The main goal was to predict salaries in IT for an user, using data collected from Indeed job postings.
This project combines web scraping, data preprocessing, regression modeling, and a Flask application for interactive presentation.
With my team, we wanted to predict salary ranges based on job descriptions and features and build an end-to-end ML pipeline from data collection to deployment.
We used BeautifulSoup, a Python library, to scrape job listings from Indeed.com The scraping process extracted job titles, company names, locations, salary estimates, job descriptions etc...
After scraping, we cleaned and processed the data:
We had good results with a simple Random Forest Regressor (screenshot) and we evaluated model using metrics like MAE (Mean Absolute Error) and MSE (Mean Squared Error).
Finally, I built a Flask web application. This project gave me hands-on experience web scraping and supervised regression.