Projects

Ingreed

Scraping data and predict your IT salary, get rich !

BeautifulSoup Flask MongoDB
2022 - Simplon project

About

The main goal was to predict salaries in IT for an user, using data collected from Indeed job postings.

This project combines web scraping, data preprocessing, regression modeling, and a Flask application for interactive presentation.

With my team, we wanted to predict salary ranges based on job descriptions and features and build an end-to-end ML pipeline from data collection to deployment.

We used BeautifulSoup, a Python library, to scrape job listings from Indeed.com The scraping process extracted job titles, company names, locations, salary estimates, job descriptions etc...

After scraping, we cleaned and processed the data:

We had good results with a simple Random Forest Regressor (screenshot) and we evaluated model using metrics like MAE (Mean Absolute Error) and MSE (Mean Squared Error).

Finally, I built a Flask web application. This project gave me hands-on experience web scraping and supervised regression.

Screenshots :

Source code