The application is a desktop scraping application with a user-friendly interface, MS SQL database and a Jobs crawler for scraping Jobs related data from https://www.glassdoor.com/. The data is scraped into the database of the application and output into .CSV format.
The project was quite complex, as at https://www.glassdoor.com/ browser identification is used, and in case of a big number of queries from the same user, the server replies with a Captcha. The problem was solved with the help of IP rotating and client identification in case of having to deal with Captcha. A special service was created to realize this. The service scans Free Proxy available on the web with the frequency predefined, checks the possibility of them to be used at each target website and saves these Proxy addresses into the database for further use.
Tools and Technologies