We're in need of a developer (or agency) with expertise in the scraping of data from a number of public websites.
Developer will deploy and manage a system to scrape available data on tens of thousands of professionals (accountants, attorneys, doctors, etc.) from dozens of websites. That data then needs to be organized within a database (preferably MySQL).
To end users, the data will ultimately appear as "listings" of professionals within a searchable directory. Each list will represent one individual professional, and be comprised of a dozen or so fields (including "name", "address", "category of expertise", among other fields.)
THE FOLLOWING SKILLS ARE BENEFITS, BUT NOT REQUIREMENTS:
Scraping large sites that employ anti-scrape tactics. Please have experience bypassing such technology.
Scraping fast, using threading (or a better method...we're open to recommendations).
Ability to support proxies (both to bypass anti-scraping systems and scrape faster).
Scraping data based on xpath to save into database (either elasticsearch, mongodb, postgre, mysql).
Experience developing user interfaces to a database a plus, but not a requirement.
THE FOLLOWING SKILLS ARE REQUIREMENTS:
Scraping system proposed SHOULD NOT be built from scratch. There are plenty of existing open source solutions available.
Please DO NOT bid if you don't have experience.