Please make sure to read full description. This is the entire project requirement.
Need a PHP programmer with web scraping experience. We need to scrap a website for a distributor. This completely legit. The distributor does not provide a product feed, but has given us permission to copy data from their site.
The script does not need to have a web interface. We simply need a script to run as a cron job and do the following:
1) visit the site and spider through the pages
2) pull information from the page and store into MySQL database. The information required is product name, sku, availability, product image url, product description, price, product URL, etc.
3) If the product exists, an update will be performed
4) if the product does not exist, it will be added
A second cron is required to clean our database. That is, it will run through all the products that have not been updated in the last day and check the same site. (you could use the same models) to see if the product still exists. If the product has been removed, the product will need to be inactivated.
These are the fields of the database:
product retail price
product distributor price
product image url
date last updated
date last checked
(this is most, may need to add a couple).
All actions to the DB must be logged in a log file. A simple log is required to be a txt file (or Mysql database log). That must be logged:
2) product sku
3) action taken - added, updated, removed (inactive).
This code must be written OO and documented with DB models. This is for a fixed price bid. No hourly rate. Full payment only when the project is successfully complete. My budget is a desired budget, bid appropriately based on the amount you require to complete the job successfully. No partial payments, no up-front bids accepted.
The script will be installed on my web server and will run for 1 week to verify functionality prior to releasing payment. Any errors, bugs, or incorrect functionality as defined above will need to be corrected before payment.