We are looking for developers familiar with Python and the Scrapy library, to help with scraping product data from our partner ecommerce websites into XML format that will be imported into our website.
Initially, we want you to write scraping scripts for 5-10 of our partner ecommerce websites over the next week.
The developer/s with the best work will be asked to write scraping scripts for a further 190 sites over the next 3 months.
We will give you our prototype scrapy spider and some examples of the XML output, as well as a detailed outline of how to decide which fields to populate for which sites.
Contractor must have good experience with Python and web scraping, ideally with the Scrapy library. You will need to be familiar with using XPath or similar to parse HTML.
Please answer the following questions to be considered for this position. Form letters will be rejected without review.
1) How much experience do you have working with web scraping before? Explain some of the tasks you have done previously.
2) How much experience do you have programming in Python? Outline some of the projects you have worked on before.
3) Write a series of XPath selectors that would select the following information from the product linked off the svpply.com homepage:
- Product name
- Store name
- Store url
- Product image src url
- Product category
- Related products
4) Our prototype scrapy spiders are quite basic and hardcoded, how could you help us to better re-use code between spiders, that could scale to 100s of different websites, all with slightly different html structures.
5) How would you go about sraping product information from a site which uses AJAX to load some of the product data?
6) What kind of availability do you have over the next couple of weeks? And would you be able to commit to a larger involvement in the project over the subsequent two to three months if our initial collaboration is successful?
7) Do you have experience with version Control and GIT?
8) How do you ensure good communication with your clients? Outline your process for keeping in the loop with your client and ensuring things run smoothly.
9) Why should we hire you?
RULES AND EXPECTATIONS:
We expect contractors to check in daily and update our company's task management system as they finish tasks. We expect the working copy of your code to be checked in to our GIT repository daily, or more regularly as you complete code. Also, please be aware that I do not pay for offline hours. You must do all work connected to the internet with the time tracker on.
Thank you for considering this job, we look forward to working with you, and hopefully providing a larger body of similar work to the right candidate.