We want a program that scrapes/extract data from various websites. The program will go to a website and then put various inquires and search. Next it will look at the used price of the book. If the price of the book is greater than $40 then it should click on the link and extract the ISBN number. If it is less than $40 then the program should move on to the next book. It will continue this process until it reaches the end of the books/searches and put the information in an excel file. You will need to use a proxy list to continuously change the "location" of the program.
The program will then use the ISBN numbers (in the excel file) to go to a website and use the ISBN numbers one by one to search for the prices. Next, the program will look at the lowest price and check the merchant/website. The program should compare the price for new and used. If new is cheaper then select the price based on the rating as well. However if used is cheaper then, the program should check the condition of the lowest price book. If the condition is "brand new, new, like new, mint, very good, or good" then the program should move on to the seller's comments available. (If the condition does not match then it should move on to the next result).
However when the program is extracting the comments it should look for specific words in the description and if they are present then the program should move on to the next search result and repeat the process. If the next one is good (does not contain "blacklisted words") then the program should extract the information (price, condition, and the seller's comments) and then put the information in the same excel file as the ISBN numbers.
Lastly, we will specify another website to scrap the sell price for the same ISBN.
This is the whole process but the program interface should be made as follows: The program should be extremely simple. It should consist an option to search all subjects (biology, accounting, etc). We will provide you with a list of subjects for the program to search and also a text file with the "blacklisted words". The program will put the information from each of the subjects into the previously mentioned excel file. The program will then put the extracted information in individual tabs of the excel file. For each of the subjects the program will extract all of the ISBN numbers from the 1st site and the information for each of the ISBN numbers. The program will also consist of a button to run the program. I understand that the description seems complex but it is much simpler than it sounds. Please feel free to ask any questions you may have. I will provide additional information on the specifics of the program once you apply.
We had been working with the programmer who has not been answering on a consistent basis and I will provide you with the old program when we hire you. Please note that this is a Perl based program but will need a few changes so you can use this as a guide to build a better program.
A second condition is that you must be extremely responsive in the future. The program will require updates in the future and we will pay you for each update but if you take more than 48 hours to respond then we will hire someone else.