We are looking for an experienced web scraping/crawling expert, preferably with experience in natural language programming and web development. The web crawler will be utilized to scrape data from multiple sites and deposit that data into a series of tables, viewable by multiple users. Please reference http://connectory.com/profile_view.aspx?connectoryId=3462&pl=p01p02p10 for an example of the profiles we're looking to create.
Additionally, we will be constructing a login-accessible site for people to view and administrate this data. We can hire multiple freelancers, if the web scraping expert does not have the necessary web development skills.
This contract is part of a much larger project. If the selected freelancer is successful with this pilot project, there will be additional work to follow. In your quote, please include a timeline and cost estimate (i.e. "It will take 100 hours to complete this prototype, 40 hours to debug, etc."). Bids that do not include a specific timeline and cost estimate will not be accepted.
*We're interested in scraping the websites of manufacturing companies and identifying their capabilities (machinery, location, expertise, products manufactured, etc.) and using this information to build profiles. This scraper will need to collect information on several thousand companies and it will need to be programmed and documented in English. These profiles will be stored in a database and published to a website where users can view and administer the data.
*The profiles that were linked (related to the Connectory) are the end product that we're hoping to create. We want the profiles your scraper creates to capture the same information that's housed in those profiles. The crawler you create would be scraping the internet, identifying manufacturing companies, and pulling their relevant information from their websites and placing them into Connectory profiles.
*Most websites follow a generic template, in terms of layout (visually). Most sites have an "About Us," a "Products" page, etc. See http://www.gkn.com/aerospace/products-and-capabilities/engine-casing-and-fixed-structure/Pages/default.aspx for an example of what we're visualizing. Our hope is that we can program a scraper to go into a website's home page, identify if there is an "About Us" page, and pull the first paragraph from that "About Us" page, and deposit it into a database that we could pull from. Our expectation is that the quality of the profiles would improve through iterations.