You've landed at the right place. oDesk is now Upwork. Learn about the new platform.

Web Crawler Jobs

50 were found based on your criteria {{ paging.total | number:0 }} were found based on your criteria

show all
  • Hourly ({{ jobTypeController.getFacetCount("hourly") | number:0}})
  • Fixed Price ({{ jobTypeController.getFacetCount("fixed") | number:0}})
show all
only
only
only
show all
only
only
only
only
only
show all
only
only
only
Fixed-Price - Intermediate ($$) - Est. Budget: $500 - Posted
Looking for someone expert in scraping data from variation website.. and save data in mysql / csv . Script has to be python or php . If python, it should work with on linux server with lamp php website. If you are really good , I dont mind you offering full time job as I need 100s of scrapping stuff needed over next 3 months .. this job is for 6 websites .. but I might need some other small scrappers before I give big project please answer these 1. write 'ddonk' before application 2. let me know if you prefer php or python 3. mention what website you have scrapped ? google, linkedin, amazon, yellow pages ? 4. show me list to any web application that does scraping if you have build any 5. do you have full time job and part time freelancer ? or you are full time freelancer ?
  • Number of freelancers needed: 3
Skills: Web Crawler Data mining Data scraping Django
Fixed-Price - Expert ($$$) - Est. Budget: $50 - Posted
Hi Our Insurance brokerage needs someone to scrap the Contractors State License Board of California's website (www.cslb.ca.gov) to obtain the Workers Compensation data listed below and format into an excel spreadsheet. We are only looking for those businesses that have "active" licenses. We have an excel list already we can provide as an example. It just needs to be updated with current dates as it has been awhile since last updating. I plan to ask for an updated list every 2-3 months and would like to find someone I can consistently go to to update the list. Please let us know if you need any further details before applying and bidding. Contractors State License Number, Business Name, Classification Type (Plumbing, Landscaping, etc.), First & Last Name of Owner (contact), Policy Number, Current Carrier, Renewal Date, Address, City, Zip Code, State, Phone Number (s), and emails if available
Skills: Web Crawler Data mining
Fixed-Price - Intermediate ($$) - Est. Budget: $250 - Posted
I need a freelancer to develop a web crawler that scrapes data off the VRBO for specific geographic markets and stores listing data and calendar availability in an Excel format. Scraper must be able to meter itself to avoid getting refused by the website for too many requests to the server. I don't want my IP to get blocked! A. Crawler to crawl VRBO.com to identify all geographic destinations on the site B. Software presents geography tree and allows user to highlight which markets should be crawled C. User selects one of two crawl functions and a third publication function: 1. Crawl: Inventory Pull 2. Crawl: Availability Check 3. Publish: Inventory statistics by market 1. Inventory Pull function -- Crawler goes to VRBO and scrapes all data about all properties for the user selected geographies --Examples of data for each listing includes: ----We need to track multiple levels (parents and children) of geographies (e.g., Hawaii -> Maui -> South -> Kihei) ----Description: Name of Property (e.g., Grand Champion), Listing Identification Number, Listing Name ----Property Type: Condo, House, 1 bedroom, 1,442 Sq. ft., etc.; ----Unit Number (for Condos; e.g., Grand Champion Unit #75) ----Number of bedrooms, number of bathrooms, how many people it sleeps ----Size of home ----Min Stay Requirements ----Low season cost per night, high season cost per night, low season cost per week, high season cost pet week, holiday cost per night, holiday cost per week, etc. ----Dates that define the low season, high season and holiday seasons ----URL Link to the details page of the property ----Number of Reviews ----Review Rating ----Information on amenities (check boxes) ----Information on Activities (check boxes) ----Contact information of the owner ----Tax ID ----Calendar last updated by Owner Date -- Crawler also downloads the current availability calendar for the property and calculated a vacancy rate by month for upcoming 12 months ----calendar data needs to be able to be archived so that future downloads do not overwrite the historical vacancy information ----we want the ability to track how quickly specific properties book over time --Data is saved in a excel format 2. Availability Check Function --Crawler goes to VRBO to check vacancy for each of the user selected markets -- Crawler will check one week increments for 52 weeks in the future for the selected markets -- Crawler will record how many properties of each bedroom count (studio, 1 bed, 2 bed, 3 bed, 4 bed, 5 bed, 6+ beds) are currently available in the markets for each of the 52 week --Spider will compare availability for each week to total property count to calculate current vacancy rate by market by week 3. Publish Inventory Statistics by market -- Need to discuss best approach with our selected developer -- Our goal is to have an easy way to read summaries of the data and drill into the details when desired
Skills: Web Crawler Data scraping Web scraping
Hourly - Expert ($$$) - Est. Time: More than 6 months, Less than 10 hrs/week - Posted
We are looking for a developer / data scientist with experience in building web crawlers that seek to scrape and index data continuously. Sample use case: 1. Different eCommerce Link like this will be provided http://www.amazon.com/b/ref=s9_acss_bw_cg_cattiles_2a1?_encoding=UTF8&ie=UTF8&node=3017941&pf_rd_m=ATVPDKIKX0DER&pf_rd_s=merchandised-search-6&pf_rd_r=0JM08XCZ52G2S9XQKC2N&pf_rd_t=101&pf_rd_p=2265474122&pf_rd_i=502394 2. The crawler will crawl the page to find new products, read the content on the product page and extract the following: Product Name Sellers Name Number of Customer Reviews Number of answered questions Total number of Star Rating 3. All of this information needs to be stored in a database. The schema can be whatever you think appropriate. 4. Continuously / periodically (i.e. it doesn't need to be running all the time, but needs to be an automated process), the web crawler needs to continue to monitor all links to identify any new products that have come out then bring them into our database, without creating duplicates. NB: The scrapers/crawlers ideally should not violate any terms of service if possible. ---------------- For the above use case, please provide a rough description of the following: 1. If you've ever worked on a similar project. 2. How you would approach this project, citing any existing libraries you would utilize, if appropriate. 3. What languages you would develop in, and your rationale for doing so. 4. Risks or things you think could be tricky / we need to watch for. 5. An estimate of the a) total number of hours work, and b) total duration to complete project. 6. If you have any experience in machine/deep learning, bayesian prediction, or other data science methods, please describe them briefly as well. Thanks
Skills: Web Crawler Data mining Data scraping Web scraping
Fixed-Price - Expert ($$$) - Est. Budget: $50 - Posted
We search for a data research expert who is able to extract for us the company details of the given sites. Your job is to deliver java code which extracts following details from the weblinks given here: A set URLs for different areas: http://www.immobilienscout24.de/anbieter/suchen/Baden-Wuerttemberg/Boeblingen-Kreis/Herrenberg?geocodeid=1276001006010&focustype=1,2,3,6,7&includeOperationAreas=true http://www.immobilienscout24.de/anbieter/suchen/Baden-Wuerttemberg/Ostalbkreis/Schwaebisch-Gmuend?geocodeid=1276001028034&focustype=1,2,3,6,7&includeOperationAreas=true A set URLs for different areas: http://www.meinestadt.de/herrenberg/berufe-branchen/handwerk http://www.meinestadt.de/herrenberg/berufe-branchen/pflegeberufe Your delivery has to be in following format: https://docs.google.com/spreadsheets/d/17Ug0grVXxb7dZGv7cQgGjd2tzZHeDCN7PTNDIMbVkuw/edit?usp=sharing Please ignore the test/checking tasks of the home pages. We need the full address, phone, email, internet address of the entries. Please ignore our placed budget and place your bid in USD along with the brief description which technology you prefer on solution. We have a lot of more work to do, as you may already see.
Skills: Web Crawler Data scraping Web scraping
Fixed-Price - Intermediate ($$) - Est. Budget: $150 - Posted
I need one java script to crawl 5 sites (only a small portion of the site and not the whole site) and create a excel sheet (tab delimited) of the results. This sites have products and I need to run this script daily so I need it to be fast. Please let me know if you have any questions. This site are public site and no password is needed to access them.
Skills: Web Crawler JavaScript
Looking for the Team App?
Download the New Upwork Team App
Fixed Price Budget - ${{ job.amount.amount | number:0 }} to ${{ job.maxAmount.amount | number:0 }} Fixed-Price - Est. Budget: ${{ job.amount.amount | number:0 }} Open to Suggestion Hourly - Est. Time: {{ [job.duration, job.engagement].join(', ') }} - Posted
Skills: {{ skill.prettyName }}
Looking for the Team App?
Download the New Upwork Team App