You've landed at the right place. oDesk is now Upwork. Learn about the new platform.

Web Scraping Jobs

231 were found based on your criteria {{ paging.total | number:0 }} were found based on your criteria

show all
  • Hourly ({{ jobTypeController.getFacetCount("hourly") | number:0}})
  • Fixed Price ({{ jobTypeController.getFacetCount("fixed") | number:0}})
show all
only
only
only
show all
only
only
only
only
only
show all
only
only
only
Looking for the Team App?
Download the New Upwork Team App
Fixed-Price - Intermediate ($$) - Est. Budget: $50 - Posted
We wish to gather all the key data of Beauty Salon / Hair Salons for 4 India cities: Bangalore, Mumbai, Chennai, and Hyderabad. For each listing we want the following information loaded into a spreadsheet which will be provided for our review: Category Name Business Name Owner or Lead Contact Name Phone Number Email Website URL Street Address City State Postal Code Open Hours Description of the Business Facebook Link Google Plus Link Twitter Link We would like to have the scraping done by Wednesday 10 February. We would like you to recommend a website which you feel would be the best source for this information and explain why. Some possibilities are listed below, but if you have another site that you would like to suggest, please do. http://www.sulekha.com/ - Mixed http://www.yellowpages.in/ - Mixed http://www.justdial.com/ - Mixed http://www.indiamart.com/ - Small and medium enterprises http://www.tradeindia.com/ - Mixed In your proposal, submit a sample of 500 records captured from your suggested website to prove that you can gather the information. Also, if you can let us know how many records you expect to get from the website, it will help justify your pricing. Upon completion, we will review the spreadsheet to validate that the information is complete and accurate. If there are issues we will point them out. Once all issues are resolved, you will be paid. This will be a fixed bid project. Please provide your bid for the entire project. Also, please indicate the date by which you can deliver the completed spreadsheet.
Skills: Web scraping Data mining Data scraping
Fixed-Price - Intermediate ($$) - Est. Budget: $35 - Posted
I'm looking for a regular expression to scrape the URLs that appear on page 1 of Google search results for any search query. I have a Ruby on Rails project where I'm using Nokogiri for scraping. I am grabbing the Google SERP like this: doc = Nokogiri::HTML(open("http://www.google.com/search?q=#{search_keyword}")) Then I get the page 1 search results like this: doc.xpath('//*[contains(concat( " ", @class, " " ), concat( " ", "r", " " ))]//a') This mostly works, but I need to EXCLUDE a couple things: 1) Exclude the excerpt/snippet google sometimes includes as result #1 2) Exclude sitelinks Google sometimes includes under result #1 See the attached pic for examples of these exclusions. At the end of the project, I will consider it a success if I can plug the regex or xpath into Nokogiri and get the desired results based on my specifications.
Skills: Web scraping Regular Expressions
Fixed-Price - Intermediate ($$) - Est. Budget: $50 - Posted
We wish to gather all the key data of Play Schools for 4 India cities: Bangalore, Mumbai, Chennai, and Hyderabad. For each listing we want the following information loaded into a spreadsheet which will be provided for our review: School Name Owner or Lead Contact Name Phone Number Email Website URL Street Address City State Postal Code Open Hours Description of the School Facebook Link Google Plus Link Twitter Link We would like to have the scraping done by Wednesday 10 February. We would like you to recommend a website which you feel would be the best source for this information and explain why. Some possibilities are listed below, but if you have another site that you would like to suggest, please do. http://www.sulekha.com/ - Mixed http://www.yellowpages.in/ - Mixed http://www.justdial.com/ - Mixed http://www.indiamart.com/ - Small and medium enterprises http://www.tradeindia.com/ - Mixed In your proposal, submit a sample of 500 records captured from your suggested website to prove that you can gather the information. Also, if you can let us know how many records you expect to get from the website, it will help justify your pricing. Upon completion, we will review the spreadsheet to validate that the information is complete and accurate. If there are issues we will point them out. Once all issues are resolved, you will be paid. This will be a fixed bid project. Please provide your bid for the entire project. Also, please indicate the date by which you can deliver the completed spreadsheet.
Skills: Web scraping Data scraping
Hourly - Entry Level ($) - Est. Time: More than 6 months, 10-30 hrs/week - Posted
Freelancer will receive a list of 100 client prospects. He/she must do some research online to help us find the gold nuggets in that list. 1. Go to the company's website. 2. Go to the company's career page. 3. Search for IT (Information Technology / Technology) jobs in targeted location. 4. Save the link and/or take a screenshot when the company has posted at least one IT job. 5. Disregard the leads that don't post IT jobs. Strong performers will receive repeat business every week.
  • Number of freelancers needed: 2
Skills: Web scraping Microsoft Excel Research
Fixed-Price - Intermediate ($$) - Est. Budget: $50 - Posted
We wish to gather all the key data of doctors or clinics in certain specialties for 4 India cities: Bangalore, Mumbai, Chennai, and Hyderabad. The specialties for which we wish to collect the data are: Dermatologist Gastroenterologist Hair Transplant Clinic Ophthalmologist Orthopedist Pediatrician Urologist Physiotherapist Based on listings in practo.com, it seems there are about 14722 records to be captured. For each listing we want the following information loaded into a spreadsheet which will be provided for our review: Specialty Doctor or Clinic Name For Clinic, lead doctor name Phone Number Email Website URL Street Address City State Postal Code Open Hours Description of the Practice- if possible including Education, Experience, Awards and Recognitions, Memberships, and Registrations Facebook Link Google Plus Link Twitter Link We need to have at least Dermatologist for Bangalore, Mumbai, and Hyderabad, and Physiotherapist for Chennai done by Wednesday 10 February, but ideally would like to have all done by that date. We would like you to recommend a website which you feel would be the best source for this information and explain why. Some possibilities are listed below, but if you have another site that you would like to suggest, please do. http://www.sulekha.com/ - Mixed http://www.yellowpages.in/ - Mixed http://www.justdial.com/ - Mixed http://www.indiamart.com/ - Small and medium enterprises https://www.practo.com – Doctors- doesn’t appear to have phone number for most records http://www.tradeindia.com/ - Mixed In your proposal, submit a sample of 500 records captured from your suggested website to prove that you can gather the information. Also, if you can let us know how many records you expect to get from the website, it will help justify your pricing. Upon completion, we will review the spreadsheet to validate that the information is complete and accurate. If there are issues we will point them out. Once all issues are resolved, you will be paid. This will be a fixed bid project. Please provide your bid for the entire project. Also, please indicate the date by which you can deliver the completed spreadsheet.
Skills: Web scraping Data mining Data scraping
Fixed-Price - Intermediate ($$) - Est. Budget: $150 - Posted
Simply.... A Program or Spreadsheet that can do the following: I want to compare a list of UPC/ASIN/ Product description to a list of pricing for 3 Online Stores. (Ebay, Amazon, Nextag). I want a simple step. 1) upload (or copy Paste) BULK Info. 2) push one button (or one additional step) 3. Program does the work by getting the Info I want from the sites. Attached is a more detailed project description. I am open to options (even if it costs more) since im new at this.
Skills: Web scraping API Development jQuery Microsoft Excel
Hourly - Intermediate ($$) - Est. Time: Less than 1 week, Less than 10 hrs/week - Posted
Automate the process of extracting stock market information via a search for tickers symbols through multiple financial websites that has the existing information and adding them to excel sheets with the output in a web format.
  • Number of freelancers needed: 2
Skills: Web scraping CSS Data mining Data scraping
Fixed-Price - Expert ($$$) - Est. Budget: $500 - Posted
The no. 1 is whatsinproducts.com. Two things we can use: 1. All product information. Note that the URL string for the product detail page goes in simple numerical order: http://whatsinproducts.com/types/type_detail/1/16705/ That 16705 means that product is the 16,705th product entered into the database (on Jan 7 2016). Here's the first product, which entered the database Sept 3, 1996: http://whatsinproducts.com/types/type_detail/1/1/ 2. SDS information. The SDSs have a similar URL structure, but note that the last value in the string does NOT correspond to the Product IDs in the product detail queries. http://whatsinproducts.com/brands/show_msds/1/16537/ Still, as you know, very easy to join those tables. Another good source of SDSs is msdsdigital.com. They have ~100K free SDSs. And the data extraction should be similarly uncomplicated. Example: http://msdsdigital.com/msds-database?sort_by=title&sort_order=DESC&items_per_page=25&page=0 Just manipulate the page value. Starting with whatsinproducts.com and msdsdigital.com will be a launching pad. The task will be broken into three three parts. 1.) Scrape 10 records from each site and produce a data output we can evaluate 2,) Scrape 1000 records for each, submit to verify structure 3.) Scrape the rest of the site. There are images and some attachments. We are envisioning a CSV output with the final columns being identifiers for the attachments and pictures, the folders with the pictures and the attachments can be uploaded to my dropbox account.
Skills: Web scraping Data mining Data Modeling
Fixed-Price - Entry Level ($) - Est. Budget: $100 - Posted
Write a screen scraper that will scrape all ads found at http://toftegaardbiler.dk/ to separate JSON files. The screen scraper should be delivered as a docker container (i. e. include a Dockerfile to build the container). It is up to the freelancer to choose what tools to use for solving the assignment as long as the provided docker container can be built without using excessive resources or depending on external services. See the attached file for example output and an example Docker container.
  • Number of freelancers needed: 3
Skills: Web scraping Docker
Looking for the Team App?
Download the New Upwork Team App
Fixed Price Budget - ${{ job.amount.amount | number:0 }} to ${{ job.maxAmount.amount | number:0 }} Fixed-Price - Est. Budget: ${{ job.amount.amount | number:0 }} Open to Suggestion Hourly - Est. Time: {{ [job.duration, job.engagement].join(', ') }} - Posted
Skills: {{ skill.prettyName }}
Looking for the Team App?
Download the New Upwork Team App