You've landed at the right place. oDesk is now Upwork. Learn about the new platform.

Web Scraping Jobs

239 were found based on your criteria {{ paging.total | number:0 }} were found based on your criteria

show all
  • Hourly ({{ jobTypeController.getFacetCount("hourly") | number:0}})
  • Fixed Price ({{ jobTypeController.getFacetCount("fixed") | number:0}})
show all
only
only
only
show all
only
only
only
only
only
show all
only
only
only
Looking for the Team App?
Download the New Upwork Team App
Hourly - Intermediate ($$) - Est. Time: Less than 1 month, 10-30 hrs/week - Posted
PROJECT OVERVIEW: We are currently working on an infographics campaign to promote our business, and we need an expert web researcher who has experience in the area of home safety/monitoring/security. JOB DETAILS: There are 3 parts to this job. Part 1 is... I'm looking for someone who can gather statistics on property crimes and home invasion (State Level) in the US for the last 10 years. Data should include but not limited to the following: • demographics. • time & exact location. • number of accomplice on a single crime committed. • tool(s) used for the crime. • point of entry • how often they commit same crime on the same house/place. • the estimated cost of the properties stolen. • stats on domestic violence committed by burglars. • how many burglary cases are solved each year. -------------------------------------------- Part 2 of this job is: Statistics on the average spending of Americans per state when it comes to home security for the last 10 years. Data to show: • annual income. • number of family members. • how much is spent on home security gadgets/services/mobile application. • have record of intruder reports • have record of burglary reports. And then... Stats on how much they spend (per state) on luxury like: • gadgets • car • home improvement • home accessories • personal items. -------------------------------------------- Part 3 of this job is: Make a list of top 100 safest cities in the US for the last 10 years. The listing should be based on the stats below: population stats on violent crimes stats on property crimes stats on burglary & home invasion number of burglary & home invasion cases solved. The ranking should be according to the stats of the crimes recorded versus the number of cases solved for the last 10 years. -------------------------------------------- STEPS TO GET THIS DONE: 1. Unless you know the “go-to” places on the web, you’d start your search by using the following search strings: “crime statistics by _____” enter either the city, zip code, state, street address and year. “crime statistics services” there are services that offer free data like http://www.neighborhoodscout.com/neighborhoods/crime-rates/top100safest/. You already have some of the data ready to go. “NameofCrime clearance rate in Year” i.e. “property crime clearance rate in 2005” The key here is to think of possible keyword combination to help you find the information you need. 2. Next you’ll have to record everything on a spreadsheet. To be considered for this job, I expect you know your way creating spreadsheets. You will then fill in the data into a spreadsheet we will provide you with. NOTE: The first two part of this job is in the US state level, while the third one is on the city level. You may notice there are other websites offering data on the 100 safest city out there. Our goal is to top those and offer a much bigger data of information to our readers. -------------------------------------------- A couple things. When you apply mention "I have the expertise" in the first paragraph of your cover letter. Also, give me an overview of your process and how soon can you deliver the first set of data. I’m looking for someone who can work on this right away as we may have future projects together if things go well. See you on the other side! Rock on!
Skills: Web scraping Big Data Data Entry Data mining
Fixed-Price - Intermediate ($$) - Est. Budget: $50 - Posted
I need someone to compile a leads list of engineers who are currently working in engineering companies or research organizations. I also want a small leads list of student engineers. The leads need to fit these specifications: I want 50 leads from Adelaide, Australia I want 200 leads from anywhere else in Australia I want 200 leads from anywhere else in English speaking countries I want 100 student leads from anywhere in Australia I only want one engineer from each company/ organization/ university —I do not want multiple engineers that work at the same organization. The students should be sourced from at least 4 different Universities. I require the personal emails of each engineer. No emails that like “info@company.com,” “staff@engineeringconsultants.com,” or “clinic@gmail.com,” or anything of that sort. Use your best judgment for this and get personal-sounding emails like john.smith@company.com. Finding personal emails is the hard part. When scraping the Web, look for the personal email information first, and then look for all the other information associated with that engineer’s email. I require the following information from each engineer to be organized into an Excel sheet: name, email, phone number (this can be the company’s phone number), company/ organization name, company/ organization location and company/ organization/ university website. I am looking for a fixed price for a leads list of this magnitude. Please reply with your bid for the job. Bids of over $(70) will be ignored. Lowest bid or best sales pitch gets the job. *If you have read this description please tell me your favorite color in response to the screening question
Skills: Web scraping Data Entry Lead generation Microsoft Excel
Fixed-Price - Intermediate ($$) - Est. Budget: $250 - Posted
I need help in scrapping university websites sites for email address. This is a low weight work. The code can be written in any language, preferably in Ruby. I will need a csv file in the format, i specify. I will also specify the sites and techniques to scrape as i have done this before. There are close to 30 sites, you will be needing to scrape from. I will give you the university name and you will need to get names from university facebook group or use common names that i can provide. Find the student directory page for that university and use that for searching and fetching the email and other meta information from the results
  • Number of freelancers needed: 2
Skills: Web scraping Data scraping HTML JavaScript
Fixed-Price - Expert ($$$) - Est. Budget: $2,000 - Posted
Hi There, I am looking to develop a crawler that will look online for email addresses of professionals in certain industries. For example... I am looking to create a data base of New York City Accountants and I need all of their email addresses. There are about 30,000 accountants in New York and to put this list together manually will be very time consuming and very expensive. I need an online robot to do it for me. All I need is their email address collected and then once the system is done collecting it I will need it exported to Excel sheet. I have about 10 industries in New York that I am trying to collect the emails from for future marketing campaign so I will be reusing this software. Please contact me with ideas, pricing and completion time for this project as soon as possible. I am looking to start right away. Thank you, Dave Ratner
Skills: Web scraping Microsoft Excel
Hourly - Entry Level ($) - Est. Time: Less than 1 week, Less than 10 hrs/week - Posted
I need help extracting products and images from Aliexpress and getting them into a csv or excel file so i can upload it as a listing on a platform similar to ebay. Freelancer can either do the process for me or show me how to do it. You must speak English and be able to communicate over skype.
Skills: Web scraping Data Entry Data scraping
Fixed-Price - Entry Level ($) - Est. Budget: $50 - Posted
Hello. I am looking for someone to help me collect 2,000 website links and the names associated with those links. The task is very straightforward, but time consuming, and I would like to pay one individual to manually scrape the links and names. It will involve simple copying and pasting from multiple pages on one website to a Google spreadsheet.
Skills: Web scraping Email Handling
Fixed-Price - Intermediate ($$) - Est. Budget: $100 - Posted
Extract the following data: #1) All-State: Data for the state of Utah as a test This is where we would find a list of all-State Agents by State https://agents.allstate.com/index.html Here is an actual agent profile https://agents.allstate.com/weller-agency-eden-ut.html First Name Last Name (first and last name as separate fields if possible) Address (Separate fields if possible for address, city, state, zip) Email address Phone Mobile Number of reviews Star rating Link: The link to the Reps All-State profile #2) Home Advisor: Carpet Cleaners and Pest Control professionals in the state of Utah as a test. For Home Advisor: http://www.homeadvisor.com/sitesearch/searchQuery?action=SEARCH&startIndex=&showBusinessOnly=false&searchType=ServiceProfessionalSearch&query=carpet+cleaning&explicitLocation=utah company profile http://www.homeadvisor.com/rated.ASpotlessCarpet.39550642.html Company Name Phone Address (Separate fields if possible for address, city, state, zip) Number of reviews Star rating Link: The link to the company Home Advisor Profile Goal is to expand data to all areas and states.
Skills: Web scraping Data mining Data scraping Internet research
Hourly - Entry Level ($) - Est. Time: More than 6 months, 30+ hrs/week - Posted
You will log into a website that provides specifications for items, and you will export the data into Excel. Some of the data will be exported automatically, and some items you will need to copy/paste manually unless you can find some other method to do it, such as data mining, you are free to do so as long as the data is accurate. (If you are able to mine the data, you will receive bonus compensation equal to the average amount of hours it would have taken to gather the data manually.) The information will saved onto a Google Drive file, as well as a local Excel file for backup. We will explain the steps and procedures to you over a Skype screen share conference, also we have uploaded a tutorial video on youtube for your reference which will be provided to you upon hire to review. We have done this manually ourselves before and know how long it takes. The average rate is approximately 80-90 entries per hour worked and you will be expected to meet this quota. If you are able to mine the data in bulk, you will be compensated with a bonus at this rate. YOU MUST BE AVAILABLE FOR WORK & COMMUNICATION DURING OUR WORKING HOURS FOR THE FIRST WEEK WHICH IS 9AM-5PM PACIFIC TIME. AFTER BOTH PARTIES ARE COMFORTABLE WITH YOU WORKING ON YOUR OWN, YOU MAY WORK YOUR OWN HOURS. THIS IS ABSOLUTELY REQUIRED.
Skills: Web scraping Data Entry Data mining Data scraping
Hourly - Intermediate ($$) - Est. Time: 1 to 3 months, 10-30 hrs/week - Posted
This job is focused on advancement of the experience that thousands of users get navigating, browsing, searching and comparing the content offered through our proprietary technology platform. The end-result (output of the ontology model) will be a set of intuitive and comprehensive multi-level navigation structures (hierarchical taxonomies, facets) for browsing, searching and tagging the content offered to our clients. The end-task is envisioned to be primarily achieved with the usage of Semantic Web concepts and data (LOD and other available SKOS) as per Semantic Web standards. The task most likely will require knowledge/learning of several RDF-based schemas (Resume RDF, HRM Ontology, HR-XML, FOAF, SCIOC, Schema.org) and usage of the W3C’s Semantic Web technology stack components (SPARQL, Protege, Semantic resoners). Key tasks: - Definition of RDF Schema and ontologies based on several existing RDF Schemas (Resume RDF, HRM Ontology, HR-XML, FOAF, SCIOC, Schema.org, etc.) - linking available LOD and SKOS data sets, building several core multi-level hierarchical taxonomies (magnitude of tens of thousands of elements) comprehensively describing the content in our system - Rule-based processing and linking of multiple existing, as well as obtained sets of data using semantic reasoners - Definition, structuring and optimization of hierarchical data sets, definition and maintenance of hierarchical relationships of particular terms (facets) - Research (independent, as well as guided by management team) on publicly available SKOS and LOD sets related to the content of the platform from public (international standards, patent databases, public and government databases, various organizational, available XML datasets, etc.), as well as acquired proprietary sources - Retrieval and ETL of multiple additional data sets from multiple sources - Tagging, Classification, entity extraction - Working with management team to maintain and advance particular segments of defined taxonomies Optional Stretch-Tasks (Depending on Candidate's Qualifications): - Automatic analysis of content, extraction of semantic relationships - Auto-tagging, auto-indexing - Integration and usage of selected IBM Watson services for content analysis - Integration with Enterprise Taxonomy Management platforms (Mondeca, Smartlogic, PoolParty, or others) This job will initially require commitment of 15-20 hours per week over 3-6 months engagement. Interaction with a responsible manager will be required at least twice a week over Skype and Google Hangouts. Longer-term cooperation is possible based on the results of the initial engagement. Required Experience: - Detailed knowledge of Semantic Web concepts and techniques - Intimate familiarity with W3C’s Semantic Web technology stack (RDF, SPARQL, etc.) - Hands-on experience with LOD (DB Pedia and others) and various SKOS - Experience of modeling data based on various RDF schemas (Resume RDF, HRM Ontology, HR-XML, FOAF, SCIOC, ISO 25964, etc.) - Knowledge of common open-source ontology environments and tools (Mediawiki, Protege, etc.) or other enterprise-grade ontology tools (Synaptica, DataHarmony, PoolParty, Mondeca, Top Braid, etc.) - Experience of work with semantic reasoners - Prior experience of content management and maintenance of taxonomies for consumer or e-commerce applications Additional Preferred Experience: - Background in Library and Information Science (MLIS), Knowledge Management, Information Management, Linguistics or Cognitive Sciences - Familiarity with common classification systems - Experience working with catalog and classification systems and creation of thesauri - Auto-tagging, auto-classification, entity extraction
Skills: Web scraping Web Crawling Data Analytics Data Entry
Looking for the Team App?
Download the New Upwork Team App
Fixed Price Budget - ${{ job.amount.amount | number:0 }} to ${{ job.maxAmount.amount | number:0 }} Fixed-Price - Est. Budget: ${{ job.amount.amount | number:0 }} Open to Suggestion Hourly - Est. Time: {{ [job.duration, job.engagement].join(', ') }} - Posted
Skills: {{ skill.prettyName }}
Looking for the Team App?
Download the New Upwork Team App