You've landed at the right place. oDesk is now Upwork. Learn about the new platform.

Data Scraping Jobs

300 were found based on your criteria {{ paging.total | number:0 }} were found based on your criteria

show all
  • Hourly ({{ jobTypeController.getFacetCount("hourly") | number:0}})
  • Fixed Price ({{ jobTypeController.getFacetCount("fixed") | number:0}})
show all
only
only
only
show all
only
only
only
only
only
show all
only
only
only
Looking for the Team App?
Download the New Upwork Team App
Fixed-Price - Intermediate ($$) - Est. Budget: $100 - Posted
Extract the following data: #1) All-State: Data for the state of Utah as a test This is where we would find a list of all-State Agents by State https://agents.allstate.com/index.html Here is an actual agent profile https://agents.allstate.com/weller-agency-eden-ut.html First Name Last Name (first and last name as separate fields if possible) Address (Separate fields if possible for address, city, state, zip) Email address Phone Mobile Number of reviews Star rating Link: The link to the Reps All-State profile #2) Home Advisor: Carpet Cleaners and Pest Control professionals in the state of Utah as a test. For Home Advisor: http://www.homeadvisor.com/sitesearch/searchQuery?action=SEARCH&startIndex=&showBusinessOnly=false&searchType=ServiceProfessionalSearch&query=carpet+cleaning&explicitLocation=utah company profile http://www.homeadvisor.com/rated.ASpotlessCarpet.39550642.html Company Name Phone Address (Separate fields if possible for address, city, state, zip) Number of reviews Star rating Link: The link to the company Home Advisor Profile Goal is to expand data to all areas and states.
Skills: Data scraping Data mining Internet research Web scraping
Hourly - Entry Level ($) - Est. Time: More than 6 months, 30+ hrs/week - Posted
You will log into a website that provides specifications for items, and you will export the data into Excel. Some of the data will be exported automatically, and some items you will need to copy/paste manually unless you can find some other method to do it, such as data mining, you are free to do so as long as the data is accurate. (If you are able to mine the data, you will receive bonus compensation equal to the average amount of hours it would have taken to gather the data manually.) The information will saved onto a Google Drive file, as well as a local Excel file for backup. We will explain the steps and procedures to you over a Skype screen share conference, also we have uploaded a tutorial video on youtube for your reference which will be provided to you upon hire to review. We have done this manually ourselves before and know how long it takes. The average rate is approximately 80-90 entries per hour worked and you will be expected to meet this quota. If you are able to mine the data in bulk, you will be compensated with a bonus at this rate. YOU MUST BE AVAILABLE FOR WORK & COMMUNICATION DURING OUR WORKING HOURS FOR THE FIRST WEEK WHICH IS 9AM-5PM PACIFIC TIME. AFTER BOTH PARTIES ARE COMFORTABLE WITH YOU WORKING ON YOUR OWN, YOU MAY WORK YOUR OWN HOURS. THIS IS ABSOLUTELY REQUIRED.
Skills: Data scraping Data Entry Data mining Internet research
Hourly - Intermediate ($$) - Est. Time: 1 to 3 months, 10-30 hrs/week - Posted
This job is focused on advancement of the experience that thousands of users get navigating, browsing, searching and comparing the content offered through our proprietary technology platform. The end-result (output of the ontology model) will be a set of intuitive and comprehensive multi-level navigation structures (hierarchical taxonomies, facets) for browsing, searching and tagging the content offered to our clients. The end-task is envisioned to be primarily achieved with the usage of Semantic Web concepts and data (LOD and other available SKOS) as per Semantic Web standards. The task most likely will require knowledge/learning of several RDF-based schemas (Resume RDF, HRM Ontology, HR-XML, FOAF, SCIOC, Schema.org) and usage of the W3C’s Semantic Web technology stack components (SPARQL, Protege, Semantic resoners). Key tasks: - Definition of RDF Schema and ontologies based on several existing RDF Schemas (Resume RDF, HRM Ontology, HR-XML, FOAF, SCIOC, Schema.org, etc.) - linking available LOD and SKOS data sets, building several core multi-level hierarchical taxonomies (magnitude of tens of thousands of elements) comprehensively describing the content in our system - Rule-based processing and linking of multiple existing, as well as obtained sets of data using semantic reasoners - Definition, structuring and optimization of hierarchical data sets, definition and maintenance of hierarchical relationships of particular terms (facets) - Research (independent, as well as guided by management team) on publicly available SKOS and LOD sets related to the content of the platform from public (international standards, patent databases, public and government databases, various organizational, available XML datasets, etc.), as well as acquired proprietary sources - Retrieval and ETL of multiple additional data sets from multiple sources - Tagging, Classification, entity extraction - Working with management team to maintain and advance particular segments of defined taxonomies Optional Stretch-Tasks (Depending on Candidate's Qualifications): - Automatic analysis of content, extraction of semantic relationships - Auto-tagging, auto-indexing - Integration and usage of selected IBM Watson services for content analysis - Integration with Enterprise Taxonomy Management platforms (Mondeca, Smartlogic, PoolParty, or others) This job will initially require commitment of 15-20 hours per week over 3-6 months engagement. Interaction with a responsible manager will be required at least twice a week over Skype and Google Hangouts. Longer-term cooperation is possible based on the results of the initial engagement. Required Experience: - Detailed knowledge of Semantic Web concepts and techniques - Intimate familiarity with W3C’s Semantic Web technology stack (RDF, SPARQL, etc.) - Hands-on experience with LOD (DB Pedia and others) and various SKOS - Experience of modeling data based on various RDF schemas (Resume RDF, HRM Ontology, HR-XML, FOAF, SCIOC, ISO 25964, etc.) - Knowledge of common open-source ontology environments and tools (Mediawiki, Protege, etc.) or other enterprise-grade ontology tools (Synaptica, DataHarmony, PoolParty, Mondeca, Top Braid, etc.) - Experience of work with semantic reasoners - Prior experience of content management and maintenance of taxonomies for consumer or e-commerce applications Additional Preferred Experience: - Background in Library and Information Science (MLIS), Knowledge Management, Information Management, Linguistics or Cognitive Sciences - Familiarity with common classification systems - Experience working with catalog and classification systems and creation of thesauri - Auto-tagging, auto-classification, entity extraction
Skills: Data scraping Web Crawling Data Analytics Data Entry
Fixed-Price - Entry Level ($) - Est. Budget: $80 - Posted
i need a script that i can run when i want to scrape info of my client site and save into excle and then when i want i can canvert that excle info in pdf , so i need 2 scripts very simple task i dont want to scrape any data i only only need scripts thanks
Skills: Data scraping PDF Conversion
Hourly - Entry Level ($) - Est. Time: Less than 1 month, Less than 10 hrs/week - Posted
I need an excel file with Name of fire department, link to website, names for all employees with their email included, only list employees with emails, for all the fire departments listed in the database provided below. Unless you speak German, I expect you to be able to utilize google translate function to get stuff into English and navigate in that way. That is my current approach to the language barrier. Use this site: http://www.feuerwehrlinks-deutschland.de/suche.php Search for "feuer" in the searchbox at the top of the site That will give you about 3250 hits on different fire departments/brigades.
Skills: Data scraping Data Entry Data mining
Fixed-Price - Intermediate ($$) - Est. Budget: $50 - Posted
Hi I need a freelancer who can extract leads information from the group which I will provide to you. Group have a 30,000 Members and If you know the data mining and Data Extraction than its easy for you. LinkedIn group only show the 1500 members and we require 10000 leads from 30,000 for now. if you have a trick than it can be easy for you. I need it to be done ASAP Thanks
Skills: Data scraping Data Entry Data mining
Fixed-Price - Expert ($$$) - Est. Budget: $500 - Posted
We have a list of 1000 websites and we need a data-building expert to find contacts from that company. The requirements are 1. Find contact name, email address, phone number, designation and website monthly traffic from SimilarWeb and/or ClearWebStats 2. Contacts should be from Digital/Online Marketing, Ecommerce, Product Management, UI/UX, etc 3. We need minimum 3 contacts from each company of manager and above designation If interested, please respond with the following details: 1. Relevant experience 2. Estimated time to complete 1000 websites 3. Number of people in your team who'll work on the project 4. How soon can you start? 5. Data accuracy percentage
Skills: Data scraping Data Entry Data mining Internet research
Hourly - Entry Level ($) - Est. Time: 3 to 6 months, Less than 10 hrs/week - Posted
We need to insert ads in our portal by scraping them from internet and insert in our import tool. We use import.io but other tools JSON compatible to our import tool are accepted. Data entry should take care about English text and focused on keywords. A job on customer service and assistant for administrative task can be available at the end of this task
Skills: Data scraping Data Entry json
Hourly - Entry Level ($) - Est. Time: Less than 1 month, Less than 10 hrs/week - Posted
We have recently won a grant to develop our new product, Pikhaya, which uses open data to provide free market intelligence to help entrepreneurs find viable business premises. There are 350 local authorities in England and Wales, of which 70 currently publish compliant data we need to include in our model. We will need, on a quarterly basis, to extract data from their websites, transform it into a common, machine-readable (CSV) format, and then upload it into our database. This is the first time we are running this process and we are aiming to hire three different freelancers to do this. After the first round of ETL, the most accurate freelancer would then work with us on a regular basis to produce the data we require. Scope of work: 70 current local authorities, most with only one dataset (usually Excel or CSV) to download and transform. An example includes Chorley (http://chorley.gov.uk/Pages/AtoZ/Information.aspx). We are specifically interested in the empty business premises which is not always available (sometimes only ‘all’, sometimes only ‘occupied’). On the example page, you’ll need to click on Freedom of Information and then the file-name: ‘Chorley - Ratepayer account data February 16.csv’ You will need to be comfortable with transforming messy, inconsistently-structured tabular data into standardised machine-readable format, and with a high level of data-accuracy. We’ll provide you with the details in Google Sheets for each local authority site where you need to do a download. We are also in the process of filing Freedom of Information requests for the other 280 local authorities (so we may have more shortly). Please respond with: 1 The time you think it will take you (and we’d like to have it complete by early March); 2 Examples of work which reflect the approach you would take with our brief (i.e. something relatively similar); 3 How you will ensure data quality; 4 An estimated price for the full job (although you can bill by the hour).
  • Number of freelancers needed: 3
Skills: Data scraping Data Entry Data mining Google Spreadsheets
Looking for the Team App?
Download the New Upwork Team App
Fixed Price Budget - ${{ job.amount.amount | number:0 }} to ${{ job.maxAmount.amount | number:0 }} Fixed-Price - Est. Budget: ${{ job.amount.amount | number:0 }} Open to Suggestion Hourly - Est. Time: {{ [job.duration, job.engagement].join(', ') }} - Posted
Skills: {{ skill.prettyName }}
Looking for the Team App?
Download the New Upwork Team App