Data Scraping Jobs

363 were found based on your criteria {{ paging.total|number:0 }} were found based on your criteria

show all
  • Hourly ({{ jobTypeController.getFacetCount("0")|number:0}})
  • Fixed Price ({{ jobTypeController.getFacetCount("1")|number:0}})
Fixed-Price - Intermediate ($$) - Est. Budget: $800 - Posted
This is the first in a series of jobs -- i.e., this can lead to a lot of work. It's the fun part of data-science and analysis (in my opinion) where you are connecting disparate data sources and looking for interesting correlations/causations. Some of this *could* be done in Excel but we'd prefer Python or R. Ideally, you are familiar with many public data sets including US Census, NIH, IRS, WISQARS, Zillow, NAR, GSS --- and many more -- i.e., we won't be able to tell you where to go to get data all the time. The work you do here, if satisfactory, will pave the way to further analysis This initial job consists of two tasks. 1. Are the rates at which taxes prepared correlated with the foreclosure rate in a given zip? a) we will take property default rates at the zip level from Zillow (per 10,000 units) http://www.zillow.com/research/data/ http://files.zillowstatic.com/research/public/Zip/Zip_HomesSoldAsForeclosures-Ratio_AllHomes.csv b) we will take IRS reported data on tax preparations (it's buried among lots of data here): https://www.irs.gov/uac/soi-tax-stats-individual-income-tax-statistics-zip-code-data-soi Questions: - how does foreclosure rate correlate with tax preparation? - has that changed over time? - can you isolate the effects of this as an independent variable? - produce charts, r value for the correlation and other insights 2. How does gun ownership affect foreclosure rates and home prices? (note this is a broader question) a) property data http://www.zillow.com/research/data/ Foreclosure Media Price Price per square foot data b) gun ownership data (this will require creativity) https://www.atf.gov/firearms/listing-federal-firearms-licensees-ffls-2013 (change last 4 digits in url to get from 2013 to 2015) Use WISQARs, NCHS, GSS, Census etc. Questions: Does gun ownership impact property prices? Are they correlated in any way?
Skills: Data scraping Analytics Data Science Statistics
Fixed-Price - Intermediate ($$) - Est. Budget: $540 - Posted
Hi, I'm looking to create a desktop app that will automatically pull in the data from Angellist after I set the criteria for a company search. Here's an example search: https://angel.co/companies?locations[]=San+Francisco&locations[]=San+Francisco&locations[]=San+Francisco&locations[]=San+Francisco&locations[]=San+Francisco&locations[]=San+Francisco&locations[]=San+FRANCISCO&raised[min]=2830196&raised[max]=100000000&signal[min]=4.1&signal[max]=10 Here's an example output I'm looking to create: https://docs.google.com/spreadsheets/d/14pb8Vyy7hStUD8aP32zzF-b69JhqhbXPdc-AaGWKkV8/edit?usp=sharing The only difference is there should be up to several employees for each company (depending on how many Angellist shows) rather than just one. Keep in mind, I also need the scraper to find the LinkedIn URLs of each employee listed. From my understanding, data from only 400 companies can be pulled at a time. I'm totally fine with that.
Skills: Data scraping JavaScript Web scraping
Hourly - Expert ($$$) - Est. Time: Less than 1 week, 10-30 hrs/week - Posted
We are looking for an expert in Wordpress MySQL and phpMyadmin to provide proven advice or plugin suggestions on the transfer of JUST Wordpress content such as articles, meta data and images that match up with the permalinks We know there are many plugins for full site migration such a backup buddy and vaultpress and have tested them but these are not what we are looking for. We specifically want ONLY articles and images migrated with nothing else. No plugins, No Themes and residual data. Just the content. What we require is someone who knows exactly how to do this as we've tested the official wordpress import/export tools but this is no currently working. So we need some suggestions and may possibly require support via chat to assist us with phpMyadmin manual transfer. We anticipate 1-3 hours of time for this.
Skills: Data scraping Data Backup Data Recovery phpMyAdmin
Fixed-Price - Expert ($$$) - Est. Budget: $350 - Posted
I need a web scraper / crawler that will access this url. Need to be able to modify the url if changes are made in future. This will be 2 scrapers I assume as they will access different sections. http://a810-bisweb.nyc.gov/bisweb/PropertyProfileOverviewServlet?requestid=3&bin=1015592 we will be using the "bin" as unique identifier. Bin will be pulled from database. We know that the website checks to see if the visitor is a real browser or crawler. There is also a prioritization page which will load at times. We need the application to be able to monitor/crawl the site for changes. Once at page, we need information from "jobs/filings" section. This section can be accessed via direct url. Under this section there is a drop down at top to "show all filings" we want to select "hide subsequent filings". We want to take each individual job# and access each of those pages. These are the pages we want to scrape / crawl. http://a810-bisweb.nyc.gov/bisweb/JobsQueryByLocationServlet?requestid=4&allbin=1015592&allstrt=WEST%20%20%2024%20STREET&allnumbhous=49 The second section/crawler is the "actions" screen. Here want to scrape each of the pages http://a810-bisweb.nyc.gov/bisweb/ActionsByLocationServlet?requestid=1&allbin=1015592 We will be inserting the records into a Mysql database. We will need sql dump to create the database and tables on our server and a config file for database connection settings. We need a config file for proxy ip addresses, user names and passwords etc. If there are entries in the proxy config file then the app has to crawl the pages using each proxy server with a round-robin strategy. We need another config file to configure a) how many instances we can launch concurrently. b) Need to configure a wait time in between each request and prioritization page. Need a config file for entering User-agent strings. If there are entries in this file, crawler will use each User-agent string for setting User-agent string on http headers when requesting pages. Need to check http responses for errors. If status code is anything but 200 the app should try again. If the status code is 200 and but the response body is for the the prioritization page then the app should wait at least 5 seconds and refresh the page to pass the prioritization page.
Skills: Data scraping JavaScript Web scraping
Fixed-Price - Intermediate ($$) - Est. Budget: $500 - Posted
We are looking for someone who can help us do the following: * Take data in an Excel spread sheet and convert to a dbf file and confirm that the data was successfully converted and the data is in the correct fields. * Take data that is in a dbf file, remove unwanted data and upload into the SQL database. * Test and confirm the data was successfully uploaded into the database and that the data is located in the correct field. This would include the ability to create queries and debug the SQL database and\or locate data upload errors. The datasets range in size from a 1 MB to large files with up to a several hundred GB. Contract can be fixed price by the job (file\data uploaded) or hourly with cost not to exceed. Individual freelancers only please.
Skills: Data scraping Data mining
Fixed-Price - Expert ($$$) - Est. Budget: $200 - Posted
I need an automated way to scrape data from two specific filing types in the EDGAR database and have it populated into an excel list by category. I'm open to suggestions for how to accomplish this. Ideally this could be automated in one way or the other to happen on a daily/weekly basis.
Skills: Data scraping Data mining
Fixed-Price - Intermediate ($$) - Est. Budget: $35 - Posted
I need someone to create a small script that does the following: 1. Scrape angel.co/jobs (Role: Sales, Location:United States). 2. For each listing: a. Get the company name. b. get the company domain name. c. get the cmpany founder first and last name. There can be multiple founders. d. output to xls or google docs. thanks!
Skills: Data scraping Web scraping
Fixed-Price - Intermediate ($$) - Est. Budget: $150 - Posted
I am looking for someone perform a one-time scrape of information from approximately 14,000 pages on a eRetail website. For each page, you will locate and store a short, specific list of attributes which I will provide. Again, every page is formatted the same so you will be able to find the attributes in the same place on each page. The final product that you will deliver is a CSV file that includes these attributes. Thanks!
Skills: Data scraping Web Crawling Data mining Web Crawler
Hourly - Expert ($$$) - Est. Time: More than 6 months, 10-30 hrs/week - Posted
Pull college data from the IPEDS http://nces.ed.gov/ipeds/datacenter/Default.aspx in subject areas like psychology, licensed practical nursing, registered nursing, healthcare administration and more. For each subject area, pull institution contact information, school enrollment, teacher student ratios, graduation rate, institution accreditation, program accreditation, degree levels offered, degree specialties offered, and more. May need to pull data from several different databases such as accreditation http://ope.ed.gov/accreditation/GetDownloadFile.aspx, scorecard https://collegescorecard.ed.gov/data/ and merge data. Create simple use cases to ensure dataset are merged correctly. Create repeatable process that can be done again every six month or every year. Ensure the data useful and relevant for the subject area. In some cases, help to formulate datasets so they can be used as part of ranking college programs against each other. In addition to these projects, help pull and format other job and career data.
Skills: Data scraping Data mining Microsoft Excel