Web Crawler Jobs

64 were found based on your criteria {{ paging.total|number:0 }} were found based on your criteria

show all
  • Hourly ({{ jobTypeController.getFacetCount("0")|number:0}})
  • Fixed Price ({{ jobTypeController.getFacetCount("1")|number:0}})
Fixed-Price - Entry Level ($) - Est. Budget: $999,999 - Posted
For a new business oportunity i am looking for someone who can set up an crawler that automaticly search for jobs in the Netherlands using a couple of parameters (location,job title etc). The crawler must search in multiple websites. Do you have a better idea please contact me so we can discuse it.
Skills: Web Crawler Web Crawling
Fixed-Price - Intermediate ($$) - Est. Budget: $100 - Posted
I used to subscribe to Rainking.com and was able to download its complete list of IT decision makers fairly easily up to a maximum number of records at a time. I suspect with a web crawler, it can be downloaded more quickly. If you have access to the Rainking.com and/or discover.org database, I would need all available fields including first name, last name, address, city, state, zip, country for data outside of the USA, phone #, fax # if available, any category fields, and most importantly email address at the individual name level. I also would be interested in the same type of data from discover.org and anything else similar where the focus are IT executives around the world. I'd like the data in CSV format. Please let me know what is world cost and how quickly I could get the file. Budget shown is a placeholder.
Skills: Web Crawler Web Crawling Data Entry Data mining
Fixed-Price - Intermediate ($$) - Est. Budget: $100 - Posted
I am interested in a list of people involved in commercial real estate in the NJ (New Jersey) USA counties on the attachment from costar.com or a similar service: Bergen, Passaic, (excluding West Milford and Ringwood), Morris, Hudson, Essex, Union, and Middlesex. The towns with their zipcodes are shown below. You would need to have access to Costar.com or a similar service. I would like all available fields separately such as first name, last name, job title, address, city, state, zipcode, phone number, fax number, email address for the individual (e.g. as little as generic emails such as info@vornado.com as possible) and type of company such as Brokers and Brokerage Firms. I would like the data on individuals under Brokers and Brokerage Firms, Owners and Investors, Multifamily Owners and Property Managers, and Retailers and Corporations. Please let me know what source/s you will be using for this data and a cost per thousand records provided or the complete job if you can bring in all the data from Costar. Please dedupe if needed so we don't see the same individual contact names multiple times. Please provide final file in XLS or CSV format. I'd also like the time needed to complete the project. Pricing shown is just a placeholder.
Skills: Web Crawler Web Crawling Data Entry Data mining
Fixed-Price - Intermediate ($$) - Est. Budget: $100 - Posted
I will be providing you with a list of perhaps 5,000 or 10,000 New Jersey USA realtors in the format on the attachment. I will need for you to research and add to the Excel document I provide accurate email addresses found from sources such as LinkedIn profile and company website, and to a lesser extent from less accurate sources such as data.com, possibly Hoovers if you have, and Zoominfo. You should be able to find the email addresses googling around as well. I envision part of this task may be automated via a web crawler whereby you pull in email addresses from various sources and then you verify if there are matches with different email addresses for the same person at the same company. I'd like your cost per thousand email addresses added to the estimated 5,000 or 10,000 records and time needed to complete this project. Budget provided is just a placeholder.
Skills: Web Crawler Web Crawling Data Entry Data mining
Fixed-Price - Intermediate ($$) - Est. Budget: $250 - Posted
You will be looking up a catalog number on 3-4 websites and recording if an item is exactly the same. If an item is exactly the same then you will record the catalog number in the excel spreadsheet. We also need the manufacturer name to be recorded. All fields can be copy/paste so you don't need to type a large amount of information. Step by step instructions: 1. Take the Manuf Part No and search it in the three websites of Spectrum Chemicals, VWR,Fisher( website are given in the excel attachment) 2. Collect the catalog number for that product in each of the websites and put it in the spreadsheet. 3. If the search results does not match the product put N/A in that field You need to look at the item description and ensure that is the same item. A number of manufacturer parts return multiple items or a different product so need to be careful. Complete the missing data in the attached excel spreadsheet and attach it with your application as sample. Also mention how many you can do per day.
Skills: Web Crawler Data Entry Internet research
Fixed-Price - Entry Level ($) - Est. Budget: $40 - Posted
I hope to find most popular keywords for Indonesian. My idea is that the crawler accepts some seed keywords. It generates 1M keywords using say, ubersuggest.io. Then, it fetches the detail information of these 1M keywords using google keyword planner. Your idea is welcomed. I prefer to python but other languages are fine.
Skills: Web Crawler Web Crawling Data scraping Python
Hourly - Intermediate ($$) - Est. Time: Less than 1 week, 10-30 hrs/week - Posted
We need to crawl 10M geotagged data from Flickr / Instagram / Twitter to do a data visualization on the map. To achieve something like http://www.digitaltrends.com/mobile/stunning-maps-visualize-twitter-and-flickr-use/ ​ tasks: 1. register Flickr / Instagram/ Twitter dev account 2. research their API to write a crawler to grab the data within the geofence bounding box. e.g. San Francisco bounding box: -123.0137, 37.6040, -122.3549, 37.8324. 3. deliverables: 1. three daemon/service-like python programs to crawl the geotagged data from Instagram / Twitter and Instagram and stores these data into the NoSQL database MongoDB. 2. It should be stable enough to crawl the data 24/7. 3. It should crawl 1 millions geotagged data per week even given the rate limit of the APIs. 4. the programs must have scalibility and multithread ability like queue library e.g. Celery in Python. GEOTAG is a must! we don't need data with no GPS information. ​ Python Experience to write service / daemon like MongoDB, Redis, Celery Twitter / Instagram / Flickr API experience.
Skills: Web Crawler Data Science Data scraping MongoDB