Web Crawler Jobs

24 were found based on your criteria {{ paging.total|number:0 }} were found based on your criteria

show all
  • Hourly ({{ jobTypeController.getFacetCount("0")|number:0}})
  • Fixed Price ({{ jobTypeController.getFacetCount("1")|number:0}})
Hourly - Intermediate ($$) - Est. Time: 3 to 6 months, Less than 10 hrs/week - Posted
We're looking for an expert who used the crawling software 'Crawl Anywhere 4' thoroughly in the past. More specifically a deep understanding of the inner workings and transport of data in said system is required. Some key information: - The system holds 500+ sources that need to be crawled constantly - Deployed on Ubuntu 16.04 Requirements: - Deep understanding of Crawl Anywhere (Crawler, Pipeline, Indexer scripts) and its possible configurations - Experience in using Solr. - Confident in working with Ubuntu (16.04) OS. - Finding performance bottle necks and possible solutions (both OS and Crawl Anywhere). - Confident with SSH
Skills: Web Crawler Apache Solr Web Crawling Laravel Framework
Hourly - Intermediate ($$) - Est. Time: More than 6 months, Less than 10 hrs/week - Posted
Hello - I'm looking to work with a researcher/job recruiter who can help me find work-remote jobs within the field of media. I'm a media producer - specializing in web content, editing for text, video and graphics, news items and more. This is a two-part job: 1) Research and collect job postings for work remote or work from home jobs within this field. Work from home ONLY 2) Help me submit my cover letter and resume to each job by tailoring my resume and cover letter to each specific job before we submit them. You will be paid for each round for items 1 and 2 as we go. Please send description of how you will approach this job and why you feel you are the qualified candidate. MUST BE: Skilled in doing deep web research and finding these specific types of jobs only. No on-site jobs. It must be 100% work remote. Add the word "REMOTE" at the top of your response so that I know you are not spamming Thanks!
Skills: Web Crawler Human Resource Information Systems Human Resource Management Internet research
Hourly - Intermediate ($$) - Est. Time: Less than 1 week, 10-30 hrs/week - Posted
We need to crawl 10M geotagged data from Flickr / Instagram / Twitter to do a data visualization on the map. To achieve something like http://www.digitaltrends.com/mobile/stunning-maps-visualize-twitter-and-flickr-use/ ​ tasks: 1. register Flickr / Instagram/ Twitter dev account 2. research their API to write a crawler to grab the data within the geofence bounding box. e.g. San Francisco bounding box: -123.0137, 37.6040, -122.3549, 37.8324. 3. deliverables: 1. three daemon/service-like python programs to crawl the geotagged data from Instagram / Twitter and Instagram and stores these data into the NoSQL database MongoDB. 2. It should be stable enough to crawl the data 24/7. 3. It should crawl 1 millions geotagged data per week even given the rate limit of the APIs. 4. the programs must have scalibility and multithread ability like queue library e.g. Celery in Python. GEOTAG is a must! we don't need data with no GPS information. ​ Python Experience to write service / daemon like MongoDB, Redis, Celery Twitter / Instagram / Flickr API experience.
Skills: Web Crawler Data Science Data scraping MongoDB
Hourly - Expert ($$$) - Est. Time: Less than 1 week, Less than 10 hrs/week - Posted
I am hoping this is possible. I have an application on my Android that notifies me whenever a new task is available by sending me a notification. If I don't select it in time, I can't see it - because that task will be awarded to someone else. The alternative option for me to see that task is to click on 'Available Tasks' or 'Refresh Available Tasks' in the actual program. It will only appear if it hasn't been awarded to someone else yet. The scraper will work in the following way: 1) Capture all available tasks in the actual application by clicking 'Available Tasks' or 'Refresh Available Tasks' in the actual application. 2) If it fits the criteria I specified, immediately select/choose it - so it appears on my calendar inside the app.
Skills: Web Crawler Data mining Web scraping
Hourly - Intermediate ($$) - Est. Time: Less than 1 week, 10-30 hrs/week - Posted
For the project, you would login into thebluebook.com account of ours with our user ID and password. There are no emails here. You get company name, city, state, zipcode, and contacts with their phone number, and fax number where available. Under paving contractors there are 140 in northern New Jersey and 159 in central and southern New Jersey. Some of the central and southern New Jersey would fall under the geographic areas we are interested in, which are below. I need for you to pull the data, perhaps using some automated tool such as spidering which is a web crawler. Save the data into fields with contact name, address, city, state, zipcode, phone number, and fax number. Find and add email address for each, preferably at the individual name level. An email address beginning with info@ likely will not be useful for us. We would like the email addresses to be as accurate as possible. If you have multiple contacts, they will be under multiple records. We only want records in our specific geographic area. If this project goes well, we will have lots more similar projects, some much larger in size. Please show me some of you other work using the same kind of skills.
Skills: Web Crawler Web Crawling Data Entry Data mining
Hourly - Entry Level ($) - Est. Time: Less than 1 month, Less than 10 hrs/week - Posted
Hello I need a script that can parse and download LinkedIn user's profiles by a specific keyword. For example, biotech. https://www.linkedin.com/vsearch/p?rsid=339609241475240167905&keywords=biotech&trk=vsrp_people_cluster_header&trkInfo=VSRPsearchId%3A339609241475240167905%2CVSRPcmpt%3Apeople_cluster There are 68,046 profiles - I need data from these profiles in XML file, with avatar pictures. Please estimate the required number of hours you'll need for this task. The requirements are: 1) Quality of the parsing 2) Minimal per hour rate Thank you
Skills: Web Crawler Web Crawling