Web Crawler Jobs

55 were found based on your criteria {{ paging.total|number:0 }} were found based on your criteria

show all
  • Hourly ({{ jobTypeController.getFacetCount("0")|number:0}})
  • Fixed Price ({{ jobTypeController.getFacetCount("1")|number:0}})
Hourly - Expert ($$$) - Est. Time: Less than 1 week, Less than 10 hrs/week - Posted
This job is for a 30 minute Skype call to give my development team some tips to improve there scraping. The devs are scraping high volume data from Google Adwords and Search Results. They are using proxies and other methods but are still having issues scaling this. The scraping is a repetitive activity that happens every day. Devs are programming in Python. You will be perfect for this job if you have experience with very high volume data scraping from Google and know how to work around there system. You are bidding on a 30 minute phone call.
Skills: Web Crawler Data mining Data scraping Python
Fixed-Price - Intermediate ($$) - Est. Budget: $100 - Posted
Hi, I need data of ALL Agents. Agent Database URL, http://www.propertyguru.com.sg/property-agent-directory/firstname/A Typical Agent Profile, http://www.propertyguru.com.sg/agent/mr-zola-tan-66755 Profile ID - 66755 Agent Img - agent/66755.jpg Agent Name - Zola Tan Agency - ERA REALTY NETWORK PTE LTD Agency Licence - L3002382K Agent Licence - R029291I Agent E-mail - zoo_tan@hotmail.com Agent Phone - 93692952 Agent Website - http://www.zola.myweb.sg/ For agent email - you will have to look into their INTRO section or Agent Website. Note: not all agents will have an agent website and not agents will have their email addresses displayed in their intro - for those agents, email and or website can be left blank. For Agent Phone - at times, there might be 2 phone numbers, you only got to get the main one. To run this scraper, proxies and multi-threading are required. I should be able to provide a .txt list of proxies for the scraper to use. I will require the script/bot to run on my windows 7 so that I can refresh data whenever I require. Regards
Skills: Web Crawler Web Crawling Data scraping Web scraping
Hourly - Entry Level ($) - Est. Time: 1 to 3 months, 10-30 hrs/week - Posted
I Need someone who is does search engine Research. The Research is needed to be evaluated by different criteria. Researched Information are needed to be put into an Excel sheet. Basic German language skills are reqired for this Job. The work will be explained in Training Videos. Payment will be on an hourly Basis and a Bonus on top for extra good work.
Skills: Web Crawler English German Microsoft Excel
Fixed-Price - Intermediate ($$) - Est. Budget: $150 - Posted
Hi, I need data from ONEMAP. Here is the map, https://www.onemap.sg/index.html Here is the API documentation link, http://www.onemap.sg/api/help/ What data do I need? Valid Postal Codes (Postal Code range is from 000000 - 999999) Find the valid postal codes. Then use the valid postal codes to find, Lat Long Building Name Street Name Full Street Address Example Postal Code - 410636 Lat - 1.3311 Long - 103.9044 Building Name - EUNOS TENAGA VILLE Street Name - BEDOK RESERVOIR ROAD Full Street Address - 636 Bedok Reservoir Road Bonus, If you scan through the map, you will realize they are using particular colors for particular landscape types. Example, 636 Bedok Reservoir Road is a HDB Building - They use Yellow. Eunos MRT is a MRT station - They use dull Blue. Eunos Bus Interchange is a BUS Station - They use Brown. Bayshore Park is a Private Condominium - They use Grey. Bedok Stadium is a Recreational Place - They use Pink. East Coast Park is a National Park - They use Green. Ping Yi Secondary School is a School - They use lighter Yellow. MASJID ABDUL GAFOOR is a Mosque (Place of worship) - they use Light Purple. Why am i saying this? If you are able to get the full list of available categories of places, it will help us put the scraped addresses in the right categories. Also, some important places like Bedok Reservoir Park, Admiralty Park, East Coast Park and more doesn't even have a postal code. So, again if we are able to get the list of categories, then we will be able to get a better list of places using the categories and places that fall under those categories. If both can be done, then, we will have a perfect list of places in Singapore. Let me know if you can do this.
Skills: Web Crawler Web Crawling Google Map Maker Google Maps API
Hourly - Entry Level ($) - Est. Time: Less than 1 month, Less than 10 hrs/week - Posted
I'm looking for data mining of all vacation homes for a specific vacation/travel destination. This would included gathering from sites such as VRBO and Tripadvisor and many others. The challenge that needs to be addressed is the same home can be listed on multiple sites. Duplicates are not acceptable. Estimated 1000 rentals. Data I would like to collect includes: Home Name Address Number & size of Beds & Bedrooms Direct Web Addresses of listing Names of listing Agency(s) w/phone numbers, email, and web adresses Price - Both per Week and Night and Per Person/Couple Air Conditioning (y/n) Pool (y/n) Photos Available. Other data may be added as needed. Data format - I'm open to ideas. Initially I'd like a spreadsheet if possible. The goal is to use the data and or sell it. I understand that without knowledge of the exact location, this may be hard to estimate, so and hourly rate would be fine, just give an estimate of the hours required.
Skills: Web Crawler Data mining Database design Web scraping
Fixed-Price - Expert ($$$) - Est. Budget: $7,500 - Posted
We have to build large database for marketing. Please let us know price for 10,00000 database. Long term opportunity for good team or company. Looking for low price and quick finisher 1. First Name 2. Last Name 3. Professional Title 4. Specialty 5. Organization Name 6. Address 1 7. Address 2 8. City 9. State 10. Zip 11. Phone 12. Email (verified)– This should be the individual’s direct email address, not a shared email address. Company email addresses such as info@medsouth.com do not qualify. The domain must be a company domain. Addresses from personal email – aol, hotmail, gmail, etc – are unacceptable. 13. Company URL
Skills: Web Crawler Data Entry Data mining Data scraping
Fixed-Price - Expert ($$$) - Est. Budget: $2,500 - Posted
Automation & softdev EXPERTS only! We are trying to emulate a real user in a real browser on a website that has advanced checks to stop bots. ~The scenario~ We have to interact with a certain website through a browser (clicks). There are multiple steps that have to be performed and on each step, the response page from the website has to processed by our app/code. For the last couple of months, we had a solution that uses a scriptable WebKit headless browser called 'PhantomJS' and it worked good without any flaws. Unfortunately, recently this website started detecting that we use 'PhantomJS' and our code no longer works. ~Your solution~ We need someone with fresh ideas who can help us restore our functionality. We have noticed that a normal browser works perfectly fine on the website while PhantomJS does not work and they will detect that we use PhantomJS. You need to come up with a solution that is able to use this website without being detected as a bot. You can use any method that is at your disposal, but the most likely solution will be automating an actual browser such as Chrome. VERY IMPORTANT: Your code/solution needs to be able to run multiple browsers at the same time! REALIZE that we need to be able to run all of your code from our C# or C++ app in Windows. (Your stuff has to be controlled from code.) IDEALLY we would like a solution similar to PhantomJS but more like a real browser. Looking forward to forming a long term business relationship with a skilled individual. Good work means more work!
Skills: Web Crawler Automated Testing Automation C
Fixed-Price - Entry Level ($) - Est. Budget: $20 - Posted
Looking for one or two to download pdf couple of thousand times from the link provided. Simple download no search no intelligence. Should have international vpn with dynamic IPs. Exp in web traffic would help.
Skills: Web Crawler traffic geyser