Data Harvesting Specialist (Senior Level)

Posted 2 days ago

Worldwide

Summary

We are looking for a Senior Data Harvesting and Automation Engineer who can build strong, reliable systems that collect large amounts of information from online sources. The goal is to turn raw online data into clean, organized, useful information that our teams can use for automation, analytics, sales, research, and decision-making.

In this role, you will design full data pipelines. This means you will collect the data, clean it, enrich it, organize it, store it, and then send it to the tools our teams use every day. To do this well, you must be very comfortable working with Python, Playwright, Selenium, Scrapy, and modern data-handling tools. You should also know how to work with databases, APIs, and ETL processes.

The systems you build must handle websites with heavy JavaScript, login pages, moving parts, changing layouts, and anti-automation challenges. You will create pipelines that run on their own, update themselves, and fix small issues without stopping.

We want someone who thinks about long-term solutions, not quick fixes. You should be able to plan, design, test, and maintain full data systems that stay strong even as they grow bigger and handle more information. The work you do will support our automation systems, AI tools, dashboards, and market-research programs across many industries.

This job is perfect for someone who enjoys solving complex problems, building strong data systems, and making information easy for others to use.

Deliverables
  • 1. Data Harvesting Systems
  • Build high-volume data harvesting tools using Python, Playwright, Selenium, Scrapy, and similar libraries.
  • Handle dynamic websites, multi-step forms, login pages, and heavy JavaScript content.
  • Create pipelines that collect data in real-time, on schedules, or in large batches.
  • Use methods like proxy rotation, session management, and browser automation for reliability.
  • 2. Automated ETL and Data Enrichment
  • Build ETL workflows that clean, organize, validate, and enrich data.
  • Standardize messy information into clear formats.
  • Add extra value through enrichment such as categories, tags, metadata, and location details.
  • Ensure the final datasets are clean, accurate, and ready to use.
  • 3. API and System Integrations
  • Build REST APIs that deliver processed data to internal tools.
  • Connect harvested data to systems such as CRMs, dashboards, and automation platforms.
  • Design schemas for leads, companies, products, service categories, and multi-industry datasets.
  • Maintain consistent data flow from pipelines into our operational tools.
  • 4. Database Architecture
  • Create and manage SQL and NoSQL databases.
  • Build data models that support fast reading, writing, and updating.
  • Use indexing and caching techniques for better performance.
  • Keep data stored safely, clearly, and logically.
  • 5. Monitoring and System Stability
  • Build monitoring tools that check pipeline health, speed, and accuracy.
  • Set up alerts for errors, slowdowns, or unusual behavior.
  • Add retry systems, backup methods, and fail-safe processes.
  • Keep pipelines running smoothly with little manual work.
  • 6. Documentation
  • Write clear instructions and explanations for each pipeline and tool.
  • Create diagrams showing how data moves through the system.
  • Document best practices so the system can grow over time.
  • Keep code and SOPs organized and easy for others to understand.
  • 7. Team and Data Support
  • Work with analytics, automation, marketing, and sales teams.
  • Provide clean data that powers dashboards, workflows, and research.
  • Help teams use the data to improve decision-making and strategy.
  • Support multi-industry intelligence projects with accurate, timely information.
  • More than 30 hrs/week
    Hourly
  • 6+ months
    Duration
  • Expert
    Experience Level
  • $50.00

    -

    $100.00

    Hourly
  • Remote Job
  • Complex project
    Project Type
Skills and Expertise
Mandatory skills
Data Analysis
Data Science
Nice-to-have skills
Data Mining
Big Data
Tools
Apache Spark
Python
SQL
Activity on this job
  • Proposals:20 to 50
  • Last viewed by client:2 days ago
  • Interviewing:
    3
  • Invites sent:
    3
  • Unanswered invites:
    0
About the client
Member since Dec 5, 2022
  • United States
    Canton11:52 AM
  • $114K total spent
    61 hires, 29 active
  • 7,424 hours

Explore similar jobs on Upwork

Snowflake
Database Design
Data Integration
Data Preprocessing
Data Transformation
Data Migration
Data Engineering
ETL Pipeline
SQL
Looker
Data Visualization
Scripting Language
Database University AssignmentsHourly‐ Posted 1 month ago
SQL
Database
Microsoft Excel
Database Design
Database Management
SQL Server Integration Services
Excel Macros
Excel Formula
Microsoft Power BI
Microsoft Excel PowerPivot
Power Query
Data Entry
Data Cleaning
Data Analytics
Data Extraction

How it works

  • Post a job icon
    Create your free profile
    Highlight your skills and experience, show your portfolio, and set your ideal pay rate.
  • Talent comes to you icon
    Work the way you want
    Apply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
  • Payment simplified icon
    Get paid securely
    From contract to payment, we help you work safely and get paid securely.
Want to get started? Create a profile

About Upwork

  • Rating is 4.9 out of 5.
    4.9/5
    (Average rating of clients by professionals)
  • G2 2021
    #1 freelance platform
  • 49,000+
    Signed contract every week
  • $2.3B
    Freelancers earned on Upwork in 2020

Find the best freelance jobs

Growing your career is as easy as creating a free profile and finding work like this that fits your skills.

Trusted by

  • Microsoft Logo
  • Airbnb Logo
  • Bissell Logo
  • GoDaddy Logo