Data Scientist

Posted 2 weeks ago

Worldwide

Summary

Data Scientist — AI Evaluation Specialist (Contract, Remote) The work YellowPad helps businesses turn information buried in documents into structured, auditable data. We're looking for an analytical data scientist to rigorously evaluate the quality of outputs from our AI-powered document data-extraction system. The core question is simple to ask and hard to answer: how good are the outputs, and where should we improve them next? You will not build or deploy the production system. You will understand the relevant workflow, measure the quality of its outputs, identify where errors come from, and tell the team — with evidence — what's working, what's broken, and where to invest. Your main deliverable is a clear, prioritized written report backed by data. Engagement Contract, fully remote. ~20–30 hrs/week to start (flexible), with potential to extend based on fit. Async-friendly. A few hours of overlap with US Eastern time for check-ins is helpful but the work is largely independent. Rate: $45–70/hr depending on experience. What we're looking for You've measured the quality of an AI, NLP, search, classification, or information-extraction system before, and you know a single accuracy number is rarely the whole truth. You can dig into distributions and underlying errors, reason about sampling and statistical confidence, and run before-and-after comparisons that show whether quality actually improved. You can read documentation, inspect data, ask good questions, and build an accurate mental model of a complex workflow without needing every step explained. And you communicate exceptionally well — turning messy data into a crisp, prioritized recommendation the team can act on. Core skills (must-have) Designing evaluation metrics and methods from scratch Building, sampling, or validating ground-truth datasets Reasoning about sampling strategy, statistical confidence, and measurement quality Inspecting errors and explaining what's driving them Experience evaluating AI, NLP, search, classification, or extraction systems Strong SQL and Python for analysis (pandas, numpy, visualization, notebooks) Comfort with semi-structured data (JSON or similar) Strong analytical writing Nice to have Experience with document AI, OCR, or information extraction Enough AI/NLP literacy to reason about why an extraction or classification system behaves the way it does To apply In your proposal, briefly describe one time you measured whether an ML/NLP system's quality actually improved — what metric you used, how you sampled, and what you concluded. Proposals that skip this will not be considered.

  • More than 30 hrs/week
    Hourly
  • 3-6 months
    Duration
  • Expert
    Experience Level
  • $45.00

    -

    $70.00

    Hourly
  • Remote Job
  • Ongoing project
    Project Type

Contract-to-hire opportunity

This lets talent know that this job could become full time.
Learn more
Skills and Expertise
Mandatory skills
Data Science
Python
Data Analysis
Activity on this job
  • Proposals:50+
  • Last viewed by client:last week
  • Interviewing:
    2
  • Invites sent:
    1
  • Unanswered invites:
    1
About the client
Member since Jan 6, 2015
  • United States
    Burlingame10:34 AM
  • $9.9K total spent
    17 hires, 3 active
  • 305 hours
  • Tech & IT
    Small company (2-9 people)

Explore similar jobs on Upwork

GTM Specialist for AI Staffing FirmHourly‐ Posted 4 weeks ago
Adobe Illustrator
Graphic Design
Logo Design
Illustration
Research & Development
Internet of Things Software
Research Paper Writing

How it works

  • Post a job icon
    Create your free profile
    Highlight your skills and experience, show your portfolio, and set your ideal pay rate.
  • Talent comes to you icon
    Work the way you want
    Apply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
  • Payment simplified icon
    Get paid securely
    From contract to payment, we help you work safely and get paid securely.
Want to get started? Create a profile

About Upwork

  • Rating is 4.9 out of 5.
    4.9/5
    (Average rating of clients by professionals)
  • G2 2021
    #1 freelance platform
  • 49,000+
    Signed contract every week
  • $2.3B
    Freelancers earned on Upwork in 2020

Find the best freelance jobs

Growing your career is as easy as creating a free profile and finding work like this that fits your skills.

Trusted by

  • Microsoft Logo
  • Airbnb Logo
  • Bissell Logo
  • GoDaddy Logo