Senior Data Scientist

Posted 3 weeks ago

Worldwide

Summary

You will own the data enrichment strategy for a massive archive of world-class journalism. Your mission is to take 25 years of historical content and "hydrate" it—cleaning and tagging it with metadata so it can power next-gen AI products and search tools. You’ll act as a bridge between business leaders and engineering teams, turning complex editorial goals into smart, scalable data pipelines. Most Important Senior NLP & ML Experience: 5+ years of experience processing large-scale, unstructured text datasets. Technical Stack: Advanced proficiency in Python (Pandas, PySpark) and building production-ready ETL pipelines. NLP Frameworks: Hands-on experience with spaCy, Hugging Face, or Transformers for entity recognition and categorization. Search Knowledge: Familiarity with OpenSearch or Elasticsearch, specifically regarding vector embeddings and index mapping. Taxonomy Design: Ability to design metadata structures that capture the value of diverse content. Strategy & Consultation: Experience leading technical discovery sessions and translating business needs into technical requirements. Nice to Have Legacy Data Handling: Experience working with messy, historical HTML and "dirty" data archives. Efficiency Focus: Knowledge of using open-source LLMs to process data in a cost-effective way. Modern Search: Exposure to hybrid search (Lexicon + Vector) and graph-based retrieval. Personal Traits The Translator: You can explain complex AI concepts to non-technical people without losing them. The Diplomat: You are great at mediating between different teams with competing priorities. Pragmatic Thinker: You focus on results and ROI, knowing when a "good enough" model is better than a perfect one that’s too expensive. Curious Investigator: You enjoy digging into decades of data to find patterns and solve "messy" problems. Team Player: You enjoy working closely with backend and search engineers to ensure your data actually works in the final product.

  • More than 30 hrs/week
    Hourly
  • 6+ months
    Duration
  • Intermediate
    Experience Level
  • $15.00

    -

    $40.00

    Hourly
  • Remote Job
  • Ongoing project
    Project Type

Contract-to-hire opportunity

This lets talent know that this job could become full time.
Learn more
Skills and Expertise
Mandatory skills
Data Science
Python
Machine Learning
Activity on this job
  • Proposals:50+
  • Last viewed by client:2 weeks ago
  • Interviewing:
    22
  • Invites sent:
    0
  • Unanswered invites:
    0
About the client
Member since Sep 21, 2021
  • Germany
    Berlin1:30 PM
  • $105K total spent
    35 hires, 3 active
  • 4,825 hours
  • Engineering & Architecture
    Mid-sized company (10-99 people)

Explore similar jobs on Upwork

Qualitative Research
Research Methods
Survey Design
Scientific Literature Review
Proofreading
Qualtrics
NVivo
Quantitative Analysis
Stata
Public Health
Statistical Analysis
SAS
Statistics
Data Analysis for Stock ExchangesHourly‐ Posted 1 month ago
Data Analysis
Statistics
Data Science
IBM SPSS

How it works

  • Post a job icon
    Create your free profile
    Highlight your skills and experience, show your portfolio, and set your ideal pay rate.
  • Talent comes to you icon
    Work the way you want
    Apply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
  • Payment simplified icon
    Get paid securely
    From contract to payment, we help you work safely and get paid securely.
Want to get started? Create a profile

About Upwork

  • Rating is 4.9 out of 5.
    4.9/5
    (Average rating of clients by professionals)
  • G2 2021
    #1 freelance platform
  • 49,000+
    Signed contract every week
  • $2.3B
    Freelancers earned on Upwork in 2020

Find the best freelance jobs

Growing your career is as easy as creating a free profile and finding work like this that fits your skills.

Trusted by

  • Microsoft Logo
  • Airbnb Logo
  • Bissell Logo
  • GoDaddy Logo