Fintech Data Scientist for a new Lending Startup

Posted 4 weeks ago

Worldwide

Summary

APPLIED DATA SCIENTIST - PRODUCTION ML FOR SMB LENDING LEAD SCORING Long-term engagement, global remote, async-first. We're looking for a data scientist to own a production intent model end to end. THE WORK We score small-business loan leads for vendor partners who need to know which leads to call first. Our model predicts whether a lead will submit an application for financing, turned into a priority bucket vendors call against - so model quality maps directly to revenue per dial. The production model is a multi-tier union-trained architecture with a rule-based gate that demotes leads that look closed. You'd report to the CTO and own the model end to end - features, training, evaluation, calibration, version handoff, and the contract between training and serving. CONCRETE PROBLEMS ON DAY ONE - Pooled held-out AUC is roughly flat across recent versions. Top-decile capture beats the prior baseline at the very top of the ranking but pooled has plateaued. Lifting this is the headline metric. - Source-class shortcut - structural properties across training sources let any model inflate pooled AUC by learning source identity. Mitigation today is a source-classifier diagnostic plus per-source AUC reporting. The architectural fix is open. - Calibration gap - models rank correctly but probabilities don't match real-world conversion rates by an order of magnitude. Calibrated probabilities unlock downstream product features. - Recoverable enrichment misses - a meaningful share of leads now have match keys that weren't available at original ingest, and re-running idempotent enrichers would recover real coverage. - Form-only tier for unmatched leads - a large share of incoming leads get no useful score because they fail enrichment matching. Coarse firmographic features are available for every raw lead. OUR STACK -Snowflake with dbt for the data layer, medallion architecture -Python end to end - scikit-learn, XGBoost, LightGBM, pandas, numpy. SMOTE and class-weight handlers for imbalance. -Jupyter notebooks for training, production runners for scoring and export -Dagster for orchestration (in build), AWS S3 for inputs and exports EXPERIENCE REQUIREMENTS -4-7 years building production ML systems, including binary classification on imbalanced labels and at least one project where you owned the train/score handoff -Strong intuition for adversarial validation, source shift, and label leakage. If you've shipped a model where pooled AUC and per-source AUC told different stories and you knew which one was real, we want to talk. -SQL and dbt comfort - a meaningful share of feature engineering happens in the dbt layer, not in pandas -Production calibration experience (Platt, isotonic, or temperature scaling) including knowing when each fails -Tree-based modeling depth - XGBoost, LightGBM, RandomForest are our workhorses -Independence in a small team - you'll be the only dedicated DS for now -Plain writing - our docs are direct, our PRs are short NICE TO HAVE -Small-business lending, factoring, or commercial-filing data background -Point-in-time-correct feature engineering experience -Shipped a model where calibrated probabilities were load-bearing (pricing, EV, expected-conversion sizing) -Comfort lightly modifying Python orchestration code (Dagster, Airflow, Prefect) -Owned a model that delivered priority buckets rather than raw scores to an external party LOGISTICS -Global remote, async-first. A few hours of US Eastern overlap required. -Long-term, ongoing engagement. Share your hourly range in your application.

  • Less than 30 hrs/week
    Hourly
  • 1-3 months
    Duration
  • Intermediate
    Experience Level
  • $20.00

    -

    $40.00

    Hourly
  • Remote Job
  • Ongoing project
    Project Type
  • Contract-to-hire
    This job has the potential to turn into a full time role
Skills and Expertise
Mandatory skills
Data Science
Machine Learning
Nice-to-have skills
Data Analysis
Activity on this job
  • Proposals:20 to 50
  • Last viewed by client:6 days ago
  • Hires:
    1
  • Interviewing:
    8
  • Invites sent:
    4
  • Unanswered invites:
    2
About the client
Member since Jun 4, 2026
  • USA
    Miami9:55 AM
  • 1 hire, 0 active

Explore similar jobs on Upwork

AI Agent Development
AI Implementation
Chatbot Development
Gen AI Developer (Contract)Fixed-price‐ Posted 1 month ago
AI Agent Development
Python
JavaScript
API
Node.js
Deep Learning
React
PostgreSQL

How it works

  • Post a job icon
    Create your free profile
    Highlight your skills and experience, show your portfolio, and set your ideal pay rate.
  • Talent comes to you icon
    Work the way you want
    Apply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
  • Payment simplified icon
    Get paid securely
    From contract to payment, we help you work safely and get paid securely.
Want to get started? Create a profile

About Upwork

  • Rating is 4.9 out of 5.
    4.9/5
    (Average rating of clients by professionals)
  • G2 2021
    #1 freelance platform
  • 49,000+
    Signed contract every week
  • $2.3B
    Freelancers earned on Upwork in 2020

Find the best freelance jobs

Growing your career is as easy as creating a free profile and finding work like this that fits your skills.

Trusted by

  • Microsoft Logo
  • Airbnb Logo
  • Bissell Logo
  • GoDaddy Logo