Fintech Data Scientist for a new Lending Startup

Posted 4 weeks ago

Worldwide

Summary

APPLIED DATA SCIENTIST - PRODUCTION ML FOR SMB LENDING LEAD SCORING Long-term engagement, global remote, async-first. We're looking for a data scientist to own a production intent model end to end. THE WORK We score small-business loan leads for vendor partners who need to know which leads to call first. Our model predicts whether a lead will submit an application for financing, turned into a priority bucket vendors call against - so model quality maps directly to revenue per dial. The production model is a multi-tier union-trained architecture with a rule-based gate that demotes leads that look closed. You'd report to the CTO and own the model end to end - features, training, evaluation, calibration, version handoff, and the contract between training and serving. CONCRETE PROBLEMS ON DAY ONE - Pooled held-out AUC is roughly flat across recent versions. Top-decile capture beats the prior baseline at the very top of the ranking but pooled has plateaued. Lifting this is the headline metric. - Source-class shortcut - structural properties across training sources let any model inflate pooled AUC by learning source identity. Mitigation today is a source-classifier diagnostic plus per-source AUC reporting. The architectural fix is open. - Calibration gap - models rank correctly but probabilities don't match real-world conversion rates by an order of magnitude. Calibrated probabilities unlock downstream product features. - Recoverable enrichment misses - a meaningful share of leads now have match keys that weren't available at original ingest, and re-running idempotent enrichers would recover real coverage. - Form-only tier for unmatched leads - a large share of incoming leads get no useful score because they fail enrichment matching. Coarse firmographic features are available for every raw lead. OUR STACK -Snowflake with dbt for the data layer, medallion architecture -Python end to end - scikit-learn, XGBoost, LightGBM, pandas, numpy. SMOTE and class-weight handlers for imbalance. -Jupyter notebooks for training, production runners for scoring and export -Dagster for orchestration (in build), AWS S3 for inputs and exports EXPERIENCE REQUIREMENTS -4-7 years building production ML systems, including binary classification on imbalanced labels and at least one project where you owned the train/score handoff -Strong intuition for adversarial validation, source shift, and label leakage. If you've shipped a model where pooled AUC and per-source AUC told different stories and you knew which one was real, we want to talk. -SQL and dbt comfort - a meaningful share of feature engineering happens in the dbt layer, not in pandas -Production calibration experience (Platt, isotonic, or temperature scaling) including knowing when each fails -Tree-based modeling depth - XGBoost, LightGBM, RandomForest are our workhorses -Independence in a small team - you'll be the only dedicated DS for now -Plain writing - our docs are direct, our PRs are short NICE TO HAVE -Small-business lending, factoring, or commercial-filing data background -Point-in-time-correct feature engineering experience -Shipped a model where calibrated probabilities were load-bearing (pricing, EV, expected-conversion sizing) -Comfort lightly modifying Python orchestration code (Dagster, Airflow, Prefect) -Owned a model that delivered priority buckets rather than raw scores to an external party LOGISTICS -Global remote, async-first. A few hours of US Eastern overlap required. -Long-term, ongoing engagement. Share your hourly range in your application.

Less than 30 hrs/week
Hourly
1-3 months
Duration
Intermediate
Experience Level
$20.00
-
$40.00
Hourly
Remote Job
Ongoing project
Project Type
Contract-to-hire
This job has the potential to turn into a full time role

Skills and Expertise

Mandatory skills

Data Science

Machine Learning

Nice-to-have skills

Data Analysis

Activity on this job

Proposals:20 to 50
Last viewed by client:6 days ago
Hires:
1
Interviewing:
8
Invites sent:
4
Unanswered invites:
2

About the client

Member since Jun 4, 2026

USA
Miami9:55 AM
1 hire, 0 active

Explore similar jobs on Upwork

Long-Term AI Automation Developer (Voice AI + AI Chatbots + Advan…Fixed-price‐ Posted 3 months ago

AI Agent Development

AI Implementation

Chatbot Development

Gen AI Developer (Contract)Fixed-price‐ Posted 1 month ago

AI Agent Development

Python

JavaScript

API

Node.js

Deep Learning

React

PostgreSQL

How it works

Create your free profile
Highlight your skills and experience, show your portfolio, and set your ideal pay rate.
Work the way you want
Apply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
Get paid securely
From contract to payment, we help you work safely and get paid securely.