Data Scientist / ML Engineer — Sports Data Pipelines, Snowflake & Feature Engineering

Posted yesterday

Worldwide

Summary

We are looking for a strong Data Scientist, ML Engineer, or Data Systems Engineer to help with an applied sports data modeling and backtesting project. The primary focus is NBA live-game data, but this is not a sports analyst role. We are looking for someone who is excellent with data acquisition, data pipelines, Snowflake, Python, SQL, feature engineering, machine learning workflows, and model-ready research datasets. We have a structured data environment combining historical pricing data, NBA play-by-play data, and game-context data. The goal is to improve the flow of raw data into clean, usable research tables, build high-quality features, support backtesting, and help evaluate which variables contain predictive value. This role maps directly to the need for stronger data reliability, research workflow, execution monitoring, and production-grade systems described in the Kymelion roadmap. Sports data experience is helpful, especially NBA or European football / soccer, but the most important requirement is being a strong technical data scientist who can work with messy real-world data and turn it into reliable modeling inputs. What You’ll Work On Acquire, clean, and structure new data sources Improve data flow into Snowflake and Python research workflows Write SQL queries and build reusable research datasets Work with play-by-play, game-context, pricing, and market-style data Build model-ready features from messy raw data Create player, lineup, rotation, fatigue, foul trouble, injury, rest, and game-state features Build Python notebooks for exploratory analysis and modeling Support historical simulations and backtests Help evaluate calibration, log loss, Brier score, ROI, drawdown, or other relevant metrics Improve repeatability of research workflows Summarize findings clearly and practically Required Skills Python SQL Snowflake or similar cloud data warehouses Data acquisition / ingestion API data ingestion Data cleaning and transformation ETL / ELT workflows Pandas, NumPy, SciPy, scikit-learn Feature engineering Machine learning workflows Predictive modeling Model validation and backtesting Working with messy real-world datasets Nice to Have Experience building repeatable research datasets Experience with live or near-real-time data workflows NBA play-by-play or possession-level data European football / soccer match-event data Odds, pricing, or market-style data Time-series data XGBoost, PyTorch, TensorFlow, or similar tools Airflow, dbt, Dagster, Prefect, or similar tools Experience moving raw data into model-ready features Experience with trading, forecasting, or market-style research workflows Deliverables Initial deliverables may include: Clean data acquisition / ingestion workflows Snowflake queries and structured research tables Python notebooks Feature engineering work ML-ready datasets Backtesting support Signal validation analysis Written summaries of findings Recommendations for improving data quality, workflow reliability, and research speed Project Structure We are open to starting with a smaller paid project or trial assignment, then expanding if there is a strong fit. This could become an ongoing part-time or contract-to-hire role. To Apply Please include: A short overview of your data science / ML / data engineering background Relevant work in data acquisition, data pipelines, machine learning, forecasting, or backtesting Links to GitHub, Kaggle, notebooks, dashboards, models, pipelines, or prior projects Any experience with Python, SQL, Snowflake, APIs, time-series data, or messy real-world datasets A brief note on how you would approach turning raw live-game data into useful modeling features

  • More than 30 hrs/week
    Hourly
  • 6+ months
    Duration
  • Intermediate
    Experience Level
  • $20.00

    -

    $65.00

    Hourly
  • Remote Job
  • Ongoing project
    Project Type

Contract-to-hire opportunity

This lets talent know that this job could become full time.
Learn more
Skills and Expertise
Mandatory skills
Data Engineering
Data Integration
Activity on this job
  • Proposals:50+
  • Last viewed by client:yesterday
  • Interviewing:
    0
  • Invites sent:
    0
  • Unanswered invites:
    0
About the client
Member since Apr 26, 2018
  • CAN
    Montreal11:30 PM
  • $86K total spent
    6 hires, 6 active
  • 1,911 hours
  • Sales & Marketing
    Individual client

Explore similar jobs on Upwork

Database University AssignmentsHourly‐ Posted 9 months ago
SQL
Database
Microsoft Excel
Database Design
Database Management
SQL Server Integration Services
Excel Macros
Excel Formula
Microsoft Power BI
Microsoft Excel PowerPivot
Power Query
Data Entry
Data Cleaning
Data Analytics
Data Extraction
AWS Glue
Apache Kafka
Python
HubSpot
Salesforce CRM
REST API
Node.js

How it works

  • Post a job icon
    Create your free profile
    Highlight your skills and experience, show your portfolio, and set your ideal pay rate.
  • Talent comes to you icon
    Work the way you want
    Apply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
  • Payment simplified icon
    Get paid securely
    From contract to payment, we help you work safely and get paid securely.
Want to get started? Create a profile

About Upwork

  • Rating is 4.9 out of 5.
    4.9/5
    (Average rating of clients by professionals)
  • G2 2021
    #1 freelance platform
  • 49,000+
    Signed contract every week
  • $2.3B
    Freelancers earned on Upwork in 2020

Find the best freelance jobs

Growing your career is as easy as creating a free profile and finding work like this that fits your skills.

Trusted by

  • Microsoft Logo
  • Airbnb Logo
  • Bissell Logo
  • GoDaddy Logo