ML experts needed to run GPU benchmarks

Posted 2 weeks ago

Worldwide

Summary

We’re looking for an experienced ML engineer or GPU benchmarking specialist to help design and run practical benchmark tests for cloud GPU infrastructure. This project is focused on real-world AI workloads, not just synthetic hardware tests. We want to benchmark GPUs across use cases such as LLM inference, RAG workloads, long-context prompts, batch document processing, fine-tuning, and possibly image/video or TTS workloads. The goal is to produce reliable, reproducible benchmark data that can be used in public-facing technical content, comparison pages, and research-style reports. What You’ll Help With: - Design a clear benchmark methodology for cloud GPUs. - Run LLM inference tests using tools such as vLLM, SGLang, TGI, GenAI-Perf, or similar. - Measure practical metrics like TTFT, TPOT, P95/P99 latency, throughput, GPU utilization, VRAM usage, cost per 1M tokens, and cost per completed task. - Help create benchmark workflows for real workloads such as RAG assistants, batch summarization, agent-style tasks, and fine-tuning. - Capture clean run metadata: GPU model, driver/CUDA version, runtime version, model settings, pricing assumptions, region/provider details, and failed runs. - Package results into clean CSV/JSON outputs with notes that a non-ML audience can understand. Ideal Candidate: You should have hands-on experience with at least some of the following: - GPU benchmarking for LLMs or AI workloads - vLLM, SGLang, TGI, TensorRT-LLM, or similar serving frameworks - NVIDIA GPUs, CUDA, `nvidia-smi`, DCGM, PyTorch - LLM inference benchmarking, fine-tuning, or RAG evaluation - Benchmark methodology, reproducibility, and performance analysis - Cloud GPU platforms or distributed GPU environments You do not need to write polished marketing content, but you should be able to explain results clearly and help us avoid misleading or unfair benchmark claims. Deliverables: - Recommended benchmark methodology - Benchmark scripts or clear runbooks - Raw benchmark outputs in CSV/JSON - Summary tables and key findings - Notes on limitations, anomalies, and reproducibility - Optional: recommendations for future benchmark tests This is a hands-on technical project with potential for ongoing work. Please apply with examples of previous GPU, ML, LLM, or infrastructure benchmarking work, and mention which tools you would recommend for this type of project.

  • Less than 30 hrs/week
    Hourly
  • 1-3 months
    Duration
  • Intermediate
    Experience Level
  • $15.00

    -

    $35.00

    Hourly
  • Remote Job
  • Ongoing project
    Project Type
Skills and Expertise
Mandatory skills
MLOps
Model Testing & Optimization
Activity on this job
  • Proposals:20 to 50
  • Last viewed by client:2 weeks ago
  • Interviewing:
    18
  • Invites sent:
    0
  • Unanswered invites:
    0
About the client
Member since Feb 22, 2016
  • Malaysia
    Kuala Lumpur7:45 PM
  • $12K total spent
    44 hires, 6 active
  • 429 hours
  • Sales & Marketing
    Individual client

Explore similar jobs on Upwork

Quantum Computing
Predictive Model
SQL
pandas
Data Science
Python
Machine Learning
Python Scikit-Learn
Deep Learning
Predictive Analytics
Data Analysis

How it works

  • Post a job icon
    Create your free profile
    Highlight your skills and experience, show your portfolio, and set your ideal pay rate.
  • Talent comes to you icon
    Work the way you want
    Apply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
  • Payment simplified icon
    Get paid securely
    From contract to payment, we help you work safely and get paid securely.
Want to get started? Create a profile

About Upwork

  • Rating is 4.9 out of 5.
    4.9/5
    (Average rating of clients by professionals)
  • G2 2021
    #1 freelance platform
  • 49,000+
    Signed contract every week
  • $2.3B
    Freelancers earned on Upwork in 2020

Find the best freelance jobs

Growing your career is as easy as creating a free profile and finding work like this that fits your skills.

Trusted by

  • Microsoft Logo
  • Airbnb Logo
  • Bissell Logo
  • GoDaddy Logo