Off-Robot Inference Engineer
Worldwide
Off-Robot Inference Engineer The Role Dubai-based robotics company deploying a robot fleet into commercial super-user sites starting this summer. We run a tiered inference architecture. Low-latency work (motor control, SLAM, obstacle avoidance) runs on-robot. The heavy layer (VLA, LLM, speech, multi-camera reasoning) runs on a near-edge NVIDIA DGX Spark we're standing up now. This hire supports the off-robot side of that split. You will build the inference stack, benchmark the model layer on real hardware, and put structured numbers on the table so the team stops estimating and starts measuring. What You'll Actually Do Stand up DGX Spark as a production inference server. Ubuntu on Grace Blackwell, CUDA, TensorRT-LLM or Triton. Multi-model serving, KV-cache budgeting, health checks, metrics. Benchmark the Phase 1 model stack against real load. UnifoLM 7.5B, GR00T N1.5 3B, Qwen 2.5 3B, Whisper large-v3, YOLO26m-seg. Precision sweeps (FP16, INT8, FP4). Tokens per second, latency distribution at p50/p95/p99, memory headroom under concurrent load. Numbers end with units. Build the routing layer between the robots and the near-edge box. Decide what runs where, handle cloud fallback for the full 7B UnifoLM when connectivity is there, structure the telemetry we capture off every deployment. This scaffolding will be owned and extended by the core team; you build it clean and documented. Validate Isaac Sim on ARM. PhysX GPU is broken on GB10 Blackwell. Newton physics is the documented workaround. Get sim-to-real working for our simplest behavior before the humanoid arrives. Produce technical briefs when the team needs a call made. Gemini Robotics-ER 1.6 as the fleet supervisor layer. OpenMind OM1 as tool or threat. Qwen 2.5 vs Llama 3.2 for Arabic deployment. Briefs end in measurements and a recommendation, not summaries. What We Need Strong Python. C++ for ROS 2 nodes and latency-sensitive paths. Linux, CUDA, Docker, ROS 2. Production model-serving experience. vLLM, TensorRT-LLM, Triton, or equivalent. Not a tutorial. You've served multiple models concurrently, debugged OOM under load, owned a p99 latency target. Benchmarking discipline. You measure, you don't estimate. If the table has LOW or MED confidence on a line item with published weights, you treat that as a bug to fix. Writing that respects the reader's time. No filler. You notice CC BY-NC-SA 4.0 and flag it before it becomes a commercial problem. Nice to Have Isaac Sim hands-on (x86 is fine — ARM quirks are documented). NVIDIA Jetson hands-on (Orin NX, AGX Thor). Unitree SDK (Go2, G1). Published or open-source work in VLA, model serving, or robotics infrastructure. Do Not Apply If You Want A firmware role. A full-stack web role. A PhD research seat where you pick your own problem. Commitment 10-15 hours per week. Fully remote, any timezone. Path to expanded scope, higher rate, etc. First Deliverable Before hardware lands, produce a benchmark plan and harness for the Phase 1 model stack on DGX Spark. Precision sweeps, load profiles, memory accounting. One page of methodology, one page of expected numbers based on published benchmarks, one page of what we'll actually measure when the box boots. This is the filter.
- More than 30 hrs/weekHourly
- 6+ monthsDuration
- IntermediateExperience Level
$25.00
-
$40.00
Hourly- Remote Job
- Ongoing projectProject Type
Skills and Expertise
Activity on this job
- Proposals:10 to 15
- Last viewed by client:last week
- Hires:1
- Interviewing:2
- Invites sent:0
- Unanswered invites:0
About the client
- AREDubai5:34 PM
- $7.7K total spent2 hires, 2 active
- 245 hours
Explore similar jobs on Upwork
How it works
Create your free profileHighlight your skills and experience, show your portfolio, and set your ideal pay rate.
Work the way you wantApply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
Get paid securelyFrom contract to payment, we help you work safely and get paid securely.
About Upwork
- 4.9/5(Average rating of clients by professionals)
- G2 2021#1 freelance platform
- 49,000+Signed contract every week
- $2.3BFreelancers earned on Upwork in 2020
Find the best freelance jobs
Growing your career is as easy as creating a free profile and finding work like this that fits your skills.
Trusted by