Python AI Engineer for LLM Deployment

Posted yesterday

Worldwide

Summary

We're not just another AI shop – we're a people-first business on a mission to solve real-world problems, from removing CO₂ from the atmosphere to building tools that actually matter. And we want you to grow with us – not for 7 days, but for years. This gig starts as a part‑time 7‑10 day project, but if you're the right fit, we're keeping you long‑term. We need a Python AI Engineer who wants to build something epic, have fun, and be part of a team that actually cares. Still with us? Good. Here's what we're building: --- PYTHON AI ENGINEER – PART-TIME (7–10 DAY PROJECT) Long-term role for the right person. We are deploying a full AI ecosystem with LLMs, memory systems, search, video generation, and creative tools. We need a Python AI Engineer to deploy the AI core on production infrastructure. This is a PRODUCTION deployment role – NOT research. --- WHAT YOU'LL DO: · Deploy vLLM inference servers on RunPod GPU pods (Llama 3.3 70B, Qwen 2.5 Coder 32B, DeepSeek R1 32B, BGE-large) · Set up OpenAI-compatible API endpoints, VRAM management, Flash Attention, health checks · Build Memory Service (FastAPI) – embeddings with BGE, vector storage with Qdrant, similarity search · Build Brain Gateway (FastAPI) – router for LLM requests, orchestration, context retrieval, session management · Build Scraping Worker – web scraping, text chunking, async processing with NATS · Set up RAG pipeline with Qdrant + Typesense (hybrid search) · Deploy FLUX / Playground inference + Whisper.cpp for subtitles + Video Engine API · Containerise all services with Docker, integrate with Kong API Gateway · Set up logging (Loki), metrics (Prometheus), tracing (Tempo) · Write API docs (Swagger) and deployment runbook · Train the Full Stack team on using the AI APIs --- REQUIREMENTS (Must Have): · Python (5+ years professional) · FastAPI or Flask (expert) · vLLM or TensorRT-LLM (production deployment) · RunPod, AWS SageMaker, or similar GPU cloud · Docker & Docker Compose (advanced) · Vector DBs (Qdrant, Pinecone, or Weaviate) · PostgreSQL with pgvector · Redis (caching + queue) · Git & CI/CD · Linux admin (Ubuntu, shell scripting) Nice to have: Llama/Qwen/DeepSeek experience, AWQ/GPTQ quantisation, NATS, Kong, Vault, Prometheus/Grafana/Loki/Tempo, MinIO/S3, Whisper, FLUX/Stable Diffusion. --- HOW TO APPLY: Send your application to Include: · CV/Resume highlighting Python AI production experience · GitHub/Portfolio with AI deployment projects · Brief intro video Subject Line: "Python AI Engineer - [Your Name]"

Less than 30 hrs/week
Hourly
3-6 months
Duration
Intermediate
Experience Level
$50.00
-
$80.00
Hourly
Remote Job
Ongoing project
Project Type

Skills and Expertise

Mandatory skills

RESTful API

Python

Artificial Intelligence

Activity on this job

Proposals:50+
Last viewed by client:2 hours ago
Interviewing:
12
Invites sent:
30
Unanswered invites:
26

About the client

Member since Jun 16, 2026

United States
Houston6:04 PM
Sales & Marketing
Individual client

Explore similar jobs on Upwork

Paid Interview: LangSmith Fleet UsersHourly‐ Posted 3 weeks ago

LangChain

Noloco specialistHourly‐ Posted 2 weeks ago

How it works

Create your free profile
Highlight your skills and experience, show your portfolio, and set your ideal pay rate.
Work the way you want
Apply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
Get paid securely
From contract to payment, we help you work safely and get paid securely.