Solution Architect / Scaling Lead — Technical Requirements
Worldwide
Solution Architect / Scaling Lead — Technical Requirements AnavClouds Software Analytics · real-time voice AI, already in production 1. Scaling the live voice pipeline (the core) Scale concurrent calls by growing droplet pools behind a WebSocket-aware load balancer (the ~15-calls-per-4-vCPU-droplet unit) Per-tenant concurrency caps in Redis, statistical-multiplexing/overbooking, and N+1 headroom Capacity planning (concurrent calls → compute), autoscale triggers (~70% utilization), and load testing (k6 / Locust) Audio-path & latency-budget tuning (sub-second first byte) 2. Production reliability & provider resilience Tune real-time timeout budgets (e.g., LLM read-timeout ~10–12 s) so a slow provider fails fast instead of dead-air — with circuit breakers, API-key rotation, cross-provider failover (already built; operate & tune) Manage provider rate limits / concurrent-stream caps per key Zero-downtime deploys, graceful shutdown (don't kill droplets with live calls) Observability: OpenTelemetry per-call tracing, metrics, deep health checks, alerting 3. Operating the service stack DigitalOcean — droplets, Load Balancers, managed databases, Spaces (scale & operate the fleet) Vercel — operate the Next.js frontend (auto-scales; watch bandwidth/function cost) MongoDB (indexing org_id+deleted_at, replica sets, region pinning) · Redis (shared session/concurrency/rate-limit store; Memurai in dev) · Qdrant (vector scaling, reranker caching) · AWS S3 FastAPI async tuning, httpx connection pooling 4. Multi-region & data residency (operate across regions) Region-aware droplet pools (DO TOR/BLR/FRA/AMS), per-region data pinning, geo-routing of calls Residency: India DPDP on-soil, EU GDPR; region-local provider selection (Sarvam in India) Extend to AWS / Azure where DigitalOcean can't meet a bar (e.g., HIPAA BAA) 5. Capacity, cost & rate limiting Per-org rate limits + cost guards (Redis token buckets) Cost optimization across the provider mix and droplet fleet; concurrency-based cost modeling (~$3.20/line) Pre-scale for campaigns/bursts; reserved concurrency for enterprise tenants 6. Telephony at scale Twilio / Plivo — concurrent-channel & per-region number provisioning; outbound pacing/queueing 7. Security & compliance (operate & harden) Multi-tenant isolation, RBAC, secrets & encryption (Fernet, JWT/python-jose, Google OAuth/Authlib) Production compliance controls: SOC 2, HIPAA, GDPR, India DPDP, PCI-DSS 8. Workflows & integrations (extend, not rebuild) Operate & extend the existing workflow engine (nodes, scheduler, triggers) and third-party integrations (CRM, webhooks, iPaaS); scale the campaign/scheduler workers 9. Core stack familiarity Python 3.12 · FastAPI · httpx · Pydantic v2; the AI provider roster — LLM (OpenAI, Anthropic, Google, Groq, Azure OpenAI), STT (Deepgram, AssemblyAI, Sarvam, Speechmatics), TTS (Cartesia, ElevenLabs, Google, Sarvam, OpenAI)
- More than 30 hrs/weekHourly
- 3-6 monthsDuration
- ExpertExperience Level
$40.00
-
$70.00
Hourly- Remote Job
- Ongoing projectProject Type
Skills and Expertise
Activity on this job
- Proposals:20 to 50
- Last viewed by client:2 weeks ago
- Interviewing:7
- Invites sent:0
- Unanswered invites:0
About the client
- IndiaLucknow11:05 PM
- $36K total spent114 hires, 23 active
- 1,299 hours
- Mid-sized company (10-99 people)
Explore similar jobs on Upwork
How it works
Create your free profileHighlight your skills and experience, show your portfolio, and set your ideal pay rate.
Work the way you wantApply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
Get paid securelyFrom contract to payment, we help you work safely and get paid securely.
About Upwork
- 4.9/5(Average rating of clients by professionals)
- G2 2021#1 freelance platform
- 49,000+Signed contract every week
- $2.3BFreelancers earned on Upwork in 2020
Find the best freelance jobs
Growing your career is as easy as creating a free profile and finding work like this that fits your skills.
Trusted by