Production pipeline reliability
Worldwide
We run an AI document-processing system in production. It ingests documents, runs a structured-extraction pipeline, and from the extracted data it generates articles and dashboards. The pipelines are orchestrated on a container platform. The system is split across multiple repositories, microservices style, so the stalls can cross service boundaries. The problem: these pipelines can stall in production. Jobs stop making progress and the flow halts until someone intervenes manually. We need them to run unattended and reliably. Scope, and only this: make the structured extraction, article generation, and dashboard generation pipelines never get stuck in production. Find why they stall, fix it, and prove it holds. Nothing else is in scope. We will share what we have observed about the stalls once you start. Doing this properly is not just surface patches. It may require strengthening the business logic and the orchestration behind these pipelines. We expect real fixes to root causes, not workarounds that hide the symptom. How the project works: - Two milestones, published on Upwork. - Milestone 1, setup and framework understanding: get set up across the multiple repositories, get access, and demonstrate a working understanding of the architecture and the orchestration. - Milestone 2, the actual work: make the pipelines stall-free in production. Stability will be evaluated by multiple pipeline runs across the following week. This milestone is passed only if no pipeline stalls across those runs. If any run still stalls, it is not met, and the fix continues until it holds. - 5 days maximum. - We require daily updates on Slack, with short check-in calls on Google Meet. If this milestone is passed successfully, we may continue with a longer term commitment. You should be strong in production debugging of orchestrated data pipelines, backend services, and the infrastructure they run on, and comfortable proving reliability rather than just patching symptoms. No agencies and no project managers. We want the engineer who will do the work, directly. The selected developer signs an NDA before commencing work. To apply, send us a short video where you address the questions, and show examples of your previous work, in particular work related to what we need here. Applications written with AI will be rejected automatically. If we are happy with your application, there is a quick follow up interview.
$2,500.00
Fixed-price- ExpertExperience Level
- Remote Job
- One-time projectProject Type
Skills and Expertise
Activity on this job
- Proposals:10 to 15
- Last viewed by client:2 hours ago
- Interviewing:8
- Invites sent:8
- Unanswered invites:4
About the client
- United Arab EmiratesDubai7:41 PM
- $36K total spent30 hires, 3 active
- 149 hours
Explore similar jobs on Upwork
How it works
Create your free profileHighlight your skills and experience, show your portfolio, and set your ideal pay rate.
Work the way you wantApply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
Get paid securelyFrom contract to payment, we help you work safely and get paid securely.
About Upwork
- 4.9/5(Average rating of clients by professionals)
- G2 2021#1 freelance platform
- 49,000+Signed contract every week
- $2.3BFreelancers earned on Upwork in 2020
Find the best freelance jobs
Growing your career is as easy as creating a free profile and finding work like this that fits your skills.
Trusted by