Production Hardening & Reliability Engineer

Posted last month

Worldwide

Summary

Production Hardening & Reliability Engineer (n8n, Twilio, PostgreSQL) Overview: I'm a first-time founder building a multi-tenant SMS automation platform for home service businesses. I built it with help from AI tools, and it's in advanced MVP testing. Core workflows function, but I'm not a professional engineer and I fully expect design flaws, missing safeguards, undiscovered edge cases, and reliability issues I don't yet know how to recognize. I'm looking for an experienced engineer to treat this as a production-readiness project. I do not need a new MVP built. I need someone to evaluate what exists, tell me what's solid, what's fragile, and what's dangerous, and help make it reliable enough for paying customers. Assume the system was built by a motivated founder, not a dev team. Some parts are likely fine, others are held together with duct tape, and part of the job is determining which is which. Technology Stack: n8n (self-hosted), PostgreSQL, Twilio, Docker, Google Calendar, Google Sheets, Cloudflare Tunnel, Ollama (local LLMs), Mac Studio M1 Ultra. What Exists Today: Multi-tenant: separate Twilio numbers per tenant, tenant-scoped data, per-tenant scheduling rules Lead management: CSV import, webhook intake, conversation history, activity logging SMS automation: inbound handling, outbound follow-up campaigns, STOP/DNC handling, human escalation, AI-assisted conversations, appointment booking Scheduling: Google Calendar availability checks, booking, confirmations Reporting: weekly/monthly reports, Google Sheets "Needs Assistance" queue, email notifications. AI: local Ollama for intent classification, routing, escalation decisions, and scheduling assistance What Success Looks Like: I'm not hiring for a dozen new features. Success means finding flaws before customers do. Improving reliability, eliminating duplicate actions, hardening workflow safety and tenant isolation, improving observability, and helping me understand where the architecture is strong versus weak. At the end I want significantly more confidence in the platform than I have today. Areas Likely Needing Review: SMS and booking workflows, scheduler logic, multi-tenant controls, Calendar and Sheets integration, AI routing and guardrails, lead state management, database schema, logging, and reporting. Likely problem types include race conditions, state management issues, duplicate-processing risks, inconsistent AI guardrails, scheduling edge cases, and immature monitoring. I'm specifically looking for someone who enjoys finding and solving these. How You'll Work: The system runs on a Mac Studio I control. Remote access can be arranged for the engagement. The database currently contains only my own test data, no real customer information, so there are no privacy concerns with the working data. I keep current backups of both the database and all workflow exports, so any mistake is a quick restore rather than a problem. Engagement Structure: I want to start with a paid audit and production-readiness assessment before committing to larger implementation work. The audit should deliver a written report of what's solid, what's risky, what's dangerous, and a prioritized roadmap. Based on those findings, we'll scope the implementation work as a second milestone. Fixed-price milestones preferred. I'm open to your input on structure, but the audit-first approach is how I'd like to begin. Optional Phase 2, Billing Automation (only if recommended after audit) Separate from hardening, I may later want Stripe-based tenant billing states (Active, Past Due, Paused) with grace periods and automatic service restore on payment. This is not part of the initial engagement. Mention relevant experience if you have it, but the first priority is making the existing platform reliable. Ownership & Confidentiality: This is a work-for-hire engagement. All deliverables and IP created for this project are assigned to me upon payment, consistent with Upwork's standard contract terms. I'll also ask the selected contractor to sign a short mutual NDA. Confidentiality is expected during and after the engagement. All credentials and access must be returnable or revocable on request. Experience Required: Apply only if you have meaningful, demonstrable experience with n8n, Twilio, PostgreSQL, Docker, API integrations, and production SaaS or workflow automation systems. Not a fit if your background is primarily prompt engineering, custom GPTs, chatbot setup, AI wrappers, or marketing automation. I care about reliability, architecture, and testing, not AI buzzwords. If your instinct is to rebuild everything from scratch, this isn't the right fit. If your instinct is to understand the system, find the risks, and harden what works, let's talk.

  • Hours to be determined
    Hourly
  • 1-3 months
    Duration
  • Expert
    Experience Level
  • Remote Job
  • Ongoing project
    Project Type
Skills and Expertise
Mandatory skills
API Integration
n8n
Ollama
Activity on this job
  • Proposals:15 to 20
  • Last viewed by client:last week
  • Hires:
    1
  • Interviewing:
    1
  • Invites sent:
    0
  • Unanswered invites:
    0
About the client
Member since May 21, 2026
  • USA
    Cheltenham3:26 AM
  • 1 hire, 1 active
  • Tech & IT
    Individual client

Explore similar jobs on Upwork

Job Aggregation and Dashboard CreationHourly‐ Posted 8 months ago
Automation
Data Scraping
Data Extraction
API
UI/UX Prototyping
Web Scraping
Zoho Creator

How it works

  • Post a job icon
    Create your free profile
    Highlight your skills and experience, show your portfolio, and set your ideal pay rate.
  • Talent comes to you icon
    Work the way you want
    Apply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
  • Payment simplified icon
    Get paid securely
    From contract to payment, we help you work safely and get paid securely.
Want to get started? Create a profile

About Upwork

  • Rating is 4.9 out of 5.
    4.9/5
    (Average rating of clients by professionals)
  • G2 2021
    #1 freelance platform
  • 49,000+
    Signed contract every week
  • $2.3B
    Freelancers earned on Upwork in 2020

Find the best freelance jobs

Growing your career is as easy as creating a free profile and finding work like this that fits your skills.

Trusted by

  • Microsoft Logo
  • Airbnb Logo
  • Bissell Logo
  • GoDaddy Logo