QA Engineer — Break the AI Operating System Before Our Clients Do
Worldwide
About This Role We're a transformation design studio building CXROS — an AI-powered operating system we run internally and install for clients. It's built with Claude, MCP (Model Context Protocol), and direct API integrations. No Zapier, no n8n, no Make. Our engineers are shipping fast. We need a dedicated QA person sitting alongside them — someone who reads the code well enough to know what should happen, then proves whether it actually does. You're not clicking buttons on a finished product. You're testing automations, MCP integrations, and AI agents that have to run unattended without breaking. This is QA for people who can debug. If you can only report "it doesn't work," this isn't the role. We need "it fails here, on this input, because of this — here's how to reproduce it." What You're Testing You'll be the quality layer under the Outcomes Machine — the operational backbone of CXROS: Transcript Intelligence — transcripts in, AI summaries out, filed to Notion. Does it file to the right place every time? Communications Intelligence — a Missive bridge unifying email, Slack, and Teams with AI-drafted replies. Does sync hold under real volume? Do drafts ever auto-send when they shouldn't? Routing Agent — AI content auto-filed by type, entity, and project. Where does it misroute, and why? Control Tower — operational health dashboard. Are the signals accurate or stale? PM Infrastructure — Notion databases for projects, tasks, time, and meetings. Detailed specs exist for all of it. Your job is to test against the spec, find where reality diverges, and hand the engineer a reproducible case. What We're Looking For: Must Have Reads Python or Node well enough to understand what's being tested · Systematic test design · Reproducible bug reports (steps, input, expected vs. actual) · Debugging — isolates root cause, not just symptoms · API testing (Postman, curl, or similar) Strong Signal Has tested AI/LLM or other non-deterministic output · Integration testing across multiple APIs · MCP or Claude familiarity · Regression discipline · Self-directed Valuable Reading logs and traces to localize a failure · OAuth / token-refresh edge cases · Clear written test docs · Pushes back when a spec is ambiguous Nice to Have Light scripting to automate repetitive test runs · Webhook / scheduled-job testing · CI familiarity The must-haves are a hard filter. We'd rather have a sharp debugger who's new to MCP than a checklist tester who's seen it. How to Apply Answer these — concise answers only: Walk us through a bug you found that others had missed. How did you isolate the root cause? You're testing an AI automation that drafts email replies — the output is different on every run. How do you decide whether a result is a pass or a fail? Show us how you'd write a bug report a developer can act on without asking you a single follow-up question. Use any real example. Project Details Rate: $10–$13/hr Hours: Part-time, minimum 15 hrs/week, ongoing Working with: Directly alongside the CXROS build engineers, under our Technical Lead pair. Location: LatinaAmerica and Pakistan preferred. Strong applicants in compatible timezones are welcome. Growth: More scope as more CXROS modules ship. About Us codeswitcher builds CXROS for ourselves and installs it for clients. Claude Code is our primary development environment. We write specs, build to them, and ship things that run without someone watching. Not demos — operations. QA is how we keep it that way.
- Less than 30 hrs/weekHourly
- 1-3 monthsDuration
- IntermediateExperience Level
$10.00
-
$13.00
Hourly- Remote Job
- One-time projectProject Type
Skills and Expertise
Activity on this job
- Proposals:20 to 50
- Last viewed by client:2 weeks ago
- Interviewing:0
- Invites sent:0
- Unanswered invites:0
About the client
- United StatesSanta Monica3:14 AM
- $128K total spent79 hires, 33 active
- 7,888 hours
Explore similar jobs on Upwork
How it works
Create your free profileHighlight your skills and experience, show your portfolio, and set your ideal pay rate.
Work the way you wantApply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
Get paid securelyFrom contract to payment, we help you work safely and get paid securely.
About Upwork
- 4.9/5(Average rating of clients by professionals)
- G2 2021#1 freelance platform
- 49,000+Signed contract every week
- $2.3BFreelancers earned on Upwork in 2020
Find the best freelance jobs
Growing your career is as easy as creating a free profile and finding work like this that fits your skills.
Trusted by