QA Engineer — Break the AI Operating System Before Our Clients Do

Posted 4 weeks ago

Worldwide

Summary

About This Role We're a transformation design studio building CXROS — an AI-powered operating system we run internally and install for clients. It's built with Claude, MCP (Model Context Protocol), and direct API integrations. No Zapier, no n8n, no Make. Our engineers are shipping fast. We need a dedicated QA person sitting alongside them — someone who reads the code well enough to know what should happen, then proves whether it actually does. You're not clicking buttons on a finished product. You're testing automations, MCP integrations, and AI agents that have to run unattended without breaking. This is QA for people who can debug. If you can only report "it doesn't work," this isn't the role. We need "it fails here, on this input, because of this — here's how to reproduce it." What You're Testing You'll be the quality layer under the Outcomes Machine — the operational backbone of CXROS: Transcript Intelligence — transcripts in, AI summaries out, filed to Notion. Does it file to the right place every time? Communications Intelligence — a Missive bridge unifying email, Slack, and Teams with AI-drafted replies. Does sync hold under real volume? Do drafts ever auto-send when they shouldn't? Routing Agent — AI content auto-filed by type, entity, and project. Where does it misroute, and why? Control Tower — operational health dashboard. Are the signals accurate or stale? PM Infrastructure — Notion databases for projects, tasks, time, and meetings. Detailed specs exist for all of it. Your job is to test against the spec, find where reality diverges, and hand the engineer a reproducible case. What We're Looking For: Must Have Reads Python or Node well enough to understand what's being tested · Systematic test design · Reproducible bug reports (steps, input, expected vs. actual) · Debugging — isolates root cause, not just symptoms · API testing (Postman, curl, or similar) Strong Signal Has tested AI/LLM or other non-deterministic output · Integration testing across multiple APIs · MCP or Claude familiarity · Regression discipline · Self-directed Valuable Reading logs and traces to localize a failure · OAuth / token-refresh edge cases · Clear written test docs · Pushes back when a spec is ambiguous Nice to Have Light scripting to automate repetitive test runs · Webhook / scheduled-job testing · CI familiarity The must-haves are a hard filter. We'd rather have a sharp debugger who's new to MCP than a checklist tester who's seen it. How to Apply Answer these — concise answers only: Walk us through a bug you found that others had missed. How did you isolate the root cause? You're testing an AI automation that drafts email replies — the output is different on every run. How do you decide whether a result is a pass or a fail? Show us how you'd write a bug report a developer can act on without asking you a single follow-up question. Use any real example. Project Details Rate: $10–$13/hr Hours: Part-time, minimum 15 hrs/week, ongoing Working with: Directly alongside the CXROS build engineers, under our Technical Lead pair. Location: LatinaAmerica and Pakistan preferred. Strong applicants in compatible timezones are welcome. Growth: More scope as more CXROS modules ship. About Us codeswitcher builds CXROS for ourselves and installs it for clients. Claude Code is our primary development environment. We write specs, build to them, and ship things that run without someone watching. Not demos — operations. QA is how we keep it that way.

Less than 30 hrs/week
Hourly
1-3 months
Duration
Intermediate
Experience Level
$10.00
-
$13.00
Hourly
Remote Job
One-time project
Project Type

Skills and Expertise

Mandatory skills

Annotated Screenshot

QA Engineering

Activity on this job

Proposals:20 to 50
Last viewed by client:2 weeks ago
Interviewing:
0
Invites sent:
0
Unanswered invites:
0

About the client

Member since Aug 14, 2024

United States
Santa Monica3:14 AM
$128K total spent
79 hires, 33 active
7,888 hours

Explore similar jobs on Upwork

Technical Co-Founder / Automation Engineering Partner Needed for…Hourly‐ Posted 8 months ago

Test Automation Framework

Automated Testing

JavaScript

Python

Auto-GPT

QA & Release Engineer — Windows Software (Part-Time, Ongoing)Hourly‐ Posted 1 week ago

Desktop Application Testing

Web Testing

Bug Reports

Software Testing

Functional Testing

Product Stability

Manual Testing

Automated Testing

How it works

Create your free profile
Highlight your skills and experience, show your portfolio, and set your ideal pay rate.
Work the way you want
Apply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
Get paid securely
From contract to payment, we help you work safely and get paid securely.