You will get an AI agent security and reliability audit + test suite


Project details
Your AI agent demos perfectly — then fails in production: it hallucinates, mishandles edge cases, leaks data through prompt injection, or calls the wrong tool at the worst moment.
I run an independent reliability + security audit and hand you the evidence to fix it.
• Reliability: failure modes, edge cases, non-determinism, hallucination, accuracy, error handling.
• Security: prompt injection, tool/function abuse, data exfiltration, jailbreak resistance, over-permissioned agency, secrets & output handling.
You get a severity-ranked findings report with a prioritized fix list — and, on higher tiers, a runnable test/eval harness so you catch regressions before your users do, plus implementation and re-verification of the critical fixes.
I audit support, voice, RAG, workflow and multi-agent systems on any stack (Claude Agent SDK, LangChain, custom), with a specialty in on-chain / Web3 agents.
Background: 1st-place Solana security auditor (Anchor CPI spoofing, CVSS 7.5) who also ships agents with real test suites (287 automated tests on a production agent). Builder and breaker in one.
Fully async, written-first. Let's get your agent production-ready.
I run an independent reliability + security audit and hand you the evidence to fix it.
• Reliability: failure modes, edge cases, non-determinism, hallucination, accuracy, error handling.
• Security: prompt injection, tool/function abuse, data exfiltration, jailbreak resistance, over-permissioned agency, secrets & output handling.
You get a severity-ranked findings report with a prioritized fix list — and, on higher tiers, a runnable test/eval harness so you catch regressions before your users do, plus implementation and re-verification of the critical fixes.
I audit support, voice, RAG, workflow and multi-agent systems on any stack (Claude Agent SDK, LangChain, custom), with a specialty in on-chain / Web3 agents.
Background: 1st-place Solana security auditor (Anchor CPI spoofing, CVSS 7.5) who also ships agents with real test suites (287 automated tests on a production agent). Builder and breaker in one.
Fully async, written-first. Let's get your agent production-ready.
AI Algorithms
Large Language Model, Multimodal Large Language ModelAI Applications
AI Chatbot, Conversational AIAI Development Language
PythonAI Models
ChatGPT, GPT-4, LLaMAWhat's included
| Service Tiers |
Starter
$249
|
Standard
$699
|
Advanced
$1,499
|
|---|---|---|---|
| Delivery Time | 4 days | 7 days | 14 days |
Number of Revisions | 1 | 2 | 3 |
AI Model Integration | - | - | - |
Batch Normalization | - | - | - |
Database Integration | - | - | - |
Detailed Code Comments | |||
Image Upscaling | - | - | - |
MLOps | - | - | - |
Model Deployment | - | - | |
Model Documentation | |||
Model Monitoring | - | - | |
Model Testing & Optimization | - | ||
Model Tuning | - | - | - |
Natural Language Processing | - | - | - |
NLP Tokenization | - | - | - |
Pre-Training | - | - | - |
Prompt Engineering | - | - | - |
Setup File | - | - | - |
Source Code | - |
Optional add-ons
You can add these on the next page.
Additional agent or flow audited
(+ 3 Days)
+$199
Wire the test suite into your CI
(+ 2 Days)
+$149Frequently asked questions
About Rheza
AI Agent Engineer on Solana | Audit-Proven | RECTOR LABS
Bekasi, Indonesia - 11:27 am local time
I don't build chatbots that get stuck. I build agentic systems that survive mainnet: LLMs that orchestrate real on-chain actions, agent privacy infrastructure, multi-chain stealth tooling. Every line of agent code I ship is audited like a smart contract — because that's what agents are: programs with money and autonomy.
What I build:
- Production agentic apps — LLM tool calling, multi-step reasoning, auto-recovery, preflight simulation
- Agent infrastructure on Solana — REST APIs + skill files for AI agents to consume on-chain primitives
- Solana programs — Rust + Anchor, audit-grade architecture, on-chain logic that survives mainnet
- Agent privacy — stealth addresses, viewing keys, Pedersen commitments across 17 chains
- Smart contract security — vulnerability hunting, audit reports, mitigations (13 findings across 14 audited Solana repos)
Proof of work:
- Sipher — Privacy-as-a-Skill for Multi-Chain Agents (66 endpoints, 17 chains, on-chain Anchor program, 1,402 tests, live in production)
- Kami — AI Co-Pilot for Kamino DeFi (plain English → signed mainnet tx, LLM auto-recovery, live on Solana mainnet)
- 11 hackathon & bounty wins — $36,050+ across the Solana ecosystem
- $10K Solana Foundation grant + $6K audit subsidy for SIP Protocol
- 1st of 116 in a Solana Security Audit — 13 findings across 14 repos, incl. a framework-level Anchor CPI bug (CVSS 7.5, fixed upstream)
How I work:
- Production-first — every line written as if it ships tonight, audited tomorrow
- Fully async — written updates + Loom walkthroughs by default. I'm in Jakarta (UTC+7); async keeps us both unblocked across time zones. Saves you status meetings, gives me deep focus.
- Honest comms — clear timelines, no scope creep, no ghosting
- Security-conscious — agents move money, so the bar is high
- Pragmatic — I ask questions before coding, pick the right tool, and say "no" when a feature doesn't serve the goal
Stack: Solana, Rust, Anchor, TypeScript, Next.js, Vercel AI SDK, MCP, Noir (ZK), Python, Docker, PostgreSQL
Let's ship a production agent — for your DeFi protocol, your trading desk, your DAO treasury, or whatever else needs to be smart, secure, and on-chain.
Steps for completing your project
After purchasing the project, send requirements so Rheza can start the project.
Delivery time starts when Rheza receives requirements from you.
Rheza works on your project following the steps below.
Revisions may occur after the delivery date.
Scope & access
Confirm the agent or flow in scope, then collect repo access, an architecture overview, and any known failure cases.
Reliability + security review
Manual and tooled analysis across reliability (failure modes, edge cases, hallucination) and security (prompt injection, tool abuse, data exfiltration).