Your agent is the brain.
We are the rails it runs on.
Your Claude or Cursor agent already knows how to test. We give it the browser, the inboxes, the SMS numbers, the personas, the video evidence. One MCP call away, no infra to own.
Demo generated by TestMyVibes' own assemble_demo_video MCP tool. Auto-regenerated when features change.
Wire it in
Add the MCP server to Claude Code:
claude mcp add testmyvibes \
--transport http \
--url https://testmyvibes.com/mcp \
--header "Authorization: Bearer $TMV_API_KEY"
Now your agent can run an entire signup-to-dashboard test against your site from any conversation:
Run a TMV Whole Kit Core against my staging site
(https://staging.myapp.com). Three stories: signup,
checkout, dashboard. Use the existing test inbox.
Report bugs by severity.
What's behind the tools
- 12 AI evaluation personalities — Marketing Strategist, Legal Eagle, MCP Auditor (yes, we test other MCP servers), Skeptical CTO, Consistency Auditor, First-Time User, more. All running stickler mode on top so they catch the microcopy and pricing drift the obvious bugs hide.
- Whole Kit combo tiers — Solo, Core, Plus, Max. 1 to 10 named user stories × every personality, auto-pause-and-refund if cumulative bugs cross threshold.
- N-agent interaction scenes — multi-role flows via
signal()/wait_for_signal(). Publisher + viewer. Buyer + seller. Multi-user chat. The kind of testing single-browser tools can't do. - Real session inboxes + SMS pool provisioned per job.
wait_for_emailandwait_for_smsfirst-class. - Persona generation that cleans up after itself. No more fake users in your DB.
- Evidence by default — WebM video, screenshot grid, console + network capture, signed URLs to Spaces.
- Hands-off demo + voiceover assembly —
assemble_demo_video+ Hume Octave TTS, the same toolchain that generated the demo above. - Feedback intake MCP — your agent can file structured bug reports against us. Queued for joint review, never auto-acted.
Pricing, honest
1 credit = $0.10. Smallest pack is $5 (50 credits). Pre-flight cost quote in the submit_test response, refund-on-underrun on every job. Whole Kit tiers charge a flat per-tier price so multi-story runs can't surprise-bill. See /pricing for the full breakdown or /api/v1/pricing.json for the machine-readable version.
Why us, when anyone could wire Puppeteer to GPT-4o
Anyone can, in a weekend. We have already solved the long tail: chromium request-interception races, Mailgun catchall routing, 10DLC SMS campaigns, persona-collision recovery, fan-out, dedupe, retries, evidence capture, the queue, the refunds, the auth, the rate-limit-egress proxy pool. Months of plumbing you don't have to own at 3am.
Plus humans, on demand, when judgment matters more than throughput.
Daily release notes at /release-notes · One blog post a day at /blog