We Audited 20 Vibe-Coded Apps: Security Findings

We've been reviewing AI-built applications for Australian businesses since early 2025. 20 apps, built with Claude Code, Cursor, Lovable, Bolt, and combinations of all of them, for companies and founders getting ready to launch.

Every single one had at least one critical or high-severity finding. Not most of them. All twenty.

The 20 apps

Solo-founder MVPs through to multi-developer projects at companies with 50-plus staff. Professional services, property tech, healthcare admin, legal tech, ecommerce, internal tooling. Most were web applications; a few had mobile components. All were pre-launch or recently launched when we reviewed them.

Tools used: Claude Code (9 apps), Cursor (6), Lovable or Bolt (4), with 7 built across more than one tool. The breakdown by tool didn't matter much. Mixed codebases had the same patterns as single-tool ones.

Average build time, by clients' own account: three to eight weeks. Average time before anyone external looked at the security: for most of them, never.

The numbers

20 / 20 had at least one critical or high-severity finding
17 / 20 had hardcoded secrets or exposed credentials in the codebase or git history
14 / 20 had broken access control — authenticated users could read other users' data by changing a parameter in a request
13 / 20 had no rate limiting on authentication endpoints
11 / 20 had at least one endpoint returning stack traces or internal error details to callers
8 / 20 trusted security-critical values sent from the browser without server-side validation
6 / 20 had Australian Privacy Act exposure

Secrets and credentials

17 of the 20 apps had credentials where they shouldn't be.

Three had OpenAI or Anthropic API keys committed to git. Two of those repos had been public or semi-public at some point. One had a Stripe secret key that was committed, then removed from the working tree — still sitting in git history, recoverable with one command.

Five apps used Supabase. Two had the service role key in client-side JavaScript. The service role key bypasses all row-level security. Anyone on the site could pull it from dev tools in about thirty seconds.

Several apps were receiving Stripe or Zapier webhooks without validating the signature, or with the webhook secret written directly into the route handler.

None of this is obscure. Secrets go in environment variables, service role keys stay server-side, webhook signatures get validated on every request. The AI tools know how to write this. They just don't do it automatically.

Broken access control

14 of the 20 apps had broken access control. OWASP has ranked it the top web application for years, which tracks — it works fine in testing and fails badly once real users are on the platform.

A booking app had an endpoint that returned appointment details without checking whether the logged-in user owned that appointment. Anyone authenticated could read other users' bookings by incrementing the ID in the URL.

A property management tool had the same gap with lease documents. A tenant could pull another tenant's documents by changing the document ID in the request. The endpoint checked for a valid login. It didn't check whether that login had any claim to the document being requested.

In both cases the code worked exactly as designed. The check that asks "does this user own this record?" just wasn't there.

Four apps had Supabase Row Level Security turned off or misconfigured. Three of those also had the service role key client-side, meaning RLS was the only thing standing between one tenant's data and another's.

Rate limiting

13 of the 20 apps had no rate limiting on authentication endpoints.

Without it, login endpoints can be brute-forced. Password reset endpoints can leak which email addresses have accounts — the response time differs between a valid address (the code runs a hash check) and an invalid one (it returns early). That difference is measurable. We saw one app where it was. Magic link endpoints can hit a mail provider's daily sending cap fast enough to lock real users out.

Rate limiting is a few lines of middleware. It doesn't come standard.

Client-side trust

8 of the 20 apps trusted values from the browser for decisions that should happen server-side.

The clearest case: a checkout flow where the item price came in the POST body and the server accepted whatever was sent. We changed the price field in the request. A $499 subscription cost us $0.01.

Another app passed the user's role as a client-generated value that the app trusted on every subsequent request without ever checking the database. Changing the value gave full admin access to any authenticated account.

Normal testing doesn't catch these because developers test their own flows. Nobody sends a request with price: 0.01 unless they're looking for it.

Privacy Act exposure

Six apps had findings under the Australian Privacy Act 1988.

The pattern: collecting the app had no reason to collect, no retention policy on what was stored, and personal information in application logs without masking. Two apps had Medicare numbers in the logs.

Two were collecting health-adjacent information with no disclosure to users about how it would be stored or used. Health information carries stricter obligations under the Australian Privacy Principles than general personal data. Both apps were handling it the same as any other form field.

What the AI tools are getting wrong

The tools write correct code for the feature as described. Whether the right people can call an endpoint, whether the input can be abused, whether a secret ends up somewhere public — those things aren't in the feature description, so they don't make it into the output.

OX Security scanned four million AI-generated pull requests and found 62% had at least one vulnerability. Their review of 15 vibe-coded apps found 100% had SSRF vulnerabilities and 0% had security headers. RedHunt Labs found credentials in 1 in 5 of roughly 130,000 publicly accessible vibe-coded sites.

We've also seen this more than once: asking the AI to fix a vulnerability introduced a new one. More prompting doesn't fix the underlying problem. Security needs to be a separate pass by someone looking for problems, not features.

The free tool

We built a free Claude Code skill that runs 18 audit categories against your codebase — OWASP Top 10 plus AI-specific risks. It's not a substitute for hands-on review, but it will find committed secrets, flag Supabase RLS gaps, and catch the access control patterns that keep showing up. MIT licensed, works with Claude Code, Cursor, ChatGPT, or anything else. One command: /ironsights-vibe-check. Available at github.com/ironsightscyber/vibe-security.

If you're about to launch

Run the free tool. If it finds nothing, good. If it finds something, you have a list to work through before go-live.

If you're handling customer data, taking payments, or building for a client who can't absorb a breach, a professional review is worth doing. We run fixed-scope vibe security checks for Australian businesses: OWASP Top 10, access control, secrets, authentication, dependency risk, Privacy Act exposure, written findings brief. Usually 2–5 business days. Details at ironsights.com.au/vibe-security. Not sure if you need it? Contact us and we'll be straight with you.

We audited 20 vibe-coded apps. Here's what we found.

The 20 apps

The numbers

Secrets and credentials

Broken access control

Rate limiting

Client-side trust

Privacy Act exposure

What the AI tools are getting wrong

The free tool

If you're about to launch

Keep building the picture.

When AI writes the code, who checks the security?

Penetration Testing vs Vulnerability Scanning: What Your Business Actually Needs

More from the IronSights team.

We audited 20 vibe-coded apps. Here's what we found.

01The 20 apps

02The numbers

03Secrets and credentials

04Broken access control

05Rate limiting

06Client-side trust

07Privacy Act exposure

08What the AI tools are getting wrong

09The free tool

10If you're about to launch

Keep building the picture.

When AI writes the code, who checks the security?

Penetration Testing vs Vulnerability Scanning: What Your Business Actually Needs

More from the IronSights team.

The 20 apps

The numbers

Secrets and credentials

Broken access control

Rate limiting

Client-side trust

Privacy Act exposure

What the AI tools are getting wrong

The free tool

If you're about to launch