We've been reviewing AI-built applications for Australian businesses since early 2025. 20 apps, built with Claude Code, Cursor, Lovable, Bolt, and combinations of all of them, for companies and founders getting ready to launch.
Every single one had at least one critical or high-severity finding. Not most of them. All twenty.
The 20 apps
Solo-founder MVPs through to multi-developer projects at companies with 50-plus staff. Professional services, property tech, healthcare admin, legal tech, ecommerce, internal tooling. Most were web applications; a few had mobile components. All were pre-launch or recently launched when we reviewed them.
Tools used: Claude Code (9 apps), Cursor (6), Lovable or Bolt (4), with 7 built across more than one tool. The breakdown by tool didn't matter much. Mixed codebases had the same patterns as single-tool ones.
Average build time, by clients' own account: three to eight weeks. Average time before anyone external looked at the security: for most of them, never.
The numbers
- 20 / 20 had at least one critical or high-severity finding
- 17 / 20 had hardcoded secrets or exposed credentials in the codebase or git history
- 14 / 20 had broken access control — authenticated users could read other users' data by changing a parameter in a request
- 13 / 20 had no rate limiting on authentication endpoints
- 11 / 20 had at least one endpoint returning stack traces or internal error details to callers
- 8 / 20 trusted security-critical values sent from the browser without server-side validation
- 6 / 20 had Australian Privacy Act exposure
Secrets and credentials
17 of the 20 apps had credentials where they shouldn't be.
Three had OpenAI or Anthropic API keys committed to git. Two of those repos had been public or semi-public at some point. One had a Stripe secret key that was committed, then removed from the working tree — still sitting in git history, recoverable with one command.
Five apps used Supabase. Two had the service role key in client-side JavaScript. The service role key bypasses all row-level security. Anyone on the site could pull it from dev tools in about thirty seconds.
Several apps were receiving Stripe or Zapier webhooks without validating the signature, or with the webhook secret written directly into the route handler.
None of this is obscure. Secrets go in environment variables, service role keys stay server-side, webhook signatures get validated on every request. The AI tools know how to write this. They just don't do it automatically.
Broken access control
14 of the 20 apps had broken access control. OWASP has ranked it the top web application for years, which tracks — it works fine in testing and fails badly once real users are on the platform.
A booking app had an endpoint that returned appointment details without checking whether the logged-in user owned that appointment. Anyone authenticated could read other users' bookings by incrementing the ID in the URL.
A property management tool had the same gap with lease documents. A tenant could pull another tenant's documents by changing the document ID in the request. The endpoint checked for a valid login. It didn't check whether that login had any claim to the document being requested.
In both cases the code worked exactly as designed. The check that asks "does this user own this record?" just wasn't there.
Four apps had Supabase Row Level Security turned off or misconfigured. Three of those also had the service role key client-side, meaning RLS was the only thing standing between one tenant's data and another's.
Rate limiting
13 of the 20 apps had no rate limiting on authentication endpoints.
Without it, login endpoints can be brute-forced. Password reset endpoints can leak which email addresses have accounts — the response time differs between a valid address (the code runs a hash check) and an invalid one (it returns early). That difference is measurable. We saw one app where it was. Magic link endpoints can hit a mail provider's daily sending cap fast enough to lock real users out.
Rate limiting is a few lines of middleware. It doesn't come standard.
Client-side trust
8 of the 20 apps trusted values from the browser for decisions that should happen server-side.
The clearest case: a checkout flow where the item price came in the POST body and the server accepted whatever was sent. We changed the price field in the request. A $499 subscription cost us $0.01.
Another app passed the user's role as a client-generated value that the app trusted on every subsequent request without ever checking the database. Changing the value gave full admin access to any authenticated account.
Normal testing doesn't catch these because developers test their own flows. Nobody sends a request with price: 0.01 unless they're looking for it.
Privacy Act exposure
Six apps had findings under the Australian Privacy Act 1988.
The pattern: collecting the app had no reason to collect, no retention policy on what was stored, and personal information in application logs without masking. Two apps had Medicare numbers in the logs.
Two were collecting health-adjacent information with no disclosure to users about how it would be stored or used. Health information carries stricter obligations under the Australian Privacy Principles than general personal data. Both apps were handling it the same as any other form field.
What the AI tools are getting wrong
The tools write correct code for the feature as described. Whether the right people can call an endpoint, whether the input can be abused, whether a secret ends up somewhere public — those things aren't in the feature description, so they don't make it into the output.
OX Security scanned four million AI-generated pull requests and found 62% had at least one vulnerability. Their review of 15 vibe-coded apps found 100% had SSRF vulnerabilities and 0% had security headers. RedHunt Labs found credentials in 1 in 5 of roughly 130,000 publicly accessible vibe-coded sites.
We've also seen this more than once: asking the AI to fix a vulnerability introduced a new one. More prompting doesn't fix the underlying problem. Security needs to be a separate pass by someone looking for problems, not features.
The free tool
We built a free Claude Code skill that runs 18 audit categories against your codebase — OWASP Top 10 plus AI-specific risks. It's not a substitute for hands-on review, but it will find committed secrets, flag Supabase RLS gaps, and catch the access control patterns that keep showing up. MIT licensed, works with Claude Code, Cursor, ChatGPT, or anything else. One command: /ironsights-vibe-check. Available at github.com/ironsightscyber/vibe-security.
If you're about to launch
Run the free tool. If it finds nothing, good. If it finds something, you have a list to work through before go-live.
If you're handling customer data, taking payments, or building for a client who can't absorb a breach, a professional review is worth doing. We run fixed-scope vibe security checks for Australian businesses: OWASP Top 10, access control, secrets, authentication, dependency risk, Privacy Act exposure, written findings brief. Usually 2–5 business days. Details at ironsights.com.au/vibe-security. Not sure if you need it? Contact us and we'll be straight with you.


