The thing that changed, and the thing that didn’t
There is a version of AI-assisted development that is genuinely transformative. A non-technical founder can describe a product idea in plain language and have a working prototype within a day. A solo builder can ship what previously required a small engineering team. A developer who has been wanting to build a tool for eight years can finish it in three months. These outcomes are real.
What did not change is which decisions require judgment. AI tools lowered the cost of writing code. They did not lower the cost of making bad decisions about what to build, how to structure it, who can access it, or what happens when something goes wrong. For builders with deep technical experience, this is manageable. They already know which categories of decision require care. For builders who are newer to the process, or building entirely through AI tooling, the risk is invisible because the tools make it easy to skip past the decisions that matter most.
The experience of building has changed. The topology of the risks has not.
Why it works at first
Lalit Maganti, a developer who spent eight years wanting to build a proper SQLite toolset, finally built it this year using AI assistance. He shipped in three months. He also lost an entire month to what he calls “vibe-coding” — accepting AI-generated code that looked plausible, letting the model make architectural decisions he had not explicitly reviewed, and pushing forward until the thing collapsed under its own weight.
The collapse did not look like a crash. Tests passed. The code compiled. Individual functions did what they were supposed to do. What failed was the global coherence of the thing. The implicit decisions about how pieces related to each other had accumulated into a structure that could not be built on further without contradiction. He rewrote the whole thing and finished it the second time by keeping all design decisions in his own hands, delegating only the mechanical work.
This is the shape of the problem. AI-assisted code tends to be locally correct. Individual components work. The tests you think to write pass. What the model cannot produce reliably is global coherence. The kind of architectural consistency that only comes from someone who has been living with the decisions and understands why each one was made. When you outsource that to the model, you do not just lose control of one decision. You lose the thread that makes the next decision sensible.
The tools are genuinely good at the mechanical work. They are not good at understanding what the thing is supposed to become.
What judgment actually is
It is worth being clear about what judgment means here, because the word is easy to wave at without defining.
Judgment, in the context of building software, is the ability to evaluate whether a decision is correct along dimensions that cannot be reduced to a test. A function that compiles and passes its test suite is correct in the local sense. But whether that function is the right function to write — whether the abstraction is at the right level, whether the interface it creates will feel coherent when you are building on top of it six months later — cannot be checked mechanically. There is no test for “this API is pleasant to work with.” There is no compiler error for “this design will make everything harder as the product grows.”
The model predicts plausible next steps based on what you have told it and what it learned during training. It does not have a mental model of your product, your users, or your constraints that would let it evaluate whether a decision is right in the larger sense. It produces what is locally consistent with the prompt. Whether that local consistency adds up to something coherent and durable is a different question — and it is yours to answer, not the model’s.
The practical consequence is that there is a class of decisions you need to own explicitly, and the most important thing to do before starting is to know what that class is.
The risk categories most builders don’t know to ask about
For builders with engineering backgrounds, the risk categories are familiar territory. For builders without that background, or founders who are building their first software product using AI tools, these categories often go unrecognised until something goes wrong.
Security and access control. Who can access what data, under what conditions, is a decision that compounds quietly. AI will generate authentication code that technically works. It will not necessarily structure permissions in a way that makes sense when you have multiple user types, or flag that you are storing credentials in a way that creates exposure. The model follows the pattern you describe; it does not audit whether that pattern is safe. By the time a security problem is visible, it has usually been present for a long time.
Data handling and what you’re collecting. If your product touches user data — and almost everything does — decisions about what you store, how long you keep it, and what you do with it carry regulatory and reputational weight that is invisible in the prototype but unavoidable in production. AI tools will generate data models that handle the immediate requirement. They will not volunteer the question of whether you need GDPR consent flows, what your data retention policy should be, or what happens to user data if someone requests deletion. Those are judgment calls that require knowing the question exists.
What breaks at scale. AI-generated code tends to be optimised for the case you described, not the case where ten times as many people use it at once. Architectural decisions that work fine for a prototype become problems when load increases. database queries that scan entire tables, synchronous operations that block on slow external calls, infrastructure that assumes a single server. The model has no knowledge of your traffic expectations, your budget constraints, or which parts of the system will be under pressure first. Scalability is a judgment call about what matters to get right now versus what you can fix later, and that judgment requires understanding the product’s growth trajectory.
Vendor and platform lock-in. When AI helps you build on a specific platform, framework, or service, it tends to reach for the native tools of that platform. This is efficient in the short term. It can also mean that migrating away from that platform later is substantially more expensive than it needed to be. The decision about which external dependencies are load-bearing versus optional is rarely made explicitly. It accumulates through a series of individually-reasonable choices. A year in, you may discover that switching payment providers, authentication systems, or hosting infrastructure would require rebuilding significant parts of the product. That is a judgment call that should happen early, but often does not happen at all.
Maintainability. Code that works today but that you cannot understand or modify in six months is a hidden liability. AI-generated code can be syntactically coherent and operationally functional while being structured in ways that make future changes harder than they need to be. If no one on the team can explain why a component is structured the way it is then changing it later involves reverse-engineering a decision that was never documented. Because the model made that choice and you accepted it without review or consideration of its meaning. Maintainability is not a property the model optimises for.
Where to hold the line
The practical division is between decisions with objective correctness criteria and decisions without them.
Tasks with objective correctness criteria are safe to delegate. If there is a test, a compiler, a type checker, or a format spec that confirms the output is correct, the model will generally do this well and faster than you will. Generating boilerplate, implementing a well-defined algorithm, writing tests for existing functionality, converting data between formats, building out repetitive UI — these are mechanical. Delegate them fully.
Tasks without objective correctness criteria require you to stay in the loop. This includes anything where the question of whether the result is right involves context the model does not have: your users’ expectations, your product’s trajectory, your team’s operational capacity, your risk tolerance, regulatory requirements specific to your domain. The model cannot answer these questions because they require knowledge about the future and about constraints that live in your head, not in the prompt.
The practical decision rule: before accepting a block of AI-generated code or design, ask what it would take to verify that this is the right decision, not just that the code works. If you can write a test for it, you probably can trust the model’s judgment. If the answer is “it would require understanding the long-term architecture” or “it depends on what our users will actually need”, you are in judgment territory.
The faster you build, the more deliberately you need to learn
Drew Breunig observed that AI generates roughly a thousand lines of code per commit, compared to a human developer’s ten to twenty lines per day. This is not a complaint about AI — it is a description of a real change in how fast implementation moves. The consequence is that the ratio between “building” and “understanding what you just built” has shifted dramatically.
Traditional development was slow enough that feedback loops were built in. You spent a week on a feature, noticed problems during that week, adjusted, and arrived at something you understood well by the time it shipped. The iteration cycle was calibrated to human decision speed. When AI compresses a week of implementation into a day, that calibration breaks. You can rebuild before you have understood what you built. You can be three versions ahead of your own understanding of whether the first version was right.
Harper Reed calls this conviction collapse. The problem is not that you are moving too fast. The problem is that speed is only valuable if your mechanism for learning from what you built is intact. If you are rebuilding before the previous version has taught you anything, you are accumulating complexity faster than you are accumulating understanding. The gap between what you have built and what you can explain compounds over time.
For builders without technical backgrounds, this is especially acute. Experienced engineers have accumulated intuitions about failure modes — patterns that tend to break, architectural choices that create debt, security assumptions that do not hold. Those intuitions are built through slow, painful feedback cycles over years. AI tools hand non-technical builders the implementation speed without the accompanying intuitions. The risk is not just building wrong things faster. It is not having the equipment to notice.
What to do about it
The answer is not to slow down. It is to be deliberate about the decisions that require your judgment, and to invest in learning mechanisms that can keep pace with implementation speed.
Before starting a build, name the risk categories explicitly. Ask whether you have considered access control, data handling, scalability, vendor dependencies, and maintainability — not because you need to solve all of them immediately, but because knowing the question exists means you can make a conscious choice about when to address it. Most builders who end up with security problems or architectural debt did not knowingly choose to incur it. They just did not know to ask.
Build shorter learning cycles into the process. At AI implementation speed, a week without structured feedback from real users or real usage data is enough time to accumulate significant architectural debt. Tight contact cadences with actual users, frequent audits of what the product is teaching you, and deliberate checkpoints where you ask “do we understand this well enough to keep building on it” are not overhead. They are the mechanism that keeps speed useful.
Be explicit about what you are delegating and why. The failure mode is not using AI — it is using AI without being conscious about which decisions you are handing over. A builder who says “I want the model to write this function because it is mechanical and I can verify the output” is in a different position from a builder who says “I want the model to design the authentication system because that seems complicated.” The first is delegation. The second is abdication.
Implementation is cheap now. The decisions that shape what you are building, who can use it safely, and whether it will hold up over time are still yours to make. That boundary is worth knowing.