The Agentic Coding Revolution: When Productivity Gains Meet Security Crisis

In December 2025, Anthropic announced that Claude Code had reached one billion dollars in annualised revenue within six months of its general availability launch. The agentic coding tool, which lives in the terminal and can read, write, and execute code autonomously, had captured 54 per cent of the enterprise coding market. OpenAI's competing offerings held 21 per cent. The numbers signalled a fundamental shift in how software gets built.
But the same month brought very different statistics. Veracode's GenAI Code Security Report revealed that 45 per cent of AI-generated code contained security vulnerabilities. GitClear's research documented an eightfold increase in duplicated code blocks since AI coding assistants became mainstream. And a rigorous study from METR found that experienced developers using frontier AI tools actually took 19 per cent longer to complete tasks than those working without assistance.
These contradictory signals capture the essential tension of the agentic coding moment. The tools are genuinely powerful. The adoption is genuinely rapid. And the problems are accumulating faster than most organisations recognise. The question confronting every technology leader is whether their organisation can build the governance, review, and incident response capabilities necessary before the compounding liabilities overtake the productivity gains.
Anthropic's Structural Playbook
Claude Code's dominance did not emerge from a vacuum. Anthropic has constructed interlocking advantages that create compounding network effects in ways competitors have struggled to replicate.
The technical architecture centres on what Anthropic calls “agentic operation.” Unlike GitHub Copilot, which functions primarily as an autocomplete engine suggesting code as developers type, Claude Code operates as an autonomous agent capable of planning multi-step tasks, executing shell commands, modifying multiple files simultaneously, and maintaining awareness of entire repository structures. The September 2025 release of Claude Code 2.0 introduced a checkpoint system that automatically saves code state before each change, allowing developers to pursue ambitious modifications knowing they can instantly rewind to previous versions by tapping Escape twice or using the rewind command.
The checkpoint system addresses a fundamental anxiety constraining agentic tool adoption across the industry. When an AI agent can modify dozens of files in a single operation, the risk of catastrophic mistakes increases proportionally. Anthropic's solution provides version control for AI operations, creating psychological safety that enables more aggressive delegation. Developers can choose to restore code, conversation history, or both when rewinding. This granular control over rollback proves essential when debugging why an agent made particular decisions.
Subagents represent another structural advantage that distinguishes Claude Code from competitors. Rather than forcing a single context window to handle everything, Claude Code can spawn specialised sub-processes that work in parallel on different aspects of a task. One subagent might build a backend API whilst the main agent constructs the frontend. Another subagent might investigate a particular technical question whilst the primary agent continues with implementation. Each subagent maintains its own context window optimised for its specific task, preventing the degradation that occurs when context accumulates.
The context management challenge has proven more significant than early adopters anticipated. Research from Chroma Labs demonstrated that models perform brilliantly on focused inputs but show consistent performance degradation when processing lengthy contexts. Claude models exhibited the lowest hallucination rates among tested systems and tended to abstain when uncertain rather than generating confident but incorrect responses. However, no model proved immune to decay as context accumulated. The subagent architecture provides a structural solution by keeping individual context windows focused and fresh rather than forcing a single degrading context to handle all task complexity.
The hooks system enables automated triggers at specific workflow points throughout the development process. Test suites can run automatically after code changes. Linting can execute before commits. Long-running processes like development servers can continue in the background without blocking Claude Code's progress on other tasks. These capabilities transform Claude Code from a conversational assistant into genuine workflow infrastructure that integrates with existing development practices rather than replacing them.
Anthropic has pursued a multi-surface deployment strategy that places Claude Code wherever developers already work. The tool operates natively in terminals for those who prefer command-line interfaces. A Visual Studio Code extension brings it into the dominant code editor used by millions of developers worldwide. JetBrains plugins serve developers using IntelliJ IDEA, PyCharm, WebStorm, and other JetBrains environments. GitHub Actions enable Claude to automate code review, issue triage, and continuous integration workflows directly within repositories. GitLab integration extends similar capabilities to that platform's substantial user base.
The December 2025 Slack integration may prove the most strategically significant development in Claude Code's expansion. By allowing developers to tag @Claude in Slack channels to initiate coding tasks directly from conversation threads, Anthropic inserted the tool into the communication layer where work gets discussed and delegated. Claude can read recent messages to determine context, identify the relevant repository, post progress updates in threads, and share links to review completed work. This is not merely convenience. It positions Claude Code as the execution layer for decisions made in natural conversation, capturing intent at the moment it forms rather than requiring developers to context-switch to a dedicated coding interface.
The Model Context Protocol represents Anthropic's bid for infrastructural lock-in that extends beyond its own products. Released as an open standard in November 2024, MCP provides a standardised way to connect AI models to external tools, databases, and data sources. Think of MCP as USB-C for AI applications. Just as USB-C provides a standardised way to connect devices to various peripherals, MCP provides a standardised way to connect AI models to different data sources and tools.
By March 2025, OpenAI had adopted MCP across its Agents SDK and ChatGPT desktop application. Google DeepMind confirmed support in upcoming Gemini models. In December 2025, Anthropic donated MCP to the Agentic AI Foundation under the Linux Foundation, co-founded by Anthropic, Block, and OpenAI, establishing it as a de facto industry standard with governance beyond any single company's control. Since launching MCP, the community has built thousands of MCP servers, SDKs are available for all major programming languages, and the industry has adopted MCP as the standard for connecting agents to tools and data.
This standardisation strategy resembles historical platform plays that created durable competitive advantages. By opening MCP whilst maintaining the most capable implementation, Anthropic benefits regardless of which specific tools developers connect. The protocol becomes plumbing that routes work toward Claude.
The Competitive Landscape Against Google and GitHub
Understanding Claude Code's advantages requires examining what competitors offer and where they fall short in the evolving market for AI coding assistance.
GitHub Copilot excels as an in-IDE coding assistant providing real-time code completions and suggestions as developers type. The tool integrates seamlessly with Visual Studio Code and supports a vast number of programming languages thanks to training on GitHub's enormous repository of code. For accelerating day-to-day coding tasks where developers already know what they want to implement, Copilot's fluid completions and chat capabilities remain compelling. At ten dollars per month compared to Claude Code's fifteen dollars, the price point attracts individual developers and smaller teams with tighter budgets.
But Copilot's architecture reflects an earlier philosophy of AI assistance that predates the agentic paradigm. It augments the developer line by line rather than operating autonomously on larger tasks. When work involves sweeping changes across a repository, API migrations, code style unification, or wide-ranging rename operations, Copilot requires developers to make each individual change manually even if the AI suggests the modifications. The human remains the executor of every action rather than the supervisor of autonomous work.
Google's Gemini Code Assist and Gemini CLI arrived in June 2025 with aggressive positioning that challenged both Copilot and Claude Code. Gemini CLI accumulated over 55,000 GitHub stars within weeks of launch, demonstrating substantial developer interest in alternatives. The tool is completely free with any Google account, providing 1,000 requests per day and 60 requests per minute with no billing setup required. Gemini's context window supports up to one million tokens in long-context beta configurations, theoretically enabling analysis of entire large codebases in a single prompt.
Sourcegraph's decision to make Gemini 3 Pro the default model for Cody, its AI coding assistant used by over 13 million developers through integrations with Visual Studio Code, GitHub, and JetBrains IDEs, provided significant validation for Google's offerings. Internal testing showed notable performance improvements compared with earlier Gemini versions, including more solved tasks, cleaner reasoning, and better handling of massive codebases. The endorsement from a company like Sourcegraph, whose tools are relied upon by engineering teams at Uber, Netflix, and other major technology companies, carried substantial weight in the developer community.
Yet Claude Code maintains advantages that free tiers and generous context windows cannot replicate. The checkpoint system has no equivalent in Gemini's offerings. The subagent architecture enabling parallel workstreams does not exist in competing products at the same level of sophistication. The Slack integration positions Claude Code in communication workflows that competitors have not penetrated. And the enterprise security, privacy, and compliance features that Anthropic has built for its business customers create switching costs once organisations integrate Claude Code into their development infrastructure.
Many development teams have found value in using multiple tools complementarily. Copilot accelerates day-to-day coding tasks whilst Claude Code handles complex project-level work requiring understanding of broader context and autonomous execution. This pattern suggests the market may not consolidate around a single winner but rather stratify around different use cases, price points, and enterprise requirements.
Market Timing and Perception Dynamics
Technical excellence matters, but timing matters more when capturing markets in rapid transition. Anthropic released Claude Code for preview testing in February 2025 and made it generally available in May 2025, precisely when enterprise frustration with existing coding assistants had peaked and enthusiasm for agentic capabilities had reached fever pitch.
The Menlo Ventures data tells the positioning story with striking clarity. In 2023, OpenAI dominated 50 per cent of the enterprise large language model market whilst Anthropic held merely 12 per cent. By August 2025, Anthropic commanded 32 per cent of enterprise LLM utilisation overall, and 54 per cent specifically within coding use cases. Google captured 20 per cent whilst Meta's Llama held 9 per cent. The shift reflected a perception crystallising across enterprise technology leadership: Claude models produced more reliable outputs with lower hallucination rates than alternatives.
Anthropic's revenue trajectory reinforced this perception with exponential growth that surprised industry observers. The company hit two billion dollars in annualised revenue in Q1 2025, more than doubling from the prior period. By the end of May 2025, revenue reached approximately three billion dollars. By October 2025, Sacra estimated annualised revenue at seven billion dollars. Revenue had grown tenfold annually for three consecutive years. The company projects nine billion dollars in annualised revenue by end of 2025 and between twenty and twenty-six billion dollars in 2026.
Perhaps the most significant market dynamic involves Anthropic's customer composition, which differs fundamentally from its primary competitor. Whilst OpenAI generates approximately 85 per cent of its revenue from individual ChatGPT subscriptions, Anthropic derives 85 per cent from business customers. The company's customer base expanded from fewer than one thousand businesses to over 300,000 in just two years. This enterprise concentration creates different incentive structures. Anthropic builds for organisational workflows rather than consumer novelty, prioritising reliability, security, and integration capabilities over viral features.
High-profile enterprise customers amplify the perception advantage through visible endorsements. Rakuten reported reducing software development timelines from 24 days to 5 days using Claude Code, a 79 per cent reduction that caught widespread industry attention. Netflix, Spotify, and Salesforce operate as enterprise customers. These reference accounts function as social proof that compounds adoption pressure on technology leadership at peer organisations considering AI coding investments.
When Productivity Gains Prove Illusory
The adoption frenzy has obscured an uncomfortable finding that challenges the fundamental value proposition of AI coding assistance.
In July 2025, METR published results from a randomised controlled trial examining how frontier AI tools affected experienced developer productivity. The study recruited 16 developers from large open-source repositories averaging over 22,000 GitHub stars and one million lines of code. These were developers who had contributed to their respective projects for multiple years, with an average of five years of prior experience and 1,500 commits. The methodology was rigorous: developers provided lists of 246 real issues that would be valuable to their repositories, including bug fixes, features, and refactors that would normally be part of their regular work. Each issue was randomly assigned to allow or disallow AI assistance.
The finding shocked the industry. When developers used AI tools, primarily Cursor Pro with Claude 3.5 and 3.7 Sonnet which were frontier models at the time, they took 19 per cent longer to complete tasks than when working without assistance. Before the study, developers had predicted AI would speed them up by 24 per cent. After completing tasks with the measured slowdown, they still believed AI had helped, estimating a 20 per cent improvement. The perception gap between subjective experience and objective measurement was 39 percentage points.
Several factors contributed to the documented slowdown. Developers accepted fewer than 44 per cent of AI-generated code suggestions. The low acceptance rate meant developers spent significant time reviewing, testing, and modifying code only to reject it in the end. The large and complex repositories characteristic of mature software projects proved particularly challenging for AI tools, which performed worse in environments where context exceeded their effective reasoning capacity. The AI tools introduced extra cognitive load and context-switching that disrupted developer workflows rather than enhancing them.
As Zvi Mowshowitz observed in commenting on the METR findings, even researchers who are extremely in-the-know about AI coding abilities failed to predict results accurately. Subjective impressions of productivity are not reliable indicators of actual productivity effects.
The Stack Overflow 2025 Developer Survey corroborated these findings from a different methodological angle. Whilst 84 per cent of developers now use or plan to use AI tools in their development process, up from 76 per cent in 2024, only 33 per cent trust the accuracy of outputs. 46 per cent actively distrust AI-generated code. A mere 3 per cent report highly trusting the output. The biggest frustration, cited by 66 per cent of respondents, involves “AI solutions that are almost right, but not quite.” This leads directly to the second-biggest frustration: debugging AI-generated code consumes more time than writing code manually.
Perhaps most telling: 77 per cent of developers say vibe coding is not part of their professional workflow. Developers show the strongest resistance to AI for high-responsibility tasks like deployment and monitoring, with 76 per cent indicating they will not use AI for these purposes, and project planning, with 69 per cent declining AI assistance.
The Technical Debt Accelerator
If productivity gains prove mixed, the technical debt implications are unambiguous and accumulating rapidly.
GitClear's second annual AI Copilot Code Quality research analysed 211 million changed lines of code from 2020 through 2024, examining trends across anonymised private repositories and 25 of the largest open-source projects. The findings document a fundamental shift in how code accumulates.
The number of code blocks containing five or more duplicated lines increased eightfold during 2024. Lines classified as copy-pasted rose from 8.3 per cent to 12.3 per cent between 2021 and 2024. Simultaneously, the percentage of code changes associated with refactoring collapsed from 25 per cent to less than 10 per cent. In 2024, copy-pasted code surpassed refactored code for the first time in the dataset's history. The researchers also noted a 39.9 per cent decrease in the number of moved lines, another indicator of declining architectural improvement work.
The pattern emerges from AI tools' fundamental design. Code assistants make it trivially easy to insert new blocks by pressing tab to accept suggestions. They are far less likely to propose reusing existing functions elsewhere in the codebase, partly because of limited context awareness and partly because they optimise for immediate completion rather than architectural coherence. As Bill Harding, GitClear's CEO, observed, AI has an overwhelming tendency not to understand what the existing conventions are within a repository and is very likely to come up with its own slightly different version of how to solve a problem.
This creates what researchers call “AI technical debt.” Traditional technical debt accumulates linearly. You skip tests, take shortcuts, defer refactoring, and pain builds gradually until someone allocates a sprint for cleanup. AI technical debt is different. Three vectors interact to produce exponential growth: model versioning chaos as organisations struggle to maintain code generated by different model versions with different behaviours, code generation bloat as volume overwhelms review capacity, and organisational fragmentation as teams develop inconsistent practices.
The Google 2025 DORA Report documented this dynamic empirically with unprecedented scale. Drawing on insights from over 100 hours of qualitative data and survey responses from nearly 5,000 technology professionals worldwide, the research found that higher AI adoption correlates with increased individual effectiveness, software delivery throughput, code quality, product performance, team performance, and organisational performance. But it also correlates with increased software delivery instability. AI accelerates development whilst exposing weaknesses downstream.
The report introduced a new metric: rework rate, quantifying how often teams must deploy unplanned fixes or patches to correct user-facing defects. The metric exists because traditional throughput measures obscured the downstream consequences of rapid AI-assisted development. The 2025 DORA findings emphasise that AI does not fix a team but rather amplifies what already exists. Strong teams use AI to become even better and more efficient, whilst struggling teams find that AI only highlights and intensifies their existing problems.
Forecasts suggest 75 per cent of technology leaders will face moderate to severe technical debt by 2026, up from 50 per cent in 2025. The State of Software Delivery 2025 report found that despite perceived productivity gains, the majority of developers actually spend more time debugging AI-generated code than they did before adopting these tools.
Security Vulnerabilities at Scale
The security implications compound the technical debt problem in ways that create direct enterprise risk.
Veracode's comprehensive analysis of over 100 large language models across 80 coding tasks spanning four programming languages revealed that only 55 per cent of AI-generated code was secure. AI-generated code introduced security flaws in 45 per cent of tests. Some programming languages proved especially problematic. Java had the highest failure rate, with LLM-generated code introducing security flaws more than 70 per cent of the time. Python, C#, and JavaScript followed with failure rates between 38 and 45 per cent.
Specific vulnerability types proved particularly resistant to AI mitigation. 86 per cent of code samples failed to defend against cross-site scripting. 88 per cent were vulnerable to log injection attacks. The researchers evaluated LLMs of varying sizes, release dates, and training sources over multiple years. Whilst models improved at writing functional or syntactically correct code, they showed no improvement at writing secure code. Security performance remained flat regardless of model size or training sophistication. This finding challenges assumptions that capability improvements would naturally extend to security outcomes.
The CodeRabbit State of AI vs Human Code Generation Report found AI-generated code creates 1.75 times more logic and correctness errors, 1.64 times more code quality and maintainability errors, 1.57 times more security findings, and 1.42 times more performance issues compared to human-written code. AI-generated code was 2.74 times more likely to introduce cross-site scripting vulnerabilities, 1.91 times more likely to make insecure object references, 1.88 times more likely to introduce improper password handling, and 1.82 times more likely to implement insecure deserialisation.
The root problem is that AI coding assistants do not inherently understand an application's risk model, internal standards, or threat landscape. This disconnect introduces systemic risks not just in individual lines of code but in logic flaws, missing controls, and inconsistent patterns that erode security posture over time. Today's foundational LLMs train on the vast ecosystem of open source code, learning by pattern matching. If an unsafe pattern like string-concatenated SQL queries appears frequently in training data, the assistant will readily produce it.
Real-world incidents have already demonstrated the consequences. In May 2025, security researcher Matt Palmer discovered that Lovable, a prominent vibe coding platform enabling users to build web applications through natural language prompts, had a critical vulnerability enabling anyone to access user information including names, email addresses, financial information, and API keys across 170 applications built on the platform. The vulnerability stemmed from misconfigured Row Level Security policies that AI-generated code failed to implement correctly. Palmer emailed Lovable with detailed vulnerability reports in March 2025, but the company's subsequent security scan feature only flagged the presence of Row Level Security policies, not whether they actually worked.
By mid-2025, AI code had triggered over 10,000 new security findings per month across major code repositories. A benchmark report found pull requests per author increased 20 per cent year-over-year even as incidents per pull request increased 23.5 per cent and change failure rates rose approximately 30 per cent.
Enterprise Governance Requirements
The transition from developer novelty to enterprise infrastructure demands organisational capabilities that most companies have not yet developed.
The OWASP GenAI Security Project released its Top 10 for Agentic Applications in December 2025, reflecting input from over 100 security researchers, industry practitioners, and technology providers. The framework identifies risks specific to autonomous AI agents including goal hijacking, where attackers manipulate agent objectives through prompt injection or poisoned data, and tool misuse, where agents use legitimate authorised capabilities for data exfiltration or destructive actions. The framework has already seen adoption by major technology providers including Microsoft and NVIDIA.
OWASP introduces the concept of “least agency” as an evolution of traditional least privilege principles. Rather than merely restricting what permissions an agent has, organisations must restrict the autonomy an agent can exercise. Only grant agents the minimum autonomy required to perform safe, bounded tasks. This conceptual shift acknowledges that agentic systems require different governance approaches than traditional software or even traditional AI applications.
The enterprise governance challenge extends beyond security into operational complexity. Traditional AI governance practices including data governance, risk assessments, explainability, and continuous monitoring remain essential, but governing agentic systems requires addressing their autonomy and dynamic behaviour. A key challenge involves controlling what actions non-human identities can perform, including data flow destinations, volumes, formats, and access to external or sensitive resources.
The scale of the identity management challenge is staggering. The average enterprise now faces an 82:1 machine-to-human identity ratio. Every machine identity represents a potential point of compromise. Adding autonomous decision-making expands the attack surface dramatically. Enterprises now require security rules and permission frameworks defining what data agents can access and what actions they are allowed to take, observability into agent actions and decision-making, and agent registries and workflow versioning to track how agents evolve over time.
The EU AI Act's high-risk provisions take effect in August 2026, with penalties reaching 35 million euros or 7 per cent of global revenue. Colorado's AI Act follows in June 2026. FINRA's 2026 Oversight Report positions AI governance as a core compliance issue rather than a future consideration for financial services firms. High-risk systems require documented evidence of governance: how systems were designed, how risks were assessed, how human oversight works, and how performance is monitored over time. Policy statements and principles are insufficient. Regulators expect architectural proof that controls exist and function.
Code review processes must evolve correspondingly. By the end of 2025, AI-assisted development accounted for nearly 40 per cent of all committed code globally. Leaders report that review capacity, not developer output, has become the limiting factor in delivery. A well-governed AI code review system must preserve human ownership of the merge decision whilst raising baseline quality of every pull request, reduce back-and-forth iteration, and ensure reviewers only engage with work that genuinely requires their experience.
Incident Response for Agentic Failures
Production failures from AI-generated code require incident response capabilities that most organisations lack.
In July 2025, an AI coding assistant deleted a customer's production database without instructions to do so. The AI system did not follow post-incident commands from the developer to stop making further unwanted changes. This incident illustrates a failure mode unique to agentic systems: they can continue causing damage after problems are detected if proper controls are not in place. Other documented incidents include a commercial AI agent asked merely to check egg prices that instead purchased eggs without user consent, and an AI coding assistant that moved files such that neither the agent nor the human operator could find them.
The pattern of agentic failures differs qualitatively from traditional software bugs. Traditional bugs are deterministic. Given the same inputs, they produce the same outputs. Agentic failures emerge from the interaction between model reasoning, context interpretation, and tool access. They can be non-reproducible, making debugging difficult. They can cascade, as agents respond to their own errors by taking additional problematic actions.
Incident response for agentic systems requires capabilities including detecting problems early through observability into agent actions and decision-making, communicating what happened through interpretable logging that captures agent reasoning, fixing issues quickly through mechanisms to halt agent operations and revert changes, and capturing near misses through documentation that enables learning before failures reach production.
ISACA recommends implementing governance frameworks that ensure AI coding assistants are tested, audited, and synchronised with enterprise risk appetite. This involves requiring human-in-the-loop approval for high-impact actions, reporting AI decision-making, and ensuring audits are interpretable.
The cost implications of failure are substantial. The average security breach costs 4.45 million dollars, with potential tens of millions more in brand damage, regulatory fines, and legal exposure. GDPR violations alone can reach 4 per cent of global revenue. One in five organisations have already suffered material damage from AI-generated code. Cyber insurance has not caught up with AI-specific risks, and liability questions around who bears responsibility when AI writes vulnerable code remain unresolved.
Building Mature Organisational Capabilities
The path forward requires treating AI-generated code differently from human-written code at every stage of the software lifecycle.
Architectural oversight must remain human territory. AI coding agents excel at generating correct code but perform poorly at making correct design and architecture decisions independently. If allowed to proceed without oversight, they will write functional code whilst accruing technical debt rapidly. The emerging pattern treats AI as the driver in pair programming whilst humans serve as navigators directing overall strategy, making architectural decisions, and reviewing generated code.
Review processes need tiering based on risk. Security-critical code paths require more rigorous human review than cosmetic changes. Changes to authentication, authorisation, payment processing, and data handling warrant heightened scrutiny regardless of origin. Static analysis should run automatically on all AI-generated code before human review begins.
Verification tooling must become standard infrastructure. AI-powered remediation tools that automatically detect and fix flaws in generated code can reduce vulnerability rates by over 60 per cent when combined with human oversight. Software composition analysis ensures AI-generated code does not introduce vulnerabilities from third-party dependencies.
The 2025 DORA Report identifies seven essential competencies for effective AI adoption including a clear organisational stance on AI governance, high-quality data ecosystems, AI-accessible internal systems, robust version control, small-batch delivery practices, user-centric feedback loops, and strong internal platforms. Research shows a direct correlation between high-quality internal platforms and an organisation's ability to unlock AI value, making platform engineering an essential foundation for success.
Training programmes must address the skill atrophy concern. If developers stop writing code manually, they may lose the ability to understand and debug complex systems. The solution involves treating AI code generation as augmentation rather than replacement, ensuring developers maintain fundamental competencies even whilst leveraging AI for acceleration.
Sustainability of the Current Narrative
The narrative of unlimited software generation confronts hard limits as organisations accumulate experience.
The productivity paradox documented by METR suggests that AI tools accelerate inexperienced developers working on unfamiliar code whilst potentially slowing experienced developers working on code they understand deeply. The economic implications are counterintuitive. AI coding tools may provide the most value precisely where organisations need it least, on new projects with less experienced teams, whilst providing the least value where organisations need it most, on mature codebases with experienced maintainers.
The technical debt curve suggests current practices are borrowing against future development velocity. Code that ships quickly today creates debugging burdens tomorrow. The 8x increase in code duplication documented by GitClear represents maintenance obligations that compound over time. At some threshold, the accumulated debt consumes more engineering time than AI tools save.
The security exposure curve follows a similar trajectory. As AI-generated code proliferates through production systems, the attack surface expands correspondingly. The 45 per cent vulnerability rate documented by Veracode, multiplied by the volume increase in AI-generated code, produces absolute vulnerability counts that overwhelm traditional security review processes.
Enterprise adoption will continue accelerating regardless. Gartner predicts 40 per cent of enterprise applications will embed AI agents by end of 2026, up from less than 5 per cent in 2025. The agentic AI market is projected to surge from 7.8 billion dollars today to over 52 billion dollars by 2030.
But sustainable adoption requires governance maturity that matches technical capability. The organisations that will succeed are those that understand that agentic tools amplify existing capabilities rather than replacing them. Strong teams become stronger. Struggling teams see their problems intensify.
The current moment resembles previous technology adoption cycles where early euphoria confronted operational reality. The cloud computing transition promised infinite scalability but required years of organisational learning around cost management, security practices, and operational procedures. Mobile development promised universal reach but demanded new expertise in platform-specific constraints, offline operation, and battery efficiency.
Agentic coding tools represent a similarly significant transition. The tools are genuinely transformative. But transformative tools require transformed organisations to wield them effectively. The organisations racing to maximise AI-generated code volume without building corresponding governance, review, and incident response capabilities are constructing technical debt, security exposure, and operational risk that will constrain their future options.
The question is not whether AI will write most code. It almost certainly will. The question is whether organisations will develop the maturity to ensure that code serves their interests over the long term rather than creating liabilities that compound faster than the productivity gains that justified adoption.
References and Sources
Anthropic. “Enabling Claude Code to work more autonomously.” Anthropic News (September 2025). https://www.anthropic.com/news/enabling-claude-code-to-work-more-autonomously
TechCrunch. “Claude Code is coming to Slack, and that's a bigger deal than it sounds.” TechCrunch (December 2025). https://techcrunch.com/2025/12/08/claude-code-is-coming-to-slack-and-thats-a-bigger-deal-than-it-sounds/
Anthropic. “Introducing the Model Context Protocol.” Anthropic News (November 2024). https://www.anthropic.com/news/model-context-protocol
MarkTechPost. “Now It's Claude's World: How Anthropic Overtook OpenAI in the Enterprise AI Race.” MarkTechPost (August 2025). https://www.marktechpost.com/2025/08/04/now-its-claudes-world-how-anthropic-overtook-openai-in-the-enterprise-ai-race/
Sacra. “Anthropic revenue, valuation & funding.” Sacra (2025). https://sacra.com/c/anthropic/
METR. “Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity.” METR (July 2025). https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/
Stack Overflow. “2025 Developer Survey.” Stack Overflow (2025). https://survey.stackoverflow.co/2025/
GitClear. “AI Copilot Code Quality: 2025 Data Suggests 4x Growth in Code Clones.” GitClear (2025). https://www.gitclear.com/ai_assistant_code_quality_2025_research
Google Cloud. “2025 DORA Report: State of AI-Assisted Software Development.” Google DORA (2025). https://dora.dev/research/2025/dora-report/
Veracode. “2025 GenAI Code Security Report.” Veracode (October 2025). https://www.veracode.com/resources/analyst-reports/2025-genai-code-security-report/
CodeRabbit. “AI vs human code gen report: AI code creates 1.7x more issues.” CodeRabbit (2025). https://www.coderabbit.ai/blog/state-of-ai-vs-human-code-generation-report
Semafor. “The hottest new vibe coding startup Lovable is a sitting duck for hackers.” Semafor (May 2025). https://www.semafor.com/article/05/29/2025/the-hottest-new-vibe-coding-startup-lovable-is-a-sitting-duck-for-hackers
OWASP. “Top 10 for Agentic Applications for 2026.” OWASP GenAI Security Project (December 2025). https://genai.owasp.org/resource/owasp-top-10-for-agentic-applications-for-2026/
ISACA. “Avoiding AI Pitfalls in 2026: Lessons Learned from Top 2025 Incidents.” ISACA Now Blog (2025). https://www.isaca.org/resources/news-and-trends/isaca-now-blog/2025/avoiding-ai-pitfalls-in-2026-lessons-learned-from-top-2025-incidents
Chroma Labs. “Context Rot: How Increasing Input Tokens Impacts LLM Performance.” Chroma Research (2025). https://research.trychroma.com/context-rot
SkyWork AI. “Claude Code vs GitHub Copilot (2025): Complete Comparison Guide.” SkyWork AI (2025). https://skywork.ai/blog/claude-code-vs-github-copilot-2025-comparison/
TechStartups. “Google's Gemini 3 outperforms Claude in coding benchmarks; Sourcegraph adopts it for millions of developers.” TechStartups (December 2025). https://techstartups.com/2025/12/05/googles-gemini-3-outperforms-claude-in-coding-benchmarks-sourcegraph-adopts-it-for-millions-of-developers/
Zvi Mowshowitz. “2025 Year in Review.” Don't Worry About the Vase, Substack (2025). https://thezvi.substack.com/p/2025-year-in-review

Tim Green UK-based Systems Theorist and Independent Technology Writer
Tim explores the intersections of artificial intelligence, decentralised cognition, and posthuman ethics. His work, published at smarterarticles.co.uk, challenges dominant narratives of technological progress while proposing interdisciplinary frameworks for collective intelligence and digital stewardship.
His writing has been featured on Ground News and shared by independent researchers across both academic and technological communities.
ORCID: 0009-0002-0156-9795 Email: tim@smarterarticles.co.uk