The AI Value Paradox: When Breakthroughs Don't Translate to Progress

The numbers tell a revealing story about the current state of artificial intelligence. Academic researchers continue to generate the overwhelming majority of highly-cited AI breakthroughs, with AlphaFold's protein structure predictions having earned a Nobel Prize in 2024. Yet simultaneously, industry is abandoning AI projects at rates far exceeding initial predictions. What Gartner forecast in mid-2024 has proven conservative: whilst they predicted at least 30% of generative AI projects would be abandoned after proof of concept by year's end, a stark MIT report from August 2025 revealed that approximately 95% of generative AI pilot programmes are falling short, delivering little to no measurable impact on profit and loss statements. Meanwhile, data from S&P Global shows 42% of companies scrapped most of their AI initiatives in 2025, up dramatically from just 17% the previous year.
This disconnect reveals something more troubling than implementation challenges. It exposes a fundamental misalignment between how AI capabilities are being developed and how they're being deployed for genuine societal impact. The question isn't just why so many projects fail. It's whether the entire enterprise of AI development has been optimised for the wrong outcomes.
The Academic-Industrial Divide
The shift in AI research leadership over the past five years has been dramatic. In 2023, industry produced 51 notable machine learning models whilst academia contributed only 15, according to Stanford's AI Index Report. By 2024, nearly 90% of notable models originated from industry, up from 60% in 2023. A handful of large companies (Anthropic, Google, OpenAI, Meta, and Microsoft) have produced most of the world's foundation models over the last five years. The 2025 AI Index Report confirms this trend continues, with U.S.-based institutions producing 40 notable AI models in 2024, significantly surpassing China's 15 and Europe's combined total of three.
Yet this industrial dominance in model production hasn't translated into deployment success. According to BCG research, only 22% of companies have advanced beyond proof of concept to generate some value, and merely 4% are creating substantial value from AI. The gap between capability and application has never been wider.
Rita Sallam, Distinguished VP Analyst at Gartner, speaking at the Gartner Data & Analytics Summit in Sydney in mid-2024, noted the growing impatience amongst executives: “After last year's hype, executives are impatient to see returns on GenAI investments, yet organisations are struggling to prove and realise value. Unfortunately, there is no one size fits all with GenAI, and costs aren't as predictable as other technologies.”
The costs are indeed staggering. Current generative AI deployment costs range from $5 million to $20 million in upfront investments. Google's Gemini 1.0 Ultra training alone cost $192 million. These figures help explain why 70% of the 2,770 companies surveyed by Deloitte have moved only 30% or fewer of their generative AI experiments into production.
Meanwhile, academic research continues to generate breakthrough insights with profound societal implications. AlphaFold, developed at Google DeepMind, has now been used by more than two million researchers from 190 countries. The AlphaFold Protein Structure Database, which began with approximately 360,000 protein structure predictions at launch in July 2021, has grown to a staggering 200 million protein structures from over one million organisms. The database has been downloaded in its entirety over 23,000 times, and the foundational paper has accumulated over 29,000 citations. This is what genuine impact looks like: research that accelerates discovery across multiple domains, freely accessible, with measurable scientific value.
The Economics of Abandonment
The abandonment rate isn't simply about technical failure. It's a symptom of deeper structural issues in how industry frames AI problems. When companies invest millions in generative AI projects, they're typically seeking efficiency gains or productivity improvements. But as Gartner noted in 2024, translating productivity enhancement into direct financial benefit remains exceptionally difficult.
The data reveals a pattern. Over 80% of AI projects fail, according to RAND research, which is twice the failure rate of corporate IT projects that don't involve AI. Only 48% of AI projects make it into production, and the journey from prototype to production takes an average of eight months. These aren't just implementation challenges. They're indicators that the problems being selected for AI solutions may not be the right problems to solve.
The situation has deteriorated sharply over the past year. As mentioned, S&P Global data shows 42% of companies scrapped most of their AI initiatives in 2025, up dramatically from just 17% in 2024. According to IDC, 88% of AI proof-of-concepts fail to transition into production, creating a graveyard of abandoned pilots and wasted investment.
The ROI measurement problem compounds these failures. As of 2024, roughly 97% of enterprises still struggled to demonstrate business value from their early generative AI efforts. Nearly half of business leaders said that proving generative AI's business value was the single biggest hurdle to adoption. Traditional ROI models don't fit AI's complex, multi-faceted impacts. Companies that successfully navigate this terrain combine financial metrics with operational and strategic metrics, but such sophistication remains rare.
However, there are emerging positive signs. According to a Microsoft-sponsored IDC report released in January 2025, three in four enterprises now see positive returns on generative AI investments, with 72% of leaders tracking ROI metrics such as productivity, profitability and throughput. McKinsey estimates every dollar invested in generative AI returns an average of $3.70, with financial services seeing as much as 4.2 times ROI. Yet these successes remain concentrated amongst sophisticated early adopters.
Consider what success looks like when it does occur. According to Gartner's 2024 survey of 822 early adopters, those who successfully implemented generative AI reported an average 15.8% revenue increase, 15.2% cost savings, and 22.6% productivity improvement. The companies BCG identifies as “AI future-built” achieve five times the revenue increases and three times the cost reductions of other organisations. Yet these successes remain outliers.
The gap suggests that most companies are approaching AI with the wrong frame. They're asking: “How can we use AI to improve existing processes?” rather than “What problems does AI uniquely enable us to solve?” The former leads to efficiency plays that struggle to justify massive upfront costs. The latter leads to transformation but requires rethinking business models from first principles.
The Efficiency Paradigm Shift
Against this backdrop of project failures and unclear value, a notable trend has emerged and accelerated through 2025. The industry is pivoting toward smaller, specialised models optimised for efficiency. The numbers are remarkable. In 2022, Google's PaLM needed 540 billion parameters to reach 60% accuracy on the MMLU benchmark. By 2024, Microsoft's Phi-3-mini achieved the same threshold with just 3.8 billion parameters. That's a 142-fold reduction in model parameters whilst maintaining equivalent performance. By 2025, the trend continues: models with 7 billion to 14 billion parameters now reach 85% to 90% of the performance of much larger 70 billion parameter models on general benchmarks.
The efficiency gains extend beyond parameter counts. Inference costs plummeted from $20 per million tokens in November 2022 to $0.07 by October 2024, representing an over 280-fold reduction in approximately 18 months. For an LLM of equivalent performance, costs are decreasing by 10 times every year. At the hardware level, costs have declined by 30% annually whilst energy efficiency has improved by 40% each year. Smaller, specialised AI models now outperform their massive counterparts on specific tasks whilst consuming 70 times less energy and costing 1,000 times less to deploy.
This shift raises a critical question: Does the move toward smaller, specialised models represent a genuine shift toward solving real problems, or merely a more pragmatic repackaging of the same pressure to commodify intelligence?
The optimistic interpretation is that specialisation forces clearer problem definition. You can't build a specialised model without precisely understanding what task it needs to perform. This constraint might push companies toward better-defined problems with measurable outcomes. The efficiency gains make experimentation more affordable, potentially enabling exploration of problems that wouldn't justify the cost of large foundation models.
The pessimistic interpretation is more troubling. Smaller models might simply make it easier to commodify narrow AI capabilities whilst avoiding harder questions about societal value. If a model costs 1,000 times less to deploy, the financial threshold for justifying its use drops dramatically. This could accelerate deployment of AI systems that generate marginal efficiency gains without addressing fundamental problems or creating genuine value.
Meta's Llama 3.3, released in summer 2024, was trained on approximately 15 trillion tokens, demonstrating that even efficient models require enormous resources. Yet the model's open availability has enabled thousands of researchers and developers to build applications that would be economically infeasible with proprietary models costing millions to access.
The key insight is that efficiency itself is neither good nor bad. What matters is how efficiency shapes problem selection. If lower costs enable researchers to tackle problems that large corporations find unprofitable (rare diseases, regional languages, environmental monitoring), then the efficiency paradigm serves societal benefit. If lower costs simply accelerate deployment of marginally useful applications that generate revenue without addressing real needs, then efficiency becomes another mechanism for value extraction.
The Healthcare Reality Check
Healthcare offers a revealing case study in the deployment gap, and 2025 has brought dramatic developments. Healthcare is now deploying AI at more than twice the rate (2.2 times) of the broader economy. Healthcare organisations have achieved 22% adoption of domain-specific AI tools, representing a 7 times increase over 2024 and 10 times over 2023. In just two years, healthcare went from 3% adoption to becoming a leader in AI implementation. Health systems lead with 27% adoption, followed by outpatient providers at 18% and payers at 14%.
Ambient clinical documentation tools have achieved near-universal adoption. In a survey of 43 U.S. health systems, ambient notes was the only use case with 100% of respondents reporting adoption activities, with 53% reporting a high degree of success. Meanwhile, imaging and radiology AI, despite widespread deployment, shows only 19% high success rates. Clinical risk stratification manages only 38% high success rates.
The contrast is instructive. Documentation tools solve a clearly defined problem: reducing the time clinicians spend on paperwork. Doctors are spending two hours doing digital paperwork for every one hour of direct patient care. Surgeons using large language models can write high-quality clinical notes in five seconds versus seven minutes manually, representing an 84-fold speed increase. The value is immediate, measurable, and directly tied to reducing physician burnout.
At UChicago Medicine, participating clinicians believed the introduction of ambient clinical documentation made them feel more valued, and 90% reported being able to give undivided attention to patients, up from 49% before the tool was introduced. Yet despite these successes, only 28% of physicians say they feel prepared to leverage AI's benefits, though 57% are already using AI tools for things like ambient listening, documentation, billing or diagnostics.
But these are efficiency plays, not transformative applications. The harder problems, where AI could genuinely advance medical outcomes, remain largely unsolved. Less than 1% of AI tools developed during COVID-19 were successfully deployed in clinical settings. The reason isn't lack of technical capability. It's that solving real clinical problems requires causal understanding, robust validation, regulatory approval, and integration into complex healthcare systems.
Consider the successes that do exist. New AI software trained on 800 brain scans and trialled on 2,000 patients proved twice as accurate as professionals at examining stroke patients. Machine learning models achieved prediction scores of 90.2% for diabetic nephropathy, 85.9% for neuropathy, and 88.9% for angiopathy. In 2024, AI tools accelerated Parkinson's drug discovery, with one compound progressing to pre-clinical trials in six months versus the traditional two to three years.
These represent genuine breakthroughs, yet they remain isolated successes rather than systemic transformation. The deployment gap persists because most healthcare AI targets the wrong problems or approaches the right problems without the rigorous validation and causal understanding required for clinical adoption. Immature AI tools remain a significant barrier to adoption, cited by 77% of respondents in recent surveys, followed by financial concerns (47%) and regulatory uncertainty (40%).
The Citation-Impact Gap
The academic research community operates under different incentives entirely. Citation counts, publication venues, and peer recognition drive researcher behaviour. This system has produced remarkable breakthroughs. AI adoption has surged across scientific disciplines, with over one million AI-assisted papers identified, representing 1.57% of all papers. The share of AI papers increased between 21 and 241 times from 1980 to 2024, depending on the field. Between 2013 and 2023, the total number of AI publications in venues related to computer science and other scientific disciplines nearly tripled, increasing from approximately 102,000 to over 242,000.
Yet this productivity surge comes with hidden costs. A recent study examining 4,051 articles found that only 370 articles (9.1%) were explicitly identified as relevant to societal impact. The predominant “scholar-to-scholar” paradigm remains a significant barrier to translating research findings into practical applications and policies that address global challenges.
The problem isn't that academic researchers don't care about impact. It's that the incentive structures don't reward it. Faculty are incentivised to publish continuously rather than translate research into real-world solutions, with job security and funding depending primarily on publication metrics. This discourages taking risks and creates a disconnect between global impact and what academia values.
The translation challenge has multiple dimensions. To achieve societal impact, researchers must engage in boundary work by making connections to other fields and actors. To achieve academic impact, they must demarcate boundaries by accentuating divisions with other theories or fields of knowledge. These are fundamentally opposing activities. Achieving societal impact requires adapting to other cultures or fields to explain or promote knowledge. Achieving academic impact requires emphasising novelty and differences relative to other fields.
The communication gap further complicates matters. Reducing linguistic complexity without being accused of triviality is a core challenge for scholarly disciplines. Bridging the social gap between science and society means scholars must adapt their language, though at the risk of compromising their epistemic authority within their fields.
This creates a paradox. Academic research generates the breakthroughs that win Nobel Prizes and accumulate tens of thousands of citations. Industry possesses the resources and organisational capacity to deploy AI at scale. Yet the breakthroughs don't translate into deployment success, and the deployments don't address the problems that academic research identifies as societally important.
The gap is structural, not accidental. Academic researchers are evaluated on scholarly impact within their disciplines. Industry teams are evaluated on business value within fiscal quarters or product cycles. Neither evaluation framework prioritises solving problems of genuine societal importance that may take years to show returns and span multiple disciplines.
Some institutions are attempting to bridge this divide. The Translating Research into Action Center (TRAC), established by a $5.7 million grant from the National Science Foundation, aims to strengthen universities' capacity to promote research translation for societal and economic impact. Such initiatives remain exceptions, swimming against powerful institutional currents that continue to reward traditional metrics.
Causal Discovery and the Trust Deficit
The failure to bridge this gap has profound implications for AI trustworthiness. State-of-the-art AI models largely lack understanding of cause-effect relationships. Consequently, these models don't generalise to unseen data, often produce unfair results, and are difficult to interpret. Research describes causal machine learning as “key to ethical AI for healthcare, equivalent to a doctor's oath to 'first, do no harm.'”
The importance of causal understanding extends far beyond healthcare. When AI systems are deployed without causal models, they excel at finding correlations in training data but fail when conditions change. This brittleness makes them unsuitable for high-stakes decisions affecting human lives. Yet companies continue deploying such systems because the alternative (investing in more robust causal approaches) requires longer development timelines and multidisciplinary expertise.
Building trustworthy AI through causal discovery demands collaboration across statistics, epidemiology, econometrics, and computer science. It requires combining aspects from biomedicine, machine learning, and philosophy to understand how explanation and trustworthiness relate to causality and robustness. This is precisely the kind of interdisciplinary work that current incentive structures discourage.
The challenge is that “causal” does not equate to “trustworthy.” Trustworthy AI, particularly within healthcare and other high-stakes domains, necessitates coordinated efforts amongst developers, policymakers, and institutions to uphold ethical standards, transparency, and accountability. Ensuring that causal AI models are both fair and transparent requires careful consideration of ethical and interpretive challenges that cannot be addressed through technical solutions alone.
Despite promising applications of causality for individual requirements of trustworthy AI, there is a notable lack of efforts to integrate dimensions like fairness, privacy, and explainability into a cohesive and unified framework. Each dimension gets addressed separately by different research communities, making it nearly impossible to build systems that simultaneously satisfy multiple trustworthiness requirements.
The Governance Gap
The recognition that AI development needs ethical guardrails has spawned numerous frameworks and initiatives. UNESCO's Recommendation on the Ethics of Artificial Intelligence, adopted by all 193 member states in November 2021, represents the most comprehensive global standard available. The framework comprises 10 principles protecting and advancing human rights, human dignity, the environment, transparency, accountability, and legal adherence.
In 2024, UNESCO launched the Global AI Ethics and Governance Observatory at the 2nd Global Forum on the Ethics of Artificial Intelligence in Kranj, Slovenia. This collaborative effort between UNESCO, the Alan Turing Institute, and the International Telecommunication Union (ITU) represents a commitment to addressing the multifaceted challenges posed by rapid AI advancement. The observatory aims to foster knowledge, expert insights, and good practices in AI ethics and governance. Major technology companies including Lenovo and SAP signed agreements to build more ethical AI, with SAP updating its AI ethics policies specifically to align with the UNESCO framework.
Looking ahead, the 3rd UNESCO Global Forum on the Ethics of Artificial Intelligence is scheduled for 24-27 June 2025 in Bangkok, Thailand, where it will highlight achievements in AI ethics since the 2021 Recommendation and underscore the need for continued progress through actionable initiatives.
Yet these high-level commitments often struggle to translate into changed practice at the level where AI problems are actually selected and framed. The gap between principle and practice remains substantial. What is generally unclear is how organisations that make use of AI understand and address ethical issues in practice. Whilst there's an abundance of conceptual work on AI ethics, empirical insights remain rare and often anecdotal.
Moreover, governance frameworks typically address how AI systems should be built and deployed, but rarely address which problems deserve AI solutions in the first place. The focus remains on responsible development and deployment of whatever projects organisations choose to pursue, rather than on whether those projects serve societal benefit. This is a fundamental blind spot in current AI governance approaches.
The Problem Selection Problem
This brings us to the fundamental question: If causal discovery and multidisciplinary approaches are crucial for trustworthy AI advancement, shouldn't the selection and framing of problems themselves (not just their solutions) be guided by ethical and societal criteria rather than corporate roadmaps?
The current system operates backwards. Companies identify business problems, then seek AI solutions. Researchers identify interesting technical challenges, then develop novel approaches. Neither starts with: “What problems most urgently need solving for societal benefit, and how might AI help?” This isn't because individuals lack good intentions. It's because the institutional structures, funding mechanisms, and evaluation frameworks aren't designed to support problem selection based on societal impact.
Consider the contrast between AlphaFold's development and typical corporate AI projects. AlphaFold addressed a problem (protein structure prediction) that the scientific community had identified as fundamentally important for decades. The solution required deep technical innovation, but the problem selection was driven by scientific and medical needs, not corporate strategy. The result: a tool used by over two million researchers generating insights across multiple disciplines. The AlphaFold Database has grown from just over 360,000 protein structure predictions at launch in July 2021 to a staggering 200 million protein structures from over one million organisms, with the entire archive downloaded over 23,000 times.
Now consider the projects being abandoned. Many target problems like “improve customer service response times” or “optimise ad targeting.” These are legitimate business concerns, but they're not societally important problems. When such projects fail, little of value is lost. The resources could have been directed toward problems where AI might generate transformative rather than incremental value.
The shift toward smaller, specialised models could enable a different approach to problem selection if accompanied by new institutional structures. Lower deployment costs make it economically feasible to work on problems that don't generate immediate revenue. Open-source models like Meta's Llama enable researchers and nonprofits to build applications serving public interest rather than shareholder value.
But these possibilities will only be realised if problem selection itself changes. That requires new evaluation frameworks that assess research and development projects based on societal benefit, not just citations or revenue. It requires funding mechanisms that support long-term work on complex problems that don't fit neatly into quarterly business plans or three-year grant cycles. It requires breaking down disciplinary silos and building genuinely interdisciplinary teams.
Toward Ethical Problem Framing
What would ethical problem selection look like in practice? Several principles emerge from the research on trustworthy AI and societal impact:
Start with societal challenges, not technical capabilities. Instead of asking “What can we do with large language models?” ask “What communication barriers prevent people from accessing essential services, and might language models help?” The problem defines the approach, not vice versa.
Evaluate problems based on impact potential, not revenue potential. A project addressing rare disease diagnosis might serve a small market but generate enormous value per person affected. Current evaluation frameworks undervalue such opportunities because they optimise for scale and revenue rather than human flourishing.
Require multidisciplinary collaboration from the start. Technical AI researchers, domain experts, ethicists, and affected communities should jointly frame problems. This prevents situations where technically sophisticated solutions address the wrong problems or create unintended harms.
Build in causal understanding and robustness requirements. If a problem requires understanding cause-effect relationships (as most high-stakes applications do), specify this upfront. Don't deploy correlation-based systems in domains where causality matters.
Make accessibility and openness core criteria. Research that generates broad societal benefit should be accessible to researchers globally, as with AlphaFold. Proprietary systems that lock insights behind paywalls or API charges limit impact.
Plan for long time horizons. Societally important problems often require sustained effort over years or decades. Funding and evaluation frameworks must support this rather than demanding quick results.
These principles sound straightforward but implementing them requires institutional change. Universities would need to reform how they evaluate and promote faculty, shifting from pure publication counts toward assessing translation of research into practice. Funding agencies would need to prioritise societal impact over traditional metrics. Companies would need to accept longer development cycles and uncertain financial returns for some projects, balanced by accountability frameworks that assess societal impact alongside business metrics.
The Pragmatic Path Forward
The gap between academic breakthroughs and industrial deployment success reveals a system optimised for the wrong objectives. Academic incentives prioritise scholarly citations over societal impact. Industry incentives prioritise quarterly results over long-term value creation. Neither framework effectively identifies and solves problems of genuine importance.
The abandonment rate for generative AI projects isn't a temporary implementation challenge that better project management will solve. The MIT report showing 95% of generative AI pilots falling short demonstrates fundamental misalignment. When you optimise for efficiency gains and cost reduction, you get brittle systems that fail when conditions change. When you optimise for citations and publications, you get research that doesn't translate into practice. When you optimise for shareholder value, you get AI applications that extract value rather than create it.
Several promising developments suggest paths forward. The explosion in AI-assisted research papers (over one million identified across disciplines) demonstrates growing comfort with AI tools amongst scientists. The increasing collaboration between industry and academia shows that bridges can be built. The growth of open-source models provides infrastructure for researchers and nonprofits to build applications serving public interest. In 2025, 82% of enterprise decision makers now use generative AI weekly, up from just 37% in 2023, suggesting that organisations are learning to work effectively with these technologies.
Funding mechanisms need reform. Government research agencies and philanthropic foundations should create programmes explicitly focused on AI for societal benefit, with evaluation criteria emphasising impact over publications or patents. Universities need to reconsider how they evaluate AI research. A paper enabling practical solutions to important problems should count as much as (or more than) a paper introducing novel architectures that accumulate citations within the research community.
Companies deploying AI need accountability frameworks that assess societal impact alongside business metrics. This isn't merely about avoiding harms. It's about consciously choosing to work on problems that matter, even when the business case is uncertain. The fact that 88% of leaders expect to increase generative AI spending in the next 12 months, with 62% forecasting more than 10% budget growth over 2 to 5 years, suggests substantial resources will be available. The question is whether those resources will be directed wisely.
The Intelligence We Actually Need
The fundamental question isn't whether we can build more capable AI systems. Technical progress continues at a remarkable pace, with efficiency gains enabling increasingly sophisticated capabilities at decreasing costs. The question is whether we're building intelligence for the right purposes.
When AlphaFold's developers (John Jumper and Demis Hassabis at Google DeepMind) earned the Nobel Prize in Chemistry in 2024 alongside David Baker at the University of Washington, the recognition wasn't primarily for technical innovation, though the AI architecture was undoubtedly sophisticated. It was for choosing a problem (protein structure prediction) whose solution would benefit millions of researchers and ultimately billions of people. The problem selection mattered as much as the solution.
The abandoned generative AI projects represent wasted resources, but more importantly, they represent missed opportunities. Those millions of dollars in upfront investments and thousands of hours of skilled labour could have been directed toward problems where success would generate lasting value. The opportunity cost of bad problem selection is measured not just in failed projects but in all the good that could have been done instead.
The current trajectory, left unchanged, leads to a future where AI becomes increasingly sophisticated at solving problems that don't matter whilst failing to address challenges that do. We'll have ever-more-efficient systems for optimising ad targeting and customer service chatbots whilst healthcare, education, environmental monitoring, and scientific research struggle to access AI capabilities that could transform their work.
This needn't be the outcome. The technical capabilities exist. The research talent exists. The resources exist. McKinsey estimates generative AI's economic potential at $2.6 trillion to $4.4 trillion annually. What's missing is alignment: between academic research and practical needs, between industry capabilities and societal challenges, between technical sophistication and human flourishing.
Creating that alignment requires treating problem selection as itself an ethical choice deserving as much scrutiny as algorithmic fairness or privacy protection. It requires building institutions and incentive structures that reward work on societally important challenges, even when such work doesn't generate maximum citations or maximum revenue.
The shift toward smaller, specialised models demonstrates that the AI field can change direction when circumstances demand it. The efficiency paradigm emerged because the economic and environmental costs of ever-larger models became unsustainable. Similarly, the value extraction paradigm can shift if we recognise that the societal cost of misaligned problem selection is too high.
The choice isn't between academic purity and commercial pragmatism. It's between a system that generates random breakthroughs and scattered deployments versus one that systematically identifies important problems and marshals resources to solve them. The former produces occasional Nobel Prizes and frequent project failures. The latter could produce widespread, lasting benefit.
What does the gap between academic breakthroughs and industrial deployment reveal about the misalignment between how AI capabilities are developed and how they're deployed? The answer is clear: We've optimised the entire system for the wrong outcomes. We measure success by citations that don't translate into impact and revenue that doesn't create value. We celebrate technical sophistication whilst ignoring whether the problems being solved matter.
Fixing this requires more than better project management or clearer business cases. It requires fundamentally rethinking what we're trying to achieve. Not intelligence that can be commodified and sold, but intelligence that serves human needs. Not capabilities that impress peer reviewers or generate returns, but capabilities that address challenges we've collectively decided matter.
The technical breakthroughs will continue. The efficiency gains will compound. The question is whether we'll direct these advances toward problems worthy of the effort. That's ultimately a question not of technology but of values: What do we want intelligence, artificial or otherwise, to be for?
Until we answer that question seriously, with institutional structures and incentive frameworks that reflect our answer, we'll continue seeing spectacular breakthroughs that don't translate into progress and ambitious deployments that don't create lasting value. The abandonment rate isn't the problem. It's a symptom. The problem is that we haven't decided, collectively and explicitly, what problems deserve the considerable resources we're devoting to AI. Until we make that decision and build systems that reflect it, the gap between capability and impact will only widen, and the promise of artificial intelligence will remain largely unfulfilled.
Sources and References
Gartner, Inc. (July 2024). “Gartner Predicts 30% of Generative AI Projects Will Be Abandoned After Proof of Concept By End of 2025.” Press release from Gartner Data & Analytics Summit, Sydney. Available at: https://www.gartner.com/en/newsroom/press-releases/2024-07-29-gartner-predicts-30-percent-of-generative-ai-projects-will-be-abandoned-after-proof-of-concept-by-end-of-2025
MIT Report (August 2025). “95% of Generative AI Pilots at Companies Failing.” Fortune. Available at: https://fortune.com/2025/08/18/mit-report-95-percent-generative-ai-pilots-at-companies-failing-cfo/
S&P Global (2025). “AI Initiative Abandonment Research.” Data showing 42% of companies scrapped AI initiatives in 2025 versus 17% in 2024.
Stanford University Human-Centered Artificial Intelligence (2024). “AI Index Report 2024.” Stanford HAI. Available at: https://aiindex.stanford.edu/report/
Stanford University Human-Centered Artificial Intelligence (2025). “AI Index Report 2025, Chapter 1: Research and Development.” Stanford HAI. Available at: https://hai.stanford.edu/assets/files/hai_ai-index-report-2025_chapter1_final.pdf
BCG (October 2024). “AI Adoption in 2024: 74% of Companies Struggle to Achieve and Scale Value.” Boston Consulting Group. Available at: https://www.bcg.com/press/24october2024-ai-adoption-in-2024-74-of-companies-struggle-to-achieve-and-scale-value
Nature (October 2024). “Chemistry Nobel goes to developers of AlphaFold AI that predicts protein structures.” DOI: 10.1038/d41586-024-03214-7
Microsoft and IDC (January 2025). “Generative AI Delivering Substantial ROI to Businesses Integrating Technology Across Operations.” Available at: https://news.microsoft.com/en-xm/2025/01/14/generative-ai-delivering-substantial-roi-to-businesses-integrating-the-technology-across-operations-microsoft-sponsored-idc-report/
Menlo Ventures (2025). “2025: The State of AI in Healthcare.” Available at: https://menlovc.com/perspective/2025-the-state-of-ai-in-healthcare/
PMC (2024). “Adoption of artificial intelligence in healthcare: survey of health system priorities, successes, and challenges.” PMC12202002. Available at: https://pmc.ncbi.nlm.nih.gov/articles/PMC12202002/
AlphaFold Protein Structure Database (2024). “AlphaFold Protein Structure Database in 2024: providing structure coverage for over 214 million protein sequences.” Nucleic Acids Research. Oxford Academic. DOI: 10.1093/nar/gkad1010
UNESCO (2024). “Global AI Ethics and Governance Observatory.” Launched at 2nd Global Forum on the Ethics of Artificial Intelligence, Kranj, Slovenia. Available at: https://www.unesco.org/ethics-ai/en
UNESCO (2025). “Global Forum on the Ethics of AI 2025.” Scheduled for 24-27 June 2025, Bangkok, Thailand. Available at: https://www.unesco.org/en/forum-ethics-ai
Wharton School (October 2025). “82% of Enterprise Leaders Now Use Generative AI Weekly.” Multi-year study. Available at: https://www.businesswire.com/news/home/20251028556241/en/82-of-Enterprise-Leaders-Now-Use-Generative-AI-Weekly-Multi-Year-Wharton-Study-Finds-as-Investment-and-ROI-Continue-to-Build
Steingard et al. (2025). “Assessing the Societal Impact of Academic Research With Artificial Intelligence (AI): A Scoping Review of Business School Scholarship as a 'Force for Good'.” Learned Publishing. DOI: 10.1002/leap.2010
Deloitte (2024). “State of Generative AI in the Enterprise.” Survey of 2,770 companies.
RAND Corporation. “AI Project Failure Rates Research.” Multiple publications on AI implementation challenges.
IDC (2024). “AI Proof-of-Concept Transition Rates.” Research on AI deployment challenges showing 88% failure rate.
ACM Computing Surveys (2024). “Causality for Trustworthy Artificial Intelligence: Status, Challenges and Perspectives.” DOI: 10.1145/3665494
Frontiers in Artificial Intelligence (2024). “Implications of causality in artificial intelligence.” Available at: https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2024.1439702/full
Medwave (2024). “How AI is Transforming Healthcare: 12 Real-World Use Cases.” Available at: https://medwave.io/2024/01/how-ai-is-transforming-healthcare-12-real-world-use-cases/
UNESCO (2021). “Recommendation on the Ethics of Artificial Intelligence.” Adopted by 193 Member States. Available at: https://unesdoc.unesco.org/ark:/48223/pf0000385082
Oxford Academic (2022). “Achieving societal and academic impacts of research: A comparison of networks, values, and strategies.” Social Policy & Practice, Volume 49, Issue 5. Available at: https://academic.oup.com/spp/article/49/5/728/6585532
National Science Foundation (2024). “Translating Research into Action Center (TRAC).” Accelerating Research Translation (ART) programme, $5.7M grant to American University. Available at: https://www.american.edu/centers/trac/
UChicago Medicine (2025). “What to know about AI ambient clinical documentation.” Available at: https://www.uchicagomedicine.org/forefront/patient-care-articles/2025/january/ai-ambient-clinical-documentation-what-to-know
McKinsey & Company (2025). “Generative AI ROI and Economic Impact Research.” Estimates of $3.70 return per dollar invested and $2.6-4.4 trillion annual economic potential.
Andreessen Horowitz (2024). “LLMflation – LLM inference cost is going down fast.” Analysis of 280-fold cost reduction. Available at: https://a16z.com/llmflation-llm-inference-cost/

Tim Green UK-based Systems Theorist & Independent Technology Writer
Tim explores the intersections of artificial intelligence, decentralised cognition, and posthuman ethics. His work, published at smarterarticles.co.uk, challenges dominant narratives of technological progress while proposing interdisciplinary frameworks for collective intelligence and digital stewardship.
His writing has been featured on Ground News and shared by independent researchers across both academic and technological communities.
ORCID: 0009-0002-0156-9795 Email: tim@smarterarticles.co.uk