Navigating Barriers to Generative AI Adoption in Enterprises (Part I)

May 1, 2024 | By Anik Bose, General Partner at BGV & Katya Evchenko, Sr. Associate at BGV

This piece was crafted for enterprises and startups implementing Gen AI-based applications. The blog consists of two parts: in Part I, we explore the key challenges of GenAI adoption; then, in Part II, we discuss strategies and best practices to mitigate these obstacles. Our insights are drawn from BGV’s direct experience with our portfolio companies and a broader set of engagements with the enterprise ecosystem.

Summary 

Despite strong interest in Gen AI technologies among enterprises, actual adoption rates remain low due to several critical obstacles. Securing tangible ROI for Gen AI-based business applications has proven challenging, making it difficult for enterprises to justify use cases, particularly under usage-based pricing models. Additionally, high AI infrastructure costs erode margins, complicating the creation of sustainable business models. Poor data quality results in inaccurate outputs and cumbersome IT systems hinder smooth workflow integration. Finally, trust-related questions about data privacy, compliance, and security further complicate deployments, as enterprises remain cautious about exposing sensitive data to external parties.

A Patchwork of Progress 

Corporations recognize the need to innovate in business applications on the Gen AI frontier. Many enterprises are engaging in experimentation to identify viable use cases. JPMorgan Chase’s CEO, Jamie Dimon, revealed that the bank has already implemented “more than 300 AI use cases in production today.” Meanwhile, companies like Bayer boast of having identified “more than 700 potential applications” for Gen AI. 

While the productivity potential is widely recognized, Gen AI implementations are progressing at a slower pace than initially anticipated, according to The Economist. Past technological breakthroughs, like the Internet, reveal a common trend: innovation is necessary but insufficient. Surveys show that fewer than 5% of organizations widely use Gen AI tools, even though many employees try them out with enthusiasm. Most companies have not yet adopted Gen AI tools like ChatGPT, Google’s Gemini, or Microsoft’s Copilot on a large scale.

Barriers To Generative AI Adoption

The slow adoption of Gen AI in enterprises can be attributed to several key factors. First, fuzzy ROI metrics make it difficult to justify large investments in AI infrastructure. Second, messy and disparate data leads to poor model accuracy, cumbersome IT systems, and unwieldy workflow integration. Lastly, inherent low levels of trust in AI systems hinder acceptance given data privacy, compliance, and security concerns. 

1. No Clear ROI

Despite the considerable hype surrounding Gen AI technologies, CFOs are increasingly demanding clear returns and a solid business case with defined KPIs, shifting from mere experimentation to proven utility. Many of them struggle to translate these investments into tangible, measurable productivity improvements based on systematic A/B testing of the manual use case versus the AI-enabled use case. 

Pricing adds another layer of complexity. Large tech companies are integrating AI features into their existing applications and either rolling these costs into the overall pricing or subsidizing the functionality, hoping for future adoption. Both strategies can be problematic, as customers are often reluctant to absorb these additional charges. Usage-based pricing models lead to a further disconnect between ROI and the cost of the AI application.

According to our internal benchmarks, only 2% of Gen AI POCs successfully make the leap to production. While many projects within enterprises start as low-hanging fruit, once the POC is completed, enterprises begin to realize that Gen AI is only a part of the solution, and further investments are required to realize the project goals.

Startups specializing in Gen AI applications can spend a significant portion of their budget, sometimes up to 80% (source), on AI infrastructure costs. This includes investments in computation hardware, proprietary LLMs, and cloud computing resources. While certain AI infrastructure strategies – i.e. one facilitated by hyperscalers or enabled through APIs – can accelerate time-to-value for enterprises, these carry the consequence of soaring compute and inference costs. Such approaches destroy the viability of startup business models and the ability of enterprises to deliver on ROI. Finally, third-party hosted models can erode a startup or enterprise’s ability to adequately address data privacy concerns, particularly for proprietary or sensitive data. The towering cost of AI infrastructure is driven by the size of large language models (most with tens to hundreds of billions of parameters) that require significant GPU and cloud computing resources for efficient model training and inference. What’s more, larger LLMs, selected with the belief that “bigger is better” (i.e., better performance), further boost these AI infrastructure costs. When transitioning LLM tools from POC to production, prioritizing FinOps and employing techniques like model orchestration and token optimization is essential to keep costs as low as possible. 

2. Insufficient Accuracy and Poor Integration with Workflows

Traditionally, IT systems have been designed to operate on structured data. However, LLMs handle vast amounts of unstructured data from disparate sources, resulting in data, IT system, and workflow integration challenges.

The accuracy levels of current LLMs are astonishingly low. For instance, in the tax domain, research indicates that currently, GPT-4 achieves only 54.5% accuracy. This means that out of 100 relatively challenging multiple-choice tax-related questions, GPT-4 could provide only 54 correct answers, which is far from reliable (random selection would score 25%, and a few models performed worse than that). Most LLMs were initially developed for consumer search, making much of their dataset irrelevant for specific enterprise applications. Our internal benchmarks suggest that less than 1% of the training data used by general LLMs is pertinent to intended applications. Although Cohere appears tailored for enterprise use cases, its accuracy in the above-mentioned tax domain benchmark as of April 2024 stands at a striking 22%. Another challenge lies in the probabilistic nature of LLM outputs. Despite seeming confident, the model may generate entirely made-up responses, a phenomenon known as hallucination. 

Improving data quality isn’t enough. These tools must be smoothly integrated into existing workflows to benefit from Gen AI truly. Integrating Gen AI with old systems is challenging, and enterprises might even build Gen AI on top of legacy processes before realizing that the new technology makes these old processes obsolete. If AI tools don’t work well with the existing workflows, they can disrupt productivity instead of improving it.

3. Trust Issues

The adoption of new technology fundamentally hinges on trust. This includes trust in the technology itself, the product, and the company behind it. Building trust is a gradual process, deeply intertwined with human factors. As users gain familiarity and observe reliable performance, their confidence in the technology grows, reinforcing the cycle of trust and adoption.

According to the AI Index Report published by the Stanford Institute for Human-Centered Artificial Intelligence, in 2023, 52% of survey participants from the general population expressed nervousness toward AI products and services, marking a 13 percentage point increase from 2022. In the US, Pew data also suggests that 52% of Americans report feeling more concerned than excited about AI, up from 38% in 2022. This trend runs parallel to another: the significant increase in the number of AI-related regulations in the US. In 2023, there were 25 AI-related regulations, a stark increase from just one in 2016, and in 2023 alone, the total number of AI-related regulations surged by 56%. One possible reason for this widespread skepticism is anthropomorphism—the attribution of human-like intelligence to AI, which is fundamentally just applied mathematics.

Enterprises are increasingly preoccupied with data privacy, aiming to protect sensitive information. As an enterprise’s size increases, so does its security team’s authority to restrict external Gen AI solutions. As a result, larger corporations are often reluctant to permit their sensitive data to pass through firewalls, fearing potential breaches. Additionally, enterprises seek assurance that their AI models adhere to safety frameworks and regulations and align with corporate values and ethics. Failure to meet these standards undermines the trustworthiness of the technology.

Whether a company is built around Gen AI or is incorporating LLM features into existing products, successful implementation practices are consistently emerging. This article will explore these insights, drawn from our portfolio and other enterprises successfully using Gen AI, in Part II. Stay tuned to learn how these strategies can benefit your organization.

If you are building in the space, or have some ideas on the topic and want to discuss them, feel free to reach out to us at [email protected] and [email protected]

We want to thank all our colleagues and friends, including Julio Casal, Sujay Rao, and Chyngyz Dzhumanazarov, who contributed to this blog.

Other sources used:

  1. Public Enterprise LLM Benchmarks – https://www.vals.ai/ 
  2. The Economist “How Businesses Are Actually using Gen AI”, Feb 2024
  3. Wall Street Journal, “AI Is Taking On New Work. But Change Will Be Hard—and Expensive”, March 2024
  4. AI Index Report published by the Stanford Institute for Human-Centered Artificial Intelligence, April 2024
  5. Key Takeaways from “AI x Risk Management Meet-up“ organized by BGV and Astorya VC, Paris, April 2024