Past “Immediate and Pray” – O’Reilly

January 21, 2025

34

TL;DR:

Enterprise AI groups are discovering that purely agentic approaches (dynamically chaining LLM calls) don’t ship the reliability wanted for manufacturing methods.
The prompt-and-pray mannequin—the place enterprise logic lives completely in prompts—creates methods which are unreliable, inefficient, and inconceivable to take care of at scale.
A shift towards structured automation, which separates conversational potential from enterprise logic execution, is required for enterprise-grade reliability.
This method delivers substantial advantages: constant execution, decrease prices, higher safety, and methods that may be maintained like conventional software program.

Image this: The present state of conversational AI is sort of a scene from Hieronymus Bosch’s Backyard of Earthly Delights. At first look, it’s mesmerizing—a paradise of potential. AI methods promise seamless conversations, clever brokers, and easy integration. However look intently and chaos emerges: a false paradise all alongside.

Your organization’s AI assistant confidently tells a buyer it’s processed their pressing withdrawal request—besides it hasn’t, as a result of it misinterpreted the API documentation. Or maybe it cheerfully informs your CEO it’s archived these delicate board paperwork—into completely the flawed folder. These aren’t hypothetical eventualities; they’re the day by day actuality for organizations betting their operations on the prompt-and-pray method to AI implementation.

Study quicker. Dig deeper. See farther.

The Evolution of Expectations

For years, the AI world was pushed by scaling legal guidelines: the empirical statement that bigger fashions and greater datasets led to proportionally higher efficiency. This fueled a perception that merely making fashions greater would resolve deeper points like accuracy, understanding, and reasoning. Nonetheless, there’s rising consensus that the period of scaling legal guidelines is coming to an finish. Incremental features are more durable to realize, and organizations betting on ever-more-powerful LLMs are starting to see diminishing returns.

In opposition to this backdrop, expectations for conversational AI have skyrocketed. Keep in mind the straightforward chatbots of yesterday? They dealt with fundamental FAQs with preprogrammed responses. Right this moment’s enterprises need AI methods that may:

Navigate complicated workflows throughout a number of departments
Interface with a whole lot of inner APIs and providers
Deal with delicate operations with safety and compliance in thoughts
Scale reliably throughout 1000’s of customers and thousands and thousands of interactions

Nonetheless, it’s vital to carve out what these methods are—and are usually not. After we discuss conversational AI, we’re referring to methods designed to have a dialog, orchestrate workflows, and make choices in actual time. These are methods that have interaction in conversations and combine with APIs however don’t create stand-alone content material like emails, shows, or paperwork. Use circumstances like “write this e mail for me” and “create a deck for me” fall into content material technology, which lies exterior this scope. This distinction is vital as a result of the challenges and options for conversational AI are distinctive to methods that function in an interactive, real-time surroundings.

We’ve been advised 2025 would be the 12 months of Brokers, however on the identical time there’s a rising consensus from the likes of Anthropic, Hugging Face, and different main voices that complicated workflows require extra management than merely trusting an LLM to determine the whole lot out.

The Immediate-and-Pray Drawback

The usual playbook for a lot of conversational AI implementations at present appears to be like one thing like this:

Accumulate related context and documentation
Craft a immediate explaining the duty
Ask the LLM to generate a plan or response
Belief that it really works as meant

This method—which we name immediate and pray—appears enticing at first. It’s fast to implement and demos properly. However it harbors severe points that turn into obvious at scale:

Unreliability

Each interplay turns into a brand new alternative for error. The identical question can yield completely different outcomes relying on how the mannequin interprets the context that day. When coping with enterprise workflows, this variability is unacceptable.

To get a way of the unreliable nature of the prompt-and-pray method, think about that Hugging Face stories the cutting-edge on perform calling is properly below 90% correct. 90% accuracy for software program will typically be a deal-breaker, however the promise of brokers rests on the power to chain them collectively: even 5 in a row will fail over 40% of the time!

Inefficiency

Dynamic technology of responses and plans is computationally costly. Every interplay requires a number of API calls, token processing, and runtime decision-making. This interprets to larger prices and slower response instances.

Complexity

Debugging these methods is a nightmare. When an LLM doesn’t do what you need, your foremost recourse is to alter the enter. However the one approach to know the influence that your change may have is trial and error. When your software includes many steps, every of which makes use of the output from one LLM name as enter for an additional, you’re left sifting via chains of LLM reasoning, making an attempt to know why the mannequin made sure choices. Improvement velocity grinds to a halt.

Safety

Letting LLMs make runtime choices about enterprise logic creates pointless danger. The OWASP AI Safety & Privateness Information particularly warns in opposition to “Extreme Company”—giving AI methods an excessive amount of autonomous decision-making energy. But many present implementations do precisely that, exposing organizations to potential breaches and unintended outcomes.

A Higher Approach Ahead: Structured Automation

The choice isn’t to desert AI’s capabilities however to harness them extra intelligently via structured automation. Structured automation is a growth method that separates conversational AI’s pure language understanding from deterministic workflow execution. This implies utilizing LLMs to interpret person enter and make clear what they need, whereas counting on predefined, testable workflows for vital operations. By separating these issues, structured automation ensures that AI-powered methods are dependable, environment friendly, and maintainable.

This method separates issues which are typically muddled in prompt-and-pray methods:

Understanding what the person desires: Use LLMs for his or her power in understanding, manipulating, and producing pure language
Enterprise logic execution: Depend on predefined, examined workflows for vital operations
State administration: Preserve clear management over system state and transitions

The important thing precept is straightforward: Generate as soon as, run reliably ceaselessly. As a substitute of getting LLMs make runtime choices about enterprise logic, use them to assist create sturdy, reusable workflows that may be examined, versioned, and maintained like conventional software program.

By preserving the enterprise logic separate from conversational capabilities, structured automation ensures that methods stay dependable, environment friendly, and safe. This method additionally reinforces the boundary between generative conversational duties (the place the LLM thrives) and operational decision-making (which is finest dealt with by deterministic, software-like processes).

By “predefined, examined workflows,” we imply creating workflows through the design section, utilizing AI to help with concepts and patterns. These workflows are then applied as conventional software program, which may be examined, versioned, and maintained. This method is properly understood in software program engineering and contrasts sharply with constructing brokers that depend on runtime choices—an inherently much less dependable and harder-to-maintain mannequin.

Alex Strick van Linschoten and the workforce at ZenML have not too long ago compiled a database of 400+ (and rising!) LLM deployments within the enterprise. Not surprisingly, they found that structured automation delivers considerably extra worth throughout the board than the prompt-and-pray method:

There’s a putting disconnect between the promise of absolutely autonomous brokers and their presence in customer-facing deployments. This hole isn’t shocking after we study the complexities concerned. The truth is that profitable deployments are likely to favor a extra constrained method, and the explanations are illuminating…
Take Lindy.ai’s journey: they started with open-ended prompts, dreaming of absolutely autonomous brokers. Nonetheless, they found that reliability improved dramatically after they shifted to structured workflows. Equally, Rexera discovered success by implementing choice bushes for high quality management, successfully constraining their brokers’ choice area to enhance predictability and reliability.

The prompt-and-pray method is tempting as a result of it demos properly and feels quick. However beneath the floor, it’s a patchwork of brittle improvisation and runaway prices. The antidote isn’t abandoning the promise of AI—it’s designing methods with a transparent separation of issues: conversational fluency dealt with by LLMs, enterprise logic powered by structured workflows.

What Does Structured Automation Look Like in Apply?

Contemplate a typical buyer assist situation: a buyer messages your AI assistant saying, “Hey, you tousled my order!”

The LLM interprets the person’s message, asking clarifying questions like, “What’s lacking out of your order?”
Having obtained the related particulars, the structured workflow queries backend knowledge to find out the difficulty: Had been gadgets shipped individually? Are they nonetheless in transit? Had been they out of inventory?
Primarily based on this info, the structured workflow determines the suitable choices: a refund, reshipment, or one other decision. If wanted, it requests extra info from the shopper, leveraging the LLM to deal with the dialog.

Right here, the LLM excels at navigating the complexities of human language and dialogue. However the vital enterprise logic—like querying databases, checking inventory, and figuring out resolutions—lives in predefined workflows.

This method ensures:

Reliability: The identical logic applies persistently throughout all customers.
Safety: Delicate operations are tightly managed.
Effectivity: Builders can check, model, and enhance workflows like conventional software program.

Structured automation bridges one of the best of each worlds: conversational fluency powered by LLMs and reliable execution dealt with by workflows.

What Concerning the Lengthy Tail?

A standard objection to structured automation is that it doesn’t scale to deal with the “lengthy tail” of duties—these uncommon, unpredictable eventualities that appear inconceivable to predefine. However the fact is that structured automation simplifies edge-case administration by making LLM improvisation secure and measurable.

Right here’s the way it works: Low-risk or uncommon duties may be dealt with flexibly by LLMs within the quick time period. Every interplay is logged, patterns are analyzed, and workflows are created for duties that turn into frequent or vital. Right this moment’s LLMs are very able to producing the code for a structured workflow given examples of profitable conversations. This iterative method turns the lengthy tail right into a manageable pipeline of recent performance, with the information that by selling these duties into structured workflows we achieve reliability, explainability, and effectivity.

From Runtime to Design Time

Let’s revisit the sooner instance: a buyer messages your AI assistant saying, “Hey, you tousled my order!”

The Immediate-and-Pray Method

Dynamically interprets messages and generates responses
Makes real-time API calls to execute operations
Depends on improvisation to resolve points

This method results in unpredictable outcomes, safety dangers, and excessive debugging prices.

A Structured Automation Method

Makes use of LLMs to interpret person enter and collect particulars
Executes vital duties via examined, versioned workflows
Depends on structured methods for constant outcomes

The Advantages Are Substantial:

Predictable execution: Workflows behave persistently each time
Decrease prices: Decreased token utilization and processing overhead
Higher safety: Clear boundaries round delicate operations
Simpler upkeep: Normal software program growth practices apply

The Position of People

For edge circumstances, the system escalates to a human with full context, guaranteeing delicate eventualities are dealt with with care. This human-in-the-loop mannequin combines AI effectivity with human oversight for a dependable and collaborative expertise.

This system may be prolonged past expense stories to different domains like buyer assist, IT ticketing, and inner HR workflows—wherever conversational AI must reliably combine with backend methods.

Constructing for Scale

The way forward for enterprise conversational AI isn’t in giving fashions extra runtime autonomy—it’s in utilizing their capabilities extra intelligently to create dependable, maintainable methods. This implies:

Treating AI-powered methods with the identical engineering rigor as conventional software program
Utilizing LLMs as instruments for technology and understanding, not as runtime choice engines
Constructing methods that may be understood, maintained, and improved by regular engineering groups

The query isn’t find out how to automate the whole lot without delay however how to take action in a manner that scales, works reliably, and delivers constant worth.

Taking Motion

For technical leaders and choice makers, the trail ahead is evident:

Audit present implementations:

Determine areas the place prompt-and-pray approaches create danger
Measure the associated fee and reliability influence of present methods
Search for alternatives to implement structured automation

2. Begin small however suppose huge:

Start with pilot initiatives in well-understood domains
Construct reusable elements and patterns
Doc successes and classes realized

3. Put money into the fitting instruments and practices:

Search for platforms that assist structured automation
Construct experience in each LLM capabilities and conventional software program engineering
Develop clear pointers for when to make use of completely different approaches

The period of immediate and pray could be starting, however you are able to do higher. As enterprises mature of their AI implementations, the main target should shift from spectacular demos to dependable, scalable methods. Structured automation supplies the framework for this transition, combining the facility of AI with the reliability of conventional software program engineering.

The way forward for enterprise AI isn’t nearly having the most recent fashions—it’s about utilizing them correctly to construct methods that work persistently, scale successfully, and ship actual worth. The time to make this transition is now.

Past “Immediate and Pray” – O’Reilly

TL;DR:

Study quicker. Dig deeper. See farther.

The Evolution of Expectations

The Immediate-and-Pray Drawback

Unreliability

Inefficiency

Complexity

Safety

A Higher Approach Ahead: Structured Automation

What Does Structured Automation Look Like in Apply?

What Concerning the Lengthy Tail?

From Runtime to Design Time

Constructing for Scale

Taking Motion

Related Articles

PTEN as a prognostic issue for radiotherapy plus immunotherapy response in nasopharyngeal carcinoma | Journal of Nanobiotechnology

GCP Cloud Composer Bug Let Attackers Elevate Entry through Malicious PyPI Packages

Evaluation: GEPRC Vapor X5/D5 Body – A Stable Choice for DJI O4 Professional

LEAVE A REPLY Cancel reply

Latest Articles

PTEN as a prognostic issue for radiotherapy plus immunotherapy response in nasopharyngeal carcinoma | Journal of Nanobiotechnology

GCP Cloud Composer Bug Let Attackers Elevate Entry through Malicious PyPI Packages

Evaluation: GEPRC Vapor X5/D5 Body – A Stable Choice for DJI O4 Professional

Securing our future: April 2025 progress report on Microsoft’s Safe Future Initiative

ARM Institute points robotic inspection for casting and forging mission name