-8.1 C
United States of America
Saturday, January 11, 2025

Constructing an Agentic RAG with Phidata


When constructing functions utilizing Massive Language Fashions (LLMs), the standard of responses closely relies on efficient planning and reasoning capabilities for a given consumer process. Whereas conventional RAG methods are highly effective, incorporating Agentic workflows can considerably improve the system’s skill to course of and reply to queries.

On this article,  you’ll construct an Agentic RAG system with reminiscence parts utilizing the Phidata open-source Agentic framework, demonstrating how one can mix vector databases i.e., Qdrant, embedding fashions, and clever brokers for improved outcomes.

Studying Targets

  • Perceive and design the structure for the parts required for Agentic RAG techniques.
  • How do vector databases and embedding fashions for data base creation be built-in throughout the Agentic workflow?
  • Study to implement reminiscence parts for improved context retention
  • Develop an AI Agent that may carry out a number of software calls and resolve what software to decide on based mostly on consumer questions or duties utilizing the Phidata.
  • Actual-world use case to construct a Doc Analyzer Assistant Agent that may work together with private data from the data base and DuckDuckGo within the absence of context within the data base.

This text was printed as part of the Knowledge Science Blogathon.

What’s Brokers and RAG?

Brokers within the context of AI are parts designed to emulate human-like pondering and planning capabilities. Brokers parts include:

Agentic RAG
  • Process decomposition into manageable subtasks.
  • Clever decision-making about which instruments to make use of and take obligatory Motion.
  • Reasoning about one of the best strategy to fixing an issue.

RAG (Retrieval-Augmented Era) combines data retrieval with LLM capabilities. Once we combine brokers into RAG techniques, we create a robust workflow that may:

rag database:
  • Analyze consumer queries intelligently.
  • Save the consumer doc inside a data base or Vector database.
  • Select acceptable data sources or context for the given consumer question.
  • Plan the retrieval and response technology course of.
  • Preserve context by way of reminiscence parts.

The important thing distinction between conventional RAG and Agentic RAG lies within the decision-making layer that determines how one can course of every question and work together with instruments to get real-time data. 

Now that we all know, there’s a factor like Agentic RAG, how will we construct it? Let’s break it down. 

What’s Phidata? 

Phidata is an open-source framework designed to construct, monitor, and deploy Agentic workflows. It helps multimodal AI brokers geared up with reminiscence, data, instruments, and reasoning capabilities. Its model-agnostic structure ensures compatibility with numerous massive language fashions (LLMs), enabling builders to rework any LLM right into a useful AI agent. Moreover, Phidata means that you can deploy your Agent workflows utilizing a convey your personal cloud (BYOC) strategy, providing each flexibility and management over your AI techniques.

Key options of Phidata embody the flexibility to construct groups of brokers that collaborate to resolve complicated issues, a user-friendly Agent UI for seamless interplay (Phidata playground), and built-in help for agentic retrieval-augmented technology (RAG) and structured outputs. The framework additionally emphasizes monitoring and debugging, offering instruments to make sure sturdy and dependable AI functions.

Brokers Use Instances Utilizing Phidata

Discover the transformative energy of Agent-based techniques in real-world functions, leveraging Phidata to reinforce decision-making and process automation.

Monetary Evaluation Agent

By integrating instruments like YFinance, Phidata permits the creation of brokers that may fetch real-time inventory costs, analyze monetary knowledge, and summarize analyst suggestions. Such brokers help buyers and analysts in making knowledgeable selections by offering up-to-date market insights.

Net Search Agent

Phidata additionally helps develop brokers able to retrieving real-time data from the online utilizing search instruments like DuckDuckGo, SerpAPI, or Serper. These brokers can reply consumer queries by sourcing the most recent knowledge, making them priceless for analysis and information-gathering duties.

Multimodal Brokers

Phidata additionally helps multimodal capabilities, enabling the creation of brokers that analyze pictures, movies, and audio. These multimodal brokers can deal with duties reminiscent of picture recognition, text-to-image technology, audio transcription, and video evaluation, providing versatile options throughout numerous domains. For text-to-image or text-to-video duties, instruments like DALL-E and Replicate may be built-in, whereas for image-to-text and video-to-text duties, multimodal LLMs reminiscent of GPT-4, Gemini 2.0, Claude AI, and others may be utilized.

Actual-time Use Case for Agentic RAG

Think about you may have documentation to your startup and need to create a chat assistant that may reply consumer questions based mostly on that documentation. To make your chatbot extra clever, it additionally must deal with real-time knowledge. Sometimes, answering real-time knowledge queries requires both rebuilding the data base or retraining the mannequin.

That is the place Brokers come into play. By combining the data base with Brokers, you’ll be able to create an Agentic RAG (Retrieval-Augmented Era) answer that not solely improves the chatbot’s skill to retrieve correct solutions but additionally enhances its total efficiency. 

Real-time Use Case

We now have three fundamental parts that come collectively to kind our data base. First, we now have Knowledge sources, like documentation pages, PDFs, or any web sites we need to use. Then we now have Qdrant, which is our vector database – it’s like a wise storage system that helps us discover comparable data rapidly. And eventually, we now have the embedding mannequin that converts our textual content right into a format that computer systems can perceive higher. These three parts feed into our data base, which is just like the mind of our system.

Now we outline the Agent object from Phidata. 

The agent is related to a few parts:

  • A Reasoning Mannequin (like GPT-4, Gemini 2.0, or Claude) that helps it suppose and plan.
  • Reminiscence (SqlAgentStorage) that helps it keep in mind earlier conversations
  • Instruments (like DuckDuckGo search) that it could actually use to search out data

Be aware: Right here Data Base and DuckDuckGo each will act as a software, after which based mostly on a process or consumer question the Agent will take Motion on which software to make use of to generate the response. Additionally Embedding mannequin is OpenAI by default, so we’ll use OpenAI – GPT-4o because the reasoning mannequin. 

Let’s construct this code. 

Step-by-Step Code Implementation: Agentic RAG utilizing Qdrant, OpenAI, and Phidata

It’s time to construct a Doc Analyzer Assistant Agent that may work together with private data (An internet site) from the data base and DuckDuckGo within the absence of context within the data base. 

Step1: Setting Up Dependencies

To construct the Agentic RAG workflow we have to set up a couple of libraries that embody:

  • Phidata: To outline the Agent object and workflow execution.
  • Google Generative AI – Reasoning mannequin i.e., Gemini 2.0 Flash
  • Qdrant – Vector database the place the data base might be saved and later used to retrieve related data
  • DuckDuckGo – Search engine used to extract real-time data.
pip set up phidata google-generativeai duckduckgo-search qdrant-client

Step2: Preliminary Configuration and Setup API keys

On this step, we’ll arrange the atmosphere variables and collect the required API credentials to run this use case. To your OpenAI API key, you will get it from: https://platform.openai.com/. Create your account and create a brand new key. 

from phi.data.web site import WebsiteKnowledgeBase
from phi.vectordb.qdrant import Qdrant

from phi.agent import Agent
from phi.storage.agent.sqlite import SqlAgentStorage
from phi.mannequin.openai import OpenAIChat
from phi.instruments.duckduckgo import DuckDuckGo

import os

os.environ['OPENAI_API_KEY'] = "<change>"

Step3: Setup Vector Database – Qdrant

You now must initialize the Qdrant consumer by offering the gathering title, URL, and API key to your vector database. The Qdrant database shops and indexes the data from the web site, permitting the agent to carry out retrieval of related data based mostly on consumer queries. This step units up the information layer to your agent:

  • Create cluster: https://cloud.qdrant.io/
  • Give a reputation to your cluster and duplicate the API key as soon as the cluster is created.
  • Underneath the curl command, you’ll be able to copy the Endpoint URL.
COLLECTION_NAME = "agentic-rag"
QDRANT_URL = "<change>"
QDRANT_API_KEY = "<change>"

vector_db = Qdrant(
    assortment=COLLECTION_NAME,
    url=QDRANT_URL,
    api_key=QDRANT_API_KEY,
)

Step4: Creating the data base

Right here, you’ll outline the sources from which the agent will pull its data. On this instance, we’re constructing a Doc analyzer agent that may make our job simple to reply questions from the web site. We’ll use the Qdrant doc web site URL for indexing.

The WebsiteKnowledgeBase object interacts with the Qdrant vector database to retailer the listed data from the offered URL. It’s then loaded into the data base for retrieval by the agent.

Be aware: Keep in mind we use the load perform to index the information supply to the data base. This must be run simply as soon as for every assortment title, if you happen to change the gathering title and need to add new knowledge, solely that point run the load perform once more. 

URL = "https://qdrant.tech/documentation/overview/"

knowledge_base = WebsiteKnowledgeBase(
    urls = [URL],
    max_links = 10,
    vector_db = vector_db,
)

knowledge_base.load() # solely run as soon as, after the gathering is created, remark this

Step5: Outline your Agent

The Agent configures an LLM (GPT-4) for response technology, a data base for data retrieval, and an SQLite storage system to trace interactions and responses as Reminiscence. It additionally units up a DuckDuckGo search software for added net searches when wanted. This setup varieties the core AI agent able to answering queries.

We’ll set show_tool_calls to True to look at the backend runtime execution and observe whether or not the question is routed to the data base or the DuckDuckGo search software. Once you run this cell, it’s going to create a database file the place all messages are saved by enabling reminiscence storage and setting add_history_to_messages to True.

agent = Agent(
    mannequin=OpenAIChat(id="gpt-4o"),
    data=knowledge_base,
    instruments=[DuckDuckGo()],

    show_tool_calls=True,
    markdown=True,

    storage=SqlAgentStorage(table_name="agentic_rag", db_file="agents_rag.db"),
    add_history_to_messages=True,
)

Step6: Strive A number of Question

Lastly, the agent is able to course of consumer queries. By calling the print_response() perform, you move in a consumer question, and the agent responds by retrieving related data from the data base and processing it. If the question will not be from the data base, it’s going to use a search software. Lets observe the adjustments. 

Question -1: From the data base

agent.print_response(
  "what are the indexing methods talked about within the doc?", 
  stream=True
)
Query -1: From the knowledge base: Agentic RAG with Phidata

Question-2 Exterior the data base

agent.print_response(
  "who's Virat Kohli?", 
  stream=True
)
Query-2 Outside the knowledge base: Agentic RAG with Phidata

Benefits of Agentic RAG

Uncover the important thing benefits of Agentic RAG, the place clever brokers and relational graphs mix to optimize knowledge retrieval and decision-making.

  • Enhanced reasoning capabilities for higher response technology.
  • Clever software choice based mostly on question contexts reminiscent of Data Base and DuckDuckGo or some other instruments from the place we are able to fetch the context that may be offered to the Agent.
  • Reminiscence integration for improved context consciousness that may keep in mind and extract historical past dialog messages.
  • Higher planning and process decomposition, the first half in Agentic workflow is to get the duty and break it down into sub-tasks, after which make higher selections and motion plans.
  • Versatile integration with numerous knowledge sources reminiscent of PDF, Web site, CSV, Docs, and plenty of extra.

Conclusion

Implementing Agentic RAG with reminiscence parts gives a dependable answer for constructing clever data retrieval techniques and search engines like google. On this article, we explored what Brokers and RAG are, and how one can mix them. With the mix of Agentic RAG, question routing improves as a result of decision-making capabilities of the Brokers.

Key Takeaways

  • Uncover how Agentic RAG with Phidata enhances AI by integrating reminiscence, a data base, and dynamic question dealing with.
  • Study to implement an Agentic RAG with Phidata for environment friendly data retrieval and adaptive response technology.
  • The Phidata knowledge library gives a streamlined implementation course of with simply 30 traces of core code together with Multimodal reminiscent of Gemini 2.0 Flash.
  • Reminiscence parts are essential for sustaining context and bettering response relevance.
  • Integration of a number of instruments (data base, net search) allows versatile data retrieval – Vector databases like Qdrant present superior indexing capabilities for environment friendly search.

Ceaselessly Requested Questions

Q1. Can Phidata deal with multimodal duties, and what instruments does it combine for this goal?

A. Sure, Phidata is constructed to help multimodal AI brokers able to dealing with duties involving pictures, movies, and audio. It integrates instruments like DALL-E and Replicate for text-to-image or text-to-video technology, and makes use of multimodal LLMs reminiscent of GPT-4, Gemini 2.0, and Claude AI for image-to-text and video-to-text duties. 

Q2. What instruments and frameworks can be found for creating Agentic RAG techniques?

A. Growing Agentic Retrieval-Augmented Era (RAG) techniques entails using numerous instruments and frameworks that facilitate the combination of autonomous brokers with retrieval and technology capabilities. Listed here are some instruments and frameworks obtainable for this goal: Langchain, LlamaIndex, Phidata, CrewAI, and AutoGen. 

Q3. Can Phidata combine with exterior instruments and data bases?

A. Sure, Phidata permits the combination of varied instruments and data bases. As an illustration, it could actually join with monetary knowledge instruments like YFinance for real-time inventory evaluation or net search instruments like DuckDuckGo for retrieving up-to-date data. This flexibility allows the creation of specialised brokers tailor-made to particular use instances.

The media proven on this article will not be owned by Analytics Vidhya and is used on the Creator’s discretion.

Knowledge Scientist at AI Planet || YouTube- AIWithTarun || Google Developer Knowledgeable in ML || Gained 5 AI hackathons || Co-organizer of TensorFlow Person Group Bangalore || Pie & AI Ambassador at DeepLearningAI

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles