RAG

By design, we do not offer a native API for retrieval augmented generation (RAG).

Of the many vector stores (Chroma, FAISS, Qdrant, Weaviate, Milvus, Pinecone, Elastisearch, pgvector), we do not want to enforce just one, and we don't want to bundle a vector store library. To keep ParaLLeM lightweight, we think RAG is outside our scope.

Nonetheless, you can access this functionality via function calling. For example, here's an in-memory chromadb example:

pip install chromadb
import chromadb
from dotenv import load_dotenv
import parallem as pllm

# RAG implementation.
# parallem does not bundle any RAG libraries, but it can be implemented.

client = chromadb.Client()
collection = client.create_collection(name="rag_demo")

collection.add(
    ids=["doc1", "doc2", "doc3"],
    documents=[
        "Refunds are available within 30 days of purchase with a valid receipt.",
        "Digital products are non-refundable after download unless required by law.",
        "Our offices are based in Palo Alto, California.",
    ],
)


def vector_store_tool(query: str, k: int = 2) -> str:
    """Given a query, retrieves relevant documents from the vector store."""
    result = collection.query(query_texts=[query], n_results=k)
    docs = result["documents"][0]
    return "\n".join(docs)


# Begin parallem logic


def rag_agent(agt: pllm.AgentContext, query: str):
    conv = agt.get_msg_state()
    resp = conv.ask_llm(
        query,
        instructions="Only supply information relevant to the user's question.",
        tools=[vector_store_tool],
    )
    conv.ask_functions(vector_store_tool=vector_store_tool)
    conv.ask_llm()
    print(resp.function_calls)
    return conv[-1].final_answer


if __name__ == "__main__":
    load_dotenv()
    with pllm.resume_directory(
        ".pllm/example/rag", provider="google"
    ) as orch:
        with orch.agent() as agt:
            out = rag_agent(agt, "What is the refund policy for digital products?")
            print(out)

In the above example, retrieval augmented generation is available as a function call. However, you can also feed the query directly to the vector store -- up to you.

def simpler_rag_agent(agt: AgentContext, query: str)
    documents = vector_store_tool("What is the refund policy for digital products?")
    resp = agt.ask_llm(
        [*documents, query]
    )
    return resp.final_answer