Skip to content

Persistence

ParaLLeM saves LLM responses locally by hashing request content. On subsequent runs, a matching hash returns the cached response — no API call is made.

Every call to ask_llm computes a SHA-256 hash of:

  • The system prompt (instructions)
  • All input documents (strings, images, function call outputs, …)
  • LLM name (ie. "gpt-5-nano")
  • Any additional salt terms (see below)

If ParaLLeM has already seen your hash, then the previous value is returned immediately. Otherwise, a request is sent to the provider and stored.

with pllm.resume_directory(
    ".pllm/myproject",
    provider="openai",
    strategy="sync"
) as orch:
    with orch.agent() as agt:
        resp = agt.ask_llm("Name a prime number.")
        print(resp.final_answer)  # live on first run, instant on subsequent runs

What is and isn't hashed

By default, only message content and LLM name are hashed. Config settings that are not hashed:

  • Tool definitions
  • Structured output
  • Provider-specific keyword arguments (ie. reasoning_level)

If there is a config change but the same prompt is used, there could be a hash collision. To avoid this, customize hash_by or compute a custom salt.

hash_by

hash_by is a list of named terms to fold into the hash. By default, hash_by=["llm"]. Available options are:

  • "llm": Include the LLM identity (model name/provider)
  • "tool_names": Include tool names only
  • "structured_output": Include structured output schema
  • "kwargs": Include extra kwargs passed to the request
  • "all": Include everything (equivalent to all of the above)

Message content, like instructions and documents, are always hashed.

agt.ask_llm(
    "Search the web",
    tools=[{"type": "web_search"}],
    hash_by=["tool_names"]
)

Using hash_by=["tool_names"] ensures that different tool sets produce separate cache entries.

salt

salt can distinguish otherwise identical content. Use it to bypass a cached result.

agt.ask_llm("Name a prime.")
agt.ask_llm("Name a prime.", salt=1)  # Will not collide

MessageState save and load

Another tool for persistent conversations: MessageState can be saved to disk and restored on subsequent runs.

def chatbot(agt: pllm.AgentContext):
    msgs = agt.get_msg_state().load()

    agt.print("Current messages:", msgs)
    out = input("Send a message: ")
    while out:
        msgs.append(out)
        msgs.ask_llm()
        agt.print("Response:", msgs[-1].resolve())
        out = input("Send a message: ")

    msgs.save()

See also: memoize for non-deterministic blocks.