Persistence¶

ParaLLeM saves LLM responses locally by hashing request content. On subsequent runs, a matching hash returns the cached response — no API call is made.

Every call to ask_llm computes a SHA-256 hash of:

The system prompt (instructions)
All input documents (strings, images, function call outputs, …)
LLM name (ie. "gpt-5-nano")
Any additional salt terms (see below)

If ParaLLeM has already seen your hash, then the previous value is returned immediately. Otherwise, a request is sent to the provider and stored.

with pllm.resume_directory(
    ".pllm/myproject",
    provider="openai",
    strategy="sync"
) as orch:
    with orch.agent() as agt:
        resp = agt.ask_llm("Name a prime number.")
        print(resp.final_answer)  # live on first run, instant on subsequent runs

What is and isn't hashed¶

By default, only message content and LLM name are hashed. Config settings that are not hashed:

Tool definitions
Structured output
Provider-specific keyword arguments (ie. reasoning_level)

If there is a config change but the same prompt is used, there could be a hash collision. To avoid this, customize hash_by or compute a custom salt.

`hash_by`¶

hash_by is a list of named terms to fold into the hash. By default, hash_by=["llm"]. Available options are:

"llm": Include the LLM identity (model name/provider)
"tool_names": Include tool names only
"structured_output": Include structured output schema
"kwargs": Include extra kwargs passed to the request
"all": Include everything (equivalent to all of the above)

Message content, like instructions and documents, are always hashed.

agt.ask_llm(
    "Search the web",
    tools=[{"type": "web_search"}],
    hash_by=["tool_names"]
)

Using hash_by=["tool_names"] ensures that different tool sets produce separate cache entries.

`salt`¶

salt can distinguish otherwise identical content. Use it to bypass a cached result.

agt.ask_llm("Name a prime.")
agt.ask_llm("Name a prime.", salt=1)  # Will not collide

MessageState `save` and `load`¶

Another tool for persistent conversations: MessageState can be saved to disk and restored on subsequent runs.

def chatbot(agt: pllm.AgentContext):
    msgs = agt.get_msg_state().load()

    agt.print("Current messages:", msgs)
    out = input("Send a message: ")
    while out:
        msgs.append(out)
        msgs.ask_llm()
        agt.print("Response:", msgs[-1].resolve())
        out = input("Send a message: ")

    msgs.save()

See also: memoize for non-deterministic blocks.

Persistence¶

What is and isn't hashed¶

hash_by¶

salt¶

MessageState save and load¶

`hash_by`¶

`salt`¶

MessageState `save` and `load`¶