Quickstart¶
Basic usage¶
Define a SQLAlchemy model whose columns mirror the params your simulation takes,
then a runner that writes its output to params["result_file"]. Hand both to a
Store:
from pathlib import Path
import numpy as np
from entropic import Store, Base, Mapped
class SimResult(Base):
__tablename__ = "results"
n: Mapped[int]
steps: Mapped[int]
dt: Mapped[float]
def my_sim(params: dict) -> None:
data = np.random.randn(params["n"], params["steps"])
np.save(params["result_file"], data)
store = Store(
runner=my_sim,
result_cls=SimResult,
results_dir="./results",
db_url="sqlite:///./runs.sqlite3",
file_suffix=".npy",
)
record = store.run_or_retrieve({"n": 100, "steps": 5000, "dt": 0.01})
data = np.load(record.result_file)
The first call runs the simulation. Every subsequent call with the same parameters returns the cached row without re-running.
The result record¶
Every Store method that returns a record returns an instance of your
result_cls. The four reserved columns from Base are always present; the
rest come from your model.
record.id # "a3f8c1d2e4b6f7a8" — 16-char hash, primary key
record.result_file # "./results/a3f8c1d2e4b6f7a8.npy"
record.created_at # datetime — UTC, set on insert
record.custom_data # {"elapsed_seconds": 0.042}
record.n # 100
record.dt # 0.01
Retrieving without running¶
record = store.retrieve({"n": 100, "steps": 5000, "dt": 0.01})
Returns the model instance on a hit, None on a miss.
Forcing a re-run¶
run always invokes the runner. Same params hash to the same row, so a forced
re-run overwrites the existing record (and result file) for that hash:
record = store.run({"n": 100, "steps": 5000, "dt": 0.01})
Deleting runs¶
store.delete({"n": 100, "steps": 5000, "dt": 0.01})
Pass remove_file=True to also delete the result file from disk:
store.delete({"n": 100, "steps": 5000, "dt": 0.01}, remove_file=True)
Returns True if a row was removed, False otherwise.
Registering external files¶
If a result file was produced outside entropic, index it via register:
store.register(
{"n": 100, "steps": 5000, "dt": 0.01},
result_file="./results/my_existing_run.npy",
)
The file must already exist. After registration the row is reachable via
retrieve like any other run.
Parameter sweeps¶
sweep is the batch counterpart to run_or_retrieve: it takes an iterable of
param dicts, reuses cached entries, and only invokes the runner for new
parameter sets. It makes no assumption about how the sets relate, so any sweep
shape is just an iterable.
For the common full-product case, build the iterable with expand_grid — each
key maps to a list of candidate values, and it returns one dict per combination:
from entropic import expand_grid
records = store.sweep(
expand_grid({"n": [100], "steps": [5000], "dt": [0.01, 0.005, 0.001]})
)
# runs 3 combinations: (n=100, steps=5000, dt=0.01), (…, dt=0.005), (…, dt=0.001)
For a multi-axis product:
records = store.sweep(expand_grid({"n": [50, 100], "dt": [0.01, 0.005]}))
# runs 4 combinations: (50, 0.01), (50, 0.005), (100, 0.01), (100, 0.005)
Because sweep takes a plain iterable, non-product sweeps need no special
support — build the dicts however you like:
# zipped / diagonal sweep
records = store.sweep([{"n": n, "dt": dt} for n, dt in zip([50, 100], [0.01, 0.005])])
# filtered product (drop unstable regions)
records = store.sweep(p for p in expand_grid(grid) if p["dt"] * p["n"] < 1.0)
To parallelise with Dask:
from dask.distributed import Client
with Client() as dask_client:
records = store.sweep(
expand_grid({"n": [50, 100], "dt": [0.01, 0.005]}), client=dask_client
)
Custom metadata¶
Any keyword argument to run, run_or_retrieve, or register lands
on the row’s custom_data JSON column:
record = store.run_or_retrieve(
{"n": 100, "steps": 5000, "dt": 0.01},
git_sha="abc123",
note="initial sweep",
)
record.custom_data
# {"elapsed_seconds": 0.042, "git_sha": "abc123", "note": "initial sweep"}
elapsed_seconds is added automatically on actual runs.
Logging¶
entropic uses a NullHandler by default (silent). To enable logging:
import logging
logging.getLogger("entropic").addHandler(logging.StreamHandler())
logging.getLogger("entropic").setLevel(logging.INFO)
This logs cache hits, run completions, ingestion, and file operations.