# Quickstart

## Basic usage

Define a SQLAlchemy model whose columns mirror the params your simulation takes,
then a runner that writes its output to `params["result_file"]`. Hand both to a
`Store`:

```python
from pathlib import Path

import numpy as np

from entropic import Store, Base, Mapped


class SimResult(Base):
    __tablename__ = "results"

    n: Mapped[int]
    steps: Mapped[int]
    dt: Mapped[float]


def my_sim(params: dict) -> None:
    data = np.random.randn(params["n"], params["steps"])
    np.save(params["result_file"], data)


store = Store(
    runner=my_sim,
    result_cls=SimResult,
    results_dir="./results",
    db_url="sqlite:///./runs.sqlite3",
    file_suffix=".npy",
)

record = store.run_or_retrieve({"n": 100, "steps": 5000, "dt": 0.01})
data = np.load(record.result_file)
```

The first call runs the simulation. Every subsequent call with the same
parameters returns the cached row without re-running.

## The result record

Every `Store` method that returns a record returns an instance of your
`result_cls`. The four reserved columns from `Base` are always present; the
rest come from your model.

```python
record.id            # "a3f8c1d2e4b6f7a8" — 16-char hash, primary key
record.result_file   # "./results/a3f8c1d2e4b6f7a8.npy"
record.created_at    # datetime — UTC, set on insert
record.custom_data   # {"elapsed_seconds": 0.042}
record.n             # 100
record.dt            # 0.01
```

## Retrieving without running

```python
record = store.retrieve({"n": 100, "steps": 5000, "dt": 0.01})
```

Returns the model instance on a hit, `None` on a miss.

## Forcing a re-run

`run` always invokes the runner. Same params hash to the same row, so a forced
re-run overwrites the existing record (and result file) for that hash:

```python
record = store.run({"n": 100, "steps": 5000, "dt": 0.01})
```

## Deleting runs

```python
store.delete({"n": 100, "steps": 5000, "dt": 0.01})
```

Pass `remove_file=True` to also delete the result file from disk:

```python
store.delete({"n": 100, "steps": 5000, "dt": 0.01}, remove_file=True)
```

Returns `True` if a row was removed, `False` otherwise.

## Registering external files

If a result file was produced outside entropic, index it via `register`:

```python
store.register(
    {"n": 100, "steps": 5000, "dt": 0.01},
    result_file="./results/my_existing_run.npy",
)
```

The file must already exist. After registration the row is reachable via
`retrieve` like any other run.

## Parameter sweeps

`sweep` is the batch counterpart to `run_or_retrieve`: it takes an **iterable of
param dicts**, reuses cached entries, and only invokes the runner for new
parameter sets. It makes no assumption about how the sets relate, so any sweep
shape is just an iterable.

For the common full-product case, build the iterable with `expand_grid` — each
key maps to a list of candidate values, and it returns one dict per combination:

```python
from entropic import expand_grid

records = store.sweep(
    expand_grid({"n": [100], "steps": [5000], "dt": [0.01, 0.005, 0.001]})
)
# runs 3 combinations: (n=100, steps=5000, dt=0.01), (…, dt=0.005), (…, dt=0.001)
```

For a multi-axis product:

```python
records = store.sweep(expand_grid({"n": [50, 100], "dt": [0.01, 0.005]}))
# runs 4 combinations: (50, 0.01), (50, 0.005), (100, 0.01), (100, 0.005)
```

Because `sweep` takes a plain iterable, non-product sweeps need no special
support — build the dicts however you like:

```python
# zipped / diagonal sweep
records = store.sweep([{"n": n, "dt": dt} for n, dt in zip([50, 100], [0.01, 0.005])])

# filtered product (drop unstable regions)
records = store.sweep(p for p in expand_grid(grid) if p["dt"] * p["n"] < 1.0)
```

To parallelise with Dask:

```python
from dask.distributed import Client
with Client() as dask_client:
    records = store.sweep(
        expand_grid({"n": [50, 100], "dt": [0.01, 0.005]}), client=dask_client
    )
```

## Custom metadata

Any keyword argument to `run`, `run_or_retrieve`, or `register` lands
on the row's `custom_data` JSON column:

```python
record = store.run_or_retrieve(
    {"n": 100, "steps": 5000, "dt": 0.01},
    git_sha="abc123",
    note="initial sweep",
)
record.custom_data
# {"elapsed_seconds": 0.042, "git_sha": "abc123", "note": "initial sweep"}
```

`elapsed_seconds` is added automatically on actual runs.

## Logging

entropic uses a `NullHandler` by default (silent). To enable logging:

```python
import logging
logging.getLogger("entropic").addHandler(logging.StreamHandler())
logging.getLogger("entropic").setLevel(logging.INFO)
```

This logs cache hits, run completions, ingestion, and file operations.