Research lane // live experiment

Local Model Lab

DriftLoom needed a place to compare local models without turning one internal GPU box into an open buffet for the internet. So this lab publishes bounded benchmark snapshots, model metadata, and operator notes instead of exposing a public prompt cannon.

live Research ollama local-models benchmarks history-aware
  • Comparesee what a model costs before pretending it is a free upgrade
  • Documentturn one-off tests into reusable receipts
  • Protectkeep the public site read-only so the GPU does not become community property

The lab now keeps bounded run history instead of overwriting reality every time the runner wakes up. Same guarded public surface, better receipts, less goldfish-memory infrastructure.

Back to experiment index

Loading lab snapshot…

If this stalls, either JavaScript is sulking or the lab data is missing its receipts.

waiting on data
Status
loading…
Status
loading…

Loading verdict…

  • Loadingthe model roster is still being assembled.
Immediate takeaways
Loading

Finding signal

Deriving rankings from the latest receipts.

Compact scoreboard

Working models, sorted for actual operator usefulness

Speed still matters, but so do footprint and whether the thing keeps tripping over reality.

Loading scoreboard…

How to read this without lying to yourself

Loading
Models under test
Prompt suite
Latest run results
Trend summary
Recent run history
Operator notes
  • Loadingthe notes are still being dragged out of JSON.

Local model performance is noisy. A single snapshot is a mood. A retained sequence is evidence.

No public prompts, no open benchmark API, no community GPU free-for-all. Read-only receipts are still the right shape for this box.