Local development

Iterate on env.py and benchanything.json on your machine before submitting to the platform. Local runs use Ollama — no cloud API keys required.

Prerequisites

  1. CLI: pip install swecc-mesocosm (use pip — see Getting started — Install)

  2. Project: mesocosm init in your env directory (or use an existing repo with the same layout)

  3. Ollama: Install from ollama.com, then pull a model:

    ollama pull llama3.2
  4. Ollama running: The desktop app usually starts the server; otherwise run ollama serve.

About requirements.txt

The file created by mesocosm init is for optional libraries your environment imports (for example NumPy). You do not need pip install -r requirements.txt just to run adapter.py and mesocosm run local — the CLI already includes the HTTP stack.

When you submit to the platform, Mesocosm installs those dependencies in the cloud runtime.

Dev loop

Optional but recommended when using a local SWECC stack:

export MESOCOSM_LOCAL=1
mesocosm doctor --local

doctor --local checks that your adapter responds on port 8765 (and local bench-api on 8010 if you have it running).

Terminal 1 — env adapter

python adapter.py
# Health check: http://localhost:8765/health

Terminal 2 — benchmark episodes

mesocosm run local
# Equivalent: mesocosm run local --model ollama/llama3.2

run local:

  • Reads benchanything.json for the binding vow and scoring rules
  • Calls your adapter at http://localhost:8765 by default
  • Uses an Ollama model (ollama/… prefix required)
  • Does not register the domain or create platform runs

Useful flags

FlagDefaultPurpose
--modelollama/llama3.2LiteLLM model id; must start with ollama/ and match a pulled model
--episodes5Number of episodes
--env-urlhttp://localhost:8765Adapter base URL if you changed the port
--manifestbenchanything.jsonAlternate manifest path
--domain-idfrom manifest or folder nameOverride domain id for local runs
--system-promptExtra instruction for the agent
--temperature0.0Sampling temperature
--max-tokens512Max tokens per step
--parallel1Max parallel episodes
--seedsSpace-separated integer seeds
--quietLess progress output

Full reference: Command reference — run local.

Ship to the platform

When local runs look good:

mesocosm auth login
mesocosm env submit \
  --name "My env" \
  --github-url https://github.com/you/your-repo
 
mesocosm env list    # note domain_id when status is ready
 
mesocosm run create \
  --domain DOMAIN_ID \
  --vow-version 1.0.0 \
  --model gemini/gemini-3.1-flash-lite \
  --episodes 5

Platform runs use cloud models on SWECC infrastructure. Ollama is only for your machine.

See Submitting environments and Running benchmarks.

Legacy domain.py repos

Repos created before benchanything.json scaffolding may use domain.py with DOMAIN_CONFIG. You can still register manually:

mesocosm register path/to/domain.py [--auto-id] [--publish]

New projects should prefer mesocosm init + env submit.