Command reference

Complete parameter reference for the mesocosm CLI. Install with pip install swecc-mesocosm (see Getting started — Install).

See also: Mesocosm CLI, Getting started, Local development.

Run mesocosm --help or mesocosm run --help for the built-in command tree.


Quick reference

CommandArguments / flagsDescription
(root)-V, --versionShow mesocosm version and exit
auth login--server-urlLog in; prompts for username and password
auth tokenPrint saved member JWT for curl or scripts
auth guest--bench-urlCreate guest session on bench-api
auth whoami--bench-urlShow current principal via GET /v1/me
auth logoutClear saved credentials file
team create--name; --useCreate team; optionally set active
team joinCODEJoin team using invite code
team listList teams you belong to
team showTEAM_IDPrint team details as JSON
team useTEAM_IDSet active team in credentials
team clearClear active team; use solo default
team runsTEAM_IDList runs for a team
team code showTEAM_IDShow team including join code
team code regenerateTEAM_IDRotate team join invite code
team members removeTEAM_ID; --user-idRemove member from team (owner)
team transferTEAM_ID; --user-idTransfer team ownership to user
team leaveTEAM_IDLeave a team you belong to
team deleteTEAM_IDDelete a team (owner)
init--dir; --forceScaffold benchanything.json, adapter, env, showcase
run localsee belowBench locally with Ollama and benchanything.json
env submit--name; --github-url; …Submit GitHub repo as developer environment
env list--team; --soloList your developer environments
env deleteENV_IDDelete a developer environment (member auth)
run create--domain; --vow-version; …Start platform bench run via API
run exportRUN_ID; -o / --outputDownload run JSON for showcase or replay
registerdomain.py; --auto-id; --publishLegacy register domain.py; prefer env submit
doctor--base-url; --localCheck bench-api reachability; local checks adapter
validateFILE or -Validate domain JSON against policy constraints
eval test--domain-id; …Run single test episode via bench-api
eval run--domain-id; …Run multi-episode eval with scoring aggregation
run getRUN_ID; --base-urlFetch run status and aggregate scores
run episodesRUN_ID; --traces; --base-urlList episodes for a run; optional traces

Global configuration {#global-configuration}

Credentials file

ItemDescription
Default path~/.config/swecc/bench_credentials.json
OverrideSet SWECC_BENCH_CREDENTIALS to another file path.
Typical keysmode (member | guest), token, server_url, bench_url, active_team_id (optional).

Written by auth login, auth guest, team use, team clear, and team create --use. Cleared by auth logout.

Environment variables

VariableUsed byDescription
MESOCOSM_LOCALURL defaults, doctor --localWhen set to 1, true, yes, or on, use local URLs (127.0.0.1:8000 server, :8010 bench-api, :8765 adapter) unless overridden. See getting-started (Configure).
MESOCOSM_BASE_URL--base-url, URL resolutionbench-api base URL. Production must include /bench (e.g. https://api.swecc.org/bench).
SWECC_BENCH_URLURL resolutionAlias for bench-api base URL.
BENCH_API_URLURL resolutionThird alias for bench-api base URL.
SWECC_SERVER_URLauth login defaultswecc-server base URL for member login (default prod: https://api.swecc.org).
SWECC_BENCH_TOKENAPI calls without auth loginMember JWT for scripts/CI.
SWECC_BENCH_GUEST_TOKENAPI callsGuest token; forces guest mode when set.
SWECC_BENCH_CREDENTIALSCredential storePath to JSON credentials file.
MESOCOSM_ENV_URLdoctor --localOverride env adapter URL (default http://127.0.0.1:8765).
MESOCOSM_ADAPTER_URLdoctor --localAlias for env adapter URL.
BENCH_AUTH_DISABLEDbench_common session (dev)When 1/true/yes, skip auth and use empty bearer (local bench-api dev only).

Resolution order (member bench-api URL): CLI --bench-url / --base-url → environment variables above → saved bench_url in credentials → derive from server_urlMESOCOSM_LOCAL → production default.

Guest default URL: auth guest uses CLI --bench-url or env bench URL vars only — not saved credentials and not MESOCOSM_LOCAL — defaulting to https://api.swecc.org/bench when unset.

Global CLI flags

FlagApplies toDescription
--bench-urlauth, team, env, init, register, run create, local, exportOverride bench-api base URL for that invocation.
--base-urldoctor, validate, eval, run get, run episodesSame role as --bench-url; also reads MESOCOSM_BASE_URL.
-V, --versionRootPrint package version and exit.

Active team context

Many platform commands attach team_id from credentials active_team_id, unless you pass --team TEAM_ID or --solo. Set active team with mesocosm team use TEAM_ID or team create --use.

See Teams.


Authentication {#authentication}

mesocosm auth login

Summary: Interactive member login via SWECC API; saves JWT and bench URL to credentials.

ParameterRequiredDescription
--server-urlNoServer base URL for login. Default: SWECC_SERVER_URL, else MESOCOSM_LOCALhttp://127.0.0.1:8000, else https://api.swecc.org.
--bench-urlNobench-api URL stored after login. Default derived from server (prod → https://api.swecc.org/bench, local server → :8010).
(interactive)YesPrompts for username (default: OS username) and password. No --username / --password flags.

After success: Writes mode: member, token, server_url, bench_url. For CI, use SWECC_BENCH_TOKEN (Authentication).

mesocosm auth token

Summary: Print the saved member JWT for curl or scripts.

ParameterRequiredDescription
--bench-urlNoParent flag (unused for output).

Requires: Prior auth login with mode: member. Errors if only guest credentials exist.

mesocosm auth guest

Summary: Create a short-lived guest session on bench-api (no SWECC account).

ParameterRequiredDescription
--bench-urlNobench-api URL for POST /v1/auth/guest. Default: env vars, else production https://api.swecc.org/bench (ignores saved credentials and MESOCOSM_LOCAL).

After success: Saves mode: guest, token, bench_url. Guest cannot run member-only commands (team *, env submit, etc.).

mesocosm auth whoami

Summary: Call GET /v1/me and print JSON principal.

ParameterRequiredDescription
--bench-urlNoCombined with credentials for URL resolution.

Errors: Connection failures print hints for prod vs local. If guest token is rejected, suggests re-running auth guest.

mesocosm auth logout

Summary: Delete the credentials file. No arguments.


Teams {#teams}

mesocosm team create

ParameterRequiredDescription
--nameYesDisplay name for the new team.
--useNoSave returned team_id as active_team_id.
--bench-urlNoParent flag.

Output: team_id and join_code.

mesocosm team join

ParameterRequiredDescription
CODEYesInvite code (normalized to uppercase).
--bench-urlNoParent flag.

mesocosm team list / team show / team use / team clear / team runs

CommandPositionalNotes
team listOne line per team
team showTEAM_IDJSON details
team useTEAM_IDSets active_team_id (no API call)
team clearRemoves active team
team runsTEAM_IDGET /v1/teams/{id}/runs

mesocosm team code show / team code regenerate

Same as team show for code show. code regenerate rotates join code (owner).

mesocosm team members remove / team transfer / team leave / team delete

CommandRequired flags
team members removeTEAM_ID, --user-id (owner)
team transferTEAM_ID, --user-id (owner)
team leaveTEAM_ID
team deleteTEAM_ID (owner)

Project scaffolding

mesocosm init

Summary: Scaffold a new env author repo (no API calls).

ParameterRequiredDescription
--dirNoTarget directory (default: .).
--forceNoOverwrite existing scaffold files.
--bench-urlNoParent flag (unused by init).

Writes: benchanything.json, adapter.py, env.py, requirements.txt, LOCAL_DEV.md, showcase/README.md, showcase/replay.example.json.

See Local development.


Running benchmarks {#running-benchmarks}

mesocosm run local {#mesocosm-run-local}

Summary: Run episodes locally via Ollama + benchanything.json (no platform submit).

ParameterRequiredDescription
--manifestNoPath to benchanything.json (default: ./benchanything.json).
--domain-idNoOverride domain id; default from manifest or parent folder name.
--modelNoLiteLLM model id (default: ollama/llama3.2). Must start with ollama/.
--env-urlNoEnv adapter URL (default: http://localhost:8765).
--episodesNoNumber of episodes (default: 5).
--seedsNoSpace-separated integer seeds.
--system-promptNoOptional system prompt for the agent.
--temperatureNoSampling temperature (default: 0.0).
--max-tokensNoMax tokens per step (default: 512).
--parallelNoMax parallel episodes (default: 1).
--quietNoReduce progress output.
--bench-urlNoParent flag (unused; no bench-api call).

mesocosm run create

Summary: Start a bench run on the platform (POST /v1/runs).

ParameterRequiredDescription
--domainYesTarget domain id.
--vow-versionYesBinding vow version (e.g. 1.0.0).
--modelYesModel identifier (e.g. gemini/gemini-3.1-flash-lite).
--episodesNoEpisode count (default: 1).
--parallelNoMax parallel episodes (default: 1).
--system-promptNoOptional system prompt in agent_config.
--temperatureNoAgent temperature (default: 0.0).
--max-tokensNoMax tokens per step (default: 512).
--teamNoExplicit team id on the run.
--soloNoDo not attach active team.
--visibilityNoprivate or gallery_public.
--env-idNoDeveloper environment id to pin env URL/runtime.
--bench-urlNoParent flag.

mesocosm run export {#mesocosm-run-export}

Summary: Download run JSON (traces + replay) for showcase.

ParameterRequiredDescription
RUN_IDYesPlatform run id.
-o, --outputNoWrite JSON to file; default stdout.
--bench-urlNoParent flag.

API: GET /v1/runs/{run_id}/export. See Showcase.

mesocosm run get

Summary: Fetch run status, episodes, and aggregate scores.

ParameterRequiredDescription
RUN_IDYesPlatform run id.
--base-urlNobench-api base URL.

mesocosm run episodes

Summary: List episodes for a run; optionally include traces.

ParameterRequiredDescription
RUN_IDYesPlatform run id.
--tracesNoAlso fetch run traces keyed by episode.
--base-urlNobench-api base URL.

Environments

mesocosm env submit {#mesocosm-env-submit}

Summary: Submit a GitHub repo as a developer environment (member auth).

ParameterRequiredDescription
--nameYesHuman-readable environment name.
--github-urlYesPublic GitHub repo URL; platform clones and registers from benchanything.json.
--descriptionNoOptional description (default: empty).
--teamNoExplicit team_id (overrides active team).
--soloNoForce solo scope (no team_id).
--bench-urlNoParent flag.

See Submitting environments.

mesocosm env list

ParameterRequiredDescription
--teamNoExplicit team id filter.
--soloNoList solo-scoped environments only.
--bench-urlNoParent flag.

mesocosm env delete

Summary: Delete a developer environment by id (member auth).

ParameterRequiredDescription
ENV_IDYesDeveloper environment id from env list or submit output.
--bench-urlNoParent flag.

API: DELETE /v1/developer/environments/{env_id}.


Legacy register

mesocosm register

Summary: Legacy: register domain.py with DOMAIN_CONFIG (prefer env submit for new repos).

ParameterRequiredDescription
domain_fileYesPath to domain.py.
--auto-idNoSet domain id to parent directory name.
--publishNoPublish domain after register.
--bench-urlNobench-api URL for register/publish API.

Diagnostics and validation

mesocosm doctor

Summary: Probe bench-api health; with --local, also probe env adapter.

ParameterRequiredDescription
--base-urlNobench-api URL. Default prod https://api.swecc.org/bench or local when MESOCOSM_LOCAL=1.
--localNoCheck adapter at MESOCOSM_ENV_URL / default :8765 and bench-api at :8010.

Exit code: 0 if checks pass, 1 otherwise. Prints JSON with issues and hints (e.g. missing /bench on prod).

Local profile: With --local or MESOCOSM_LOCAL=1, doctor passes if the env adapter health check succeeds — bench-api can be down, which is fine for the Ollama + run local loop.

See Troubleshooting.

mesocosm validate {#mesocosm-validate}

Summary: Validate JSON against bundled policy constraints (offline, no HTTP).

ParameterRequiredDescription
FILEYesPath to JSON file, or - for stdin. Auto-detects benchanything.json manifest shape or legacy POST /v1/domains register body.
--base-urlNoDeclared but unused (no network).

Output: JSON with ok, issues, suggested_fixes, rules_version, and optional schema: benchanything_manifest.

Exit code: 0 if validation ok, else 1.


Eval

mesocosm eval test

Summary: Run one dev test episode (POST /v1/test/episode).

ParameterRequiredDescription
--domain-idYesTarget domain id.
--vow-versionNoIf omitted, read from domain record.
--modelYesModel id.
--env-urlNoOverride environment HTTP URL.
--seedNoEpisode seed integer.
--temperatureNoDefault 0.0.
--max-tokensNoDefault 4096.
--base-urlNobench-api base URL.

Exit: Non-zero if episode status is failed, cancelled, or error.

run create vs eval run

Both call POST /v1/runs, but they target different workflows:

run createeval run
Domain flag--domain--domain-id
Episodes--episodes--num-episodes
Parallelism--parallel--max-parallel
Default max_tokens5124096
--vow-versionrequiredoptional (from domain record)
Draft domainsno guard--require-published / --allow-draft
Teams / visibility / env-idyesno

Prefer run create for hackathon platform benchmarks; use eval run for developer eval workflows.

mesocosm eval run

Summary: Multi-episode run with aggregation (POST /v1/runs).

ParameterRequiredDescription
--domain-idYesTarget domain id.
--vow-versionNoDefault from domain record if omitted.
--modelYesModel id.
--num-episodesNoDefault 1.
--seed-setNoJSON array of integers, e.g. '[1,2,3]'.
--temperatureNoDefault 0.0.
--max-tokensNoDefault 4096.
--max-parallelNoDefault 1.
--require-published / --allow-draftNoDefault rejects non-published domains; --allow-draft allows draft.
--base-urlNobench-api base URL.

Root

mesocosm (root)

ParameterRequiredDescription
-V, --versionNoPrint mesocosm <version> and exit.
--helpNoShow command tree or subcommand help.

mesocosm run with no subcommand shows run subcommand help.