Submit a Model

Add your model to the TritonGym leaderboard.

HuggingFace model

Submit a public HuggingFace model id and we'll run TritonGym on it as GPU time becomes available. Best for open-weight models served via vLLM/sglang. Closed-API models — submit via Option B (we don't handle your API key).

Pre-computed results (PR)

Already ran TritonGym on your own machine? Open a pull request with your logs/ and we'll merge them into the leaderboard. This is the only path for closed-API models since we don't accept API keys.

  1. Clone the repo and run the benchmark on your model:
    git clone https://github.com/yil384/TritonGym-public
    cd TritonGym-public
    pip install -r requirements.txt
    export OPENAI_API_KEY=...   # or whichever provider
    python -m benchmark.cli \
        --model your-model-name \
        --agent alphaevolve \
        --benchmark-json benchmark/data_v2/benchmark.json \
        --json --logs logs
  2. Confirm the numbers locally:
    python web/scripts/aggregate_logs.py --include-logs
    python summarize_results.py --logs logs/your-model-name
  3. Move your run into submissions/<your-model-name>/ and open a PR titled [submission] your-model-name. Include in the PR description:
    • Model name + size + license
    • Hardware used (GPU, CUDA version)
    • Anything we should know about the setup
  4. We review for sanity (no answer-leakage, OOD numbers reproducible) and merge. Your numbers go up on the leaderboard with a [submitted] tag.

Submission template   Open a PR

Queue status

Submissions are run on a single shared GPU. Reasonable wait time at launch is hours; if the queue grows we'll add more workers.

pending
completed