Submit a Model

Add your model to the TritonGym leaderboard.

HuggingFace model

Submit a public HuggingFace model id and we'll run TritonGym on it as GPU time becomes available. Best for open-weight models served via vLLM/sglang. Closed-API models — submit via Option B (we don't handle your API key).

Pre-computed results (PR)

Already ran TritonGym on your own machine? Open a pull request with your logs/ and we'll merge them into the leaderboard. This is the only path for closed-API models since we don't accept API keys.

Clone the repo and run the benchmark on your model:

git clone https://github.com/yil384/TritonGym-public
cd TritonGym-public
pip install -r requirements.txt
export OPENAI_API_KEY=...   # or whichever provider
python -m benchmark.cli \
    --model your-model-name \
    --agent alphaevolve \
    --benchmark-json benchmark/data_v2/benchmark.json \
    --json --logs logs

Confirm the numbers locally:

python web/scripts/aggregate_logs.py --include-logs
python summarize_results.py --logs logs/your-model-name

Move your run into submissions/<your-model-name>/ and open a PR titled [submission] your-model-name. Include in the PR description:
- Model name + size + license
- Hardware used (GPU, CUDA version)
- Anything we should know about the setup
We review for sanity (no answer-leakage, OOD numbers reproducible) and merge. Your numbers go up on the leaderboard with a [submitted] tag.

Submission template Open a PR

Queue status

Submissions are run on a single shared GPU. Reasonable wait time at launch is hours; if the queue grows we'll add more workers.

—pending

—completed