# CooperBench

## Docs

- [Agent interface](https://mintlify.wiki/cooperbench/CooperBench/api/agents.md): Work with different agent frameworks in CooperBench
- [Execution backends](https://mintlify.wiki/cooperbench/CooperBench/api/backends.md): Configure and use different execution backends for running tasks
- [Discovery functions](https://mintlify.wiki/cooperbench/CooperBench/api/discover.md): Query available tasks and completed runs in CooperBench
- [Environment classes](https://mintlify.wiki/cooperbench/CooperBench/api/environments.md): Work with task environments and create custom sandboxes
- [evaluate()](https://mintlify.wiki/cooperbench/CooperBench/api/evaluate.md): Evaluate completed benchmark runs by testing patches against feature tests
- [run()](https://mintlify.wiki/cooperbench/CooperBench/api/run.md): Execute benchmark tasks with configurable agents and backends
- [Agent configuration](https://mintlify.wiki/cooperbench/CooperBench/cli/agent-config.md): Pass custom configuration files to agents
- [Execution backends](https://mintlify.wiki/cooperbench/CooperBench/cli/backends.md): Configure and use different execution backends for running tasks and evaluations
- [cooperbench config](https://mintlify.wiki/cooperbench/CooperBench/cli/config.md): Configure execution backends for CooperBench
- [Environment variables](https://mintlify.wiki/cooperbench/CooperBench/cli/environment-variables.md): Configure CooperBench with environment variables
- [cooperbench eval](https://mintlify.wiki/cooperbench/CooperBench/cli/eval.md): Evaluate completed benchmark runs
- [cooperbench run](https://mintlify.wiki/cooperbench/CooperBench/cli/run.md): Run agents on CooperBench tasks
- [System architecture](https://mintlify.wiki/cooperbench/CooperBench/concepts/architecture.md): Understanding CooperBench's execution backends, evaluation pipeline, and infrastructure components
- [Dataset structure](https://mintlify.wiki/cooperbench/CooperBench/concepts/dataset.md): Understanding the 652 benchmark tasks across 12 repositories, including task organization, subsets, and data format
- [Overview](https://mintlify.wiki/cooperbench/CooperBench/concepts/overview.md): Understanding multi-agent coordination in software engineering and how CooperBench evaluates collaborative AI systems
- [Evaluation settings](https://mintlify.wiki/cooperbench/CooperBench/concepts/settings.md): Comparing cooperative and solo modes to measure the coordination deficit in multi-agent systems
- [Backends](https://mintlify.wiki/cooperbench/CooperBench/guides/backends.md): Choose the right execution backend for running CooperBench experiments
- [Custom agents](https://mintlify.wiki/cooperbench/CooperBench/guides/custom-agents.md): Implement your own agent framework to run on CooperBench
- [Evaluation](https://mintlify.wiki/cooperbench/CooperBench/guides/evaluation.md): Understand how CooperBench evaluates agent-generated code changes
- [GCP setup](https://mintlify.wiki/cooperbench/CooperBench/guides/gcp-setup.md): Configure Google Cloud Platform as your CooperBench execution backend
- [Running experiments](https://mintlify.wiki/cooperbench/CooperBench/guides/running-experiments.md): Learn how to run CooperBench experiments with different settings, filters, and backends
- [Installation](https://mintlify.wiki/cooperbench/CooperBench/installation.md): Set up CooperBench with your preferred execution backend and configure LLM providers
- [Introduction](https://mintlify.wiki/cooperbench/CooperBench/introduction.md): CooperBench is the first benchmark designed to measure how well AI agents can cooperate when handling individual tasks with potential conflicts
- [Quick start](https://mintlify.wiki/cooperbench/CooperBench/quickstart.md): Run your first CooperBench experiment in minutes and evaluate agent coordination
- [Key findings](https://mintlify.wiki/cooperbench/CooperBench/results/findings.md): Discover the three critical coordination failures revealed by CooperBench's evaluation of AI agent collaboration
- [Output structure](https://mintlify.wiki/cooperbench/CooperBench/results/output-structure.md): Complete reference for CooperBench's output format including logs directory, trajectory files, patches, and evaluation results
- [Dataset statistics](https://mintlify.wiki/cooperbench/CooperBench/results/statistics.md): Comprehensive statistics about CooperBench's dataset including task counts, repository coverage, and language distribution