Compare performance of different AI coding models on their actual codebase
Buildermark supports agent benchmarking across Claude, Codex, Gemini, and Cursor, allowing developers to see which models perform best on their specific projects

Measure how much of your code is AI-generated. Open source.
Measure how much of your code is AI-generated. Buildermark matches your coding agent diffs with commits to calculate how much is by agents. Open source. Runs natively on macOS, Windows, and Linux.
Buildermark supports agent benchmarking across Claude, Codex, Gemini, and Cursor, allowing developers to see which models perform best on their specific projects
Buildermark's upcoming Team Server enables self-hosted aggregation of metrics across the organization, allowing comparison of agent adoption rates and developer productivity changes
Buildermark automatically matches agent conversation logs with git commits to attribute every line of code to AI agents, providing exact percentages per commit without requiring agent hooks or manual tagging
Buildermark allows manual conversation ratings or automatic agent self-critique via the /rate-buildermark skill, enabling quality assessment of agent-generated code
Buildermark runs entirely locally on the user's machine with zero telemetry, ensuring proprietary code never leaves their infrastructure while still providing attribution metrics