Show HN: SurvivalIndex – which developer tools do AI agents choose?

1 points

15 hours ago

2 comments

story

We've been running coding agents against standardized repos with natural-language prompts — no tool names, no hints — and measuring what they actually choose.

Early finding: Claude Code picks Custom/DIY in 12 of 20 categories. Not because it can't use the tools (BFCL scores suggest it can) but because it doesn't reach for them. That's a different failure mode than capability benchmarks measure.

We score each tool on: agent visibility, pick rate vs Custom/DIY, cross-context breadth, expert human ratings, and implementation success rate. Tools above survival=1 persist. Below it, agents synthesize around them.

Methodology is at survivalindex.org/methodology. Very curious what people think of the measurement approach, especially the human coefficient variable.