Senior SWE-Bench: open-source benchmark that assesses agents as senior engineers 153 points by matt_d 16 hours ago 102 comments story