Edward Stevinson
Hi, I’m Ed, a research scientist and PhD candidate at the CIRCLE group at Imperial College London, where I work on mechanistic interpretability and adversarial robustness. My research centres on understanding the representation geometry of neural networks and how this shapes adversarial vulnerability.
Find me on Twitter, Google Scholar, Github and LinkedIn. Please reach out by email if you want to talk about any research!
news
| Jun 11, 2026 | Our paper looking at how mechanistic knowledge can predict vulnerabilties was accepted to the Mechanistic Interpretability workshop at ICML 2026. |
|---|---|
| May 23, 2026 | Our paper was accepted to the Pluralistic Alignment and Trustworthy AI for Good workshops at ICML 2026. |
| May 08, 2026 | Recognised as a Gold Reviewer at ICML 2026 |
| Apr 30, 2026 | Paper on superposition created adversarial vulnerability accepted at ICML 2026 – read it on arXiv |
| Jan 26, 2026 | Our paper on feature geometry was accepted at ICLR 2026. |
| Jan 26, 2026 | Our LASR Labs paper, ContextBench, was accepted at ICLR 2026 |
| Dec 12, 2025 | Two spotlight papers at the Mechanistic Interpretability workshop, NeurIPS 2025 |
| Sep 15, 2025 | Outstanding paper award at NeSy 2025 for our paper on probabilistic neuro-symbolic robustness verification |
| Mar 30, 2025 | Our paper on the ViT-Prisma toolkit, an open-source mechanistic interpretability library for vision models, was accepted at the MIV workshop at CVPR 2025 |
| Nov 25, 2024 | 2nd place in the Apart AI safety hackathon for work on detecting adversarial prompt vulnerabilities |
| May 22, 2024 | Took the opposition in an EA debate at Imperial entitled Is AI an Existential Risk? |