Skip to content

AstroAlertBench

AstroAlertBench is a vision–language benchmark on publicly available ZTF alerts brokered by ALeRCE. Each example pairs tabular metadata (ZTF-style candidate fields) with a single RGB stamp montage (science, reference, difference) as one PNG.

Models return JSON organized as Parts A–C: Part A copies and interprets key metadata fields, Part B self-rates the reasoning, and Part C runs a three-stage cascade into a five-way science class.

Example figures and heatmaps are on Visualizations. GitHub, Hugging Face, and other links are collected on Resources.

Why it exists

Classifying variable and transient events from survey alerts is a core step in time-domain astronomy. This benchmark fixes a per-class pool of high-confidence examples, PNG montages suited to VLMs, and a deterministic scorer so different models are comparable on the same inputs and gold labels.

What you need locally

Get montages and metadata from AstroAlertBench on Hugging Face. The Setup & reproduction page summarizes environment setup, paths, and how to run inference and scoring with the public code.

Released under the same terms as the accompanying paper repository.