*Result*: EEG-CLIP: A transformer-based framework for EEG-guided image generation.
*Further Information*
*Decoding visual perception from neural signals represents a fundamental step toward advanced brain-computer interfaces (BCIs), where functional magnetic resonance imaging (fMRI) has shown promising results despite practical constraints in deployment and costs. Electroencephalography (EEG), with its superior temporal resolution, portability, and cost-effectiveness, emerges as a promising alternative for real-time brain-computer interface (BCI) applications. While existing EEG-based approaches have advanced neural decoding capabilities, they remain constrained by inadequate architectural designs, limited reconstruction fidelity, and inconsistent evaluation protocols. To address these challenges, we present EEG-CLIP, a novel Transformer-based framework that systematically addresses each limitation: (1) We introduce a specialized EEG-ViT encoder that adeptly captures the spatial and temporal characteristics of EEG signals to augment model capacity, along with a Diffusion Prior Transformer architecture to approximate the image feature distribution. (2) We employ a dual-stage reconstruction pipeline that integrates class contrastive learning and pretrained diffusion models to enhance visual reconstruction quality. (3) We establish comprehensive evaluation protocols across multiple datasets. Our framework operates through two stages: first projecting EEG signals into CLIP image space via class contrastive learning and refining them into image priors, then reconstructing perceived images through a pretrained conditional diffusion model. Comprehensive empirical analysis, including temporal window sensitivity studies and regional brain activation visualization, demonstrates the framework's robustness. We demonstrate through ablations that EEG-CLIP's performance improvements over previous methods result from specialized architecture for EEG encoding and improved training techniques. Quantitative and qualitative evaluations on ThingsEEG and Brain2Image datasets establish EEG-CLIP's state-of-the-art performance in both classification and reconstruction tasks, advancing neural signal-based visual decoding capabilities.
(Copyright © 2025 Elsevier Ltd. All rights reserved.)*
*Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.*