My research is on representation learning for personalization — currently focused on generative models, LLM agents, and LLM-as-a-Judge evaluation. Previously, PhD at UPF Barcelona on algorithmic bias in graph-based recommenders.
Currently focused on personalized LLM-as-a-Judge and judge-guided self-improvement for recommender systems at Spotify. Bridging the gap between subjective product quality and scalable, reliable evaluation.
Generating coherent recommendation slates with prompt-conditioned diffusion. Moves recommendation from ranking-and-cut to learned slate generation.
A scalable, calibrated LLM judge that incorporates user profile context — a better middle ground between offline metrics and A/B tests for evaluating recommendations.
Cross-platform graph foundation model that detects coordinated inauthentic behavior — transferable across networks without per-platform retraining.
First cross-content GNN recommender for audiobook discovery. Co-listening graph design with LLM content representations and inductive item coverage.
Position paper laying out what graph foundation models could look like for the personalization problem at industrial scale.
A minimal-edit graph rewiring method that shortens radicalization pathways in video recommenders — without retraining the underlying model.