TU Wien Informatics

Alberto Maria Metelli: Policy Gradients in Reinforcement Learning

  • 2026-05-27
  • Machine Learning

Join us on May 27 for the Guest Lecture “Policy Gradients in Reinforcement Learning” by Alberto Maria Metelli!

Alberto Maria Metelli: Policy Gradients in Reinforcement Learning
Picture: local_doctor / stock.adobe.com

Join us on May 27, when Guest Lecturer Alberto Maria Metelli will hold a Guest Lecture on Policy Gradients in Reinforcement Learning! The lecture is part of Guillaume Bellec’s Tatjana Chavdarova’s, and Thomas Gärtner’s Reinforcement Learning course.

Policy Gradients in Reinforcement Learning

This lecture introduces policy-gradient methods as a framework for directly optimizing parameterized policies in reinforcement learning. Starting from the expected-return objective, it explains how policy quality can be expressed through both trajectory-based and occupancy-based perspectives, and how gradients of this objective can be estimated from sampled experience. The lecture contrasts simple black-box approaches with likelihood-ratio methods such as REINFORCE and G(PO)MDP, emphasizing the role of variance reduction through causality and baselines. It then develops the Policy Gradient Theorem as the foundation for actor–critic methods, where learned value estimates guide policy improvement, and extends this view to compatible function approximation and natural gradients.

About Alberto Maria Metelli

Alberto Maria Metelli is an Assistant Professor of Information Processing Systems with the Dipartimento di Elettronica, Informazione e Bioingegneria, in the Artificial Intelligence and Robotics Laboratory at the Politecnico di Milano. He obtained his PhD (cum Laude) in Information Technology in March 2021, which was awarded the “Premio NeoDottori di Ricerca Marco Cadoli 2021” as the best Italian PhD thesis in AI. He is the co-founder of ML Cube S.r.l., an innovative start-up for machine learning lifecycle management. He won the FIS Starting Grant in 2025, a prestigious Italian award for fundamental research. His research revolves around reinforcement learning. He is a member of the ELLIS Society and co-author of over 50 papers in top venues (JMLR, MLJ, ICML, NeurIPS, AAAI)

Join us Online

The lecture will be streamed via Zoom (Meeting-ID: 660 3702 7780, Password: 4vDmJ3mj)

Curious about our other news? Subscribe to our news feed, calendar, or newsletter, or follow us on social media.