A Roadmap for the Everyday Use of LLMs: Emerging Risks and Research Directions
Large Language Models (LLMs) are increasingly part of everyday life, shaping how people seek information, advice, and guidance. This rapid shift raises new challenges that extend beyond traditional NLP benchmarks, as models can influence decisions, beliefs, and perceptions in subtle but powerful ways. In this talk, I will reflect on recent research and ongoing work aimed at identifying these challenges and exploring how we can design LLMs that foster safer, more trustworthy, and more pluralistic interactions.
Camilla Casula, Sebastiano Vecellio Salto, Elisa Leonardelli and Sara Tonelli. Job Unfair: An Investigation of Gender and Occupational Bias in Free-Form Text Completion by LLMs. (EMNLP 2025)
Dennis Fucci, Marco Gaido, Matteo Negri, Luisa Bentivogli, Andre Martins, Giuseppe Attanasio. Different Speech Translation Models Encode and Translate Speaker Gender Differently. (ACL 2025)
Beatrice Savoldi, Giuseppe Attanasio, Eleonora Cupin, Eleni Gkovedarou, Janiça Hackenbuchner, Anne Lauscher, Matteo Negri, Andrea Piergentili, Manjinder Thind, Luisa Bentivogli. Mind the Inclusivity Gap: Multilingual Gender-Neutral Translation Evaluation with mGeNTE (EMNLP 2025)
Saba Ghanbari Haez, Mauro Dragoni. Neutral Is Not Unbiased: Evaluating Implicit and Intersectional Identity Bias in LLMs Through Structured Narrative Scenarios (EMNLP 2025 Findings)
Sofia Brenna, Elisabetta Jezek, Bernardo Magnini. Investigating Proactivity in Task-Oriented Dialogues. (Dialogue & Discourse)
Daniela Occhipinti, Marco Guerini, and Malvina Nissim. 2025. When Harry Meets Superman: The Role of The Interlocutor in Persona-Based Dialogue Generation. (ACL 2025)
Patrizio Bellan, Saba Ghanbari Haez, Leonardo Sanna, Simone Magnolini & Mauro Dragoni. Leveraging Multi-Agent Systems for Domain-Pertinence Query Classification in Informative Chatbots
(AIME 2025)
Sara Papi, Marco Gaido, Luisa Bentivogli, Alessio Brutti, Mauro Cettolo, Roberto Gretter, Marco Matassoni, Mohamed Nabih Ali Mohamed Nawar and Matteo Negri. FAMA: The First Large-Scale Open-Science Speech Foundation Model for English and Italian. (CLIC-it 2025)
Sara Papi, Peter Polák, Dominik Macháček, Ondřej Bojar. How “Real” is Your Real-Time Simultaneous Speech-to-Text Translation System? (TACL)
Abdul Hannan, Alessio Brutti, Daniele Falavigna, Input Conditioned Layer Dropping in Speech Foundation Models. (MLSP 2025)
Alan Ramponi, Marco Rovera, Robert Moro and Sara Tonelli. Multilingual vs Crosslingual Retrieval of Fact-checked Claims: A Tale of Two Approaches. (EMNLP 2025)
Daniel Russo, Fariba Sadeghi, Stefano Menini, Marco Guerini. EuroVerdict: A Multilingual Dataset for Verdict Generation Against Misinformation. (ACL 2025 Findings)
Alan Ramponi, Agnese Daffara and Sara Tonelli. Fine-grained Fallacy Detection with Human Label Variation. (NAACL 2025)
Seraphina Fong, Marco Matassoni, Alessio Brutti. Speech LLMs in Low-Resource Scenarios: Data Volume Requirements and the Impact of Pretraining on High-Resource Languages. (Interspeech 2025)
Marco Rovera, Serena Cristoforetti and Sara Tonelli. ModaFact: Multi-paradigm Evaluation for Joint Event Modality and Factuality Detection (COLING 2025).
Stefano Menini, Daniel Russo, Alessio Palmero Aprosio, Marco Guerini. First-AID: the first Annotation Interface for grounded Dialogues. (ACL 2025 Demo)
Nicolò Penzo, Marco Guerini, Bruno Lepri, Goran Glavaš, Sara Tonelli. Don't Stop the Multi-Party! On Generating Synthetic Multi-Party Conversations with Constraints.
Mohamed Nabih Ali, Daniele Falavigna, Alessio Brutti. EFL-PEFT: A communication Efficient Federated Learning framework using PEFT sparsification for ASR. (ICASSP 2025)
Beatrice Savoldi, Alan Ramponi, Matteo Negri, Luisa Bentivogli. Translation in the Hands of Many: Centering Lay Users in Machine Translation Interactions (EMNLP 2025)
Tsz Kin Lam*, Marco Gaido*, Sara Papi, Luisa Bentivogli, Barry Haddow. Prepending or Cross-Attention for Speech-to-Text? An Empirical Comparison. (NAACL 2025)
Dennis Fucci, Marco Gaido, Matteo Negri, Mauro Cettolo, Luisa Bentivogli. Echoes of Phonetics: Unveiling Relevant Acoustic Cues for ASR via Feature Attribution. (Interspeech 2025)
Bernardo Magnini, Roberto Zanoli, Michele Resta, Martin Cimmino, Paolo Albano, Marco Madeddu, Viviana Patti. Evalita-llm: Benchmarking large language models on Italian.
Bernardo Magnini, Saeed Farzi, Pietro Ferrazzi, Soumitra Ghosh, Alberto Lavelli, Giulia Mezzanotte, Manuela Speranza. A cost-effective approach to counterbalance the scarcity of medical datasets. (Frontiers in Disaster and Emergency Medicine.)
Leonardo Sanna, Marco Bolpagni, Valentina Fietta, Giorgia Gavioli, Mattia Franzin, Mauro Dragoni and Silvia Gabrielli. Enriching Mental Health Chatbots using LLM-augmented State Machines: A Case Study on Self-Help+. (AIME 2025)
Darline Marx, Marco Matassoni, Alessio Brutti. Automatic detection of speech sound disorders in German speaking children: augmenting the data with typically developed speech. (Interspeech 2025)
Eugenio Marzona, Maria Goikhman, Alessio Palmero Aprosio, Massimo Zancanaro. Exploring Paraphrasing Strategies for CEFR A1-Level Constraints in LLMs (EMNLP 2025 Findings)
Daniel Russo, Stefano Menini, Jacopo Staiano, Marco Guerini. Face the Facts! Evaluating RAG-based Fact-checking Pipelines in Realistic Settings (INLG 2025)
Giacomo Gonella, Gian Maria Campedelli, Stefano Menini, Marco Guerini. CrisiText: A dataset of warning messages for LLM training in emergency communication.