Daniel Scalena

Milan, Italy

Hi! I am Daniel, a (double) second-year PhD student at the 🇮🇹 University of Milano - Bicocca and the 🇳🇱 University of Groningen working on interpretability, fairness and security of generative (and non-generative) Large Language Models. My supervisors are Elisabetta Fersini and Malvina Nissim.

My research focuses on the use of interpretability as a tool to make generative models safer, more reliable and less toxic to extend and improve their real-world applications.

In my spare time I take pictures and echo "from NL import infrastructure" > Milan.py.

news

May 23, 2025	📢 New paper out! Steering Large Language Models for Machine Translation Personalization, small thread about it on X or BSky.
Oct 02, 2024	📜 Multi-property Steering paper accepted to BlackBoxNLP 2024 (@EMNLP 2024) and 📜 A gentle push funziona benissimo accepted @ CLIC-it conference! 🎉

latest posts

Sep 10, 2023	In-depth Notes to understand more about Transformers
Jul 12, 2023	Let the Models Respond: Interpreting the Detoxification process of LMs

selected publications

Steering Large Language Models for Machine Translation Personalization

Daniel Scalena^*, Gabriele Sarti^*, Arianna Bisazza, and 2 more authors

2025

Abs HTML

High-quality machine translation systems based on large language models (LLMs) have simplified the production of personalized translations reflecting specific stylistic constraints. However, these systems still struggle in settings where stylistic requirements are less explicit and might be harder to convey via prompting. We explore various strategies for personalizing LLM-generated translations in low-resource settings, focusing on the challenging literary translation domain. We explore prompting strategies and inference-time interventions for steering model generations towards a personalized style, and propose a contrastive framework exploiting latent concepts extracted from sparse autoencoders to identify salient personalization properties. Our results show that steering achieves strong personalization while preserving translation quality. We further examine the impact of steering on LLM representations, finding model layers with a relevant impact for personalization are impacted similarly by multi-shot prompting and our steering method, suggesting similar mechanism at play.
A Gentle Push Funziona Benissimo: Making Instructed Models in Italian via Contrastive Activation Steering

Daniel Scalena, Elisabetta Fersini, and Malvina Nissim

In Proceedings of the 10th Italian Conference on Computational Linguistics (CLiC-it 2024), Dec 2024

Abs PDF

Adapting models to a language that was only partially present in the pre-training data requires fine-tuning, which is expensive in terms of both data and computational resources. As an alternative to fine-tuning, we explore the potential of activation steering-based techniques to enhance model performance on Italian tasks. Through our experiments we show that Italian steering (i) can be successfully applied to different models, (ii) achieves performances comparable to, or even better than, fine-tuned models for Italian, and (iii) yields higher quality and consistency in Italian generations. We also discuss the utility of steering and fine-tuning in the contemporary LLM landscape where models are anyway getting high Italian performances even if not explicitly trained in this language.
Multi-property Steering of Large Language Models with Dynamic Activation Composition

Daniel Scalena, Gabriele Sarti, and Malvina Nissim

In Proceedings of the 7th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP, Nov 2024

Abs HTML PDF

Activation steering methods were shown to be effective in conditioning language model generation by additively intervening over models’ intermediate representations. However, the evaluation of these techniques has so far been limited to single conditioning properties and synthetic settings. In this work, we conduct a comprehensive evaluation of various activation steering strategies, highlighting the property-dependent nature of optimal parameters to ensure a robust effect throughout generation. To address this issue, we propose Dynamic Activation Composition, an information-theoretic approach to modulate the steering intensity of one or more properties throughout generation. Our experiments on multi-property steering show that our method successfully maintains high conditioning while minimizing the impact of conditioning on generation fluency.