Putting Yourself in Someone Else’s Shoes: Improving Human Generated Labels with Empathy Priming

The quality of human-generated labels is crucial for training and evaluating machine learning models, especially in subjective domains. Traditional methods for improving label quality often face scalability and cost challenges, while computational techniques struggle to mitigate biases inherent in subjective labeling. This study introduces cognitive empathy priming, a psychological treatment that improves perspective-taking skills, as a strategy to improve the quality of labels. Through multiple experiments using a sexism classification task, we demonstrate that empathy priming increases agreement amongst crowd-sourced labelers and aligns them more closely with expert labels. Furthermore, training state-of-the-art open-source models with empathy-primed labels improved their performance, whereas models trained on labels obtained through standard methods exhibited decreased performance. Our findings show that empathy-based treatments can reduce annotator bias in identifying sexist content, improving label quality at scale and low cost.