Reconfidencing LLM Uncertainty from the Grouping Loss Perspective

Large Language Models (LLMs), such as GPT and LLaMA, are susceptible to generating hallucinated answers in a confident tone. While previous efforts to elicit and calibrate uncertainty have shown some success, they often overlook biases towards certain groups, such as specific nationalities.

Existing calibration methods typically focus on average performance, failing to address this disparity. In our study, we demonstrate that the concept of grouping loss is an effective metric for understanding and correcting the heterogeneity in confidence levels. We introduce a novel evaluation dataset, derived from a knowledge base, specifically designed to assess the confidence scores of LLM responses across different groups. Our experimental results highlight significant variations in confidence, which are accurately captured by grouping loss. To tackle this issue, we propose a new method to calibrate the confidence scores of LLMs by considering different groups, a process we term reconfidencing. Our findings indicate that this approach effectively mitigates biases against minority groups, contributing to the development of fairer LLMs. The code is available at https: //github.com/tigerchen52/ reconfidencing_llms

Mots clés

Computation and Language (cs.CL) FOS: Computer and information sciences

Domaines

Mathématiques [math]

Fichier principal

Preprint_Reconfidencing_LLMs.pdf (1.14 Mo)

Origine	Fichiers produits par l'(les) auteur(s)

Lihu Chen : Connectez-vous pour contacter le contributeur

https://hal.science/hal-04750567

Soumis le : mercredi 23 octobre 2024-17:08:03

Dernière modification le : jeudi 21 novembre 2024-14:58:02

Dates et versions

hal-04750567 , version 1 (23-10-2024)

Licence

Paternité - Pas de modifications

Identifiants

HAL Id : hal-04750567 , version 1
DOI : 10.48550/arXiv.2402.04957

Citer

Lihu Chen, Alexandre Perez-Lebel, Fabian Suchanek, Gaël Varoquaux. Reconfidencing LLM Uncertainty from the Grouping Loss Perspective. EMNLP 2024 - Conference on Empirical Methods in Natural Language Processing, Nov 2024, Miami, United States. ⟨10.48550/arXiv.2402.04957⟩. ⟨hal-04750567⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSERM INRIA INRIA2 LTCI INFRES DIG IP_PARIS ANR GS-COMPUTER-SCIENCE INSTITUT-MINES-TELECOM

250 Consultations

105 Téléchargements