Towards Hybrid Evaluation Methodologies for Large Language Models in the Legal Domain

Publication details

Authors	SANCHI Marco NOVOTNÁ Tereza
Year of publication	2024
Type	Article in Proceedings
Conference	Frontiers in Artificial Intelligence and Applications, Vol. 395 Legal Knowledge and Information Systems. Proceedings of JURIX 2024.
MU Faculty or unit	Faculty of Law
Citation
web	Open access sborníku
Doi	https://doi.org/10.3233/FAIA241279
Keywords	Large Language Models; Thematic Analysis; Performance Evaluation
Description	This paper analyses automated and human-driven evaluation approaches for Large Language Models (LLMs) performance in the legal domain, stressing the need to combine both into hybrid evaluation frameworks. This conclusion is reinforced by a qualitative case study that uncovers assessment factors considered by lawyers when using LLMs. The diverse nature of these factors, requiring distinct evaluation approaches, underscores the need for adopting a hybrid methodology.

10 reasons why you will fall in love with MU