Salvatore Junior Curello
Court judgment prediction and explanation based on Transformers.
Rel. Luca Cagliero, Irene Benedetto. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2023
|
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (4MB) | Preview |
Abstract: |
In recent years, Natural Language Processing (NLP) has significantly revolutionized the legal industry. The introduction of NLP technologies has opened new opportunities for legal professionals, providing a wide range of resources to address traditional challenges and increase operational efficiency. One promising application of these new methodologies involves supporting judges by predicting the outcome of a legal case, thereby accelerating the entire judicial process. However, the mere prediction of a legal case is not sufficient if not accompanied by the corresponding explanation. To perform this task using an automated system is not so straightforward. The legal domain is complex, with lots of peculiar terms that require specific skills to be navigated. Furthermore, legal documents tend to be quite long, and extracting relevant information can be difficult. This thesis aims to address these challenges. The first part of this work refers to the Case Decision Prediction in which an extensive experimentation is performed using Transformers Models on the ILDC (Indian Legal Documents Corpus) dataset. Domain-specific transformers like CaseLawBERT or LegalLSGBERT have shown great performance despite their pre-training corpora includes US/EU legal documents characterized by legal systems that are entirely different with respect to the Indian setting. Moreover, the use of Hierarchical Transformers has demonstrated a further slight increase in performance. Notably, it was found that the concluding section of the legal document has the greatest impact on the prediction. The second part of this thesis focuses on the most challenging task: the Case Decision Explanation. This phase entails the utilization of a smaller portion of the corpus, annotated with gold-standard explanations by five legal experts. Human explanations are very difficult to obtain. Consequently, the annotated documents are not provided during training, in order to obtain a model learned to make predictions which is capable of generating explanations without being explicitly trained on them. Concerning the explanation generation, the occlusion method and attention mechanism have been exploited to extract phrases from the case description that best justify the final decision. Performance evaluation is conducted using a battery of metrics that measure the overlap between the expert annotators’ gold explanations and those generated by the machine. In the current literature, there is still a wide gap between how, given the same legal case, a machine and a legal expert would explain a judgment. This thesis contributes to partially reducing this gap. |
---|---|
Relatori: | Luca Cagliero, Irene Benedetto |
Anno accademico: | 2023/24 |
Tipo di pubblicazione: | Elettronica |
Numero di pagine: | 106 |
Soggetti: | |
Corso di laurea: | Corso di laurea magistrale in Data Science And Engineering |
Classe di laurea: | Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA |
Aziende collaboratrici: | NON SPECIFICATO |
URI: | http://webthesis.biblio.polito.it/id/eprint/29318 |
Modifica (riservato agli operatori) |