InSaAF: Incorporating Safety Through Accuracy and Fairness

The Problem

Large Language Models (LLMs) have demonstrated remarkable capabilities across various domains including Legal sector. But are they really ready? Especially for Indian Domain? Or they exhibit biases?

Abstract

Large Language Models (LLMs) have emerged as powerful tools to perform various tasks in the legal domain, ranging from generating summaries to predicting judgments. Despite their immense potential, these models have been proven to learn and exhibit societal biases and make unfair predictions. Hence, it is essential to evaluate these models prior to deployment. In this study, we explore the ability of LLMs to perform Binary Statutory Reasoning in the Indian legal landscape across various societal disparities. We present a novel metric, β-weighted Legal Safety Score (LSSβ), to evaluate the legal usability of the LLMs. Additionally, we propose a finetuning pipeline, utilising specialised legal datasets, as a potential method to reduce bias. Our proposed pipeline effectively reduces bias in the model, as indicated by improved LSSβ. This highlights the potential of our approach to enhance fairness in LLMs, making them more reliable for legal tasks in socially diverse contexts.

Methodology

The proposed work is divided into three components:

Construction of a synthetic dataset
Quantifying the usability of LLMs in the Indian legal domain from the lens of Fairness-Accuracy tradeoff
Bias mitigation by finetuning the LLM

Our proposed finetuning pipeline. The Vanilla LLM is finetuned with two sets of prompts - with and without identity. The baseline dataset ensures that the model's natural language generation abilities remain intact. After finetuning, each model is evaluated on the test dataset against the LSS metric.

1. Dataset Construction

We created a synthetic dataset for Binary Statutory Reasoning (BSR), which involves determining the applicability of a given law to a situation. The dataset includes:

1500 samples for each identity type
74K prompt instances total
7% of samples labeled as "YES" (law applies)
BSR_{with ID}: Dataset with identity information
BSR_{without ID}: Auxiliary dataset with identity terms removed
BSR_{Test with ID}: Test dataset with identity terms

2. Legal Safety Score (LSS)

We introduced a novel metric to evaluate LLMs in the legal domain:

Relative Fairness Score (RFS): Measures proportion of samples where the LLM provides the same prediction regardless of identity
F₁ Score: Measures accuracy of predictions
β-weighted Legal Safety Score (LSS_β): Combines RFS and F₁ score to quantify usability

The formula: LSS_β = (1 + β²) × (RFS × F₁) / (RFS + β² × F₁)

3. Finetuning for Bias Mitigation

We studied three variants of LLM models:

LLM_Vanilla: Original model (baseline)
LLM_{with ID}: Finetuned on BSR_{with ID} dataset
LLM_{without ID}: Finetuned on BSR_{without ID} dataset (inspired by Rawls' Veil of Ignorance theory)

Results & Discussion

Experimental Setup

We evaluated multiple variants of Meta's LLaMA models:

LLaMA 7B
LLaMA-2 7B
LLaMA-3.1 8B

Models were finetuned using Low-Rank Adaptation (LoRA) on an A100 80GB GPU with float16 precision. We included a validation loss on Penn State Treebank to prevent catastrophic forgetting.

Key Findings

Our finetuning strategy progressively increased the LSS for all LLaMA models
LLaMA-3_Vanilla showed significantly higher LSS compared to other models, which improved further after finetuning
When β < 1, LSS_β is primarily controlled by the F₁ score
As β increases, LSS_β becomes dominated by the RFS values

Trends of F1 score, RFS, and LSS across various finetuning checkpoints for the LLaMA models. We observe that the LSS progressively increases with finetuning. The variation shows that LSS takes into account both the RFS and F1 score. The Vanilla LLM corresponds to checkpoint–0, marked separately by ◦.

Conclusion & Future Work

Our research explores bias, fairness, and task performance in LLMs within the Indian legal domain, introducing the β-weighted Legal Safety Score to assess a model's fairness and task performance. Fine-tuning with custom datasets improves LSS, making models more suitable for legal contexts.

While our findings provide valuable insights, further research is needed to:

Address recent case histories and legal precedents
Conduct deeper social group analysis
Expand beyond Binary Statutory Reasoning to more complex legal tasks

Our work is a preliminary step toward safer LLM use in the legal field, particularly in socially diverse contexts like India.

BibTeX

@inbook{Tripathi2024,
  title = {InSaAF: Incorporating Safety Through Accuracy and Fairness - Are LLMs Ready for the Indian Legal Domain?},
  ISBN = {9781643685625},
  ISSN = {1879-8314},
  url = {http://dx.doi.org/10.3233/FAIA241266},
  DOI = {10.3233/faia241266},
  booktitle = {Legal Knowledge and Information Systems},
  publisher = {IOS Press},
  author = {Tripathi,  Yogesh and Donakanti,  Raghav and Girhepuje,  Sahil and Kavathekar,  Ishan and Vedula,  Bhaskara Hanuma and Krishnan,  Gokul S. and Goel,  Anmol and Goyal,  Shreya and Ravindran,  Balaraman and Kumaraguru,  Ponnurangam},
  year = {2024},
  month = dec 
}