Removing bias and ensuring impartiality in legal LLMs

30 Apr 2024

Blog

Nour Sibai, AI engineer at Siren, presenting at AUB's Women in Data Science conference on the use of LLMs and legal chatbots to address legal bias.

Generative AI systems have immense potential benefits and risks. While applications like deepfakes demonstrate the technology's potentially destructive power, many speakers at the American University of Beirut's Women in Data Science conference this year emphasised the importance of developing and deploying GenAI responsibly to solve clearly defined problems and use cases. One area of focus is using LLMs to mitigate bias and enhance reasoning in the legal sector. We spoke with Nour Sibai, an AI engineer at Siren, to learn more about the legal AI project she’s been working on.

At Women in Data Science, you presented “Casey,” a legal chatbot to aid in litigation. Can you tell us a bit more about it and what inspired the team to take on this project?

Our inspiration stemmed from the recognition that implicit social biases can influence judicial decision-making when, for example, determining whether to admit or dismiss a case, weighing evidence or exercising discretionary powers. There are concerns that these biases may manifest when legal rulings do not clearly cite the specific laws or precedents serving as the basis, which can lead to public questioning of the rationale behind the judgments. If large language models (LLM) are then grounded on data derived from biased decisions, they can further perpetuate and amplify these biases. Pre-empting this vicious cycle, we sought to assist judges by using AI to suggest relevant legal articles and precedents that might not have been initially considered when ruling on a case. This can help enrich the decision-making process, making it more comprehensive and transparent, less susceptible to bias and, ultimately, supporting fairer outcomes.

Can you explain the process you took to eliminate bias from the LLM?

To ensure that Casey remains unbiased, we carefully curated its knowledge base, grounding it on laws, international treaties and case facts. We deliberately excluded any judicial decisions, opinions or reasoning processes. This method eliminates potential biases that might arise from subjective judicial interpretations.Throughout the development process, we tested various prompts through trial and error to minimise biases to the greatest possible degree. The chatbot's design emphasises transparency; it cites specific legal articles and the documents from which information is drawn with every response. This not only ensures accuracy but also prevents any factual misrepresentations or hallucinations in outputs. Casey is also programmed to explain the reasoning for each legal article and decision suggested, helping to clarify the thought process. To validate and improve Casey's functionality, we conducted extensive testing with legal consultants. These experts ensured that Casey operates with the impartiality expected of a human judge.Critically, we are implementing a feedback system where judges can interact with Casey's responses—liking, disliking, and commenting on them. They can also review and re-rank the relevance of the legal texts it uses to construct answers. This feedback, which we are collecting from a diverse group of judges, is then normalised to further refine Casey's responses.

You mentioned that Casey has not been grounded on jurisprudence and doctrine yet. How will grounding on this material affect its output?

Grounding Casey on jurisprudence and legal doctrine would significantly transform its capabilities and the sophistication of outputs. Currently, Casey operates within a highly objective framework where its primary function is to align the factual elements of a case with the corresponding legal statutes. Jurisprudence and legal doctrine go beyond mere statutory interpretation. They encompass the foundational principles, theoretical underpinnings and philosophical insights that guide the application and evolution of laws. Grounding Casey on these aspects would allow it to appreciate the reasons behind legal rules, explore the interplay between different legal norms, and understand the historical and ethical contexts that influence legal developments. This would enable it to better handle cases where the law is ambiguous or silent, engage with ethical dilemmas, and provide insights into the broader implications of legal decisions.

It sounds as though this involves introducing a degree of subjectivity into Casey. How do you manage this?

That’s right. These materials are inherently interpretative and context-dependent and grounding Casey on them would require careful management and oversight to ensure accuracy and relevance.To handle this complexity, we plan to engage a diverse panel of legal experts who can assess Casey’s outputs from diverse points of view. Their expertise will be crucial in fine-tuning Casey’s responses, making sure its application of jurisprudence and doctrine is both legally sound and contextually appropriate.Importantly, we’d still enable users to opt for interpretations based strictly on legislation. Alternatively, they could engage with more complex analyses that incorporate jurisprudence and doctrine. The aim is to make Casey adaptable to user needs and preferences, enabling it to offer recommendations that range from strictly legalistic to more philosophically enriched, depending on the user’s selection.

If everyone has access to Casey, you can imagine that prosecution and defence will use it to pick apart their cases. In a sense they will be using the same brain. What impact do you think this will have on the legal sector?

We think this would 'level the playing field,' ensuring that both sides have equal access to high-quality legal insights and arguments. All parties would be able to better identify the strengths and weaknesses of their cases, potentially leading to more informed strategies, fewer errors, and fairer outcomes. Better prepared cases can also lead to quicker resolutions.The aim is not to replace judges but augment their ability to make unbiased, well-grounded and contextually relevant decisions. Of course, lawyers and judges could lean too heavily on the AI to do the "thinking" for them rather than developing their own legal faculties. This would be a mistake as we should not treat the output of any AI system as infallible. But as a shared resource, we think Casey would likely push lawyers to be more creative due to its potential to raise the quality of legal argumentation across the board.

Download