Are you ready to bring more awareness to your brand? Consider becoming a sponsor for The AI Impact Tour. Learn more about the opportunities here.
As artificial intelligence infiltrates nearly every aspect of modern life, researchers at startups like Anthropic are working to prevent harms like bias and discrimination before new AI systems are deployed.
Now, in yet another seminal study published by Anthropic, researchers from the company have unveiled their latest findings on AI bias in a paper titled, “Evaluating and Mitigating Discrimination in Language Model Decisions.” The newly published paper brings to light the subtle prejudices ingrained in decisions made by artificial intelligence systems.
The paper not only exposes these biases, but also proposes a comprehensive strategy for creating AI applications that are more fair and just. The publication of this study comes in the wake of earlier research conducted by the company on the potential “catastrophic risks” of AI and the establishment of a constitutional framework for AI ethics earlier this year.
The company’s new research comes at just the right time, as the AI industry continues to scrutinize the ethical implications of rapid technological growth, particularly in the wake of OpenAI’s internal upheaval following the dismissal and reappointment of CEO Sam Altman.
The AI Impact Tour
Connect with the enterprise AI community at VentureBeat’s AI Impact Tour coming to a city near you!
Research method aims to proactively evaluate discrimination in AI
The paper, available on arXiv, presents a proactive approach in assessing the discriminatory impact of large language models (LLMs) in high-stakes scenarios such as finance and housing — a notable concern as artificial intelligence continues to penetrate sensitive societal areas.
“While we do not endorse or permit the use of language models for high-stakes automated decision-making, we believe it is crucial to anticipate risks as early as possible,” said lead author Alex Tamkin. “Our work enables developers and policymakers to get ahead of these issues.”
Study finds Patterns of discrimination in language model
Anthropic used their own Claude 2.0 language model and generated a diverse set of 70 hypothetical decision scenarios that could be input into a language model.
Examples included high-stakes societal decisions like granting loans, approving medical treatment, and granting access to housing. The prompts systematically varied demographic factors like age, gender and race to enable detecting discrimination.
“Applying this methodology reveals patterns of both positive and negative discrimination in the Claude 2.0 model in select settings when no interventions are applied,” the paper states. Specifically, the authors found their model exhibited positive discrimination favoring women and non-white individuals, while discriminating against those over age 60.
Interventions reduce measured discrimination
The researchers explain in the paper that the goal of the research is to enable developers and policymakers to proactively address risks: “As language model capabilities and applications continue to expand, our work enables developers and policymakers to anticipate, measure, and address discrimination.”
The researchers propose mitigation strategies like adding statements that discrimination is illegal and asking models to verbalize their reasoning while avoiding biases. These interventions significantly reduced measured discrimination.
Steering the course of AI ethics
The paper aligns closely with Anthropic’s much-discussed Constitutional AI paper from earlier this year. The paper outlined a set of values and principles that Claude must follow when interacting with users, such as being helpful, harmless and honest. It also specified how Claude should handle sensitive topics, respect user privacy and avoid illegal behavior.
“We are sharing Claude’s current constitution in the spirit of transparency,” Anthropic co-founder Jared Kaplan told VentureBeat in May, when the AI constitution was published. “We hope this research helps the AI community build more beneficial models and make their values more clear. We are also sharing this as a starting point — we expect to continuously revise Claude’s constitution, and part of our hope in sharing this post is that it will spark more research and discussion around constitution design.”
It also closely aligns with Anthropic’s work at the vanguard of reducing catastrophic risk in AI systems. Anthropic co-founder Sam McCandlish shared insights into the development of the company’s policy and its potential challenges in September — which could give some insight into the though process behind publishing AI bias research as well.
“As you mentioned [in your question], some of these tests and procedures require judgment calls,” McClandlish told VentureBeat in response to a question about Anthropic’s board of directors. ” We have real concern that with us both releasing models and testing them for safety, there is a temptation to make the tests too easy, which is not the outcome we want. The board (and LTBT) provide some measure of independent oversight. Ultimately, for true independent oversight it’s best if these types of rules are enforced by governments and regulatory bodies, but until that happens, this is the first step.”
Transparency and Community Engagement
By releasing the paper, data set, and prompts, Anthropic is championing transparency and public discourse — at least in this very specific instance — inviting the broader AI community to partake in refining new ethics systems. This openness fosters collective efforts in creating unbiased AI systems.
For those in charge of technical decision-making at enterprises, Anthropic’s research presents an essential framework for scrutinizing AI deployments, ensuring they conform to ethical standards. As the race to harness enterprise AI intensifies, the industry is challenged to build technologies that marry efficiency with equity.
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.