The AI2 ImpACT License Project: Open, Responsible AI Licenses for the Common Good

A new way to think about AI licensing

Jen Dumas
AI2 Blog

--

By Jen Dumas, Crystal Nam, Will Smith, David Atkinson, and Nicole DeCario

AI Is Everywhere; What Are the Risks?

Artificial intelligence is everywhere — on your smartphone, in your doctor’s office, in your car, and at your workplace. New AI products are emerging daily, and the rate at which these AI-enabled technologies are being adopted looks like the early stage of a steep growth curve. With a myriad of potential benefits also come significant challenges, including the diffusion of potentially harmful AI applications and powerful AI tools in the hands of malicious actors.

One way to address these challenges is through regulation, like the EU AI Act. But these efforts by lawmakers may not take effect for years to come. The challenges, meanwhile, are urgent and the risks of many AI use cases have already been documented. Like other institutions and researchers engaged in fundamental AI research, AI2 cannot wait for regulations and is brainstorming solutions it can apply today, like licenses.

Advancing Responsible AI

AI licenses are a potentially valuable instrument for incentivizing responsible uses of AI because licenses create enforceable legal obligations between the grantor and the user (Contractor et al. (2022)). Among other advantages, a license allows ethical guidelines to be operationalized in the release and adoption of new AI technologies (see e.g., Contractor & Muñoz Ferrandis (2022)). We developed the AI2 ImpACT Licenses to not only support and advance the movement around Responsible AI Licenses (RAILs) already in use with Bloom and Llama for example, but also to inspire a sea-change in the way that AI development is done.

As described in greater detail in this blog, the AI2 ImpACT Licenses make at least two novel contributions to responsible AI licenses: first, they adopt a risk-based rather than artifact-based approach; and second, they enable public reporting of violators and disclosure requirements around intended use and project inputs. We feel that a risk-based license tracks more closely the organizational structure and philosophy of potential regulations. Also, although the public reporting and disclosure requirements may seem atypical in a license, we believe this further supports and reinforces the importance of transparency in the AI development process. We want to incentivize AI developers to be available for public scrutiny and to disclose intended uses and intended users of their new AI applications. We also believe that AI developers, including AI2, can make critical contributions to the public commons by making a collective commitment to accountability, collaboration, and transparency: in this case, by willingly sharing the inputs that make our AI research possible and voluntarily submitting ourselves to public scrutiny in cases where we fail to meet our own standards for responsible AI uses.

We welcome your feedback on this initiative: please reach us at ai2impact@allenai.org.

Introducing the AI2 ImpACT Licenses

The ambition of the AI2 ImpACT License project is to leverage the power of shared community norms to make an impact on the common good in the field of AI. The AI2 ImpACT License project is named for and designed to implement the following core values of AI2: Impact, Accountability, Collaboration, and Transparency (“ImpACT”). There are currently three licenses in the AI2 ImpACT License family, with each tailored to one of three risk categories: low, medium, and high:

Overall, we believe the AI2 ImpACT License family is distinctive in four ways:

  1. Artifact Agnostic: Each AI2 ImpACT License may be used to release any model or dataset. We believe the principles reflected in this approach could also be used to release any AI-related technology, including, for example, not just weights and data, but also software and source code, and possibly technology related to other industries outside of AI research and engineering.
  2. Risk-Based Use Restrictions: Each AI2 ImpACT License relies on a risk assessment completed by a multidisciplinary group of lawyers, ethicists, and scientists that triggers behavioral use restrictions rather than segmenting by artifact or artifact type. We believe this risk-based approach can be more direct and more effective in mitigating the potential harms of AI technology.
  3. Breaking the Black Box: Each AI2 ImpACT License includes disclosure requirements about the inputs used to develop derivatives based on the underlying artifacts (including identifying intended uses, funding sources, energy consumption, and data provenance for the derivatives). These Derivative Impact Reports facilitate community accountability, collaborative research, and public transparency by providing insight into currently opaque inputs that make today’s AI models and AI research possible. To incentivize public reporting, we also built in a safe harbor liability exemption for the information included in these good-faith public disclosures.
  4. Multi-Layered Enforceability Tools: Each AI2 ImpACT License acknowledges the technical difficulty of enforcing license restrictions for AI technology. We start by relying first on building up and supporting community norms around transparency and community accountability. These community norms constitute our primary enforcement tools, with more traditional legal and litigation-based remedies in a supporting role.

More Details: How We Built the AI2 ImpACT Licenses

For our first deployment of the AI2 Impact Licenses — which will be connected with the AI2 OLMo (Open Language Model) project — we completed an internal risk analysis to identify low, medium, and high-risk artifacts for the planned AI2 OLMo releases. In the future, we plan to publish guidance that establishes proposed best practices for doing AI artifact risk assessments.

Each AI2 ImpACT License starts from the basic premise that the power and potential use of a particular AI artifact — both for good or bad — is uncoupled from the type of artifact it is. For example, the potential risk or harm of using a dataset trained on public information is very different from one trained on health data, yet both are “datasets” and would ordinarily be released under the same dataset license. To address this difference in harm characteristics, we flipped how licenses are typically organized and made the potential risk the determining factor driving licensing rights rather than the artifact type.

The second premise motivating each AI2 ImpACT License is the belief that community norms can be a more effective tool for both collaboration and enforcement than legal remedies alone. To encourage the responsible use of AI applications, we rely on an overlapping fabric of normative and legal incentives: (1) risk-based license terms; (2) behavioral use restrictions; (3) public disclosure of license violators; and (4) public disclosure of the use and impact of any derivatives created from the licensed artifact.

  • Risk-Based License Terms: As discussed above, each AI2 ImpACT License provides the corresponding rights and obligations based on a risk classification associated with each licensed artifact. Drawing on recent research concerning the ethical and social risks of language models (Weldinger et al. (2021)), the risk category for each artifact is determined by several factors, including a harm risk score that considers the likelihood and impact of the AI artifact causing the following harms: (a) discrimination, exclusion and toxicity: risks arising from language models accurately reflecting natural speech, including unjust, toxic, and oppressive tendencies present in the training data; (b) information hazards: risks arising from the language model predicting utterances that constitute private or safety-critical information that are present in (or can be inferred from) the training data; (c) misinformation: risks arising from the language model assigning high probabilities to false, misleading, nonsensical, or poor quality information; (d) malicious uses: risks arising from humans intentionally using the language model to cause harm, such as fraud and cyber attacks; (e) human-computer interaction: risks arising from language model applications that directly engage a user via mode of conversation leading to unsafe use by humans who misjudge or mistakenly trust the model, which in turn exploits sensitive or private information linked to their identity; and(f) automation, access and environmental: risks arising from use of language models to underpin widely used downstream applications that disproportionately benefit some groups vs. others to further perpetuate or increase social inequalities.

Using this approach, the greater the risk of harm, the higher the risk category. The higher the risk category, the more controls are required to try to mitigate the risk of the artifact contributing to harmful AI. The harm score, along with other considerations like the uniqueness and novelty of the artifact, inform the final overall risk classification: low, medium, or high. We acknowledge that these particular harms may change over time. We plan to revisit and iterate on our ratings and related analysis in the future as we refine our internal process for identifying and assessing the risk of harm presented by our research.

  • Behavioral Use Restrictions: In developing the AI2 ImpACT Licenses, we drew inspiration from the work of Responsible AI, BigScience, Hugging Face, and others who identified prohibited behavioral use restrictions in the first generation of responsible AI licenses. We iterated on existing lists and asked ourselves what might already be covered by existing laws and what might be too ambiguous to be enforceable. We are planning a supplemental blog post about responsible AI releases that explains in further detail our analysis of use restrictions and harm mitigation. We think this is a fast-changing area and that the prohibited uses will need to be revisited regularly for breadth and currency.
  • Public Disclosure of License Violators: We also addressed the technical challenges of license enforcement by thinking about supplemental ways to encourage compliance. We were inspired by the effectiveness of community norms in supporting open-source software communities. Each AI2 ImpACT license authorizes AI2 to post notices on its website publicly identifying any licensee who violates the use-based restrictions of the license. This disclosure function allows researchers who release AI works to clearly identify who has — and who has not — honored their original commitment to use that AI responsibly. Anonymity should not be a shield to accountability for malicious users, for example, to conceal their role in disseminating harmful AI applications. Together, we can start to self-police the research community and the larger pool of AI users by providing pro-social support to use AI responsibly and social disincentives to intentional misuse.
  • Public Disclosure of the Derivative Impact Reports: The Derivative Impact Reports request public disclosure of the inputs used to create a derivative model or dataset before its release. Topics include intended uses, energy, data provenance, and funding sources. We believe these Derivative Impact Reports will in the long run provide a valuable resource for future research and investigation about the way AI artifacts are intended to be used and the costs of creating them as compared to their later actual use in the world. In the short run, the value of the Derivative Impact Reports is to support a mind-shift in research and AI development towards openness, transparency, and collaboration. The more we share in good faith with each other about what makes our research possible, the more we can learn from one another, and both hold each other accountable for our mistakes and celebrate our successes. Freely shared information is a necessary condition to fairly make these determinations.

For further details about how the AI2 ImpACT Licenses work, please see our summary of the legal text here.

Next Steps: Our Invitation

We acknowledge that mitigating the challenges presented by AI’s potential harms is complicated. AI licensing is already a complex and ambiguous topic. Our goal is to think differently about how to address the upstream decisions that have great downstream impact. We are excited about this new approach and the flexibility it provides to address potential and unforeseen harms.

This is also not an approach that is frozen in time: we plan to iterate on this AI2 ImpACT License family to make it clearer and easier to understand as thinking and technology evolves. We believe the AI2 ImpACT License project is a step forward in the movement around responsible AI licenses that will support all AI developers in responsibly developing and releasing AI models, datasets, and other artifacts for the benefit of the general public.

Our invitation to the research community and other AI developers is to join us in a new shared project: working together to embed ethical considerations and responsible practices into the AI development pipeline so that new AI applications and their outputs are responsible by design. We welcome feedback on this project in general and on this iteration of the AI2 ImpACT Licenses in particular at AI2impact@allenai.org.

References

In the RAILs terminology, the AI2 Impact License can be classified as an Open RAIL-DM license because it: (1) incorporates behavioral use-based restrictions for the purpose of mitigating risks associated with the distribution of AI technologies; (2) permits the free use and distribution of the licensed artifacts so as long as the behavioral use restrictions similarly to downstream derivatives; and (3) covers AI models (M suffix) and datasets (D suffix) ( Muñoz Ferrandis et al., 2022).

Acknowledgments

Special thanks to Alvaro Herrasti, a researcher on AI2’s PRIOR team, who first pitched the idea of a “software license for the common good” at AI2’s 2022 Hackathon event. The current AI2 ImpACT License project is indebted to Alvaro for his inspiration and insight and represents just one part of the grand vision that was Alvaro’s original idea for pursuing the common good through licensing.

Visit the OLMo project page for the latest information about AI2’s upcoming language model.

Check out our current openings, follow @allen_ai on Twitter, and subscribe to the AI2 Newsletter to stay current on news and research coming out of AI2.

--

--