Towards a Safe Large Multimodal Multilingual Model

Mar 11

At Ontocord.AI we strongly advocate for universal access to Artificial Intelligence that adheres to the legal standards and upholds the rights of individuals. The recently adopted EU AI Act presents a promising avenue to establish trust and safety in the future AI, and we support this initiative. This will position Europe as a global leader in AI, while prioritizing its people and economy.

A Proposal for the Next Foundtion Model

A consortium of researchers, of which Ontocord was a major part, proposed Synthetic and Augmented Data, Fair and Extreme-scaled Large Multimodal Model (SafeLMM) in an applicaiton to the EuroHPC. SafeLMM is, we hope, a vital effort to meet the rising demand for regulatory-compliant, ethical and high-performing multimodal foundation models. We aspire to create top-notch synthetic and real data to train such models, while ensuring strict adherence to the forthcoming European Union AI Act. While the submitted proposal was very much an applicaiton, you can read a more user friendly version of the proposal here.

The AI Act, is expected to be a comprehensive framework for regulating AI Systems in the EU, and perhaps a de-facto standard globally due to multinational corporations incorporating EU AI standards into their company policies. Based on our reading of the current draft of the AI Act, it promotes data governance, risk mitigation and protection of fundamental rights.

We are committed to developing Large Language Models (LLMs) and Large Multimodal Models (LMMs) responsibly and legally, aligned with the EU's "human-centric" approach to AI technology. Our goal is to contribute to transparency in AI development by fully disclosing our methods and subjecting our work to peer review. Our ongoig work will be informed by industry accepted principles of safety by design.

What is the difference between “safe” and “lawful” AI?

SafeLMM is but one of our efforts and a starting point for us to discuss the differene between safe and lawful AI. While the term safe and lawful are often used interchangeably, Ontocord develops both safe and lawful products, services and research. First, safety touches on the concept of protection of downstream users from harms and disclosure of such potential harms so that users are empowered to use or not use products, services or research. Our ethical view of safe AI is that the adult users should be able to choose how to use their AI, as long as their AI is lawful. Users who are minors however should have addition protection in terms of content and usage. We also urge parents of minors to monitor AI usage by their children. We acknowledge that others may have a different ethical standards than we do.

Second, lawfulness is a requirement - While people may have different ethics, we all must follow the law. It’s true that the law constantly changes, so this poses a challange. So we can aspire to comply with the law in our jurisdiciton as it currently stands and proactively plan to comply with upcoming laws such as the EU AI Act.

Biden Harris Executive Order on AI

The proposal is especially relevant in light of the White House Executive Order by the Biden Administration on Artificial Intelligence on Oct 30, 2023. While not specifically focused on U.S. requirements, it shows the importance of law following AIs.

Back to SafeLMM

So, for Ontocord, we aspire to create a safer and lawful LMM and look forward to contributing to SafeLMM if and when it is approved. Our aim is to bring to life the vision of SafeLMM, trained on 2 trillion tokens, with parameters ranging from 7 to 34 billion. We are committed to making these models available as open-source and open science, ensuring that everyone can use AI models that comply with the EU AI Act. The consortium includes Ontocord AI, PIISA.org, LAION, e.v., Juelich Supercomputing Center, Horizon Europe project HPLT, and Efficient Translation Limited, faculty at the University of Chicago, and many others.

We Value Your Comments and Support

In our ongoing efforts to facilitate a broader public discourse on how open source can contribute to AI Safety, particularly through reference design and training and usage safeguards, we encourage the scientific and AI community to reach out to us at engage@ontocord.ai. We value your input and support, so please don’t hesitate to share your suggestions and join us in this endeavor.

Huu Nguyen

Towards a Safe Large Multimodal Multilingual Model

Meet Aurora-m

Occiglot: Polyglot Language Models for the Occident