Defining the Minimum Viable Dataset for Cultural and Linguistic Empowerment
Volunteer now to be part of CODI’s inaugural research initiative
The Coalition on Digital Impact believes the Internet can break down barriers, amplify unheard voices, and fuel global innovation. But we must also recognize a growing inequity becoming more stark in the age of AI: a lack of language representation online and in the technologies shaping our future.
That’s why I’m thrilled to announce CODI’s first major research initiative: Defining the Minimum Viable Dataset for Cultural and Linguistic Empowerment. This milestone marks the beginning of our journey to ensure everyone has a seat at the digital table, regardless of language, culture, or geography.
Why This Initiative, and Why Now?
AI’s benefits are only as inclusive as the data behind them. Yet most of the world’s languages, especially Indigenous, minority, and oral-tradition languages, remain almost invisible in digital systems. To change this trajectory, we need to address the foundational issue: data.
That’s where our inaugural research project comes in. CODI is launching a groundbreaking effort: the creation of a Minimum Viable Dataset (MVD) that can serve as a foundation for culturally relevant, ethically sourced, and community-owned digital resources.
Join Us: Call for Volunteers
We are now forming a Working Group to guide this initiative, and I want to personally invite experts, advocates, and community leaders to get involved. This is an all hands on deck effort, and we are seeking volunteers from a wide range of disciplines, including:
- Language and cultural preservation
- Applied ethics and digital rights
- AI, NLP, and data science
- Legal and IP
- Community advocacy
- Metadata standards, data classification, and data governance
This is a volunteer role, but one with real impact. It will help shape a first-of-its-kind global framework to ensure communities thrive in the digital future.
Together, this group will define what a culturally relevant dataset looks like, establish ethical standards for data collection and use, and build a foundation that enables multicultural AI development.
Want to be a part of it? Submit your Expression of Interest via this form before September 29, 2025.
Read more about this initiative on the Coalition on Digital Impact’s website. To participate, submit an Expression of Interest before September 29, 2025.