Two hands are holding a paper ribbon, that shows the outline of people holding hands

07 November 2023| doi: 10.5281/zenodo.13221794

Participation with Impact: Insights into the processes of Common Voice

Common Voice is a crowdsourcing project of the Mozilla Foundation. It’s developing a publicly available voice dataset created by voice recordings from volunteers around the world. Since 2019, people can use this as a basis for building voice applications that are as non-discriminatory as possible. Before, many of the voice datasets used to build AI systems – such as translation tools or voice assistant Alexa – favour white, English-speaking men. This means that many of these technologies do not work at all in many languages. Then, in the languages where they do work, they often don’t work equally well for all people. This is why Common Voice, with its inclusive dataset, is committed to including previously excluded and future user groups in many of its decision-making processes.

As our research report for the Civic Coding Innovation Network (2022) highlights, this is a necessary requirement for the development of AI for the public interest. But how exactly does Common Voice pave the way for more equitable and inclusive voice-driven applications through social participation? What can other projects learn from it? These and other questions are explored in this blogpost.

The Importance of Participation in AI Development

Henrik Mucha, together with other researchers, describes participatory design in AI development in i-com magazine: “Participatory design means recognising that those who will be affected by a future technology should have an active say in its development.” This is necessary to develop AI applications that work for and benefit the technology’s target audience. This only works if these people are actually involved in the process of development.

The idea of participatory design and collaborative decision-making is playing an increasingly important role not only in public interest AI projects, which particularly occupy our research. However, participation is not an universal remedy to tailor technologies more closely to the needs of their users in their development. In the worst case, without a clear impact, participation measures can even weaken trust in technologies.

The Civic Coding research report, with reference to the publication by Shirley Ogolla and A. Gupta, shows that the “relevance of participatory approaches to AI development […] is increasingly widely recognised and implemented”. Dr Züger and Dr Asghari draw attention to the fact that at the same time the question remains open as to how exactly forms of participation should look like and, above all, how they become effective. The term can quickly be used to make a name for oneself. It is therefore important to be vigilant against possible “participation-washing”. According to Mona Sloane, this refers to the inclusion of a community in an “exploitative and extractive way”. Furthermore, she writes that for true participatory design it is necessary to understand it as situation- and context-dependent.

But how is effective participation implemented in practice? We describe this below using the example of Common Voice and its publicly available voice dataset.

How Common Voice implements participation

As mentioned above, Common Voice aims to build a language data set that is as fair and non-discriminatory as possible. This means that the applications developed on the basis of Common Voice should be equally accessible and usable for all language communities and user groups. So far, this is not a given matter in other applications. According to the Civic Coding report male-classified voices or an American accent, for example, are better recognised than female voices and accents of a less represented language, such as Persian or Indonesian. It is therefore important to include language communities that are otherwise often underrepresented in the construction of such datasets.

The project illustrates well how an actual implementation of participatory decision-making processes can be realised. It thus offers insights that are also relevant for other actors in the field of AI development. For our case analysis, we interviewed Common Voice project staff and quote from this interview below.

For Common Voice, it is important to raise awareness of data sovereignty among the people who donate their data. In other words, making them realise that they have a say regarding their data (extract from the interview). For example, the Māori language community decided not to make their voice data available when considering a collaboration with various tech players, including Common Voice. Common Voice respects this. The unique feature of the licence that Common Voice uses is that the dataset is openly available under the CC0 licence. This licence makes it possible for anyone who downloads the data set to use it as if it were free of copyright. In other words, also for commercial purposes. However, giving up the copyright of the data donors must also be viewed critically and was decisive for the Māori’s decision not to donate their data.

On the Common Voice homepage, it is easy to donate a voice recording by saying prescribed sentences aloud. It is also possible to validate recorded sentences by providing feedback on whether they were read out correctly. But participation does not only take place through a voice donation. It also plays a role when it comes to making concrete decisions about development processes.

How Common Voice makes project decisions

There is a whole range of processes and structures for this:

The Representatives Council ensures the representation of the corresponding language communities in the decision-making processes. Any person from the language community can nominate themselves and be elected to be part of it. One then keeps the seat for a certain period of time.
In different language communities, their opinions are repeatedly sought through surveys.
Experts consulted include language experts, programmers, technical advisors and political scientists. Their assessments are incorporated into the development via advisory committees (so-called steering committees). These committees consist of the management of the Mozilla Foundation and advisory and funding partners of Common Voice. Especially in cases of conflict, these advisory committees are consulted for decision-making (Mozilla Common Voice Governance Doc V1.0).
Whether or not a change is made to the data set is decided on the basis of the prioritisation matrix. Here, the cost-benefit ratio is weighed in relation to the public interest. Depending on this, changes or new features are ranked and then implemented or discarded on this basis.
In addition, transparency is to be ensured, for example through a community forum, a blog and the publication of decisions. Through these measures for transparency and openness, a participatory and deliberative decision-making process is created overall.

All these structures have proven their worth over the years of work and have been constantly developed further.
According to the Mozilla Foundation, the dataset is now used for training and testing by major technology companies developing speech recognition and speech-to-text engines.

Challenges of participatory decision-making

Participatory decision-making processes are often more complex than hierarchical decision-making because more people are involved, which increases the time required. Moreover, adequate implementation of such participatory structures is costly: “Doing this is expensive. People’s time is expensive. The infrastructure is expensive. Making changes to the infrastructure is expensive, et cetera. And I think organisations sometimes go in without full appreciation that it is a pricey endeavour, and doing it well takes years.” (extract from the interview with Common Voice). The expense involved in a participatory process is often underestimated and sometimes inadequately factored into budgeting, which can become a problem for ongoing projects with the ambition to create a participatory process. One point Common Voice emphasises in the interview is the challenge of dealing with power inequality. It requires active facilitation in the process to give groups in weaker positions as much influence on decisions as important donors.

What we can learn from Common Voice

Participatory decisions and developments are expensive, time-consuming and difficult. And now? A cheap and seemingly simple counter design to participatory processes consists of scraping voice data from the internet without the consent of the individuals. This means that the voices from the videos are read out and summarised as a data set. An example of this is the YouTube-8M data set. This means that the voice technologies based on it do not work equally well for all user groups (Brihane 2021). It also raises the question of who owns these voice datasets and who may and should decide what data they contain and how the data may be used.

Common Voice as an educative case study shows how the issues of data governance can be solved in a participatory way and also highlights how even complex decision-making processes about the further development of technologies can be designed in a participatory way. It shows that participatory decision-making is achievable and successful if organisations are willing to take up the challenge.

References

Züger, T., Faßbender, J., Kuper, F., Nenno, S., Katzy-Reinshagen, A., & Kühnlein, I. (2022). Civic Coding: Grundlagen und empirische Einblicke zur Unterstützung gemeinwohlorientierter KI. Civic Coding Initiative.

Züger, T., & Asghari, H. (2022). AI for the public. How public interest theory shifts the discourse on AI. AI & Society. DOI: 10.1007/s00146-022-01480-5

Mucha, H., Correia de Barros, A., Benjamin, J., Benzmüller, C., Bischof, A., Buchmüller, S., de Carvalho, A., Dhungel, A., Draude, C., Fleck, M., Jarke, J., Klein, S., Kortekaas, C., Kurze, A., Linke, D., Maas, F., Marsden, N., Melo, R., Michel, S., Müller-Birn, C., Pröbster, M., Rießenberger, K., Schäfer, M., Sörries, P., Stilke, J., Volkmann, T., Weibert, A., Weinhold, W., Wolf, S., Zorn, I., Heidt, M. & Berger, A. (2022). Collaborative Speculations on Future Themes for Participatory Design in Germany. i-com, 21(2), 283-298. DOI: https://doi.org/10.1515/icom-2021-0030

Sloane, M., Moss, E., Awomolo, O., Forlano & L. (2020) Participation is not a Design Fix for Machine Learning. arXiv: http://arxiv.org/abs/2007.02423

Brihane, A. (2021) Algorithmic injustice: a relational ethics approach. DOI: https://doi.org/10.1016/j.patter.2021.100205

This post represents the view of the author and does not necessarily represent the view of the institute itself. For more information about the topics of these articles and associated research projects, please contact info@hiig.de.

Sign up for HIIG's Monthly Digest

You will receive our latest blog articles once a month in a newsletter.

Explore Research issue in focus

Du siehst Eisenbahnschienen. Die vielen verschiedenen Abzweigungen symbolisieren die Entscheidungsmöglichkeiten von Künstlicher Intelligenz in der Gesellschaft. Manche gehen nach oben, unten, rechts. Manche enden auch in Sackgassen. Englisch: You see railway tracks. The many different branches symbolise the decision-making possibilities of artificial intelligence and society. Some go up, down, to the right. Some also end in dead ends.

Artificial intelligence and society

The future of artificial Intelligence and society operates in diverse societal contexts. What can we learn from its political, social and cultural facets?

Participation with Impact: Insights into the processes of Common Voice

The Importance of Participation in AI Development

How Common Voice implements participation

How Common Voice makes project decisions

Challenges of participatory decision-making

What we can learn from Common Voice

References

Birte Lübbert

Irina Kühnlein

Sign up for HIIG's Monthly Digest

Explore Research issue in focus

Artificial intelligence and society

Further articles

Escaping the digitalisation backlog: data governance puts cities and municipalities in the digital fast lane

Online echoes: the Tagesschau in Einfacher Sprache

Opportunities to combat loneliness: How care facilities are connecting neighborhoods

KEEP UP TO DATE