Dholuo speech dataset
African Next Voices: Data collection in Kenya (KenCorpus Consortium, Gates Foundation). Scripted and unscripted speech across multiple domains, collected through ethical, community-led processes. CC BY 4.0. See the Hugging Face organization for the latest splits and attribution. This configuration covers Dholuo (Nyandwat, Milambo dialects). Use the dataset card for transcription status, splits, and ethical use terms.
Looking for more options? Browse the full African speech datasets catalog or see our community-centric data licensing framework.
Key details
- Hours available
- 723
- Speakers
- 0
- Access
- Hugging Face
- Audio format
- WAV (per dataset card)
- Accents
- Kenyan Dholuo
Dataset details
Hours available
723
Age range
18 - 60+
Download size
Hugging Face
Number of speakers
0
Audio format
WAV (per dataset card)
Accents
Kenyan Dholuo
Additional information
African Next Voices — Kenya
This listing points to African Next Voices in Kenya (KenCorpus Consortium, Gates Foundation): scripted and unscripted speech collected through community-led processes, with per-language dataset repos under the Anv-ke organization on Hugging Face. The public cards describe domains, splits, transcription coverage, and ethical use; treat releases as work in progress and follow CC BY 4.0 attribution on the dataset card.
More languages & resources
Open the Hugging Face dataset card for this language for loading instructions, columns, and the latest statistics. The Anv-ke organization lists sibling repos (Dholuo, Kikuyu, Somali, Kalenjin, Maasai). Use only as permitted on the card (research and ASR-related development; no surveillance or unethical profiling).