Africa Next Voices

Kikuyu speech dataset

African Next Voices: Data collection in Kenya (KenCorpus Consortium, Gates Foundation). Scripted and unscripted speech across multiple domains, collected through ethical, community-led processes. CC BY 4.0. See the Hugging Face organization for the latest splits and attribution. This configuration covers Kikuyu (Gĩ-Kabete, Ki-Mathira, Ki-Muranga, Ki-Ndia, Gĩ-Gichugu). Use the dataset card for the latest hours and splits.

Looking for more options? Browse the full African speech datasets catalog or see our community-centric data licensing framework.

Key details

Hours available
754
Speakers
0
Access
Hugging Face
Audio format
WAV (per dataset card)
Accents
Kenyan Kikuyu
Get dataset on Hugging Face

Dataset details

Hours available

754

Age range

18 - 60+

Download size

Hugging Face

Number of speakers

0

Audio format

WAV (per dataset card)

Accents

Kenyan Kikuyu

Additional information

African Next Voices — Kenya

This listing points to African Next Voices in Kenya (KenCorpus Consortium, Gates Foundation): scripted and unscripted speech collected through community-led processes, with per-language dataset repos under the Anv-ke organization on Hugging Face. The public cards describe domains, splits, transcription coverage, and ethical use; treat releases as work in progress and follow CC BY 4.0 attribution on the dataset card.

More languages & resources

Open the Hugging Face dataset card for this language for loading instructions, columns, and the latest statistics. The Anv-ke organization lists sibling repos (Dholuo, Kikuyu, Somali, Kalenjin, Maasai). Use only as permitted on the card (research and ASR-related development; no surveillance or unethical profiling).