African speech datasets for AI

South African English conversational speech dataset built for ASR training, evaluation, and multilingual AI development, featuring real-world contact-centre style interactions and diverse regional accents.

South Africa InsuranceRetail +2

seSotho Paid

Conversational seSotho speech data collected from first-language speakers, designed to improve representation of under-resourced African languages in speech recognition and language model training.

South Africa InsuranceRetail +2

isiZulu Paid

Production-ready isiZulu conversational speech dataset supporting ASR benchmarking and multilingual AI workflows, with tonal language coverage and realistic acoustic environments.

South Africa InsuranceRetail +2

Afrikaans Paid

Afrikaans conversational speech data designed for speech recognition, conversational AI, and evaluation use cases, reflecting natural language usage across multiple domains.

Free & open

Africa Next Voices (Swivuriso)

Large-scale multilingual speech dataset for 7 South African languages—over 3,000 hours in total. Built for ASR research and inclusive technologies. Available free on Hugging Face (CC BY 4.0). Way With Words produced the South African component with DSFSI.

isiZulu Free

503h Hugging Face

Over 500 hours of isiZulu speech from the Swivuriso dataset—scripted and unscripted, first-language speakers—for ASR and inclusive speech technology.

isiXhosa Free

504h Hugging Face

Over 500 hours of isiXhosa speech from Swivuriso—scripted and unscripted, first-language speakers—for ASR and inclusive speech technology.

Sesotho Free

504h Hugging Face

Over 500 hours of Sesotho speech from Swivuriso—scripted and unscripted, first-language speakers—for ASR and inclusive speech technology.

Setswana Free

502h Hugging Face

Over 500 hours of Setswana speech from Swivuriso—scripted and unscripted, first-language speakers—for ASR and inclusive speech technology.

Xitsonga Free

500h Hugging Face

Over 500 hours of Xitsonga speech from Swivuriso—scripted and unscripted, first-language speakers—for ASR and inclusive speech technology.

Tshivenda Free

251h Hugging Face

Over 250 hours of Tshivenda speech from Swivuriso—scripted and unscripted, first-language speakers—for ASR and inclusive speech technology.

isiNdebele Free

252h Hugging Face

Over 250 hours of isiNdebele speech from Swivuriso—scripted and unscripted, first-language speakers—for ASR and inclusive speech technology.