-
Beyond the Data: Language as Living Identity in a Digital Age
By Danica Roberts
A seTswana proofreader reflects on preserving language as identity through African Next Voices and why authentic speech data matters.
-
A Guide to Speech Datasets: Types, Uses, and Best Practices
A practical guide to scripted vs unscripted speech, multilingual and parallel data, multimodal and multi-speaker corpora, and how to match dataset types to ASR, conversational AI, and production deployments.
-
Beyond the Data: Rooted in Language
By Danica Roberts
A recorder’s reflection on reconnecting with heritage, family and community through the African Next Voices project.
-
Training AI on Reality: What African Languages Teach Us About Speech Recognition
An informational look at how African language contexts expose key speech recognition challenges, from code-switching and data scarcity to multilingual model performance in real-world ASR.
-
What ELIS 2026 Reveals About the Real Future of Language Work
By Adam Kossowski
The European Language Industry Survey 2026 shows falling confidence, tighter margins, and rapid AI adoption. The deeper issue may be how language value is being measured through speed and scale instead of trust, context, and meaning.
-
Beyond The Data: Authenticity in Every Word
By Danica Roberts
An isiXhosa recorder reflects on identity, self-expression, and why preserving indigenous languages matters. Busisiwe Madikane shares her experience contributing to Africa Next Voices.
-
The Hidden Costs of Poor AI Training Data in Machine Learning
Poor training data can break AI systems, increase costs, and introduce bias. Learn the hidden costs of low-quality datasets, real-world AI failures, and how organizations can build better training data pipelines.
-
The Complete Speech Data Collection Checklist
A practical, experience-driven guide to planning speech data properly, from defining the use case to locking down ethics and documentation, without overcomplicating the process.
-
How to Create a Dataset Card (And Why It Matters More Than You Think)
A dataset card is a transparency document, a governance tool, and a signal of maturity. Here's how to create one properly and why it matters for enterprise, research, and trustworthy AI.
-
Beyond the Data: Every Voice Carries More Than Words
By Danica Roberts
A recorder's reflection on contributing to South African language technology. Lenepa Molaoa shares what it meant to preserve seSotho through our African Next Voices project.
-
World Read Aloud Day with Nal'ibali
By Danica Roberts
Way With Words was proud to add our voices to World Read Aloud Day on February 4th 2026—a global celebration of the joy of stories, the power of reading aloud, and the connections we build through shared words.
-
Honouring the Individuals Who Made Africa Next Voices Possible
How personalised certificates for Africa Next Voices participants became a powerful reminder that recognition matters — and why acknowledging contributors goes beyond payment.
-
AI Expo Africa 2025: Reflections from the Way With Words Team
Our team reflects on attending AI Expo Africa 2025 in Johannesburg and the growing importance of locally sourced African speech data.
-
Building African Next Voices: Our Journey
How we helped deliver the South African component of the Africa Next Voices initiative alongside DSFSI, building TalkTag and producing large-scale ethical speech datasets.