---
title: "Field Notes – African Speech Data &#38; AI Insights | Way With Words"
description: "Stay updated with our Field Notes blog: stories and insights on African speech data, language technology, and our community-driven AI projects."
image: "https://waywithwords.ai/og-default.png"
---

Blog

# Way With Words AI Field Notes

Our **Beyond the Data** interviews with contributors, plus updates, stories, and reflections on language, speech data, and the people behind our work.

## Beyond the Data

Beyond the Data features recorders, proofreaders, language leads, and collaborators in their own words. They write about work that often felt less like a contract than cultural responsibility: reconnecting with mother tongues and community, growing through patience and rigour, and why real speech (tone, dialect, emotion, and variation) must shape how technology listens. Together they frame dignity and visibility in AI as recognition rather than erasure, and the hope of laying groundwork so future generations meet digital tools in languages spoken as they are truly lived.

[

![Beyond the Data blog banner featuring Treasure Makhanye, with the tagline Every voice carries more than words and silhouettes beside the series title.](/images/blog/beyond-the-data-Treasure-Makhanye.jpg)

11 May 2026

### Beyond the Data: The Weight of Being Seen

By Danica Roberts

A Xitsonga Language Manager reflects on recognition, representation, and reconnecting with her mother tongue through Africa Next Voices. Treasure Makhanye shares how ANV helped Xitsonga be seen in technology and media.

Read article →

](/blog/beyond-the-data-treasure-makhanye)[

![Beyond the Data: The Quiet Truth About Mastery featured image](/images/blog/beyond-the-data-Sphamandla-Mthimunye.jpg)

7 May 2026

### Beyond the Data: The Quiet Truth About Mastery

By Danica Roberts

A proofreader’s reflection on what mastering a language really looks like

Read article →

](/blog/beyond-the-data-sphamandla-mthimunye)[

![Beyond the Data: Language as Living Identity in a Digital Age featured image](/images/blog/beyond-the-data-Brenden-Letsatsi.jpg)

10 April 2026

### Beyond the Data: Language as Living Identity in a Digital Age

By Danica Roberts

A seTswana proofreader reflects on preserving language as identity through African Next Voices and why authentic speech data matters.

Read article →

](/blog/beyond-the-data-brenden-letsatsi)[

![Beyond the Data: Rooted in Language featured image](/images/blog/beyond-the-data-Carlfornia-Minyuku.jpg)

1 April 2026

### Beyond the Data: Rooted in Language

By Danica Roberts

A recorder’s reflection on reconnecting with heritage, family and community through the African Next Voices project.

Read article →

](/blog/beyond-the-data-carlfornia-minyuku)[

![Beyond The Data: Authenticity in Every Word featured image](/images/blog/beyond-the-data-Busisiwe-Madikane.jpg)

17 March 2026

### Beyond The Data: Authenticity in Every Word

By Danica Roberts

An isiXhosa recorder reflects on identity, self-expression, and why preserving indigenous languages matters. Busisiwe Madikane shares her experience contributing to Africa Next Voices.

Read article →

](/blog/beyond-the-data-busisiwe-madikane)[

![Beyond the Data: Every Voice Carries More Than Words featured image](/images/blog/beyond-the-data-Lenepa-Molaoa.jpg)

14 February 2026

### Beyond the Data: Every Voice Carries More Than Words

By Danica Roberts

A recorder's reflection on contributing to South African language technology. Lenepa Molaoa shares what it meant to preserve seSotho through our African Next Voices project.

Read article →

](/blog/beyond-the-data-lenepa-molaoa)

## From the Blog

Articles on datasets, projects, events, and language AI — newest first.

[

![Speaker recording session for African speech dataset collection](/images/blog/why-high-quality-speech-data-demands-careful-investment.png)

30 April 2026

### Why High-Quality Speech Data Requires Careful Investment

Building high-quality speech datasets is expensive and time intensive. This article explains why careful prioritisation, sustainable economics, and collaboration across community and commercial models are essential for broader language coverage.

Read article →

](/blog/why-high-quality-speech-data-demands-careful-investment)[

![Visual overview of speech dataset types and use cases](/images/blog/guide-to-speech-dataset-types.jpg)

7 April 2026

### A Guide to Speech Datasets: Types, Uses, and Best Practices

A practical guide to scripted vs unscripted speech, multilingual and parallel data, multimodal and multi-speaker corpora, and how to match dataset types to ASR, conversational AI, and production deployments.

Read article →

](/blog/a-guide-to-speech-datasets-types-uses-and-best-practices)[

![Training AI on Reality: What African Languages Teach Us About Speech Recognition featured image](/images/blog/multilingual-speech-for-asr.jpg)

25 March 2026

### Training AI on Reality: What African Languages Teach Us About Speech Recognition

An informational look at how African language contexts expose key speech recognition challenges, from code-switching and data scarcity to multilingual model performance in real-world ASR.

Read article →

](/blog/training-ai-on-reality-what-african-languages-teach-us-about-speech-recognition)[

![What ELIS 2026 Reveals About the Real Future of Language Work featured image](/images/blog/european-language-industry-survey.jpg)

23 March 2026

### What ELIS 2026 Reveals About the Real Future of Language Work

By Adam Kossowski

The European Language Industry Survey 2026 shows falling confidence, tighter margins, and rapid AI adoption. The deeper issue may be how language value is being measured through speed and scale instead of trust, context, and meaning.

Read article →

](/blog/what-elis-2026-reveals-about-the-real-future-of-language-work)[

![The Hidden Costs of Poor AI Training Data in Machine Learning featured image](/images/blog/garbage-in-garbage-out.jpg)

12 March 2026

### The Hidden Costs of Poor AI Training Data in Machine Learning

Poor training data can break AI systems, increase costs, and introduce bias. Learn the hidden costs of low-quality datasets, real-world AI failures, and how organizations can build better training data pipelines.

Read article →

](/blog/hidden-costs-poor-training-data)[

![The Complete Speech Data Collection Checklist featured image](/images/blog/complete-speech-data-collection-checklist.jpg)

1 March 2026

### The Complete Speech Data Collection Checklist

A practical, experience-driven guide to planning speech data properly, from defining the use case to locking down ethics and documentation, without overcomplicating the process.

Read article →

](/blog/complete-speech-data-collection-checklist)[

![How to Create a Dataset Card (And Why It Matters More Than You Think) featured image](/images/blog/how-to-create-a-dataset-card.jpg)

23 February 2026

### How to Create a Dataset Card (And Why It Matters More Than You Think)

A dataset card is a transparency document, a governance tool, and a signal of maturity. Here's how to create one properly and why it matters for enterprise, research, and trustworthy AI.

Read article →

](/blog/how-to-create-a-dataset-card)[

![World Read Aloud Day with Nal'ibali featured image](/images/blog/world-read-aloud-day.jpg)

4 February 2026

### World Read Aloud Day with Nal'ibali

By Danica Roberts

Way With Words was proud to add our voices to World Read Aloud Day on February 4th 2026—a global celebration of the joy of stories, the power of reading aloud, and the connections we build through shared words.

Read article →

](/blog/world-read-aloud-day-nalibali)[

![Honouring the Individuals Who Made Africa Next Voices Possible featured image](/images/blog/africa-next-voices-certificate-appreciation.jpg)

15 December 2025

### Honouring the Individuals Who Made Africa Next Voices Possible

How personalised certificates for Africa Next Voices participants became a powerful reminder that recognition matters — and why acknowledging contributors goes beyond payment.

Read article →

](/blog/africa-next-voices-certificates)[

![AI Expo Africa 2025: Reflections from the Way With Words Team featured image](/images/blog/ai-expo-africa-2025.jpg)

31 October 2025

### AI Expo Africa 2025: Reflections from the Way With Words Team

Our team reflects on attending AI Expo Africa 2025 in Johannesburg and the growing importance of locally sourced African speech data.

Read article →

](/blog/ai-expo-africa-2025)[

![Building African Next Voices: Our Journey featured image](/images/blog/africa-next-voices-way-with-words.jpg)

28 February 2025

### Building African Next Voices: Our Journey

How we helped deliver the South African component of the Africa Next Voices initiative alongside DSFSI, building TalkTag and producing large-scale ethical speech datasets.

Read article →

](/blog/africa-next-voices-project)

[Chronological archive (17 posts)](/blog/archive)

```json
{"@context":"https://schema.org","@type":"Organization","name":"Way With Words AI","url":"https://waywithwords.ai","email":"hello@waywithwords.ai","contactPoint":[{"@type":"ContactPoint","contactType":"customer support","telephone":"+44 208 157 9929","email":"hello@waywithwords.ai","areaServed":"GB","availableLanguage":"en"},{"@type":"ContactPoint","contactType":"customer support","telephone":"+27 21 879 3552","email":"hello@waywithwords.ai","areaServed":"ZA","availableLanguage":"en"}],"location":[{"@type":"Place","name":"Way With Words Limited (UK Office)","address":{"@type":"PostalAddress","streetAddress":"Caledonian House Business Centre, 164 High Street","addressLocality":"Elgin","postalCode":"IV30 1BD","addressCountry":"GB"}},{"@type":"Place","name":"Way With Words SA (Pty) Ltd (South Africa & SADC Office)","address":{"@type":"PostalAddress","streetAddress":"First Floor, Vineyards Square North, The Vineyards Office Estate, 99 Jip de Jager Drive, Bellville","addressLocality":"Cape Town","postalCode":"7530","addressCountry":"ZA"}}]}
{"@context":"https://schema.org","@type":"Blog","name":"Way With Words Field Notes","url":"https://waywithwords.ai/blog","description":"Updates and reflections on African speech dataset projects and language AI (Way With Words Field Notes)."}
```