Naija-TTS-Dataset
License:
NOODL-1.0
Steward:
Institute of African Digital HumanitiesTask: TTS
Release Date: 4/14/2026
Format: MP3, TSV
Size: 324.82 MB
Share
Description
This dataset comprises audio recordings of Nigerian Pidgin English speech aligned with textual transcriptions. The dataset is structured into 16 folders, each containing audio files and a corresponding audio-text mapping file. The audio clips are short, typically ranging from 1 to 38 seconds, and are suitable for training and evaluating Text-to-Speech (TTS) systems. The dataset follows a structured format where each audio file is paired with its corresponding transcription in a tab-separated mapping file. The textual content used in this dataset originates from a variety of written and spoken sources in Nigerian Pidgin English, including narrative texts, conversational exchanges, news-style content, and everyday speech samples. These texts were segmented into short utterances suitable for read speech and TTS modelling.
Specifics
Licensing
Nwulite Obodo Open Data Licence 1.0 (NOODL-1.0)
https://licensingafricandatasets.com/nwulite-obodo-licenseConsiderations
Restrictions/Special Constraints
- For research and scientific use only - You agree not to re-host or redistribute this dataset
Forbidden Usage
You agree not to use the data for: - Generative AI - Voice cloning or speaker imitation - Reproduction, duplication, modification, or redistribution - Commercial use without explicit permission
Processes
Intended Use
This dataset is intended for the training and evaluation of Text-to-Speech (TTS) systems for the Nigerian Pidgin English (Naija) language. It aims to support: - Language technology development for one of Africa's most widely spoken contact languages - Development of speech technologies for under-served and creole language communities - Educational applications in multilingual and multilectal contexts - Research in low-resource and African language speech synthesis
Metadata
Language
Nigerian Pidgin English (native name: Naija; also known as Nigerian Pidgin, Naija Pidgin, or simply Pidgin) is an English-based creole language spoken primarily in Nigeria. It emerged from centuries of contact between the English language and various indigenous Nigerian languages, particularly in coastal trading communities. Today it has evolved into a widely spoken vernacular language used across the country.
Nigerian Pidgin English is spoken by an estimated 75 to 100 million people in Nigeria, making it one of the most widely spoken languages in Africa. It is used as a first language in many urban households and as a second or lingua franca language by the vast majority of Nigerians, cutting across ethnic, religious, and socio-economic boundaries. Although it is not officially recognized as a national language, it functions as the de facto lingua franca of Nigeria alongside Standard English.
Naija is present across all regions of Nigeria, with particularly strong concentrations in the Niger Delta, Lagos, Port Harcourt, Benin City, Warri, and Calabar. It is also spoken in the Nigerian diaspora communities across West Africa and internationally. The BBC operates a dedicated Naija-language news service (BBC Pidgin), reflecting the language's broad reach and cultural significance.
Variants
Nigerian Pidgin English is not monolithic; it exhibits significant regional and sociolectal variation across Nigeria. While all varieties are mutually intelligible, linguists and speakers recognize distinct regional flavors.
Niger Delta varieties include:
Warri Pidgin — spoken in Warri, Delta State; considered one of the most distinctive varieties, known for its speed, intonation, and unique lexical items
Port Harcourt Pidgin — spoken in Rivers State; influenced heavily by Igbo and Ijoid languages
Calabar Pidgin — spoken in Cross River State; shows influence from Efik and Ibibio
Lagos variety:
Lagos Pidgin — urban variety spoken in Nigeria's largest city; cosmopolitan, blending influence from Yoruba, Igbo, Hausa, and global youth culture; increasingly used in music, social media, and entertainment
Northern Nigerian variety:
Northern Pidgin — spoken across northern urban centres such as Kano, Kaduna, and Abuja; more conservative in structure, often showing influence from Hausa
Creolized variety:
In many urban households, particularly in the Niger Delta and Lagos, Naija is spoken as a first language and has begun creolizing, showing greater grammatical regularity and expanded lexical development compared to earlier pidgin forms.
The variety represented in this dataset reflects the urban spoken variety widely used in Nigerian popular culture, media, and everyday urban communication, drawing on the common spoken register intelligible across regions.
Writing System
Nigerian Pidgin English lacks a formally standardized official orthography. It is most commonly written using a semi-phonemic adaptation of the Roman alphabet, drawing on English spelling conventions while reflecting Naija pronunciation patterns. This informal writing practice is widely used in social media, text messages, journalism (especially in online and tabloid media), and creative writing.
Several scholarly and advocacy efforts have proposed more systematic orthographic conventions for Naija, but none has been formally adopted at a national level. The BBC Pidgin service uses a semi-standardized form of Roman script for its publications.
Key orthographic features commonly encountered in written Naija:
Words are often spelled as they are pronounced, not according to Standard English orthography (e.g., "dey" for "there/are", "na" for "it is/that is", "dem" for "them")
Reduplication is common and is typically written by repeating the word (e.g., "small small" for "gradually")
Tense and aspect markers are written as separate words preceding the verb (e.g., "go" for future, "don" for perfective, "dey" for progressive)
English loanwords are often respelled phonemically (e.g., "kpai" for "die", from Yoruba influence)
The transcriptions in this dataset are written in the conventional semi-phonemic Roman script typical of written Naija, reflecting natural written usage.
Grammar and Linguistic Features
Nigerian Pidgin English has a predominantly English-derived lexicon, though its grammar reflects structural features of West African languages. Key grammatical features include:
Tense-Aspect-Mood (TAM) system:
"go" — future marker (e.g., "I go do am" = "I will do it")
"don" — perfective aspect (e.g., "She don go" = "She has gone")
"dey" — progressive/continuous marker (e.g., "E dey rain" = "It is raining")
"bin" — past tense marker (e.g., "Im bin come" = "He/she came")
Pronominal system:
Subject/object pronouns are often undifferentiated: "im" (him/her/it), "dem" (them), "wi" (we/us), "una" or "unu" (you plural)
Copula:
"na" functions as an identificational/equative copula (e.g., "Na him do am" = "It was him who did it")
"be" or zero copula is used predicatively
Negation:
"no" or "nor" precedes the verb for negation (e.g., "I no sabi" = "I don't know")
Serial verb constructions: Multiple verbs are chained without overt connectives, following West African language patterns.
Source
The textual material in this dataset originates from a variety of Nigerian Pidgin English written and transcribed sources, including narrative texts, conversational and dialogue-based content, news and social media material, and everyday speech samples. The texts were segmented into short utterances suitable for read speech and used as prompts for audio recording sessions.
Domain
This dataset is derived from prompted read speech. The speaker read aloud pre-written Nigerian Pidgin English texts drawn from narrative, conversational, and informational sources. The content covers a range of registers and everyday topics typical of spoken Naija, including personal narratives, social commentary, and general discourse.
The dataset has been structured as segmented, read-style speech suitable for speech synthesis tasks.
Size
The dataset is composed of 16 folders containing audio clips and corresponding mapping files.
Each folder contains between 4 and 200 audio files. Individual audio clips typically range from 1 to 38 seconds in duration.
Folder-level durations range from approximately 22 seconds to over 42 minutes of audio.
The dataset represents a total of 1,987 audio files with a combined duration of approximately 5 hours 54 minutes and 58 seconds of segmented Nigerian Pidgin English speech.
A detailed breakdown of durations and file counts per folder is provided below.
| Folder | Files | Duration |
|---|---|---|
| tts__pcm_dataset_07_204clips_2095s_20260410-1251_part1of2 | 200 | 29m 38s |
| tts__pcm_dataset_07_204clips_2095s_20260410-1251_part2of2 | 4 | 22s |
| tts_pcm_dataset_01_153clips_1812s_20260405-2138 | 153 | 24m 51s |
| tts_pcm_dataset_02_109clips_1463s_20260406-1740 | 109 | 20m 15s |
| tts_pcm_dataset_03_139clips_1488s_20260407-0346 | 139 | 21m 03s |
| tts_pcm_dataset_04_163clips_1896s_20260407-2118 | 163 | 28m 22s |
| tts_pcm_dataset_05_156clips_2361s_20260409-0208 | 156 | 28m 24s |
| tts_pcm_dataset_06_198clips_2922s_20260409-1250 | 198 | 36m 05s |
| tts_pcm_dataset_08_218clips_2617s_20260410-1512_part1of2 | 200 | 32m 47s |
| tts_pcm_dataset_08_218clips_2617s_20260410-1512_part2of2 | 18 | 2m 12s |
| tts_pcm_dataset_08_218clips_2617s_20260410-1512_part2of2 2 | 18 | 2m 12s |
| tts_pcm_dataset_09_175clips_1913s_20260410-0348 | 175 | 27m 35s |
| tts_pcm_dataset_10_175clips_1961s_20260410-0725 | 175 | 29m 21s |
| tts_pcm_dataset_11_16clips_170s_20260410-0739 | 16 | 2m 27s |
| tts_pcm_dataset_12_153clips_3023s_20260413-1722 | 153 | 42m 50s |
| tts_pcm_dataset_13_110clips_1973s_20260413-1852 | 110 | 26m 29s |
| GRAND TOTAL | 1,987 | 5h 54m 58s |
Structure
Each folder in the dataset contains:
A collection of audio files in MP3 format
A tab-separated mapping file linking each audio file to its transcription
Each line in the mapping file follows the format:
audio_filename.mp3 key sentence attempts
The dataset is designed for TTS pipelines requiring paired audio-text data.
Sample
5425628fc939498ba16e2d600931bd1a.mp3 | Pidgin English for Naija today, we no fit dey write about am make we no call di name of di papa dem all wey make di language watin e bi today.
8ac90d713cd5f0f6d243f651c7c98662.mp3 | She even dey wave hand
f2e35fa0f49e64703f426210f73cac8b.mp3 | like say she wan slap the person through phone, while she continue dey warn her caller.
cc421e3184b8fd6e60f8af6db957fddc.mp3. | Na two times agbero don drag me come down bus beat me like say tomorrow nor dey.
e99f8f9e4af97a8bde187260c959d0b3.mp3. | Hare come ask am again, so tortis come say na di tin wey im talk n aim e dey hear from am.