Datasets
LibriVox Czech TTS Female Voice
License: CC0-1.0
Locale: cs
Task: TTS
Format: MP3, TXT, TSV
Size: 178.58 MB
UK Sort Codes - ASR Evaluation
License: CC-BY-4.0
Locale: en-GB
Task: ASR
Format: WEBM, TSV
Size: 23.76 MB
otomí-hñähñu TTS Voz Masculina
License: CC-BY-SA-4.0
Locale: ote
Task: TTS
Format: MP3, TXT, TSV
Size: 119.54 MB
Yoruba-English Code-Switching (YECS) Corpus
License: NOODL-1.0
Locale: yo, en
Task: ASR
Format: WAV, CSV
Size: 9.71 GB
Awal Tamazight Dataset
License: CC-BY-4.0
Locale: zgh
Task: LM
Format: TSV, JSON, TXT
Size: 11.57 MB
RFE/RL Serbian, Bosnian, and Montenegrin (Balkan) News Text Corpus
License: CC-BY-NC-SA-4.0
Locale: hbs
Task: NLP
Format: TXT
Size: 310.39 MB
RFE/RL Bulgarian News Text Corpus
License: CC-BY-NC-SA-4.0
Locale: bg
Task: NLP
Format: TXT
Size: 49.82 MB
RFE/RL Azerbaijani News Text Corpus
License: CC-BY-NC-SA-4.0
Locale: az,ru
Task: NLP
Format: TXT
Size: 211.65 MB
RFE/RL Belarusian News Text Corpus
License: CC-BY-NC-SA-4.0
Locale: be
Task: NLP
Format: TXT
Size: 486.55 MB
RFE/RL Macedonian News Text Corpus
License: CC-BY-NC-SA-4.0
Locale: mk
Task: NLP
Format: TXT
Size: 133.95 MB
LibriVox Croatian TTS Male Voice
License: CC0-1.0
Locale: hr
Task: TTS
Format: MP3, TXT, TSV
Size: 377.60 MB
RFE/RL Romanian (Moldova) News Text Corpus
License: CC-BY-NC-SA-4.0
Locale: ro,ru,en
Task: NLP
Format: TXT
Size: 311.87 MB
RFE/RL Tajik News Text Corpus
License: CC-BY-NC-SA-4.0
Locale: tg,ru
Task: NLP
Format: TXT
Size: 145.27 MB
Punjabi 10 Hours TTS
License: CC-BY-NC-SA-4.0
Locale: pnb
Task: TTS
Format: WEBM, TSV
Size: 481.96 MB
RFE/RL Turkmen News Text Corpus
License: CC-BY-NC-SA-4.0
Locale: tk,ru
Task: NLP
Format: TXT
Size: 48.28 MB
RFE/RL Kyrgyz News Text Corpus
License: CC-BY-NC-SA-4.0
Locale: ky,ru,en
Task: NLP
Format: TXT
Size: 282.41 MB
RFE/RL Georgian News Text Corpus
License: CC-BY-NC-SA-4.0
Locale: ka
Task: NLP
Format: TXT
Size: 257.53 MB
RFE/RL Kazakh News Text Corpus
License: CC-BY-NC-SA-4.0
Locale: kk
Task: NLP
Format: TXT
Size: 126.81 MB
RFE/RL Crimean Tatar News Text Corpus
License: CC-BY-NC-SA-4.0
Locale: crh
Task: NLP
Format: TXT
Size: 18.35 MB
RFE/RL Chechen News Text Corpus
License: CC-BY-NC-SA-4.0
Locale: ce
Task: NLP
Format: TXT
Size: 28.29 MB
Naija-TTS-Dataset
License: NOODL-1.0
Locale: pcm
Task: TTS
Format: MP3, TSV
Size: 324.82 MB
RFE/RL Hungarian News Text Corpus
License: CC-BY-NC-SA-4.0
Locale: hu
Task: NLP
Format: TXT
Size: 36.64 MB
RFE/RL Ukrainian (Crimea) News Text Corpus
License: CC-BY-NC-SA-4.0
Locale: uk
Task: NLP
Format: TXT
Size: 180.13 MB
RFE/RL Pashto (Pakistani) News Text Corpus
License: CC-BY-NC-SA-4.0
Locale: ps
Task: NLP
Format: TXT
Size: 39.26 MB