Datasets
English Hausa Parallel Corpus
License: CC-BY-NC-4.0
Locale: eng, hau
Task: MT
Format: csv
Size: 164.32 KB
Persian Literature Corpus by Najwai Sukhan
License: CC-BY-NC-4.0
Locale: fas
Task: NLP
Format: TXT
Size: 38.62 MB
Heroes English-Spanish Dubbed Movie Speech Corpus
License: CC-BY-SA-4.0
Locale: eng, spa
Task: NLP
Format: wav, csv, txt
Size: 1.68 GB
Common Voice Scripted Speech 25.0 - Swahili
License: CC0-1.0
Locale: sw
Task: ASR
Format: MP3
Size: 20.87 GB
Common Voice Scripted Speech 25.0 - Kabyle
License: CC0-1.0
Locale: kab
Task: ASR
Format: MP3
Size: 17.43 GB
Common Voice Scripted Speech 25.0 - Basque
License: CC0-1.0
Locale: eu
Task: ASR
Format: MP3
Size: 14.48 GB
Common Voice Scripted Speech 25.0 - Japanese
License: CC0-1.0
Locale: ja
Task: ASR
Format: MP3
Size: 14.34 GB
Common Voice Scripted Speech 25.0 - Luganda
License: CC0-1.0
Locale: lg
Task: ASR
Format: MP3
Size: 11.06 GB
Common Voice Scripted Speech 25.0 - Czech
License: CC0-1.0
Locale: cs
Task: ASR
Format: MP3
Size: 5.56 GB
Common Voice Scripted Speech 25.0 - Urdu
License: CC0-1.0
Locale: ur
Task: ASR
Format: MP3
Size: 5.78 GB
Common Voice Scripted Speech 25.0 - Georgian
License: CC0-1.0
Locale: ka
Task: ASR
Format: MP3
Size: 6.37 GB
Common Voice Scripted Speech 25.0 - Thai
License: CC0-1.0
Locale: th
Task: ASR
Format: MP3
Size: 8.38 GB
Common Voice Scripted Speech 25.0 - Russian
License: CC0-1.0
Locale: ru
Task: ASR
Format: MP3
Size: 6.55 GB
Common Voice Scripted Speech 25.0 - Italian
License: CC0-1.0
Locale: it
Task: ASR
Format: MP3
Size: 9.71 GB
Common Voice Scripted Speech 25.0 - Galician
License: CC0-1.0
Locale: gl
Task: ASR
Format: MP3
Size: 7.81 GB
Common Voice Scripted Speech 25.0 - Latvian
License: CC0-1.0
Locale: lv
Task: ASR
Format: MP3
Size: 5.84 GB
Common Voice Scripted Speech 25.0 - Persian
License: CC0-1.0
Locale: fa
Task: ASR
Format: MP3
Size: 10.40 GB
Common Voice Scripted Speech 25.0 - Tamil
License: CC0-1.0
Locale: ta
Task: ASR
Format: MP3
Size: 8.57 GB
Common Voice Scripted Speech 25.0 - Uyghur
License: CC0-1.0
Locale: ug
Task: ASR
Format: MP3
Size: 9.69 GB
Common Voice Scripted Speech 25.0 - Kabardian
License: CC0-1.0
Locale: kbd
Task: ASR
Format: MP3
Size: 5.52 GB
Common Voice Scripted Speech 25.0 - Frisian
License: CC0-1.0
Locale: fy-NL
Task: ASR
Format: MP3
Size: 4.34 GB
Common Voice Scripted Speech 25.0 - Welsh
License: CC0-1.0
Locale: cy
Task: ASR
Format: MP3
Size: 3.89 GB
Common Voice Scripted Speech 25.0 - Central Kurdish
License: CC0-1.0
Locale: ckb
Task: ASR
Format: MP3
Size: 3.59 GB
Common Voice Scripted Speech 25.0 - Hungarian
License: CC0-1.0
Locale: hu
Task: ASR
Format: MP3
Size: 3.58 GB