Task: ASR
Release Date: 5/12/2026
Format: WAV, TSV
Size: 692.29 MB
Share
3 hours (1717 utterances) of read speech of Türkçe (Turkish), collected via the VoxForge project.
Licensing
GNU General Public License v3.0 or later (GPL-3.0)
https://spdx.org/licenses/GPL-3.0-or-later.htmlRestrictions/Special Constraints
N/A
Forbidden Usage
N/A
Ethical Review
Describe the ethical review process that was followed for this dataset, including any approvals or considerations related to data collection and usage.
Intended Use
ASR training and evaluation
Voice data contributed by volunteers who read prompts out loud. For Türkçe (Turkish), there are 3 hours of recorded speech.
The following is a breakdown of the number of utterances per speaker (at least 61 speakers):
| Name | Count |
|---|---|
| anonymous | 728 |
| BarbarosOM | 80 |
| ya06ren | 60 |
| ysfkck | 60 |
| yasintoga | 50 |
| FurkanAlniak | 40 |
| vipretek | 30 |
| 5Forward | 20 |
| Furkan | 20 |
| Hasan | 20 |
| Sezkajin | 20 |
| VolkanUGUR | 20 |
| aydin | 20 |
| dolphin | 20 |
| iletisimcandasveterinerklinigicom | 20 |
| leguzel | 20 |
| muratduman | 20 |
| onur | 20 |
| recep | 20 |
| yildiray | 20 |
| ixirdox | 14 |
| AhmetBSD | 10 |
| Ali | 10 |
| AzizGm | 10 |
| DeepTheorem | 10 |
| Levent132 | 10 |
| OnderKuscu | 10 |
| OnsPrdctn | 10 |
| TR | 10 |
| abangl | 10 |
| acakir77 | 10 |
| akemerci | 10 |
| alperakbilek | 10 |
| aykutb | 10 |
| bayciddiyetsiz | 10 |
| burhan656 | 10 |
| byrm | 10 |
| cdmn | 10 |
| cihan | 10 |
| evren | 10 |
| fatih | 10 |
| ibrahimzkan | 10 |
| ismailavc | 10 |
| maidis | 10 |
| mert | 10 |
| mtekbicak | 10 |
| mysteron | 10 |
| onurguven | 10 |
| remis | 10 |
| ryu | 10 |
| sebo | 10 |
| selimdeneme | 10 |
| sibel | 10 |
| tolgababa55 | 10 |
| turk | 10 |
| umite | 10 |
| yakup | 10 |
| zgrSarer | 10 |
| byazici | 9 |
| futuk | 9 |
| Erhan | 7 |
The top-level directory contains a number of subdirectories corresponding to speaker/session recorded. Each of these subdirectories is structured as follows:
├── wav/
│ ├── file1.wav
│ ├── file2.wav
│ ├── ...
├── etc/
│ ├── GPL_license.txt
│ ├── PROMPTS
│ ├── prompts-original
│ ├── README
where PROMPTS and prompts-original contain an audio id followed by a space and the prompt text (transcript).
See https://www.voxforge.org/home/about for more details about the project and dataset.