License:
CC-BY-SA-4.0
Steward:
CommunityDataset ID:
cmr6i7wxv034imm07yf2up2hd
Task: TTS
Release Date: 7/4/2026
Format: WEBM, TSV
Size: 297.19 MB
Share
TTS Sundanese - Sunda Priangan Timur (SPRINT) consists of short texts covering a wide range of topics, including culture, family, education, social interaction, and everyday activities. The dataset is based on the East Priangan dialect of the Sundanese language, specifically the variety spoken in Garut Regency. It includes three speech levels: informal (kasar), neutral (pertengahan/loma), and polite (lemes). Additionally, the dataset employs an apostrophe/curek (') to mark one of the /e/ vowels, distinguishing it from the other /e/ vowel and the /eu/ vowel, thereby preserving the phonological distinctions of the Sundanese language, West Java Province, Indonesia.
Licensing
Creative Commons Attribution Share Alike 4.0 International (CC-BY-SA-4.0)
Restrictions/Special Constraints
This dataset may be used for public benefit, including commercial applications. Any use of the dataset requires prior notification and permission from the dataset owner.
Forbidden Usage
Modifying this dataset without the owner’s permission is prohibited.
Ethical Review
This dataset was created by writing texts in the East Priangan dialect of Sundanese with code-mixing of Indonesian and English.The files were read and recorded by native speakers through the hosting platform https://sabre-2.onrender.com/. The collection of audio recordings was compiled into a comprehensive dataset.
Intended Use
This dataset is designed to document Sundanese language speech across various domains of community life.
This dataset uses the East Priangan dialect of the Sundanese language from West Java, with Indonesian and English code-mixing.
Created by the owner of the dataset, considered as a linguist and native speaker.
Approximately 5.5 hours
Maulana, Taofik. (2026). TTS Sundanese - Sunda Priangan Timur (SPRINT) [Data set]. Mozilla Data Collective. URL [link dataset].
Audio file name, text
“Kuring kamari ngiluan méngbal ka Bogor jeung babaturan sakola”
“Sia mah nanaon téh sok kudu aya ayeuna kénéh”
“Kapungkur mah abdi sok ilubiung sareng warga dina kagiatan sosial”
“Amanda téh siswa anu geulis tur bageur, matak loba nu resep”
“Pun Biang kakaraeun masihan kuéh nastar nalika nuju boboran”
Latin alphabet (A–Z), Arabic numerals (0–9)