License:
Apache-2.0
Steward:
CommunityTask: TTS
Release Date: 4/28/2026
Format: WEBM, TSV
Size: 680.16 MB
Share
A single-speaker read speech dataset in Italian. The dataset contains ~10 hours of pre-segmented utterances, recorded by an anonymous 66-years-old female Italian speaker. Sentences were prompted from a script. The speaker is native of Abruzzi region and speaks with a slight accent. The archive includes .webm audio files together with a metadata TSV with transcriptions, file paths and turations of each recording.
Restrictions/Special Constraints
No restrictions
Forbidden Usage
No forbidden usage
Ethical Review
Participants are fully aware of the study's purpose. They have been instructed about the Mozilla Data Collective initiative.
Intended Use
This dataset is intended for use in creating automatic speech generation and recognition systems.
The speaker is a 66-year-old anonymous woman from the Abruzzo region, in the province of Teramo. She speaks with a slight accent, which may be noticeable in some fragments.
The fragments are excerpts from Il fu Mattia Pascal by Luigi Pirandello and L’Argentina vista com’è by Luigi Barzini, published in Corriere della Sera in 1901–1902.
The Italian in the excerpts may contain some outdated words that still belong to standard Italian.
Failed attempts are collected in a separate folder. It is up to the speaker to decide whether an attempt was a failure.