Task: ASR
Release Date: 4/1/2026
Format: OGG, SRT
Size: 355.77 MB
Share
The Kannada Time-Aligned Speech Corpus is a 5-hour speech dataset containing Kannada audio with corresponding time-aligned transcriptions. It is designed to support speech technology and research tasks such as automatic speech recognition, forced alignment, speech segmentation, pronunciation modeling, and spoken language analysis. The dataset provides a useful resource for developing and evaluating Kannada language technologies.
Licensing
Creative Commons Attribution Non Commercial Share Alike 4.0 International (CC-BY-NC-SA-4.0)
https://spdx.org/licenses/CC-BY-NC-SA-4.0.htmlRestrictions/Special Constraints
Use is permitted with attribution for non-commercial purposes only, and any shared adaptations must be distributed under the same license terms.
Forbidden Usage
Forbidden uses include commercial use, redistribution without proper attribution, and sharing modified versions under a different license.
Intended Use
This dataset is intended for use in speech technology and language research, including automatic speech recognition, forced alignment, speech-text matching, and spoken Kannada language processing.
Kannada is a major Dravidian language primarily spoken in the Indian state of Karnataka and by Kannada-speaking communities in other parts of India and abroad. It has a long literary history, a rich written tradition, and its own script. Kannada is widely used in education, media, administration, literature, and everyday communication, making it one of the most important languages of South India.
The dataset is organized into two main folders:
Audio/ — contains the Kannada speech recordings
Transcription/ — contains the corresponding text transcriptions for each audio file
Each transcription file corresponds to an audio file, making the dataset easy to use for speech processing, alignment, and transcription-based tasks.
The dataset includes recordings from two native Kannada speakers:
Speaker 1: Male, 32 years old
Speaker 2: Female, 39 years old
This provides basic speaker diversity in terms of gender and age within the corpus.
Verify audio quality
Normalize transcription text
Match audio and transcription filenames
Check alignment consistency
Remove noisy or corrupted files
Standardize formats and metadata
1
00:00:00,001 --> 00:00:02,956
ನಾನು ಇಂದು ಶಿಕ್ಷಣದ ಬಗ್ಗೆ ಮಾತನಾಡಲ್ಲ ಶಿಕ್ಷಣದ
2
00:00:02,980 --> 00:00:04,783
ಮಹತ್ವದ ಬಗ್ಗೆ
3
00:00:04,807 --> 00:00:06,031
ಮಾತನಾಡಲು ಹೊರಟಿದ್ದೇನೆ.