License:
CC-BY-SA-4.0
Steward:
Collaborative Action For Research & Development (CARD)Task: NLP
Release Date: 3/5/2026
Format: CSV
Size: 312.87 KB
Share
The IBT Torwali Wordlist contains approximately 20,000 unique entries in Torwali (ISO 639-3: trw), an under-documented Indo-Aryan language spoken in northern Pakistan. The dataset comprises standardized lexical entries covering core vocabulary, function words, and culturally salient terms, with consistent orthography and normalization suitable for linguistic and computational use. Entries are aligned with English and Urdu glosses, and include part-of-speech tag.
Licensing
Creative Commons Attribution Share Alike 4.0 International (CC-BY-SA-4.0)
https://spdx.org/licenses/CC-BY-SA-4.0.htmlRestrictions/Special Constraints
Use is restricted for commercial entities with annual revenue above 1 million USD
Forbidden Usage
The dataset cannot be employed in systems that fabricate language output or in projects that could amplify negative, biased, or abusive content.
Intended Use
The IBT Torwali wordlist is intended for linguistic research, language documentation, and the development of educational and computational resources for the Torwali language.
Torwali (trw) is an Indo-Aryan language spoken in the Swat Kohistan region of northern Pakistan, mainly in the Bahrain and Chail areas. It is used across several valleys and exhibits variation in pronunciation and vocabulary among local speech communities. Torwali has a strong oral tradition, with folktales, poetry, songs, and storytelling forming a core part of its cultural identity. Although actively spoken, Torwali remains under-documented and has limited standardized writing, making it an essential language for linguistic research, preservation efforts, and resource development.
Person and Society
Kinship, social roles, customs, relations
Body and Health
Body parts, illness, physical states
Emotion and Cognition
Feelings, perception, thinking
Language and Communication
Speech, writing, interaction
Nature and Environment
Weather, landforms, plants, animals
Food and Livelihood
Agriculture, cooking, work, economy
Objects and Material Culture
Tools, clothing, household items
Action and Movement
Activities, motion, physical actions
Time, Space, and Quantity
Temporal, spatial, numerical concepts
Culture, Belief, and Knowledge
Tradition, religion, education
آ اَ ٲ ب پ ت ٹ ث ج چ ڇ خ د ذ ڑ ر ز ڙ ژ ط ض ص ش ݜ س ظ غ ف ق ک گ ل م ن و ہ ی ء او
Torwali: اتفاق تے
Part of Speech: Adverb
English Gloss: Unitedly; with unity
Urdu Gloss: اتفاق سے، مل کر
Semantic Domain (EN): States
Semantic Domain (UR): حالت
Date: 23 Nov 2015
Torwali: اَٹکے
Part of Speech: Noun
English Gloss: Severe cold days
Urdu Gloss: سخت سردی کے دن
Semantic Domain (EN): States
Semantic Domain (UR): حالت
Date: 23 Nov 2015
Proper citation is required when using this dataset.
@misc{IBT_North_Pakistan,
title = {IBT North Pakistan},
howpublished = {\url{https://ibtnorthpakistan.org/}},
note = {Accessed: 2025-01-06}
}