Task: MT
Release Date: 5/20/2026
Format: TSV
Size: 100.28 KB
Share
Bitext from the online AmaWar dictionary of the Tamazight dialect of Ait Warain spoken in northeastern Morocco. Contains sentences, stories, and poems in Tamazight written in the Neo-Tifinagh script along with their translations into Modern Standard Arabic. The dataset is based on Dr. Noureddine Amhaoui's work. Please cite him if you use this dataset in your work. Citation: Amhaoui, N. (2023). Computerized Dictionary of the Meanings of Amazigh Nouns and Verbs (Dialect of Ait Ourain as a pattern). Afro-Asian Journal of Scientific Research, 1(2), 321–329. https://www.aajsr.com
Restrictions/Special Constraints
No restrictions apply.
Forbidden Usage
Ethical Review
Data sourced from a publicly available online dictionary with permission from the author.
Intended Use
This dataset is intended for use in research and development of machine translation systems and language learning tools as well as linguistics.
The dataset has 2,324 rows in total split into the following subsets:
examples.tsv: Example parallel sentences taken from dictionary entries.
idioms.tsv: Idiomatic expressions translated literally into Arabic. The meaning column mostly contains an explanation of the expression, not an attempt at translation.
proverbs.tsv: Proverbs translated literally into Arabic. Each proverb can have more than one meaning, found in the columns meaning1 and meaning2, which are mostly explanatory, not direct translations.
riddles.tsv: Riddles and their answers translated literally into Arabic.
poems/*.tsv: A collection of 20 translated poems.
stories/*.tsv: A collection of 8 translated folk stories.
Each subset has a source_string column containing the original Tamazight text and a target_string column containing the Modern Standard Arabic translation.
Using the literal translations in Machine Translation can affect the model's performance, especially in the Tamazight => Arabic direction.