Task: NLP
Release Date: 5/12/2026
Format: CSV
Size: 100.24 KB
Share
This repository contains the dataset associated with the paper "Less is More? The Role of Demographic Author Information in Emotion Classification of Ambiguous Text". Emotion annotation is inherently subjective, often resulting in low agreement between annotators.
Licensing
Creative Commons Attribution 4.0 International (CC-BY-4.0)
https://spdx.org/licenses/CC-BY-4.0.htmlRestrictions/Special Constraints
This data set should only be used for research purposes.
Forbidden Usage
This data should not be used for personalized author profiling.
Emotion annotation is inherently subjective, often resulting in low agreement between annotators. This dataset supports research investigating whether providing annotators with demographic information about the text author reduces ambiguity and improves annotation consistency.
The dataset is derived from the crowd-enVENT corpus, which consists of personal event descriptions and associated emotion annotations.
📊 Dataset Description
Total texts: 250
Total annotators: 500
Source: crowd-enVENT corpus (subset with low agreement cases)