License:
CC BY-NC-ND 4.0
Steward:
EELLAK - GreekFOSSTask: NLP
Release Date: 4/16/2026
Format: PARQUET
Size: 32.60 MB
Share
This dataset is a structured digital collection of press releases and news articles sourced from the official press platform of the Hellenic Broadcasting Corporation (https://press.ert.gr). It contains 18,979 entries representing the official communication and archival record of the national Greek public broadcaster. The dataset captures a wide range of content, including institutional announcements, television and radio programming updates, and news coverage. Each entry provides the full article text along with its corresponding publication timestamp and original source URL.
Licensing
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International
https://creativecommons.org/licenses/by-nc-nd/4.0/deed.enRestrictions/Special Constraints
This dataset is distributed under CC BY-NC-ND 4.0 (attribution required, non-commercial use only, no derivatives). All intellectual property rights remain with ERT. Use is strictly limited to non-commercial research purposes. Any commercial use, redistribution, or creation of derivative works (including trained models intended for commercial use) requires prior authorization from ERT. The dataset creators do not grant any additional rights beyond those permitted under applicable law.
Forbidden Usage
Any commercial exploitation, redistribution of modified versions, or use that violates ERT’s terms of service or copyright.
Ethical Review
This dataset contains content derived from copyrighted sources that are not owned by the dataset creators. The data has been collected and processed from lawfully accessed materials and structured for research purposes in language technology. The decision to include this dataset is based on the understanding that such use falls within applicable copyright frameworks for research. No ownership over the original content is claimed. To ensure responsible use: The dataset is released under a CC BY-NC-ND 4.0 license (non-commercial, no derivatives) Users are explicitly informed that any use beyond research may require permission from the original rights holders The dataset is distributed via controlled platforms to support appropriate governance
Intended Use
This dataset is intended for non-commercial research in natural language processing, linguistics, and Greek-language AI. Example applications include: Language modeling experiments Lexical and semantic analysis Evaluation of NLP systems Academic research and benchmarking Use of this dataset for training machine learning models (including large language models) is permitted only within a research context. Any commercial use, or use beyond research purposes, requires appropriate authorization from the original content rights holders. The dataset creators do not grant such rights.