Filtrage

Filtrage

Filtrage is a project aiming at producing a multilingual corpus tagged with Named Entities. The languages are Arabic, French and English. The corpus is made of 30,000 newswires from the French news agency AFP (Agence France Presse). The corpus is semi-automatically annotated with a first annotation pass produced by a state-of-the-art system and then the annotations are corrected manually.

ELDA is responsible for this manual correction.

The following paper, describing the project, has been presented at the 2nd International Conference on Arabic Language Resources and Tools (MEDAR 2009), held on April 22-23 in Cairo, Egypt :

A Multilingual Named Entities Corpus for Arabic, English and French (Mostefa Djamel, Laïb Mariama, Chaudiron Stéphane, Choukri Khalid and de Chalendar Gaël)

Contact: Khalid Choukri [javascript protected email address]

Latest News

New LRs in the ELRA Catalogue Dec. 7, 2023
New LRs in the ELRA Catalogue Nov. 13, 2023
The LDS vision by Philippe Gelin Oct. 17, 2023
Distribution Agreement between ELDA and Lexicala for Multilingual Lexical Data Dissemination Oct. 12, 2023
Open position at ELDA Sept. 4, 2023

ELRA Tweets

Tweets by @ELRAnews

Links

Tags

Latest News

Tag Cloud

ELRA Tweets

Share this page!

Links

Tags

Latest News

Tag Cloud

ELRA Tweets