RSS twitter Login
Home Contact Login

Evaluation events organized by ELRA

Share this page!
twitter google-plus linkedin share
ELRA 18th Anniversary

On November 19-20, 2013, ELRA celebrated its 18th anniversary and invited NLP experts from all over the world to participate in the workshop ELRA international workshop on sharing Language Resources: landscape and trends.

Plan conferences to avoid overlaps, boost Language Resources identification and discovery, promote best practices in Language Resource citation in publications, encourage Language Resources and Tools sharing through infrastructures and strengthen relations with neighbouring communities were among the main themes discussed by the participants.



As part of the CLARA Marie Curie Initial Training Network, ELDA and the University of Copenhagen have organized a four-day thematic training course on evaluation of Human Language Technologies on November 26-29 in Paris. The course was attended by 10 CLARA fellows and other researchers from the HLT field.

By introducing and describing some evaluation concepts (comparative evaluation versus competition, technology evaluation versus usage/usability evaluation), the course has provided both ESRs (Early Stage Researchers) and ERs (Experienced Researchers) the background and skills to use and implement state-of-the– art evaluation tools and techniques for speech technologies, grammars and parsing, machine translation and speech-to- speech translation, information retrieval/filtering, multi-modal interfaces, etc.).


The programme file provides both the schedule of the course and the slides presented during the course.

Invited Speakers

  • Victoria Arranz, ELDA, France
  • Khalid Choukri, ELDA, France
  • Edouard Geoffrois, DGA , France
  • Olivier Hamon, ELDA, France
  • Bente Maegaard, University of Copenhagen, Denmark
  • Aurélien Max, LIMSI, France
  • Djamel Mostefa, ELDA, France
  • Patrick Paroubek, LIMSI, France
  • Anders Søgaard, University of Copenhagen, Denmark
Looking into the Future of Evaluation @ LREC 2008 - 27 May 2008, Marrakech

Held in conjunction with the 6th International Language Resources and Evaluation Conference

View/download the Workshop’s Proceedings (PDF)


Workshop Chairing Team

  • Gregor Thurmair, Linguatec Sprachtechnologien GmbH, Germany - chair
  • Khalid Choukri, ELDA - Evaluations and Language resources Distribution Agency, France - co-chair
  • Bente Maegaard, CST, University of Copenhagen, Denmark - co-chair


Workshop Programme

09:00 Welcome and Introduction

09:15 Technology Advancement has Required Evaluations to Change Data and Tasks — and now Metrics Mark Przybocki; NIST, USA

10:00 Explicit and Implicit Requirements of Technology Evaluations: Implications for Test Data Creation Lauren Friedman, Stephanie Strassel, Meghan Lammie Glenn; Linguistic Data Consortium, USA

10:35 Coffee break

11:00 Automated MT Evaluation for Error Analysis: Automatic Discovery of Potential Translation Errors for Multiword Expressions Bogdan Babych, Anthony Hartley; Centre for Translation Studies, University of Leeds, United Kingdom

11:35 Discussion

12:15 Reference-based vs. Task-based Evaluation of Human Language Technology Andrei Popescu-Belis; IDIAP Research Institute, Switzerland

12:50 FEIRI: Extending ISLE’s FEMTI for the Evaluation of a Specialized Application in Information Retrieval Keith J. Miller; The MITRE Corporation, USA

13:25 Lunch

14:30 Discussion

15:15 Evaluating a Natural Language Processing Approach in Arabic Information Retrieval Nasredine Semmar, Laib Meriama, Christian Fluhr; CEA, LIST, Laboratoire d’ingénierie de la Connaissance Multimédia Multilingue, France, NewPhenix, France

15:50 Coffee break

16:20 A Review of the Benefits and Issues of Speaker Verification Evaluation Campaigns Asmaa El Hannani, Jean Hennebert; Department of Computer Science, University of Sheffield, UK, HESSO, Business Information Systems, Switzerland and University of Fribourg, Switzerland

16:55 Field Testing of an Interactive Question-Answering Character Ron Artstein, Sudeep Gandhe, Anton Leuski and David Traum; Institute for Creative Technologies, University of Southern California, USA

17:30 Discussion and conclusions

18:15 Close

MT Summit XI, Copenhagen - 11 September 2007


Workshop organised by the ELRA Evaluation Committee

  • Gregor Thurmair, Linguatec
  • Khalid Choukri, ELDA
  • Bente Maegaard, University of Copenhagen

The purpose of this workshop was to discuss automatic evaluation procedures in MT. Among the discussion points were:

  • What do the scores really measure?
  • What kind of implicit assumptions do they make?
  • What kind initial effort do they require (e.g.: pre-translate test corpus)?
  • What kind of resources do they need (e.g.: third party grammars)?
  • Are they biased towards specific MT technologies?
  • What kind of diagnostic support can they give? (where to improve the system)
  • What kind of evaluation criteria (e.g. related to the FEMTI framework) do they support (adequacy, fluency, …)

The objective of the workshop was to have a better understanding of the strengths and limitations of the respective approaches, and perhaps make steps towards defining a common methodology for MT output evaluation.


(Click the title to view/download the presentation)

9.00 Welcome and introduction

9.20 The place of automatic evaluation metrics in external quality models for machine translation (pdf , 104 KB, 19 slides)
Andrei Popescu-Belis, University of Geneva

10.00 Evaluating Evaluation --- Lessons from the WMT’07 Shared Task (pdf , 420 KB, 38 slides)
Philipp Koehn, University of Edinburgh

10.30 Coffee break

11.00 Investigating Why BLEU Penalizes Non-Statistical Systems (pdf , 261 KB, 10 slides)
Eduard Hovy, University of Southern California

11.30 Edit distance as an evaluation metric (pdf , 997 KB, 34 slides)
Christopher Cieri, Linguistic Data Consortium

12.00 Experience and conclusions from the CESTA evaluation project (pdf , 102 KB, 22 slides)
Olivier Hamon, ELDA

12.30 Lunch

13.30 Automatic Evaluation in MT system production (pdf , 147 KB, 28 slides)
Gregor Thurmair, Linguatec

14.00 Sensitivity of performance-based and proximity-based models for MT evaluation (pdf , 144 KB, 22 slides)
Bogdan Babych, Univ. Leeds

14.30 Automatic & human Evaluations of MT in the framework of a speech to speech communication (pdf , 178 KB, 33 slides)
Khalid Choukri, ELDA

15.00 Coffee break

15.30 Discussion and conclusions

17.00 Close

 HLT Evaluation Workshop in Malta - 1st and 2 December 2005


To celebrate ELRA’s 10th anniversary, a 2-day workshop dedicated to Human Language Technologies (HLT) Evaluation was held in Malta on December 1st and 2nd 2005.

In organizing this workshop, ELRA, whose main missions over the past decade have been to promote language resources for the HLT sector, and to promote evaluation of language engineering technologies, intended to bring together the HLT Evaluation key players to discuss the HLT evaluation from various perspectives: general principles and purposes, technologies, past and on-going evaluation projects, worldwide initiatives, etc.

The main objective of the workshop was to allow a fruitful brainstorming on the HLT evaluation starting from what is being done today and what should be done better, differently, etc. All sectors of HLT were be addressed (speech technologies, grammars and parsing, machine translation and speech to speech translation, information retrieval/filtering, multimodal interfaces, etc.). The number of participants was limited to about 50 to make the event very productive.

ELRA Tweets