Standard & Best Practices
Over the past years, a great deal of work has been carried out on the validation of language resources, especially for speech databases. You will find below an overview of the documents that have been produced in relevant projects. We have split the surveys according to the type of data (speech, written). In each part, you will find a general document that describes ELRA’s general validation guidelines.
The various links below give access to information about the validation of Written Language Resources:
QQC of WLR-lexica | Methodology for a Quick Quality Check of WLR-lexica Deliverable D1.2 Hanne Fersøe, Sussi Olsen |
Validation manual for lexica | Validation Manual for Lexica Release 2.0, January 2004 Hanne Fersøe |
Towards a Standard for the Creation of Lexica | Towards a Standard for the Creation of Lexica May 2003 Monica Monachini, Francesca Bertagna, Nicoletta Calzolari, Nancy Underwood, Costanza Navarretta |
Validation of corpora | Validation of Linguistic Corpora 28 April 1998 Tony McEnery, Lou Burnard, Andrew Wilson and Baker |
EAGLES/ISLE | EAGLES/ISLE Meta Data Initiative web site |
The following documents present a list of guidelines for validation checks to be carried out in order to ascertain a certain quality standard of Spoken Language Resources (SLR). The methods proposed have been chosen to provide a good balance between achievable quality standards and associated costs of the validation procedure.
Methodology for a Quick Quality Check of SLR and phonetic lexicons (doc) |
Methodology for a Quick Quality Check of SLR and Phonetic Lexicons (VCom)Deliverable D1.2, 18 August 2004
Henk van den Heuvel |
QQC Report for LEX (doc) | SPEX Quick Quality Check QQC Report for LexicaHenk van den Heuvel |
QQC Report for SLR (doc) | SPEX Quick Quality Check QQC Report for SLRHenk van den Heuvel |
“Validation criteria” of OrienTel (pdf) | Specification of Validation Criteria (OrienTel project) Deliverable D6.2, 25 August 2002 Dorota Iskra, Henk van den Heuvel, Oren Gedge, Sherrie Shammass |
“Validation criteria” of SpeeCon (pdf) | Definition of Validation Criteria (SpeeCon project) Deliverable D41, 25 April 2002 Henk van den Heuvel, Shaunie Shammass, Ami Moyal, Oren Gedge |
ELRA’s Validation manual for SLR (pdf) |
Validation of Content and Quality of Existing SLR: Overview and MethodologyDeliverable 1.1, 21 January 2000
Henk van den Heuvel, Louis Boves, Eric Sanders |
“Validation criteria” of SpeechDat-Car | Validation criteria (SpeechDat-Car project) Deliverable D1.3.1, 12 September 2000 Henk van den Heuvel |
“Validation criteria” of SpeechDat(E) | Validation criteria (SpeechDat(E) project) Deliverable ED1.4.2, 27 October 1999 Henk van den Heuvel |
“Validation criteria” of SpeechDat(II) (pdf) | Validation criteria (SpeechDat(II) project) Deliverable SD1.3.3, 5 November 1997 Henk van den Heuvel |