Standard & Best Practices

Over the past years, a great deal of work has been carried out on the validation of language resources, especially for speech databases. You will find below an overview of the documents that have been produced in relevant projects. We have split the surveys according to the type of data (speech, written). In each part, you will find a general document that describes ELRA’s general validation guidelines.

The various links below give access to information about the validation of Written Language Resources:

QQC of WLR-lexica Methodology for a Quick Quality Check of WLR-lexica
Deliverable D1.2
Hanne Fersøe, Sussi Olsen
Validation manual for lexica Validation Manual for Lexica
Release 2.0, January 2004
Hanne Fersøe
Towards a Standard for the Creation of Lexica Towards a Standard for the Creation of Lexica
May 2003
Monica Monachini, Francesca Bertagna, Nicoletta Calzolari, Nancy Underwood, Costanza Navarretta
Validation of corpora Validation of Linguistic Corpora
28 April 1998
Tony McEnery, Lou Burnard, Andrew Wilson and Baker
EAGLES/ISLE EAGLES/ISLE Meta Data Initiative web site

The following documents present a list of guidelines for validation checks to be carried out in order to ascertain a certain quality standard of Spoken Language Resources (SLR). The methods proposed have been chosen to provide a good balance between achievable quality standards and associated costs of the validation procedure.

 

Methodology for a Quick Quality Check of SLR and phonetic lexicons (doc)

Methodology for a Quick Quality Check of SLR and Phonetic Lexicons (VCom)Deliverable D1.218 August 2004

 

Henk van den Heuvel

QQC Report for LEX (doc) SPEX Quick Quality Check QQC Report for LexicaHenk van den Heuvel
QQC Report for SLR (doc) SPEX Quick Quality Check QQC Report for SLRHenk van den Heuvel
“Validation criteria” of OrienTel (pdf) Specification of Validation Criteria (OrienTel project)
Deliverable D6.2, 25 August 2002
Dorota Iskra, Henk van den Heuvel, Oren Gedge, Sherrie Shammass
“Validation criteria” of SpeeCon (pdf) Definition of Validation Criteria (SpeeCon project)
Deliverable D41, 25 April 2002
Henk van den Heuvel, Shaunie Shammass, Ami Moyal, Oren Gedge
ELRA’s Validation manual for SLR (pdf)

Validation of Content and Quality of Existing SLR: Overview and MethodologyDeliverable 1.1, 21 January 2000

 

Henk van den Heuvel, Louis Boves, Eric Sanders

“Validation criteria” of SpeechDat-Car Validation criteria (SpeechDat-Car project)
Deliverable D1.3.1, 12 September 2000
Henk van den Heuvel
“Validation criteria” of SpeechDat(E) Validation criteria (SpeechDat(E) project)
Deliverable ED1.4.2, 27 October 1999
Henk van den Heuvel
“Validation criteria” of SpeechDat(II) (pdf) Validation criteria (SpeechDat(II) project)
Deliverable SD1.3.3, 5 November 1997
Henk van den Heuvel