Projects

At the moment (or in the recent past) we are involved in the following projects

LC-Star & LC-Star II
LC-Star (2002-2006) was an EU-funded project and LC-Star II (2006-) is a corporate-funded project, both with the goal of creating language resources (phonetic lexicons and text corpora) which can be used for transferring speech-to-speech translation technology in multi-lingual environments. SPEX is responsible for the validation of the phonetic lexicons.

LILA
LILA (2005-) is a corporate-funded project with the aim of collecting speech databases for mobile telephony in Asia. SPEX is responsible for the validation of all the databases at different stages in the production.

Validation for ELRA
Together with the European Language Resource Association (ELRA) SPEX aims at maximising the quality of the spoken language resources present in ELRA's catalogue. This is achieved by tailored validation procedures of the already existing resources and new resources entering ELRA's catalogue.

TC-Star
TC-Star (2004-2007) was an EU-funded project aimed at developing technology and corpora for speech-to-speech translation. SPEX in responsible for the quality checks of the corpora produced within the project.

Spoken Dutch Corpus (CGN)
CGN is the largest corpus of contemporary Dutch as spoken by adults in Flanders and the Netherlands collected between 1998 and 2004 and consisting of 9 million words (800 hours of speech). SPEX was responsible for the orthographic and manually phonetic transcriptions of the part collected in the Netherlands.

OrienTel
OrienTel (2001-2004) was an EU-funded project that focuses on the development of language resources for speech-based telephony applications across the Mediterranean and the Middle East with a special emphasis on Arabic. Like in SpeeCon SPEX was involved in creating database specifications and responsible for the validation of all the databases at different stages in the production.

SALA II
SALA II (2002-2004) was a corporate-funded project with the aim of collecting speech databases for mobile telephony in South and North America. SPEX is responsible for the validation of all the databases at different stages in the production.

SpeeCon
SpeeCon (2000-2003) was an EU-funded project in which speech databases were collected for speech-driven consumer applications in almost 30 languages. SPEX was involved in creating specifications for the production of the databases and responsible for the validation of all the databases at different stages in the production.

Database collections
In the recent past we have also carried out data collection or lexicon creation for customers such as Temic, Philips or KPN. We have also been involved in the creation of speech databases in various SpeechDat projects.