2020
pdf
bib
abs
The CLARIN Knowledge Centre for Atypical Communication Expertise
Henk van den Heuvel
|
Nelleke Oostdijk
|
Caroline Rowland
|
Paul Trilsbeek
Proceedings of The 12th Language Resources and Evaluation Conference
This paper introduces a new CLARIN Knowledge Center which is the K-Centre for Atypical Communication Expertise (ACE for short) which has been established at the Centre for Language and Speech Technology (CLST) at Radboud University. Atypical communication is an umbrella term used here to denote language use by second language learners, people with language disorders or those suffering from language disabilities, but also more broadly by bilinguals and users of sign languages. It involves multiple modalities (text, speech, sign, gesture) and encompasses different developmental stages. ACE closely collaborates with The Language Archive (TLA) at the Max Planck Institute for Psycholinguistics in order to safeguard GDPR-compliant data storage and access. We explain the mission of ACE and show its potential on a number of showcases and a use case.
pdf
bib
abs
Corpora of Disordered Speech in the Light of the GDPR: Two Use Cases from the DELAD Initiative
Henk van den Heuvel
|
Aleksei Kelli
|
Katarzyna Klessa
|
Satu Salaasti
Proceedings of The 12th Language Resources and Evaluation Conference
Corpora of disordered speech (CDS) are costly to collect and difficult to share due to personal data protection and intellectual property (IP) issues. In this contribution we discuss the legal grounds for processing CDS in the light of the GDPR, and illustrate these with two use cases from the DELAD context. One use case deals with clinical datasets and another with legacy data from Polish hearing-impaired children. For both cases, processing based on consent and on public interest are taken into consideration.
pdf
bib
abs
A CLARIN Transcription Portal for Interview Data
Christoph Draxler
|
Henk van den Heuvel
|
Arjan van Hessen
|
Silvia Calamai
|
Louise Corti
Proceedings of The 12th Language Resources and Evaluation Conference
In this paper we present a first version of a transcription portal for audio files based on automatic speech recognition (ASR) in various languages. The portal is implemented in the CLARIN resources research network and intended for use by non-technical scholars. We explain the background and interdisciplinary nature of interview data, the perks and quirks of using ASR for transcribing the audio in a research context, the dos and don’ts for optimal use of the portal, and future developments foreseen. The portal is promoted in a range of workshops, but there are a number of challenges that have to be met. These challenges concern privacy issues, ASR quality, and cost, amongst others.
pdf
bib
abs
Crossing the SSH Bridge with Interview Data
Henk van den Heuvel
Proceedings of the Workshop about Language Resources for the SSH Cloud
Spoken audio data, such as interview data, is a scientific instrument used by researchers in various disciplines crossing the boundaries of social sciences and humanities. In this paper, we will have a closer look at a portal designed to perform speech-to-text conversion on audio recordings through Automatic Speech Recognition (ASR) in the CLARIN infrastructure. Within the cluster cross-domain EU project SSHOC the potential value of such a linguistic tool kit for processing spoken language recording has found uptake in a webinar about the topic, and in a task addressing audio analysis of panel survey data. The objective of this contribution is to show that the processing of interviews as a research instrument has opened up a fascinating and fruitful area of collaboration between Social Sciences and Humanities (SSH).
pdf
bib
abs
CLARIN: Distributed Language Resources and Technology in a European Infrastructure
Maria Eskevich
|
Franciska de Jong
|
Alexander König
|
Darja Fišer
|
Dieter Van Uytvanck
|
Tero Aalto
|
Lars Borin
|
Olga Gerassimenko
|
Jan Hajic
|
Henk van den Heuvel
|
Neeme Kahusk
|
Krista Liin
|
Martin Matthiesen
|
Stelios Piperidis
|
Kadri Vider
Proceedings of the 1st International Workshop on Language Technology Platforms
CLARIN is a European Research Infrastructure providing access to digital language resources and tools from across Europe and beyond to researchers in the humanities and social sciences. This paper focuses on CLARIN as a platform for the sharing of language resources. It zooms in on the service offer for the aggregation of language repositories and the value proposition for a number of communities that benefit from the enhanced visibility of their data and services as a result of integration in CLARIN. The enhanced findability of language resources is serving the social sciences and humanities (SSH) community at large and supports research communities that aim to collaborate based on virtual collections for a specific domain. The paper also addresses the wider landscape of service platforms based on language technologies which has the potential of becoming a powerful set of interoperable facilities to a variety of communities of use.