Mining the Spoken Wikipedia for Speech Data and Beyond

Posted on Mo 30 Mai 2016 in Publications • Tagged with corpus

Our paper Mining the Spoken Wikipedia for Speech Data and Beyond has been accepted at LREC. Timo presented it and the reception seemed to be rather good. You can find our paper about hours and hours of time-aligned speech data generated from the Spoken Wikipedia at the Spoken Wikipedia Corpora …


Continue reading Comments

What’s in an Embedding? Analyzing Word Embeddings through Multilingual Evaluation

Posted on Di 15 September 2015 in Publications

I'm presenting my paper What’s in an Embedding? Analyzing Word Embeddings through Multilingual Evaluation at EMNLP. You can have a look at the data, code, and examples.

Hopefully, the EMNLP video recordings will be online at some point. As of now (2016-04), they are not.


Continue reading Comments

My Bachelor thesis

Posted on Do 31 Dezember 2009 in Publications

My bachelor thesis is in German, you can find it at the open access repository of our department: Inkrementelle Part-of-Speech Tagger

I also made an overview in English with the relevant results.


Continue reading Comments