Leaving Academia

Posted on Fr 27 Januar 2023 in misc

This has been on my phone(!) far too long, so I might as well post it.

Before talking about why I left academia and how I feel about the transition, let me first talk about how I entered & worked in academia.

University was the first time I felt like I …


Continue reading Comments

Notes for a Scientific Writing Workshop

Posted on Mo 09 März 2020 in 2020-scientific-writing • Tagged with teaching

Lucia and I hold a scientific writing workshop for students. No credits, four days split into two blocks. These are (some of) the notes, mainly about LaTeX. This document is incomplete and subject to updates. We also uploaded the slides to the workshop.

Wikibooks has an excellent book on LaTeX …


Continue reading Comments

We need to talk about significance tests

Posted on Do 24 Oktober 2019 in misc • Tagged with nlp

At ACL 2019, We Need to Talk about Standard Splits by Kyle Gorman and Steven Bedrick was gilded as one of five outstanding papers. The authors perform a replication study on PoS taggers to evaluate whether the reported accuracies can be reproduced and whether those accuracies hinge on using the …


Continue reading Comments

Some tips on writing software for research

Posted on Mi 17 Juli 2019 in misc • Tagged with programming, nlp

These are my notes for a presentation in our group at Saarland University. The presentation was mainly about software written as part of experiments in NLP, but most of the tips do not focus on NLP but rather on writing code for reproducible experiments that involve processing data sets. This …


Continue reading Comments

Why we chose XML for the SWC annotations

Posted on Mi 29 November 2017 in misc • Tagged with corpora

I was asked why we use XML instead of json for the Spoken Wikipedia Corpora:

As mentioned, we actually started with json. The first version of …


Continue reading Comments

GamersGlobal Comment Corpus released

Posted on Sa 18 November 2017 in nlp • Tagged with corpus

Today I'm releasing the GamersGlobal comment corpus. GamersGlobal is a German computer gaming site (and my favorite one!) with a fairly active comment section below each article. This corpus contains all comments by the 20 most active users up to November 2016.

I use this corpus for teaching, mainly author …


Continue reading Comments

abgaben.el: assignment correction with emacs

Posted on Mo 13 November 2017 in software • Tagged with emacs, teaching

Part of my job at the university is teaching and that entails correcting assignments. In the old days, I would receive the assignments by email, print them, write comments in the margins, give points for the assignments and hand them back. This approach has two downsides:

  • assignments are done by …

Continue reading Comments

ESSLLI Course on Incremental NLP

Posted on Mo 03 Oktober 2016 in nlp

Timo and I held a course on incremental processing at ESSLLI 2016. If you have a look at (most of) our publications, you will see that Timo works on incremental speech processing and I on incremental Text processing. The course was about incremental NLP in general and I hope we …


Continue reading Comments

GPS track visualization for videos

Posted on Mo 03 Oktober 2016 in misc

We recently went for a ride at the very nice Alsterquellgebiet just north of Hamburg. We had a camera mounted and from time to time, I shot a short video.

Back home I wanted to visualize where we were for each video to make a short clip using kdenlive. The …


Continue reading Comments

Evaluating Embeddings using Syntax-based Classification Tasks as a Proxy for Parser Performance

Posted on So 19 Juni 2016 in Publications

My paper about the correlation between syneval and parsing performance has been accepted at RepEval 2016. You can find code, data etc. here. Looking forward to Berlin (which is a 1:30h train ride from Hamburg).


Continue reading Comments