Sunday, November 30, 2014

publication on sentence boundary detection

My first publication was on
Automatic sentence boundary detection in German broadcast news, done as a result of my master thesis in the Fraunhofer institute for applied research.


Abstract:
In this work we aim at enriching the transcript of an automatic
speech recognition system with punctuation by automatically
detecting sentence ends. We make use of a
simple word-based language model and combine it with
a decision tree for the acoustic features of speech. The
focus lies on selecting robust acoustic features that reflect
the prosodic characteristics of the German language in a
most optimal way. We arrive at a Sentence Unit Error Rate
of 54 compared to the state-of-the art rate for English of
61, by applying a comparable detection system. This is a
sound indication that prosody has a stronger cue on perception
of sentence boundaries for German than for English.
Our work is, to our knowledge, the first system developed
for sentence boundary detection for the broadcast news domain
for German language. Our results can therefore serve
as a baseline for further studies in this scenario.

No comments: