We propose a method for automatically identifying rhetorical relations. We use supervised machine learning but exploit cue phrases to automatically extract and label training data. Our models draw on a variety of linguistic cues to distinguish between the relations. We show that these feature-rich models outperform the previously suggested bigram models by more than 20%, at least for small training sets. Our approach is therefore better suited to deal with relations for which it is difficult to automatically label a lot of training data because they are rarely signalled by unambiguous cue phrases (e.g., "continuation").
|Title of host publication||Proceedings of Recent Advances in Natural Language Processing (RANLP-05)|
|Place of Publication||Borovets, Bulgaria|
|Number of pages||8|
|Publication status||Published - 2005|