Open Universiteit

Please use this identifier to cite or link to this item:
Title: Classifying Written Texts Through Rhythmic Features
Authors: Balint, Mihaela
Dascalu, Mihai
Trausan-Matu, Stefan
Keywords: rhythm
text classification
natural language processing
discourse analysis
Issue Date: 2016
Publisher: Springer
Citation: Balint, M., Dascalu, M., & Trausan-Matu, S. (2016). Classifying Written Texts through Rhythmic Features. In 15th Int. Conf. on Artificial Intelligence: Methodology, Systems, and Applications (AIMSA 2016) (pp. 121–129). Varna, Bulgaria: Springer
Abstract: Rhythm analysis of written texts focuses on literary analysis and it mainly considers poetry. In this paper we investigate the relevance of rhythmic features for categorizing texts in prosaic form pertaining to different genres. Our contribution is threefold. First, we define a set of rhythmic features for written texts. Second, we extract these features from three corpora, of speeches, essays, and newspaper articles. Third, we perform feature selection by means of statistical analyses, and determine a subset of features which efficiently discriminates between the three genres. We find that using as little as eight rhythmic features, documents can be adequately assigned to a given genre with an accuracy of around 80 %, significantly higher than the 33 % baseline which results from random assignment.
Appears in Collections:1. RAGE Publications

Files in This Item:
File Description SizeFormat 
balint2016.pdf348.16 kBAdobe PDFView/Open

This item is licensed under a Creative Commons License Creative Commons