Efficient term cloud generation for streaming web content

Odysseas Papapetrou, George Papadakis, Ekaterini Ioannou, Dimitrios Skoutas

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Abstract

Large amounts of information are posted daily on the Web, such as articles published online by traditional news agencies or blog posts referring to and commenting on various events. Although the users sometimes rely on a small set of trusted sources from which to get their information, they often also want to get a wider overview and glimpse of what is being reported and discussed in the news and the blogosphere. In this paper, we present an approach for supporting this discovery and exploration process by exploiting term clouds. In particular, we provide an efficient method for dynamically computing the most frequently appearing terms in the posts of monitored online sources, for time intervals specified at query time, without the need to archive the actual published content. An experimental evaluation on a large-scale real-world set of blogs demonstrates the accuracy and the efficiency of the proposed method in terms of computational time and memory requirements.
Original languageEnglish
Title of host publicationInternational Conference on Web Engineering
Subtitle of host publicationICWE 2010 Web Engineering
Place of PublicationBerlin
PublisherSpringer
Pages385-399
Number of pages15
DOIs
Publication statusPublished - 2010
Externally publishedYes
Event10th International Conference on Web Engineering - Vienne, Austria
Duration: 5 Jul 20109 Jul 2010

Publication series

NameLecture Notes in Computer Science
Volume6189

Conference

Conference10th International Conference on Web Engineering
Abbreviated titleICWE 2010
Country/TerritoryAustria
CityVienne
Period5/07/109/07/10

Fingerprint

Dive into the research topics of 'Efficient term cloud generation for streaming web content'. Together they form a unique fingerprint.

Cite this