Discovering Clusters of Arbitrary Shapes and Densities in Data Streams: A density-based and grid-based approach to discover clusters in data streams - Couverture souple

Magdy, Amr; M. El-Makky, Nagwa; A. Yousri, Noha

 
9783846524343: Discovering Clusters of Arbitrary Shapes and Densities in Data Streams: A density-based and grid-based approach to discover clusters in data streams

Synopsis

The huge size of a continuously flowing data has put forward a number of challenges in data stream analysis. Exploration of the structure of streamed data represented a major challenge that resulted in introducing various clustering algorithms. However, current clustering algorithms still lack the ability to efficiently discover clusters of arbitrary densities in data streams. In this thesis, a new grid-based and density-based algorithm is proposed for clustering data streams. It addresses drawbacks of recent algorithms in discovering clusters of arbitrary densities. The algorithm uses an online component to map the input data to grid cells. An offline component is then used to cluster the grid cells based on density information. Relative density relatedness measures and a dynamic range neighborhood are proposed to differentiate clusters of arbitrary densities. The experimental evaluation shows considerable improvements upon the state-of-the-art algorithms in both clustering quality and scalability with different stream sizes and with higher dimensions. In addition, the output quality of the proposed algorithm is less sensitive to parameter selection errors.

Les informations fournies dans la section « Synopsis » peuvent faire référence à une autre édition de ce titre.

Présentation de l'éditeur

The huge size of a continuously flowing data has put forward a number of challenges in data stream analysis. Exploration of the structure of streamed data represented a major challenge that resulted in introducing various clustering algorithms. However, current clustering algorithms still lack the ability to efficiently discover clusters of arbitrary densities in data streams. In this thesis, a new grid-based and density-based algorithm is proposed for clustering data streams. It addresses drawbacks of recent algorithms in discovering clusters of arbitrary densities. The algorithm uses an online component to map the input data to grid cells. An offline component is then used to cluster the grid cells based on density information. Relative density relatedness measures and a dynamic range neighborhood are proposed to differentiate clusters of arbitrary densities. The experimental evaluation shows considerable improvements upon the state-of-the-art algorithms in both clustering quality and scalability with different stream sizes and with higher dimensions. In addition, the output quality of the proposed algorithm is less sensitive to parameter selection errors.

Biographie de l'auteur

Amr Magdy has finished his M.Sc. degree in Computer and System Engineering in Alexandria University, Egypt under supervision of Prof. Dr. Nagwa M. El-Makky and Assistant Prof. Noha A. Yousri. Their research interests includes Data Mining, Data Streams Management Systems, Machine Intelligence, Recommendation Systems and related areas.

Les informations fournies dans la section « A propos du livre » peuvent faire référence à une autre édition de ce titre.