advertisement

Time Based Cluster Analysis for Automatic Blog Generation

100 %
0 %
advertisement
Information about Time Based Cluster Analysis for Automatic Blog Generation

Published on April 25, 2008

Author: lukostaz

Source: slideshare.net

Description

Presented at the Social Web Search and Mining Workshop, WWW2008 in Beijing
advertisement

Time Based Context Cluster Analysis for Automatic Blog Generation Luca Costabello and Laurent-Walter Goix Telecom Italia, Italy

Context as Blog Content User context is gaining importance Location info Nearby buddies The surrounding environment in general We mine context data to detect daily user actions User actions are converted into natural text Blog posts describing the user days enable the detection of a community of users with similar behavioral patterns.

User context is gaining importance

Location info

Nearby buddies

The surrounding environment in general

We mine context data to detect daily user actions

User actions are converted into natural text

Blog posts describing the user days enable the detection of a community of users with similar behavioral patterns.

Context-Based Blog Generation 1) Raw data gathering Daily actions 2) Offline Cluster analysis 3) Blog post generation

System Architecture

Cluster Analysis: Detecting User Actions 2007-10-03 11:02:33 222-1-61101-72162201 office,tilab 2007-10-03 10:59:09 222-1-61101-72162201 office,tilab 2007-10-03 10:55:46 222-1-61101-72162201 office,tilab 2007-10-03 10:52:41 222-1-61101-64530928 n/a,n/a 2007-10-03 10:48:59 222-1-61101-72162201 office,tilab 2007-10-03 10:45:34 222-1-61101-72162201 office,tilab 2007-10-03 10:42:11 222-1-61101-64530928 n/a,n/a 2007-10-03 10:38:47 222-1-61101-72162201 office,tilab 2007-10-03 10:37:47 222-1-61101-72162201 office,tilab 2007-10-03 09:27:01 222-1-61101-72157899 office,tilab 2007-10-03 08:58:11 222-1-61104-72386176 n/a,n/a 2007-10-03 08:56:28 222-1-24650-121 n/a,n/a 2007-10-03 08:56:05 222-1-24650-122 n/a,n/a 2007-10-03 08:54:20 222-1-54650-923 n/a,n/a 2007-10-03 08:51:31 222-1-61104-72395762 n/a,n/a 2007-10-03 08:49:16 222-1-61104-72384437 n/a,n/a 2007-10-03 08:48:47 222-1-61104-72395762 n/a,n/a 2007-10-03 08:48:18 222-1-61104-72384437 n/a,n/a 2007-10-03 08:47:50 222-1-61104-72395762 n/a,n/a 2007-10-03 08:47:21 222-1-61104-72395762 n/a,n/a 2007-10-03 08:46:51 222-1-61104-72384437 n/a,n/a 2007-10-03 08:46:20 222-1-61104-72376116 n/a,n/a 2007-10-03 08:45:15 222-1-61104-72395763 n/a,n/a 2007-10-03 08:44:02 222-1-61104-72400263 n/a,n/a 2007-10-03 08:42:33 222-1-61104-72395770 n/a,n/a 2007-10-03 08:42:02 222-1-61104-72400262 n/a,n/a 2007-10-03 08:40:08 222-1-24650-1281 residence,home 2007-10-03 08:36:26 222-1-24650-1281 residence,home 2007-10-03 08:33:02 222-1-24650-1281 residence,home Cluster 1 (Static) Start 08:58 End 11:02 CGI 222-1-61101-162201 VP CGI Office, TILab VP Bth Not available Cluster 2 (Movement) Start 08:42 End 08:56 CGI From 222-1-24550-1281 CGI To 222-1-24650-121 VP CGI From Residence,home VP CGI To Office, TILab VP Bth Not available Timestamp Cell ID Cell ID Virtual Place

Clustering Algorithms Dimensions Location GSM/UMTS Cell IDs User-defined Cell ID Labels Time Chronological order of actions must be respected Categorical attributes Euclidean distance not available Time must be evaluated according to “temporal distance” Ad-hoc algorithms had to be designed

Location

GSM/UMTS Cell IDs

User-defined Cell ID Labels

Time

Chronological order of actions must be respected

Cell-Based Location Data Issues Context updates occur with variable frequency Detecting static situations VS detecting movement Base station concentration affects context data patterns Frequent cell handovers during static actions

Context updates occur with variable frequency

Detecting static situations VS detecting movement

Base station concentration affects context data patterns

Frequent cell handovers during static actions

Compare&Merge Algorithm 2007-10-03 11:02:33 222-1-61101-72162201 office,tilab 2007-10-03 10:59:09 222-1-61101-72162201 office,tilab 2007-10-03 10:55:46 222-1-61101-72162201 office,tilab 2007-10-03 10:52:41 222-1-61101-64530928 n/a,n/a 2007-10-03 10:48:59 222-1-61101-72162201 office,tilab 2007-10-03 10:45:34 222-1-61101-72162201 office,tilab 2007-10-03 10:42:11 222-1-61101-64530928 n/a,n/a 2007-10-03 10:38:47 222-1-61101-72162201 office,tilab 2007-10-03 10:37:47 222-1-61101-72162201 office,tilab 2007-10-03 09:27:01 222-1-61101-72157899 office,tilab 2007-10-03 08:58:11 222-1-61104-72386176 n/a,n/a 2007-10-03 08:56:28 222-1-24650-121 n/a,n/a 2007-10-03 08:56:05 222-1-24650-122 n/a,n/a 2007-10-03 08:54:20 222-1-54650-923 n/a,n/a 2007-10-03 08:51:31 222-1-61104-72395762 n/a,n/a 2007-10-03 08:49:16 222-1-61104-72384437 n/a,n/a 2007-10-03 08:48:47 222-1-61104-72395762 n/a,n/a 2007-10-03 08:48:18 222-1-61104-72384437 n/a,n/a Context History Preliminary Context Scan Long Temporary Cluster Short Temporary Clusters Temporary Clusters Merge Static Cluster Movement Cluster Static Cluster

MultiLevel Sliding Window Algorithm For each window iteration: Check if any user-defined label is available. Detect user movement Detect the most frequent position Merge window data with previous window iteration (if detected position is the same)

For each window iteration:

Check if any user-defined label is available.

Detect user movement

Detect the most frequent position

Merge window data with previous window iteration (if detected position is the same)

Algorithms Comparison Lower precision than C&M. (A 30 minute long window leads to a less than 30 minutes error) Very high in optimal situations (less than 2-5 minutes) Precision Non-labeled areas Frequent cell handovers Good user labeling Cells with low handovers issues Optimal usage None Frequent cell handovers Critical situations MultiLevel Sliding Window Compare&Merge  

Non-labeled areas

Frequent cell handovers

Good user labeling

Cells with low handovers issues

Cluster Analysis Accuracy VS User Perception

From Clusters To Blog Post NLG Natural Text Generation Action Detector Context Clusters User Preferences

Results Mining context history leads to user pattern discovery Daily actions sharing Detection of user communities, according to daily behaviors Clustering accuracy VS personal memories perception Movement detection Location-labeling importance

Mining context history leads to user pattern discovery

Daily actions sharing

Detection of user communities, according to daily behaviors

Clustering accuracy VS personal memories perception

Movement detection

Location-labeling importance

Any Questions? Thank You! luca.costabello@guest.telecomitalia.it [email_address] Email

Any Questions?

Add a comment

Related pages

Time Based Context Cluster Analysis for Automatic Blog ...

Time Based Context Cluster Analysis for Automatic Blog Generation Luca Costabello ∗ Politecnico di Torino Corso Duca degli Abruzzi, 24 Turin, Italy
Read more

A novel genetic algorithm for automatic clustering

A novel genetic algorithm for automatic ... Cluster analysis is an effective tool in ... A Density-Based Algorithm for discovering clustering in large ...
Read more

Data clustering - ACM Digital Library

Data clustering: a review. Full ... Cluster Analysis Based on the Central Tendency ... Towards automatic concept hierarchy generation for specific ...
Read more

Clustering financial time series: an application to mutual ...

Section 2 introduces cluster analysis and the ... which then is passed to the next generation. ... propose a second approach based on clustering time ...
Read more

Figure 2: Context data clustering process. Two static ...

Time based context cluster analysis ... Context data clustering process. Two static ... Time based context cluster analysis for automatic blog generation.
Read more

model-based cluster analysis - Quick-R: Home Page

Cluster Analysis . R has an amazing variety of functions for cluster analysis. In this section, I will describe three of the many approaches: hierarchical ...
Read more

Evolutionary Cluster Analysis | Sandra Paterlini ...

Evolutionary Cluster Analysis Sandra Paterlini, Tommaso Minerva Università di Modena e Reggio Emilia Dipartimento di Economia Politica Viale Berengario 51 ...
Read more

Patent US7373542 - Automatic startup of a cluster system ...

The invention relates to a method for the automatic startup of a cluster (10) after an error has occurred in a node (12, 14) of said cluster (10) that led ...
Read more