Unsupervised Clickstream Clustering For User Behavior Analysis

Gang Wang
Xinyi Zhang
Shiliang Tang
Haitao Zheng
Ben Y. Zhao

Proc. of the 34th CHI Conference on Human Factors in Computing Systems (CHI 2016)

[Full Text in PDF Format, 1.09MB]

Paper Abstract

Online services are increasingly dependent on user participation. Whether it's online social networks or crowdsourcing services, understanding user behavior is important yet challenging. In this paper, we build an unsupervised system to capture dominating user behaviors from clickstream data (traces of users' click events), and visualize the detected behaviors in an intuitive manner. Our system identifies "clusters" of similar users by partitioning a similarity graph (nodes are users; edges are weighted by clickstream similarity). The partitioning process leverages iterative feature pruning to capture the natural hierarchy within user clusters and produce intuitive features for visualizing and understanding captured user behaviors. For evaluation, we present case studies on two large-scale clickstream traces (142 million events) from real social networks. Our system effectively identifies previously unknown behaviors, e.g., dormant users, hostile chatters. Also, our user study shows people can easily interpret identified behaviors using our visualization tool.