Link based limited session reconstruction method for mining web usage data

Thesis Type: Postgraduate

Institution Of The Thesis: Orta Doğu Teknik Üniversitesi, Faculty of Engineering, Department of Computer Engineering, Turkey

Approval Date: 2013




Web is growing very fast and serving huge amount of information to people nowadays. Many web users try to access this information every day and for this reason it needs to be organized efficiently. There are traditional web usage mining methods in the literature but detecting user’s sessions and understanding their common web behaviors from web logs are difficult problems. In this work, we propose a link based model to session construction problem for finding users’ common behaviors on the web. This model aims to find users’ sessions and frequent patterns by using web site’s topologies and session logs. In order to detect sessions more accurately, we present a new algorithm Limited Session Reconstruction Algorithm. For the pattern discovery phase, an efficient version of Apriori-All technique is used. A web agent simulator is used based on previous works on link based approach to produce web usage logs and site topology. A web tracker tool is designed to capture visitor’s sessions. Experimental results show that this algorithm gives more accurate results than classical time, navigation approaches and slightly better results than other link based approaches on the simulated data. On the other hand, although it has some enhancements on real data results, its accuracy value is not good compared to some other heuristics due to incompatibility of used web log data and web tracker tool. It is predicted that the new approach gives better results on web sites involving long user sessions such as e-commerce and shopping.