...

Advances in Environmental Biology

by user

on
Category: Documents
41

views

Report

Comments

Transcript

Advances in Environmental Biology
Advances in Environmental Biology, 8(17) September 2014, Pages: 1245-1250
AENSI Journals
Advances in Environmental Biology
ISSN-1995-0756
EISSN-1998-1066
Journal home page: http://www.aensiweb.com/AEB/
A Recommendation System Presenting to Improve Performance of Search Engines
Based on Web Mining and Formal Models
1Maryam
Jafari, 2Ali Harounabadi and 3Seyyed Javad Mirabedini
1
Department of computer engineering, Bushehr branch, Islamic Azad University, Bushehr -IRAN
Department of computer engineering, Islamic Azad University, Central Tehran branch- IRAN
3
Department of computer engineering, Islamic Azad University, Central Tehran branch- IRAN
2
ARTICLE INFO
Article history:
Received 25 September 2014
Received in revised form
26 October 2014
Accepted 25 November 2014
Available online 29 December 2014
Key words:
Color Petri Net, Recommendation
System,
Web
Mining,
Web
Personalization
ABSTRACT
Web personalized has an important role to improve interaction between users and web
with the aim providing of favorable results according to users interest. Nowadays,
searching engines are main tools to seek information through the web and despite
extensive using it cannot provide the results of searching according most of users. Thus
search engines personalize is considering essential with the aim of helping users to find
information according their interests. In this paper in addition to frequency and
observation duration of web pages parameters, observation date parameter of web pages
that saved in the log files, are used to analyze web users' behavior and to make users'
profile. Also, it is presented a personalized search engine with modeling by color Petri
nets. This proposed method identifies the degree of users’ interest to various category
of pages based on user's search record, after a period of the time and recommend pages
related to his interest. Obtained results show the accuracy criteria improvement
compared with previous methods by using of color Petri net.
© 2014 AENSI Publisher All rights reserved.
To Cite This Article: Maryam Jafari, Ali Harounabadi and Seyyed Javad Mirabedini., A recommendation system presenting to improve
performance of search engines based on web mining and formal models. Adv. Environ. Biol., 8(17), 1245-1250, 2014
INTRODUCTION
Recommendation systems use ideas and opinion of groups of users to help them individually in group to
identify and determine their content more effective and efficient. Providing arbitrary information or required by
users without their explicit request is the aim of a web personalize system. In fact, a recommendation system is a
powerful mechanism to do information filtration. To provide intelligent personalize online systems as
recommendation systems based on web, generally modeling of the users’ behavior is necessary. Recently, web
mining is more considerable, because it provide web personalize requirements.
This study consist of web mining, recommendation systems, session of user and Petri nets. Each one will be
review in this paper.
Web mining:
Web mining is Employing data mining methods for automatically discover and extract information from
documents and Web services [11].
Web recommendation systems:
It predict users’ requirements and provide it as recommend toward of users guidance. It is expected that
these kinds of systems have bright future in the e-commerce and search engines. Proposed items can be some
productions as books, movies, music and…, and online resources such as web pages or online activities as path
predict. Totally, a web recommendation system is a form of two modules: offline and online. Offline module
preprocess data to product user models, while online module during working, use these models to identify user’s
purposes and predict a list of recommends and update it. The issue of a recommendation system can be
considered in the simplest and the most common state as estimate issue of the rank of an user's interesting about
items that he has not visit yet [17].
Session:
The session of user is a set of visited pages by that user during a special visit of website [5] a set of visited
Corresponding Author: Maryam Jafari, Department of computer engineering, Bushehr branch, Islamic Azad University,
Bushehr –IRAN
E-mail: [email protected]
1246
Maryam Jafari et al, 2014
Advances in Environmental Biology, 8(17) September 2014, Pages: 1245-1250
pages by special user is considered as his session when those pages have been asked in a time interval less or
equal a special time [12].
Petri Nets:
Petri net is a tool to study systems. Petri net theory allows that a system can be a model by that, and in fact
this is a mathematical presentation of that system. Some useful information is obtained; from dynamic behavior
and modeled system structure by analyze Petri net, that, this information can be used to evaluation and guesses
to system improvement or changes [1].
Color Petri net:
The expansion of these nets has been done with the purpose of creating a modeling language. We named
these kinds of nets “color Petri net” because they provide the possibility use of tokens that carried different
values of data and separable from each other. A model of color Petri net in graphical form has been represented
by a two partial pointing graph. In the model, places by circle and transitions by rectangular are indicated.
Tokens are presented into the places as black tokens.
Related works:
These years, there is more focused on making recommendation systems, and generally, personalization
methods that don't need to achieve clear information of users [6]. These type of systems get the behavior model
and users' interest from him implicitly. In these systems is not expressed the degree of the user's interest about
different elements explicitly and system create a model of their interests by observing users' behavior implicitly.
In fact, input of these systems just consist of user's transactions or in the simple word, observed items by each
user.
In Yektaparast thesis it is made the user's profile by using of pages weight criteria and put users with same
interests in the same clusters by helping of clustering techniques. Then according to degree of users' interest in
each cluster, proposes commercial to them. To modelling this method it is used Petri nets. In this way that each
user that is Member of one or more cluster with their rank of belonging that present it by using fuzzy colored
Petri net [2].
Chen presents a method to web structure modeling by using of Petri net. In this method, places present web
pages in site and transitions present the link between pages. In this paper emphasis is on the using way of pars
algorithm to recovery content of web sites pages, Analysis of the content and find intersection matrix that
present the structure of web. Also, it shows that how we can do identifying process of main page and evolution
path by availability feature [3].
In the methods of data mining based on what kind of data they research about, divided in three categories as
web content mining, web usage mining and web structure mining [16]. Recently, web usage mining techniques
have been used expansively to discover users' motive patterns. ZHOU ordinal patterns mining [18], LIN mining
of association rules [8] and PHATAK and MOBASHER's clustering different available patterns [13], [10]
discover that can be used to proposes personalization of web recommendation systems.
Most of the activity-based recommender systems, recommender systems are based on Web usage mining.
These systems work on web servers and these have access to the information users who use the Web site [4].
Moatamedimehr is proposed a hybrid algorithm based on distributed learning automata, graph
segmentation and PageRank ranking algorithm. In this paper an algorithm has been proposed by combination of
users' usage data and web pages structural data [9].
MATERIALS AND METHODS
As other recommendation systems, this proposal system also includes offline and online phases. In the
offline phase, first, we save different attributes of web site that categorized according to content and include
duration of observation, date of observation and frequency of each page by user and ranking pages of each
category in the base of these attributes. In the online phase, user after entering, selects a category of pages and
observes the wish pages. Then system follow the user’s session and find his/her interest to each category
according to the average of the pages’ duration and frequency of each category and the system recommend to
user a list of pages. Figure 1. shows general schematic of the proposal recommendation system.
In the previous researches were used of criteria such as the duration to observe each page and its frequency,
to compute rate of user's interest to pages. In the proposed method is also used observation date criterion of
pages to computations.
The reasons of selecting these criteria are as follow:
Page frequency:
indicate the duration of visiting the web page. Frequency is become normal by the number of all visited
1247
Maryam Jafari et al, 2014
Advances in Environmental Biology, 8(17) September 2014, Pages: 1245-1250
pages in a session. Fp(P) is the page frequency [7].
Fig. 1: General schematic of proposal recommendation system.
(1)
Observation duration of page by user:
It is define as the devoted duration on a page. It should be consider that maybe fast jump over one page is
because of small length if the page. Therefore, the observation duration of page should been normal by the
length of the page that is the number of the bytes [14]. Dp(p) is the observation duration of the page that is
shown in the equation (2):
(2)
Whereas the important of two cases are equal, then in this system, we use the average of frequency
harmonic and the duration to show the rate of the user's interest.
w(P)= [ ×fp(P)×dp(P)] / [ fp(P)+dp(P)]
(3)
Date:
Date presents its importance such that the request possibility of new pages is more than old pages for users
because users seek new information. Showing importance of new pages, we multiply them in large numbers.
This rate of interest is also called weight of the pages.
W=
i
w
i
i+1
(4)
i
0
i
Mapping the elements of proposed system to Petri components:
In the situation that we face with different data of type, we model it by using of putting color token in the
color Petri nets. Petri nets have been created by two main components. These components are: a set of places (p)
and a set of transition (T). The relation between places and transition are determined by two functions: input and
output that connect transitions to places.
 Places: in the proposed method, pages are put in the place frame in the simulation tool of color Petri net.
 Transitions: the duty of hypertext is create the interaction between pages and it has been shown as
transition.
 Arcs: they show the interaction direction between pages with arc.
 Tokens: each one carry out the information of user's session by itself.
Proposed method procedure:
User can choose one of the categories of pages as political, social, entertainment or commercial, after enter
to the site. Each one of this category of pages is subnet that consists of places as page. These places are
connected by transition (link between pages).
After entering to one of the categories, some of pages are reviewed and the attributes of those pages
consider in user's interest. In this way that observation duration is measured by a timer. Also the frequency and
the last observation date of page are sent to the page/ attribute matrix as token and the information of this matrix
is updated. Figure 2. shows page/ attribute matrix.
1248
Maryam Jafari et al, 2014
Advances in Environmental Biology, 8(17) September 2014, Pages: 1245-1250
Observation duration of page
Page's frequency
The last observation date of page
Page 1
Page 2
.
.
Page N
Fig. 2: page / attribute matrix.
Because of the web pages increasing in the virtual world; keeping all attributes of all users' observed pages
due to increase the time complexity of proposed method. So, just attributes of pages are kept in the
page/attribute matrix that observed nowadays.
According to formula (4), weight of all pages are computed and arranged in the descendent order. So, sum
of the observation duration and sum of observation frequency of each category that computed by counter, in the
end, calculate the rate of the user's interest to current category by using formula (3).
Determining that how much pages are proposed to user in each step of each category, we are doing as:
Assume that the rate of users’ interest to each one of 4 pages category that obtained by using of formula (3),
sequential is A, B, C and D. we want to propose to user maximum M new pages of each category. We should
multiply all obtained numbers in M and transform the result to integer by the round off method. The result
number is the maximum pages that should choose from page / attribute matrix of each category and show to the
user. There is plan subnets in each category of pages that sort pages based on weigh, computing the number of
proposed pages and presenting interest pages to users. There is plan subnets in each category of pages that sort
pages based on weigh, computing the number of proposed pages and presenting interest pages to users.
Results:
To proposed method modeling by using of color Petri net, has been used CPNTools software. Here, the
meaning of color is various pages category.
Figure 3. and Figure 4. are related to the social group that are two samples of 8 modeled subnets. In the
proposed method is assumed that the number of web pages fix every day. This system is implemented in the 100
days period to various users.
Fig. 3: Social subnet that modeled with CPN Tools.
Fig. 4: Plan Social subnet that modeled with CPN tools.
To evaluate a recommendation system is used from recovery precision criteria that it compares the
precision of proposed system performance with precision of static algorithms [15].
Precision=
(5)
In the static algorithms, the elements of their proposed method have been chosen just with according to
1249
Maryam Jafari et al, 2014
Advances in Environmental Biology, 8(17) September 2014, Pages: 1245-1250
observations frequency of previous users and it is same to all users.
Fig. 5: Comparing precision of proposed method with static algorithm.
It is observed in the graph 5 that after some days, system finds the user's interest and proposes the pages
that he likes.
Fig. 6: Comparing proposed method with previous algorithms.
In the graph 6, comparing of the proposed method with previous algorithms is presented that just use of
frequency parameters and observation duration to weighting.
Conclusions:
Considering the science development in the various contexts and consuming time in applying researches,
and also the probability of problems happening or Runtime error in the researches implementation, we need to
the simulation instead of implementation. The aim of this project is introduction of method to improve web
recommendation system performance by using of formal model of color Petri net. Analyses show that because
of used history parameter in the introduced method, it has high accuracy and is appropriate to web
personalization.
REFERENCES
[1] Bahadori, M., 2013. Modeling of operator's geodesic behavior in website by color Petri nets whit the aim of
identifying operator's interest to present dynamic web pages recommendation system, thesis of M.S.C,
Islamic Azad university branch of Dezfool.
[2] Yektaparast, A., 2012. A colored petri net behavior of web users using clustering to targeted advertising,
thesis of: m.s.c, islamic azad university science and research branch of Khouzestan.
[3] Chen, P., C. Sun And S. Yang, 2008. Modeling and Analysis the Web Structure Using Stochastic Timed
Petri Nets, Journal of software, academy publisher, 3(8): 19-26.
[4] Cule, B. And B. Goethals, 2010. Mining Association Rules in Long Sequences, in: Advances in Knowledge
Discovery and Data Mining, 6118/2010, Springer Berlin/ Heidelberg, 300-309.
[5] Ghaderyan, M., 2008. Improving of operator model in website as automatic by using of semantics with
special meaning of domain, M.S.C, Amirkabir university of technology, 2008.
[6] Godoy, D. And A. Amandi, 2009. Interest Drifts in User Profiling: A Relevance-Based Approach and
Analysis of Scenarios, In: The Computer Journal Advance, 52(7): 771-788.
[7] Khademali, Z., 2012. creating a profile to operator according to his interconnection in web by using of
intelligent algorithm, M.S.C, Islamic Azad university branch of DEZFOOL.
[8] Lin, W., S.A. Alvarez And C. Ruiz, 2000. Collaborative Recommendation Via Adaptive Association Rule
Mining, In: Second International Workshop: Web Mining for ECommerce, Boston, MA, USA.
[9] Moatamedimehr, S., M. Taran, A. BaradaranHashemi And M. Mabidi, 2010. Web recommender systems
1250
Maryam Jafari et al, 2014
Advances in Environmental Biology, 8(17) September 2014, Pages: 1245-1250
using a Distributed Learning Automata and Partitioning Graph, Fourth Iranian Conference of Data Mining,
Sharif University of Technology.
[10] Mobasher, B., H. Dai, T. Luo, And M. Nakagawa, 2001. Effective Personalization Based On Association
Rule Discovery From Web Usage Data, In: Proc. of the 3rd International Workshop on Web Information
and Data Management, ACM Press, Atlanta, GA, USA, 9-15.
[11] Mustapasa, O., D. Karahoca, A. Karahoca, A. Yucel And Uzunboylu, 2010. Huseyin. Implementation of
Semantic Web Mining on E-Learning, In: Proc. of Social and Behavioral Sciences, 2: 5820-5823.
[12] Pierrakos, D., G. Paliouras, C. Papatheodorou And C. Spyropoulos, 2003. Web Usage Mining as a Tool for
Personalization: A Survey, in: User Modeling and User-AdaptedInter action, 13: 311-372.
[13] Phatak, D.S. And R. Mulvaney, 2002. Clustering For Personalized Mobile Web Usage, In: Proc. of the
IEEE FUZZ’02, Hawaii, USA, 705-710.
[14] Rahmani, S. And M. Meybodi, 2009. Web personalization by using of developed council, third conference
about Iran data mining, Amirkabir university of technology.
[15] Rashidi, F., 2012. Constructing a Profile for User According to his Web Interactions Using Intelligent
Algorithms, M.S.C, Islamic Azad university science and research branch of Khozestan.
[16] Sugiyama, K., K. Hatano And M. Yoshikawa, 2004. Adaptive Web Search Based on User Profile
Constructed without Any Effort from Users, 13th international conference on World Wide Web, ACM New
York, USA, 675-684.
[17] Zhan, S., F. Gao, C. Xing And L. Zhou, 2006. Addressing Concept Drift Problem in Collaborative, in: Proc.
of the ECAI, Workshop on Recommender Systems in conjunction with the 17th European Conference on
Artificial Intelligence Riva del Garda, Italy, 34-39.
[18] Zhou, B., S.C. Hui And K. Chang, 2004. An Intelligent Recommender System Using Sequential Web
Access Patterns, In: Proc. of the 2004 IEEE Conference on Cybernetics and Intelligent Systems, Singapore,
1-3.
Fly UP