JWE Abstracts 

Vol.13 No.5&6 November 1, 2014

Research Articles:

Towards Automatic Construction of Skyline Composite Services (pp361-377)
       
Shiting Wen, Qing Li, Liwen He, An Liu, Jianwen Tao, and Longjin Lv
Due to the rapid increase of available web services over the Internet, service-oriented architecture has been regarded as one of the most promising web technologies. Moreover, enterprises are able to employ outsourcing software to build and publish their business applications as services, the latter can be accessible via the Web by other people or organizations. While there are a large number of web services available, often no single web service can satisfy a concrete user request, so one has to compose multiple basic services to fulfill a complex requirement. Web service composition enables dynamic and seamless integration of business applications on the Web. The traditional composition methods select the best composite service through defining a simple weight-additive method based on a utility function. But a service has multiple dimensions of non-functional properties, so how to assign weight for each QoS dimension is a non-trivial issue. In this article, we propose algorithms to compose skyline or top-k composite services for a given user request automatically. Experimental results show that our approach can find all skyline or a set of top-k composite services effectively and efficiently.

Gathering Web Pages of Entities with High Precision (pp378-404)
       
Byung-Wo On, Muhammad Omar, Gyu Sang Choi, and Junbeom Kwon
A search engine like Yahoo looks for entities such as specific people, places, or things on web pages with search queries. Depending on the granularity of query keywords and performance of a search engine, the retrieved web pages may be in very large number having lots of irrelevant web pages and may be also not in proper order. It's infeasible to manually decide the relevance of each web page due to the large number of retrieved web pages. Another challenge is to develop a language independent relevance classification of search results provided by a search engine. To improve the quality of a search engine it is desirable to automatically evaluate the results of a search engine and decide the relevance of retrieved web pages with the user query and the intended entity, the query is all about. A step towards this improvement is to prune irrelevant web pages out by understanding the needs of a user in order to discover knowledge of entities in a particular domain. We propose a novel method to improve the precision of a search engine which is language independent and also free from search engine query logs and user clicks through data (widely used in recent times). We devise language independent novel features to build support vector machine relevance classification model using which we can automatically classify whether a web page retrieved by a search engine is relevant or not to the desired entity.

Finding News-Topic Oriented Influential Twitter Users Based on Topic Related Hashtag Community Detection (pp405-429)
       
Feng Xiao, Tomoya Noro, and Takehiro Tokuda
Recently, more and more users would like to collect and provide information about news topics in Twitter, which is one of the most popular microblogging services. Virtual communities defined by hashtags in Twitter are created for exchanging information about the news topic. Finding influential Twitter users in these communities related to a news topic would help us understand why some opinions are popular, and get valuable and reliable information for the news topic. In this paper, we propose a new approach to detect news-topic-related user communities defined by hashtags based on characteristic co-occurrence word detection. We also propose RetweetRank and MentionRank to find two types of influential Twitter users from these news-topic-related communities based on user’s retweet and mention activities. Experimental results show that our characteristic co-occurrence word detection methods could detect words which are highly relevant to the news topic. RetweetRank could find influential Twitter users whose tweets about the news topic are valuable and more likely to interest others. MentionRank could find influential Twitter users who have high authority on the news topic. Our methods also outperform other related methods in evaluations.

An Optimal Constraint Based Web Service Composition Using Intelligent Backtracking (pp430-449)
       
M. Suresh Kumar and P. Varalakshmi
Composition of web services involves a complex task of analyzing the various services available and deducing the most optimal solution from the list of service sequences. The web services are viewed in the form of layers interlinked with each other based on some conditions to form a service composition graph dynamically. Layering of the web services is done based on a sequential arrangement of the services as designed by the web service provider. From the numerous service sequences available, the most optimal service is computed dynamically from the start to end of a web service composition. The optimal solution set, consisting of a number of services, is deduced as the path that has the least total weight from start to end of the service composition. The anomalies that might arise in the search for optimal solution are solved using the Intelligent Backtracking technique thereby eliminating any absurd problems. The idea of Intelligent Backtracking is to make the optimization more efficacious. Dependency-directed backtracking is used so that the past transaction records are saved, making it easier to track the flow of web service selection. A Log file concept is introduced to keep a record of the service transactions at each level in order to satisfy user constraints in the best possible way. In case the user constraints are not feasible enough to complete the composition of services, then based on the data in the log files, negotiation can be done with the user for the reselection of certain anomalous web services. Negotiation is a process of dynamic mediation with the user in case his requirements constraints cannot be satisfied with the list of services provided by the service provider. This concept, if put to use will be revolutionary as it not only helps achieve optimization but also enriches the QoS constraints for user satisfaction.

Learning-Based Web Service Composition in Uncertain Environment (pp450-468)
       
Lei Yu, Zhili Wang, Luo-Ming Meng, Xuesong Qiu, and Jian-Tao Zhou
Web service composition has two kinds of uncertain factors, including uncertain invocation results and uncertain quality of services. These uncertain factors affect success rate of service composition. The web service composition problem should be considered as an uncertain planning problem. This paper used Partially Observable Markov Decision Process to deal with the uncertain planning problem for service composition. According to the uncertain model, we propose a fast learning method, which is an uncertainty planning method, to compose web services. The method views invocations of web service as uncertain actions, and views service quality as partially observable variables. The method does not need to know complete information, instead uses an estimated value function to approach a real function and to obtain a composite service. Simulation experiments verify the validity of the algorithm, and the results also show that our method improves the success rate of the service composition and reduces computing time.

The Roles of Decision Making and Empowerment in Jordanian Web-Based Development Organisations (pp469-482)
       
Thamer Al-Rousan, Ayad Al-Zobaydi, and Osama Al-Haj Hassan
This study aims to explore how empowerment is enabled in Web-based project teams. It also aims to identify differences in empowering practices and levels of individual empowerment in different types of Web-based project development methods. The point of departure is the assumption that the relationships between two important disciplines in Web-based project development, which are the Web-based project development methods and empowerment, are not clear in industrial Web-based projects. Through a survey of data that collected from 123 Web-based projects in Jordan, the study assesses whether there is a difference in empowerment in different types of Web application development methodologies. The findings show that the level of participation in decisions and empowerment differ in Web-based project development teams and there are clear signs that this can be attributed to different organizations and the methodologies chosen. The implications of these findings are discussed and suggestions for future research are identified and proposed.

Web Event State Prediction Model: Combining Prior Knowledge with Real Time Data (pp483-506)
       
Xiangfeng Luo, Junyu Xuan, and Huimin Liu
The state prediction plays a key role in the evolution analysis of web events. There are two issues for the state prediction of web events: one is what factors impact on the state transition of web events; and the other is how the prior knowledge can guide the state transition of web events. For the first issue, we discuss two types of temporal features observed from the real time webpages covering an event, i.e., the statistical ones and the knowledge structural ones. For the second issue, Fuzzy Cognitive Map (FCM) and conditional dependency matrix are mined from the training web events. As the prior knowledge, they represent the relations between the states transition and the relations of unobserved space (i.e., the six states of web events) and observed space (i.e., the two types of features). Then, based on that, an improved hidden Markov model is developed to predict the state transition of web events. Experimental results show the model has good performance and robustness because it combines the prior knowledge and the real time data of web events.

Web Page Prediction Enhanced with Confidence Mechanism (pp507-524)
       
Arpad Gellert and Adrian Florea
In this work we comparatively present and evaluate different prediction techniques used to anticipate and prefetch web pages and files accessed via browsers. The goal is to reduce the delays necessary to load the web pages and files visited by the users. We have included into our analysis Markov chains, Hidden Markov Models and graph algorithms. We have enhanced all these predictors with confidence mechanism which classifies dynamically web pages as predictable or unpredictable. A prediction is generated only if the confidence counter attached to the current web page is in a predictable state, improving thus the accuracy. Based on the results we have also developed a hybrid predictor consisting in a Hidden Markov Model and a graph-based predictor. The experiments show that this hybrid predictor provides the best prediction accuracies, an average of 85.45% on the "Ocean Group Research" dataset from the University of Boston and 87.28% on the dataset collected from the educational web server of our university, being thus the most appropriate to efficiently predict and prefetch web pages.

The Modified Concept based Focused Crawling using Ontology (pp525-538)
       
S. Thenmalar and T.V. Geetha
The major goal of focused crawlers is to crawl web pages that are relevant to a specific topic One of the important issues of focuses crawlers is the difficulty in determining which web pages are relevant to the desired topic. The ontology based web crawler uses domain ontology to estimate the semantic content of the URL and the relevancy of the URL is determined by the association metric. In concept based focused crawling a topic is represented by an overall concept vector, determined by combining concept vectors of individual pages associated with the seed URLs. The pages are ranked in comparison between concept vectors at each depth, across depths and between the overall topics indicating concept vector. However in this work, we determine and rank the seed page set from the seed URLs. We rank and filter the page sets at the succeeding depths of crawl. We propose a method to include relevant concepts from the ontology that have been missed out by the initial set of seed URLs. The performance of the proposed work is evaluated based on the two new evaluation metrics – convergence and density contour. The modified concept based focused crawling process produces the convergence value of 0.82 and with the inclusion of missing concepts produces the density contour value of 0.58.

Back to JWE Online Front Page