Abstract: Uncertain data is inherent in a few important applications such as environmental surveillance and mobile object tracking. Top-k queries (also known as ranking queries) are often natural and useful in analyzing uncertain data in those applications. In this paper, we study the problem of answering probabilistic threshold top-k queries on uncertain data, which computes uncertain records taking a probability of at least p to be in the top-k list where p is a user specified probability threshold. I present an efficient exact algorithm, a fast sampling algorithm, and a Poisson approximation based algorithm. An empirical study using real and synthetic data sets verifies the effectiveness of probabilistic threshold top-k queries and the efficiency of our methods.
Keywords: Dimension incomplete database, similarity search, whole sequence query.
Title: Searching Measurement for Imperfect Databases
Author: Sreeja VS, G. Thilagavathi
International Journal of Thesis Projects and Dissertations (IJTPD)
Research Publish Journals