Nowadays, efficient and effective processing over massive stream data has attracted much attention from the database community, which are useful in many real applications such as sensor data monitoring, network intrusion detection, and so on. In practice, due to the malfunction of sensing devices or imperfect data collection techniques, real-world stream data may often contain missing or incomplete data attributes. In my Ph.D. dissertation study, we have been formalizing and tackling a novel and important problem, named query processing over incomplete data stream (P-iDS), which retrieves desired objects (in the presence of missing attributes) with high confidences from incomplete data stream. In order to tackle the P-iDS problem, we have designed efficient approaches to impute missing attributes of objects from incomplete data stream via different data imputation rules. We have proposed effective pruning strategies to reduce the search space of the P-iDS problem, devised cost-model-based index structures to facilitate the data imputation and query computation at the same time, and integrated our proposed techniques into an efficient P-iDS query answering algorithm. In order to evaluate efficiency and effectiveness of our P-iDS processing approach, we have conducted extensive experiments for some representative query operators (e.g., Entity Resolution, Skyline, Join, and Top-k) of P-iDS problem over both real and synthetic data sets.
Dr. Weilong Ren obtained his PhD degree from the Department of Computer Science, Kent State University, in December 2021, under the co-supervision of Dr. Xiang Lian and Dr. Kambiz Ghazinour. His current interests include incomplete data management, data integration, and data privacy.