Concept drift in smart city scenarios

Thesis event information

Date and time of the thesis defence

Place of the thesis defence

L10

Topic of the dissertation

Concept drift in smart city scenarios

Doctoral candidate

Master of Science Hassan Mehmood

Faculty and unit

University of Oulu Graduate School, Faculty of Information Technology and Electrical Engineering, Center for Ubiquitous Computing UBICOMP

Subject of study

Computer Science

Opponent

Professor Francisco Camara Pereira, Technical University of Denmark

Custos

Docent Susanna Pirttikangas, University of Oulu

Visit thesis event

Add event to calendar

Concept drift in smart city scenarios

Exponential population growth and urbanisation pose potential challenges to mobility, governance, well-being, the environment, and the safety of modern cities. This demands data-driven predictions and decision-making systems to achieve sustainable societal goals. Smart city data are being employed to improve citizens’ lifestyles, derive climate initiatives, provide quality health care and education, achieve better governance, and design better urban policies. However, the data from smart cities is vast and heterogeneous, requiring efficient and fault-tolerant data platforms supporting continuous data collection, storage, analysis, visualisation, and results delivery in both batch and real-time fashion. In addition, real-world data brings challenges and may come from malfunctioning, replaced or differently calibrated devices. Concept drift is a crucial barrier to relying on predictions from real-world data streams. It emerges due to unforeseen reasons, changes in statistical properties and the context of data while performing predictive modelling.

This thesis focuses on the challenges mentioned by investigating and proposing efficient concept drift detection approaches, providing distributed data pipeline architectures, and highlighting the potential challenges of concept drift in terms of real-world applicability. As a result, two different algorithms are proposed to perform predictive modelling using machine learning methods integrated with concept drift detection and adaptation methods. The experiment showed that integrating concept drift detection with predictive models increases the effectiveness of drawn predictions. Secondly, a cloud computing-based distributed data pipeline architecture is provided to support data collection, data analysis, concept drift detection, and others. Similarly, an edge computing-based distributed data pipeline is proposed for edge micro data centres to perform computationally demanding processes. The proposed data pipelines are fault-tolerant, can be scaled seamlessly, and support batch and real-time processing, third-party application integration, and more. The overall work has contributed to the existing knowledge base and outperformed current state-of-the-art solutions with real-world use case implementation. Finally, the open issues and challenges of concept drift detection and real-world applicability are discussed.
Last updated: 23.1.2024