CEU eTD Collection

CEU Electronic Theses and Dissertations, 2024

Author	Kiss, Oliver
Title	Essays in Machine Learning and Networks
Summary	Chapter 1: Machine Learning on Networks This chapter consists of three published papers, all developing and examining machine learning tools for graph-structured data. Graphs encode important structural properties of complex systems. Machine learning on graphs has therefore emerged as an important technique in research and applications. We present \textit{Karate Club} -- a Python framework combining more than 30 state-of-the-art graph mining algorithms. We show \textit{Karate Club}'s efficiency in learning performance on a wide range of real-world clustering problems and classification tasks along with supporting evidence of its competitive speed. Sampling graphs is an important task in data mining. We propose \textit{Little Ball of Fur} a Python library that includes more than twenty graph sampling algorithms. Our experiments demonstrate that \textit{Little Ball of Fur} can speed up node and whole graph embedding techniques considerably while only mildly deteriorating the predictive value of distilled features. We present PyTorch Geometric Temporal, a deep learning framework combining state-of-the-art machine learning algorithms for neural spatiotemporal signal processing. Experiments demonstrate the predictive performance of the models implemented in the library on real-world problems such as epidemiological forecasting, ride-hail demand prediction, and web traffic management. Our sensitivity analysis of runtime shows that the framework can potentially operate on web-scale datasets with rich temporal features and spatial structure. Chapter 2: The Shapley Value in Machine Learning Over the last few years, the Shapley value, a solution concept from cooperative game theory, has found numerous applications in machine learning. In this paper, we first discuss fundamental concepts of cooperative game theory and axiomatic properties of the Shapley value. Then, we give an overview of the most important applications of the Shapley value in machine learning: feature selection, explainability, multi-agent reinforcement learning, ensemble pruning, and data valuation. We examine the most crucial limitations of the Shapley value and point out directions for future research. Chapter 3: Peer Effects in Multiplex Networks How social and economic networks govern real-life phenomena has been studied extensively in the past decades. While in some settings constructing randomized experiments is possible, most social ties are difficult or impossible to randomize. In such cases, the estimation of peer effects must rely on observational data. Such estimation is, however, hindered by a range of difficulties posed by the specific data structure. This paper proposes using spatial models to address the problem of omitted variable bias caused by correlated social networks and to disentangle peer effects in more complex social situations. The validity of the approach is verified using Monte Carlo simulations and synthetic graphs. The difference arising from the proposed simultaneous peer effects estimation is further illustrated using data from a technology adoption field experiment in Uganda.
Supervisor	Adam, Szeidl
Department	Economics PhD
Full text	https://www.etd.ceu.edu/2024/kiss_oliver.pdf

Visit the CEU Library.

CEU eTD Collection (2024); Kiss, Oliver: Essays in Machine Learning and Networks