Since its inception, the Internet has been inherently insecure. Over the years, much progress has been made in the areas of information encryption and authentication. However, infrastructure and resource protection against anomalous and attack behaviour are still major open challenges. This is exacerbated further by the advent of Cloud Computing where resources are collocated over virtualised data centre infrastructures, and the number and magnitude of security threats are amplified.
Current techniques for statistical and learning-based network-wide anomaly detection are offline and static, relying on the classical Machine Learning paradigm of collecting a corpus of training data with which to train the system. There is thus no ability to adapt to changing network and traffic characteristics without collecting a new corpus and re-training the system. Assumptions as to the characteristics of the data are crude: assuming measured features are independent through a Naïve Bayes classifier, or that projections that maximise the variance within the features (PCA) will naturally reveal anomalies. Moreover, there currently is no framework for profiling the evolving normal behaviour of networked infrastructures and be able to identify anomalies as deviations from such normality.
The overarching objective of this PhD project is to design in-network, learning-based anomaly detection mechanisms that will be able to operate on (and integrate) partial data, work in short timescales, and detect previously unseen anomalies. The work will bridge Machine and Reinforcement Learning with experimental systems research, and will evaluate the devised mechanisms over real-world virtualised networked environments and traffic workloads.
The student can focus on advancing the state-of-the-art in the learning processes, the requisite network programmability mechanisms, or both. For example, the project can focus on exploring recent advances in statistical ML to develop flexible probabilistic models that can capture the rapidly evolving view of the network. Or, it can focus on designing programmable dataplanes and application acceleration/offload frameworks that can support such advanced functionality running in-network and sustaining line-rate performance.
The research will be conducted as part of the Networked Systems Research Laboratory (netlab) at the School of Computing Science, and the student will be given access to actual Internet traffic traces, and a state-of-the-art networking testbed with fully programmable platforms at all software and hardware layers. The work will spread across some very vibrant and cross-disciplinary research areas, and the student will be equipped with highly demanded skills in Machine Learning, CyberSecurity and next generation network architectures.