Network Measurement as a Service (NMaaS)

EPSRC-funded Project (EP/N033957/1)

Project Summary

Recent advances in server and network virtualisation have given rise to the Infrastructure-as-a-Service paradigm where businesses can lease resources from cloud datacentre operators, thus enabling the outsourcing of ICT. Such
businesses can themselves be application and service providers who act as tenants of a shared data centre infrastructure. The tenants resize their ICT footprint through the pay-as-you-go pricing model, thereby maintaining low capital (and operational) expenditure and increasing their profit margin. This infrastructural abstraction allows tenants to focus solely on their business delivery model while leaving the infrastructure maintenance to the operators. However, the resulting lack of visibility to the dynamic state of the underlying infrastructure can immensely hurt the services of the tenants when its performance fluctuates in short timescales. This prohibits the more pervasive migration of businesses to the cloud who are instead forced to maintain their own, in-house infrastructures. Adding to the problem, security risks are more acute in the cloud. Attackers can leverage cloud servers to launch DDoS attacks to other tenants or faster portscan to identify vulnerable services. Especially tenants are completely excluded from detecting security threats and from taking remedial action autonomously as the incidents unfold. Vulnerable services can end up consuming immense amounts of compute and network resources, leading to unsustainable bills for tenants who ultimately may have to retreat their services from the cloud. Existing measurement and monitoring approaches are inadequate because they are architected specifically for accounting, traffic engineering or offline debugging. Measurements from these approaches provide no clue on whether an
application suffers self-induced congestion or cyber-attacks, there are some other offending flows/applications, or unacceptable latencies are due to long queueing delay at certain switch or application components, and how many flows are impacted by them. While addressing these problems itself is important to cloud operators, doing so in a timely fashion is often simply impossible because software and hardware updates take time and new pathological traffic patterns may arise as applications evolve.

The overarching goal of this project is to design and develop a native Network Measurement-as-a-Service (NMaaS) framework that will allow tenants to express their measurement needs, and to subsequently synthesise the corresponding complex service-level performance functions out of simple monitoring primitives. The required primitive measurement components will be dynamically and transparently instantiated when and where required throughout the infrastructure, exploiting the temporal available capacity of servers and network nodes. In particular, we aim to:
  • devise novel server and switch instrumentation capabilities for traffic monitoring and make them as a native part of an underlying infrastructure so that they can support diverse measurement functions while alleviating measurement errors and uncertainties
  • develop a network-wide, centrally-orchestrated algorithm for the synthesis of complex metrics through the optimal placement of server-based and switch-based measurement functions in virtual and physical network components
  • design and develop measurement requirement description APIs to parse high-level measurement specifications issued by tenants and transform them into low-level measurement indicators.

Ultimately, we aim to demonstrate that the proposed framework will contribute significantly in maintaining the desired application performance while at the same time improving the utilisation of cloud resources. Given that the cloud is still a rapidly growing global business, we anticipate that the research outcome will greatly benefit the wider IT industry.