Triple Monitoring Engine, our response to variable resources requests

BigDataStack Innovation Potential: Initial Plan and Activities

As the Cloud offering for Big Data increases, the potential configurations for running computation become immeasurable. While, initially, this potentially endless set of possibilities is great news for the user, the complexity of big data infrastructures makes it unaffordable for the average user to keep track of the status of the system and dynamically adapt.

One of the main functionality to achieve this goal is for the user to provide high-level Quality of Service requirements which can be monitored and evaluated inside the system, and alert both the system and the user in the case where these are not met. This is of particular importance for Big Data applications, where the amount of information to be computed and the response time is variable (for example, some services can be run in the background, using batch computation, and others need to be processed in real-time).

The new component (to be) developed by BigDataStack

In BigDataStack we have identified this situation, and we propose a novel solution which allows the user monitoring their applications at 3 levels:

  1. Application level: The system identifies, through hooks inserted in the application’s code, different performance metrics of an application (response time, throughput, etc.).
  2. System level: The system can monitor itself and the performance of the resources leased by the user. If it discovers that the performance of these resources is becoming a bottleneck (i.e. the CPU is overloaded due to a computational peak), then it raises the corresponding alerts.
  3. Network level: Finally, the system can keep track of the network performance, and raise an alert when this performance is lower than expected by the user.

To achieve this, in BigDataStack we have developed the Triple Monitoring Engine, a component that leverages two widely-used, open-source tools to monitor and react to the different events. This component integrates with the platform and, through a simple GUI, allows choosing between multiple predetermined KPIs to provide the user with real-time information about the QoS of their applications with respect to the expectations. It also integrates with the Dynamic Orchestrator component, which uses the information provided by the Triple Monitoring Engine to learn from the different configurations and propose new ones.

The expected innovation

The main innovation of the Triple Monitoring Engine is providing the user with a high-level tool to describe the QoS requirements of the different applications, easing the choice of configuration patterns. Furthermore, it does also give the infrastructure provider a better understanding on the requirements of the applications running on top of their infrastructure, and the system the ability to dynamically adapt to the user’s needs. While this component is self-reliant, it should be considered as part of a whole, where the interaction with the other functionalities in the system (dynamic orchestrator) better exploits all the potential of the component.