The cloud computing industry has grown massively over the last decade and with that new areas of application have arisen. Some areas require specialized hardware, which needs to be placed in locations close to the user. User requirements such as ultra-low latency, security and location awareness are becoming more and more common, for example, in Smart Cities, industrial automation and data analytics. Modern cloud applications have also become more complex as they usually run on a distributed computer system, split up into components that must run with high availability.
Unifying such diverse systems into centrally controlled compute clusters and providing sophisticated scheduling decisions across them are two major challenges in this field. Scheduling decisions for a cluster consisting of cloud and edge nodes must consider unique characteristics such as variability in node and network capacity. The common solution for orchestrating large clusters is Kubernetes, however, it is designed for reliable homogeneous clusters. Many applications and extensions are available for Kubernetes. Unfortunately, none of them accounts for optimization of both performance and energy or addresses data and job locality.
Duration
12/2022 to 11/2025
Programme
Horizon Europe
HORIZON-CL4-2022-DATA-01-02
Research & Innovation Action
Reference
101092582
Project Concept
AI-based, open and portable cloud management
DECICE aims to develop an AI-based, open and portable cloud management framework for automatic and adaptive optimization and deployment of applications in a federated infrastructure, including computing from the very large (e.g., HPC systems) to the very small (e.g., IoT sensors connected on the edge).
Digital Twin
Working at such vastly different scales requires an intelligent management plane with advanced capabilities that allow it to proactively adjust workloads within the system based on their needs, such as latency, compute power and power consumption. Therefore, we envision an AI-model, which can use a digital twin of the resources available, to make real-time scheduling decisions based on telemetry data from the resources.
DECICE framework
The DECICE framework will be able to dynamically balance different workloads, optimize the throughput and latency of the system resources (compute, storage, and network) regarding performance and energy efficiency and quickly adapt to changing conditions. The framework also gives the necessary tools and interfaces for the administrators and deployment experts to interface with all the infrastructure components and control them to achieve the desired result.
Open standard APIs
The integration of the DECICE framework with orchestration systems will be done through open standard APIs to make it portable, modular and extensible. The DECICE framework will be evaluated through established use cases.
Project Impacts
Impact 01
Europe’s open strategic autonomy by sustaining first-mover advantages in strategic areas including AI, data, robotics, quantum computing, and graphene, and by investing early in emerging enabling technologies
Impact 02
Reinforced European industry leadership across the digital supply chains
Impact 03
Robust European industrial and technology presence in all key parts of a greener digital supply chain, from low-power components to advanced systems, future networks, new data technologies, and platforms.