About

The cloud computing industry has grown massively over the last decade and with that new areas of application have arisen. Some areas require specialized hardware, which needs to be placed in locations close to the user. User requirements such as ultra-low latency, security and location awareness are becoming more and more common, for example, in Smart Cities, industrial automation and data analytics. Modern cloud applications have also become more complex as they usually run on a distributed computer system, split up into components that must run with high availability.

Unifying such diverse systems into centrally controlled compute clusters and providing sophisticated scheduling decisions across them are two major challenges in this field. Scheduling decisions for a cluster consisting of cloud and edge nodes must consider unique characteristics such as variability in node and network capacity. The common solution for orchestrating large clusters is Kubernetes, however, it is designed for reliable homogeneous clusters. Many applications and extensions are available for Kubernetes. Unfortunately, none of them accounts for optimization of both performance and energy or addresses data and job locality.

LEVERAGE A COMPUTE CONTINUUM ranging from Cloud and HPC to Edge and IoT
AI-SCHEDULER supporting dynamic load balancing for energy efficient compute orchestration, improved use of Green Energy, and automated deployment.
API that increases control over network, computing and data resources.
DYNAMIC DIGITAL TWIN of the system with AI-based prediction capabilities
REAL-LIFE USE CASES of DECICE framework (usability and benefits).
SERVICE DEPLOYMENT with a high level of trustworthiness and compliance with relevant security frameworks.

Duration

11/2022 to 10/2025

Programme

Horizon Europe

HORIZON-CL4-2022-DATA-01-02

Research & Innovation Action

Reference

101092582

Project Concept

AI-based, open and portable cloud management

DECICE aims to develop an AI-based, open and portable cloud management framework for automatic and adaptive optimization and deployment of applications in a federated infrastructure, including computing from the very large (e.g., HPC systems) to the very small (e.g., IoT sensors connected on the edge).

Digital Twin

Working at such vastly different scales requires an intelligent management plane with advanced capabilities that allow it to proactively adjust workloads within the system based on their needs, such as latency, compute power and power consumption. Therefore, we envision an AI-model, which can use a digital twin of the resources available, to make real-time scheduling decisions based on telemetry data from the resources.

DECICE framework

The DECICE framework will be able to dynamically balance different workloads, optimize the throughput and latency of the system resources (compute, storage, and network) regarding performance and energy efficiency and quickly adapt to changing conditions. The framework also gives the necessary tools and interfaces for the administrators and deployment experts to interface with all the infrastructure components and control them to achieve the desired result.

Open standard APIs

The integration of the DECICE framework with orchestration systems will be done through open standard APIs to make it portable, modular and extensible. The DECICE framework will be evaluated through established use cases.

Project Impacts

Impact 01

Europe’s open strategic autonomy by sustaining first-mover advantages in strategic areas including AI, data, robotics, quantum computing, and graphene, and by investing early in emerging enabling technologies

Impact 02

Reinforced European industry leadership across the digital supply chains

Impact 03

Robust European industrial and technology presence in all key parts of a greener digital supply chain, from low-power components to advanced systems, future networks, new data technologies, and platforms.

Project Structure

WP1 aims to organize the overall project administration, finance, and project management including definition and coordination of the quality and risk management.

WP2 aims to implement the scheduling agent responsible for the efficient orchestration of the application workload on the cloud-edge infrastructure.

WP3 seeks to integrate arbitrary backend solutions into a portable framework that could be integrated into arbitrary cloud frameworks and provide a training environment.

WP4 targets the majority of tasks that concern integration of extensions for Kubernetes.

WP5 contains activities revolving around deployment and validation.

WP6 seeks to enhance the impact of the project in the long-term through strategic planning of the dissemination, communication and stakeholder engagement activities.

Deliverables

D2.1 Specification of the Optimization Scope

Download PDF

D2.2 Digital Twin

Download PDF

D2.3 AI-Scheduler Prototypes for Storage and Compute

Download PDF

D2.4 Integrated AI-Scheduler Prototype

Download PDF

D2.5 Final Scheduler and Digital Twin

Download PDF

D3.1 Synthetic Test Environment

Download PDF

D3.2 Final Architecture and Interfaces

Download PDF

D3.3 Final Implementation

Download PDF

D3.4 Security and Trustworthiness

Download PDF

D4.1 Implementation Report of CI/CD Environment

Download PDF

D4.2 Integration of Monitoring Framework

Download PDF

D4.3 Final Integration of DECICE APIs

Download PDF

D4.4 Final Integration of HPC and AI Services

Download PDF

D5.1 Use Case Rquirements

Download PDF

D5.2 Development Environment Specification

Download PDF

D5.3 Project Development Environment Deployed for Phase 1 and 2

Download PDF

D5.4 Project Development Environment Deployed for Phase 3

Download PDF

D5.5 Performance Evaluation Report

Download PDF

D6.1 Dissemination & Communication Plan

Download PDF

D6.3 Online and Media Presence

Download PDF

D6.4 Engagement Summary Report

Download PDF

back to top icon