Skip to content

eosc-synergy-logo

Introduction#

In this handbook we describe how to integrate computing and storage infrastructure in such a way as to comply with current guidelines and best practices of the European Open Science Cloud (EOSC). We also describe how it may be used, operated and extended, with a focus as well on the system administrator's perspective. We present a set of example applications that were adapted to profit from the services supported by the infrastucture. This includes the services used, the general architecture of EOSC, the tools we chose to support the applications, and how computer centres can join their own resources into the federated cloud.

Our initial starting point was:

  • We have 10 demanding Thematic Services that need infrastructure resource (cpu/gpu, storage, network, accounting, and monitoring) to provide their services and or results to the users

  • We have an architecture for the European Open Science Cloud (EOSC) with definitions of the core services / federating core, ...

  • We have an EOSC Marketplace that provides a large choice (320+) of different solutions and tools, many of which are in an unknown state.

  • We have a team of experts in distributed computing that (happen to) operate the first working prototype of an EOSC infrastructure: EGI Federated Cloud

The challenge: Bring users and infrastructures together - in a scalable way that avoids vertical solutions. The EOSC-Synergy way to address this was the introduction of tools that provide a natural separation between the different roles and requirements on the infrastructure. Users are supported by the Community Manager, or by their Community Developers. These two representatives of the Community can request changes to the infrastructure services. Site administrators are responsible for the operation of the physical machines that provide CPU cycles and storage. They are in contact with Community Administrators that are in contact with Community managers to request the capacity in which these services are provided. Details are described in our Deliverable [D2.2].

All tools and solutions used are provided by open source software, exclusively. This ensures a long term perspective for a sustainable infrastructure. Furthermore, the open approach taken guarantees that custom extensions can always be implemented by third parties to create tools that bridge gaps that may be identified.

One example for this are the three different tools that may be used to access the infrastructure. Two different web-tools and one Command Line Interface (CLI) tool, address the whole bandwidth of user experience. These tools Infrastructure Manager (IM) and its dashboard, the Openstack Dashboard and the fedcloudclient are described together with many others in section 4.

Federated is not just distributed. The federated nature of the infrastructure brings several challenges that need to be addressed in order to build a sustainable solution. One particular challenge stems from the fact that our users are identified by entirely different legal entities in different continents. Making use of recently developed modern Identity and Access Management solutions (often called AAI) allows offering services to reliably identified users without the attached cost of user management.

Just as our users, are the computer centres that provide capacity to the cloud are federated across different legal entities in different countries, most of which are currently situated in Europe.

One essential concept for addressing this are Virtual Organisations (VOs). The VO Management is delegated to a user community. Community managers negotiate quotas for their VOs with individual resource centres. The infrastructure provides usage statistics (accounting) and monitoring on the granularity of the VO. VOs are probably the most cost efficient and scalable way of addressing federated users on federated infrastructures at large.

VOs as implemented on the EGI Federated Cloud provide a balance between the freedom in the authorisation decision and a strict governance on technology and policies (which software, which regulations, who is responsible). Without VOs, furthering endeavours such as the European Open Science Cloud EOSC do not seem feasible.

The overall organisation of this handbook is as follows. We start with an introduction of the EOSC-Synergy supported Thematic Services (TS) and their requirements. Each Thematic Service is described in section 2. Then we describe the general components of the EOSC architecture in Section 3.

The more technical Sections 4 and 5 describe how to integrate new resource centres (I.e. CPU, GPU, or storage hardware to provide cloud services) into the federated cloud, followed by a list of the components and tools used within the EOSC-Synergy Project. We close with the summary in Section 6.