Skip to content

eosc-synergy-logo

Thematic Services#

EOSC-Synergy is supporting ten different thematic services in four scientific areas (Earth Observation, Biomedicine, Astrophysics and Climate Change see here for a detailed list). Each service corresponds to a separate Community, each of which has different requirements, software tools, and access patterns. All of these thematic services require access to infrastructure services such as CPU, Storage and Network. The way in which the infrastructure is allocated, accessed, and used is different between each thematic service, though.

Many services provide access to research data and benefit from a wider availability. As such they need to act as clients and servers simultaneously. Often, the underlying user management (also called AAI) is shared for both roles.

The thematic services serve as examples, since they addressed a larger number of issues than many other services. This gives us the chance to either prove that EOSC-Synergy is ready to access federated datasets, in clusters distributed across Europe, or to develop additional tools for the ecosystem in case they are still missing.

During the project lifetime the communities have progressed towards best practices for the adoption of common EOSC guidelines, tools, interfaces and services. This includes strengthening the communities in increasing the capacity, performance, reliability and/or functionality of these thematic services through their integration in EOSC. This was especially important to increase the number of users of these thematic services substantially.

A detailed report will be published by EOSC-Synergy’s Work Package 4 (WP4), that describes requirements and solutions of the thematic services in detail. [WP4-IS]

Thematic Services Challenges#

Here we provide a short overview about the specific challenges faced by each thematic service. These challenges regard access to Computing and Storage specifically. For more information about each specific thematic service, we provide references to a publication that describes the service in more detail. The following information have originally been collected in the paper “A survey of the European Open Science Cloud services for expanding the capacity and capabilities of multidisciplinary scientific applications” by Ignacio Blanquer et. al. [WP4-IS]

Thematic Service Limitations and needs
WORSICA - Improve download speed and number of concurrent downloads of satellite images.
- Increase storage of the images needed for the algorithm.
- Increase computational resources: GPU and RAM to speedup the image processing.
- Seamless authentication and authorization for end users.
SAPS - Need for a larger-scale deployment: computing, storage and data access.
- Scalability and standardisation of services
- Integrated and widely supported AAI
GCore - Overcome limited access to data repository due to network bandwidth restrictions.
- Infrastructure resources for processing and reprocessing large data sets.
- Data delivery volume. Increasing size of files to be delivered to users.
SCIPION - Insufficient Cloud resources for the workflow: GPUs, CPUs and RAM
- Need of a Resource Management able to optimize the use of cloud resources.
- Storage limitations and data transfer performance: 1-3 TB raw data.
- Distributed and shared file system.
OpenEBench - Need to work on heterogeneous systems to reach Life Sciences Communities
- Need to efficiently store processed data and workflows in a FAIR manner.
LAGO - Limitations on data preprocessing.
- Needs data storage that copes with FAIR, curation and harvesting;
- Need for computing power for simulations, together with optimal scheduling.
SDS-WAS - Lack of services needed for Data storage and curation.
- Lack of computing power for data analysis on-demand.
- Lack of reliability of data sources, especially about observations
UMSA - Long-term data storage is required, together with appropriate data curation.
- Tracking provenance of the secondary (derived) datasets.
- Need for reimplementing UMSA algorithms to deal with sparse data.
MSWSS - Needs data protection measures because of the usage of confidential data.
- The data has to be stored in a private storage only.
- Implement security policies to protect VMs.
O3AS - Requires larger storage resources, specially improving data availability
- Fast handling of big data

Table 1: The Thematic services with their challenges that need to be addressed in EOSC Synergy

Thematic service technology choices#

To address the identified challenges, WP4 of EOSC-Synergy undertook an analysis of the Services offered via the EOSC Marketplace. More than 320 services are available. In [WP4-IS] these services are organised into six categories, out of which “Access physical & eInfrastructures” is the one we are interested in. Table 2 shows those services chosen by each service to address the needs within the different categories.

More details are available in the corresponding WP4 Deliverable [D4.3].

Service AAI Workload Mng. Resource Mng. Data Storage
WORSICA EGI Check in ArcCE, Batch (SLURM) IM (TOSCA) Nextcloud, Datavers
G-Core CAS User/pwd & EGI Check in GCore+ K8s IM / EC3 ElasticSearch
SAPS EGI Check in K8s IM / EC3 OpenStack Swift
Scipion EGI Check in Batch (SLURM) IM / EC3 Local + EGI DataHub
OpenEBench Life Sciences AAI WfExS + NextFlow OpenNebula Local + B2SHARE
LAGO eduTEAMS + EGI Check-in Batch (SLURM) Local clusters + IM / EC3 EGI DataHub ONEDATA
SDS-WAS B2ACCESS Batch (SLURM) Local clusters B2HANDLE / B2SAFE
UMSA EGI Check in & Life- science AAI Batch (SLURM) in IM/EC3 (in Galaxy) IM / EC3 Local + S3
MSWSS EGI Check in Batch (SLURM) in EC3 (in Galaxy) IM / EC3 Local + Dataverse
O3AS EGI Check in Batch (SLURM) & K8s cluster Local + WebDAV

Table 2: The solutions used by thematic services in the different domains. (from [D4.3])