News

Virtual Research Environments for Clinical Metabolomics

PhenoMeNalLogo

PhenoMeNal is a 3-year EU Horizon 2020 project (2015-2018) that will develop a standardised e-infrastructure for analysing medical metabolic phenotype data. This comprises development of standards for data exchange, pipelines, computational frameworks and resources for the processing, analysis and information-mining of the massive amount of medical molecular phenotyping and genotyping data that will be generated by metabolomics applications now entering research and clinic.

At the Spjuth research group we lead WP5; “Operation and maintenance of PhenoMeNal grid/cloud” and our aim is to provide PhenoMeNal and researchers with the capability to spawn secure Virtual Research Environments (VRE or VE) with easy access to scalable, interoperable data and tools for data analysis. These virtual environments should be able to run on most hardware architectures ranging from single laptops/workstations, to private and public cloud (IaaS) providers.

We use MANTL to set up, and to provide, a microservice-oriented virtual infrastructure. In PhenoMeNal, all partners provide tools as Docker images, , that are automatically built, tested, and pushed to DockerHub, by a continuous integration system (Jenkins). Within MANTL we provide long-running services using Marathon, including Jupyter and Galaxy workflows systems, that can orchestrate microservices-based pipelines using e.g. Chronos or Kubernetes.

Phase3 Draft Budget

So far we have successfully provisioned PhenoMeNal VRE on Google Cloud Platform, EBI Embassy Cloud (OpenStack), and SNIC Science Cloud (OpenStack). We are currently experimenting with Packer for speeding up the provisioning of virtual machines within the VRE, and Consul for federating multiple VREs. Another ongoing project is to use Apache Spark for distributed data analysis within the VRE.

Links:

http://www.farmbio.uu.se/forskning/researchgroups/pb/PhenoMeNal/

http://www.farmbio.uu.se/forskning/researchgroups/pb/Data-intensive/

http://phenomenal-h2020.eu/

Applied Cloud Computing Workshop (Spring 2016)

Overview:

Instructor: Salman Toor.
Level: Basic.

Location: SciLifeLab Gamma Level 6 (Pascal), Stockholm.

Visiting address: Science for Life Laboratory, Tomtebodavägen 23A, 17165 Solna, Sweden.

Infrastructure: SNIC Science Cloud (OpenStack based Community Cloud).

Date & duration: 30:th March, (10:00 – 16:00).

Audience: Users and potential users of SNIC Science Cloud resources with no previous cloud experience.


Registration:

Registration has closed.


Topics:

  • Brief overview of Cloud Computing.
  • Cloud offerings: Compute, Storage, Network as a Service (*aaS).
  • Brief description of IaaS, PaaS, SaaS etc.
  • How to access Cloud resources?
  • Introduction to SNIC Science Cloud initiative.

Hands-on session topics:

1 – How Horizon dashboard works?
2 – How to start a virtual machine (VM)?
3 – Instance snapshots.
4 – Access to cloud storage (volumes and Object store)
5 – Storage snapshot
5 – Network information
6 – Basic system interaction with APIs

Lab-Document


Schedule:

First half (10:15 – 12:00): Lectures
Second half (13:00 – 15:00): Lab session


 

Plans for first half of 2016

Part of the SNIC CLoud Team 2016
Part of the SNIC Cloud Team 2016. From left: Lars Viklund (HPC2N), Daniel Nilsson (C3SE), Andreas Hellander (UU), Salman Toor (UU, UPPMAX), Pontus Freyhult (UPPMAX) and Mathias Lindberg (C3SE). Missing from picture: Ingemar Fällman (HPC2N) and Henric Zazzi (PDC).

Last week we held out first all-hands meeting for 2016. Many of us were able to meet at HPC2N in Umeå for almost two days of brainstorming and technical work. Since we now have a functioning (but not yet production grade) IaaS cloud up and running, serving approximately 40 projects and 110 users, the focus of this meeting was on monitoring (increase stability), metering and accounting. We are, like all SNIC-supported projects, relying on the SUPR system for managing projects and users, but we haven’t yet developed a custom entry point for the cloud resources (we have been using “UPPMAX Small” templates, for those of you who know what that is). During the meeting, we completed a draft of the SUPR/SAMS workflows for cloud projects, in collaboration with representatives of the SAMS team. This is now to be handed off to those teams for feedback, and hopefully quick implementation.

Some other highlights from our all-hands meeting:

  • We decided to host 3 training workshops this semester targeted at beginning users for the cloud resources, tentatively at KI (end of February), Chalmers (late March) and Umeå University  (May). We will then follow this up with a more advanced workshop, showing some more advanced concepts and tools  in Uppsala early in the next semester.
  • We are in good shape to start accepting more users, now that we have two regions online. If you are interested, go ahead and make a project request.
  • We spent a lot of time discussing the incentive for users to make sensible use of the IaaS resources when developing applications. We will implement some form of pay-as-you-go model to promote dynamic use of resources. More information will follow.
  • We are planning to harden the systems, so as a user you will successively see a more and more stable system over the next couple of months. One step in that direction will be taken during the next large service window in the UPPMAX region Feb 15-Feb 29.
  • A third region at C3SE is well on its way.

SNIC Science Cloud – A Community Cloud and a Community Effort

We are happy to announce SNIC Science Cloud (SSC), a community cloud offering Infrastructure as a Service (IaaS), and, in the near future, selected Platform as a Service (PaaS) offerings free of charge to individual researchers at Swedish universities. We will start taking on more users during the next couple of months so let us know if you have a need for cloud computing infrastructure.

Open source. We are building SSC on the open source OpenStack cloud suite. Currently, we are hardening the system for sustained production. We are also scaling it to multiple regions with participation from the HPC centre UPPMAX (Uppsala), C3SE (Göteborg), HPC2N (Umeå) and PDC (Stockholm) to ensure that we can meet an increasing demand.

A community effort. Our goal is to provide a modern, flexible and open infrastructure that complements existing HPC resources. We strive for a community effort that evolves with and for researchers. We would love to hear from potential users regarding the needs for platform level services, such as Apache Hadoop/Spark, Kubernetes or other toolchains so that we focus efforts where they are most needed. What large datasets would you like to process?

Transparency to help others follow. In taking on the challenge of deploying and operating an OpenStack community cloud on a national scale over several hundreds of servers and many thousand physical cores, we hope to lead the way for other institutions that are considering similar initiatives. This is why we aim for transparency, both with architecture planning, operation practices (e.g. sharing code for testing and evaluation), and with data regarding usage patterns.

Open science, open data. With SSC we hope to take a leap towards an infrastructure for open science and open data, with cloud technology facilitating shareability and reproducibility of complex and computationally demanding experiments. We aim at making computations and data analysis more accessible for research communities with little previous experience of advanced and large scale computing resources. We are always interested in discussing these issues and in sharing and sharpening our vision.

You can help. Finally, there is a lot of work to do! If you are involved with academia in Sweden and you are an OpenStack operator, have experience of e.g. software stacks for large scale data processing, microservices orchestration, automation, or if you belong to a community that is using some specific SaaS that you would like to provide for research groups, we want your help!