NORTH-1 will replace the HPC2N-region.

The HP2N-region will be removed 8/2 so make sure that all instances and data are moved to either EAST-1 or WEST-1 before 7/2.

If you need assistance please send a support ticket and let us know.

We will be replacing both storage and compute and since the setup in the HPC2N region was from the pilot cloud from 2015 we unfortunately cloud not do an in place upgrade.

The new NORTH-1 region will soon be available. I will use AMD 2.5Ghz CPU:s and the boot-disks will now use flash storage.

Proteomics analysis using cloud infrastructure

Proteomics is the study of the global protein expression of cells and tissues. In proteomics, measurements are often carried out using mass spectrometers and the resulting data is both complex and large in volume. Proteins are complex macromolecules consisting hundreds or thousands of 20 amino acid types. Each amino acid can also undergoes modifications and this result that an estimated 1 million different protein types exists in complex organisms such as humans and their abundance varies over 7 orders of magnitude.

Computational proteomics aims at generating interpretable information from the thousands of mass spectra produced each hour. In general, the computational workflows need to be adapted to new data acquisition strategies and sometimes even per project. To accommodate this, typical workflows consist of many tools produced by research groups, consortia or companies. Below, we describe the technology stack we use to provide stable workflows to both experienced and novice users, yet remain flexible to accommodate special analysis cases.

All produced data, both measured and derived, is ingested into a data manager referred to as openBIS (Bauch et al 2011), which is ultimately stored on Swestore. Workflows can automatically stage data on the computation infrastructure in use. GC3PIE is used to manage the workflow and to interact with the computational resources as follows; a new workflow is submitted by a user, the GC3PIE head node downloads the data, creates cloud workers that then executes the various tools that constitutes the workflow. The final result data is registered in the data manager in relation to the input data. The result data consist of both result data and interactive reports.

Johan Malmström (Lund University) and Lars Malmström (ETH Zurich)

Dedicated support channel

As part of our efforts to move towards a production grade setup, both for the infrastructure and the surrounding administration, we have now set up a dedicated support email:

support@cloud.snic.se

Please direct your support requests there so that they are seen by all members of the cloud team.