smurf – Swedish Science Cloud

Upgrading the WEST-1 region of Swedish Science Cloud in September

In Semptember we will begin upgrading the WEST-1 region hosted by Chalmers e-Commons to the latest version of OpenStack and also add some new hardware.

This will improve the capacity and funktionality of the WEST-1 region.

Unfortunately, it also means that the WEST-1 region will be down and unavialable for some time this fall and eveyting that any data currently stored there will be removed.

If you are currently using WEST-1 you must make sure to:

Backup your data.
Move your workloads and data from WEST-1 to either EAST-1 or NORTH-1.

If you have any questions or if you need assistance, do not hesitate to contact support@cloud.snic.se and we will help you.

EAST-1 power failure (resolved)

At 00:57 CEST on Monday, May 29th a power outage caused the cooling system at Ångström Laboratory to shut down, leading to a rapid increase in temperature within the compute hall. To prevent further temperature escalation and safeguard the equipment, all systems in the compute hall were forcefully powered off. The cooling system was restored at approximately 05:00.

Due to the elevated temperatures experienced during the outage, additional inspections are required to ensure the compute hall, compute, storage, and network hardware are functioning as expected. Currently, we have identified an issue with one of the two UPS units.

Throughout the day, we will provide regular updates regarding the progress of the recovery efforts and the status of the affected equipment. We are working diligently to resolve any issues and restore normal operations as soon as possible.

Update 2023-05-29 11:00

The compute hall is fully operational again. We are now working on restoring systems.

Shutdown of all systems on 2 february at 07:00 CET

The UPPMAX compute hall hosting EAST-1 will be partially shutdown during 2 February between 07:00 – 11:00 CET as Akademiska Hus performs work on the cooling circuit. The shutdown has been planned to coincide with our February maintenance day. We will try to provide some level of access but expect all compute capability to be unavailable until the work is completed.

If you have any questions please contact us at support@uppmax.uu.se.

Best regards, UPPMAX

Serious vulnerability in pwnkit (CVE-2021-4034)

Pwnkit is installed by default in most linux distributions, there is no permanent fix yet but there is a workaround, you can remove the suid bit from the binary using chmod 0755 /usr/bin/pkexec and that will make it impossible to exploit this bug.

Pkexec is installed by default on all major Linux distributions.
Pkexec has been vulnerable since its creation in May 2009.
Any unprivileged local user can exploit this vulnerability to get full root privileges.

http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-4034

Issues with two hypervisor host in the NORTH-1 Region.

We currently have some issues with two of the hypervisor nodes in the NORTH-1 region and the instances running on these nodes are currently unavailable.
We are working to resolve this issue.

Serious vulnerability in sudo (CVE-2021-3156)

Make sure to install the latest security updates in your instances to fix a Serious vulnerability in sudo (CVE-2021-3156) that will let any user run any command as root without entering a password.

In combination with other less severe security exploits this can in some cases be used to compromise your instances remotely.

NORTH-1 will replace the HPC2N-region.

The HP2N-region will be removed 8/2 so make sure that all instances and data are moved to either EAST-1 or WEST-1 before 7/2.

If you need assistance please send a support ticket and let us know.

We will be replacing both storage and compute and since the setup in the HPC2N region was from the pilot cloud from 2015 we unfortunately cloud not do an in place upgrade.

The new NORTH-1 region will soon be available. I will use AMD 2.5Ghz CPU:s and the boot-disks will now use flash storage.

From Pilot to Production

As SNIC Science Cloud has gone from a pilot to a production resource, the pilot regions in the cloud will be replaced new regions with production hardware.

The region at C3SE has already been replaced by the new WEST-1 region; running OpenStack Rocky on new hardware.

The other pilot cloud-regions at UPPMAX and HPC2N will soon be replaced with the EAST-1 and NORTH-1 regions.

If you are starting up new projects in the cloud we suggest that you use the WEST-1 region for now until the other regions becomes available, because otherwise you will have to migrate your workload to the new regions soon.

The compute and storage of HPC2N region are temporarily down

Update: The downtime will last until 22/4, exact time unknown.

The electrical work did not go as smooth as planed, resulting in a cooling outage of the compute nodes and storage in the HPC2N region.

Maintenance with downtime in the HPC2N region.

Planned downtime in the HPC2N region on Monday the 20th of April between 6-12 and Tuesday the 21th of April between 11-17, due to urgent electrical work. All running instances will be suspended before the outage and restarted again afterwards.

The other regions will not be affected by this and so if you can, we suggest move your workloads to the new WEST-1 region that is running a much more resent version of OpenStack on new hardware.