AI upgrade for CSIRO’s computing cluster

Joseph Brookes
Senior Reporter

The national science agency will upgrade from its current GPU cluster next year to a new high-performance accelerator computing system more capable of machine learning and artificial intelligence workloads across its various research areas.

The CSIRO on Friday went to market for the solution, offering up to $14.5 million over five years, and wants the cluster up and running on its technical floor space at Canberra Data Centres by September next year.

The contract is for the supply, complete installation, maintenance, and ongoing support of the hardware and software for what will be known as the Advanced Scientific Accelerator Cluster (ASAC).

The CSIRO is seeking to minimise risk with selecting a vendor with a “requisite track record” and financial viability.

The ASAC will replace the existing functionality of the current GPU cluster known as Bracewell. Bracewell is a homogeneous cluster consisting of 114 Dell PowerEdge C4130 servers with dual sockets, quad P100 GPUs, 256GB of memory and 960GB local disk with EDR interconnect.

The current solution launched in 2017 and was named after Ronald N Bracewell, an Australian astronomer and engineer who worked in the CSIRO Radiophysics Laboratory during World War II.

The CSIRO partnered with Dell on Bracewell with a much smaller budget of $4 million, touting the computer system as a deep learning and artificial intelligence capability. It was used by Data61 to develop a bionic vision solution.

The replacement ASAC will also support various scientific workloads and CSIRO’s large research database in areas like astronomy, manufacturing, climate sciences, health and biosecurity.

The ASAC will need to be interoperable with CSIRO’s current CPU cluster Petrichor and support research applications including Pytorch, Tensorflow, Jax, Comsol, OpenMPI, Relion, VirtualGL, Matlab, NAMD, Qiskit and Openfoam.

Tenderers will need to provide information on the total number and mix of nodes to be supplied, node configuration, hardware types and performance measures, as well as support for existing CSIRO systems.

The tender closes on December 1, with a contract expected to be signed by mid-February 2023. The build will be completed, tested and accepted by September 2023.

The contractor will need to follow Canberra Data Centre’s health, safety and environment protocols. Canberra Data Centres was among the first cohort of certified providers under the federal government’s Hosting Certification Framework (HCF) last year.

The data sovereignty and security scheme applies to all service providers that deliver hosting services for Australian Government customers, including the facilities that host government data, their systems and supply chains.

Do you know more? Contact James Riley via Email.

Leave a Comment