ARC Systems

At the centre of the ARC service are two high performance compute clusters - arc and htc.

  • arc is designed for multi-node parallel computation

  • htc is designed for high-thoughput operation (lower core count jobs).

htc is also a more heterogeneous system offering different types of resources, such as GPGPU computing and high memory systems; nodes on arc are uniform. Users get access to both both clusters automatically as part of the process of obtaining an account with ARC, and can use either or both.

For more detailled information on the hardware specifications of these clusters, see the tables below:

Cluster

Description

Login Node

Compute Nodes

Minimum Job Size

Notes:

arc

Our largest compute cluster.
Optimised for large parallel jobs spanning multiple nodes.
Scheduler prefers large jobs.
Offers low-latency interconnect (Mellanox HDR 100).

arc-login

CPU: 48 core Cascade Lake (Intel Xeon Platinum 8268 CPU @ 2.90GHz)
Memory: 384 GB
or
CPU: 288 core Zen 5c/Turin (AMD EPYC 9825 @ 2.2GHz)
Memory: 2.3 TB

1 core

Non-blocking island size is 2212 cores

htc

Optimised for single core jobs, and SMP jobs up to one node in size.
Scheduler prefers small jobs.
Also catering for jobs requiring resources other than CPU cores (e.g. GPUs).

htc-login

CPUs: mix of Broadwell, Haswell, Cacade Lake, Sapphire/Emerald Rapids, AMD Genoa/Rome
GPU: P100, V100, A100, RTX, H100, L40s

1 core

Jobs will only be scheduled onto a GPU node if requesting a GPU resource.

Operating system

The ARC systems use the Linux Operating System (specifically AlmaLinux 9) which is commonly used in HPC. We do not have any HPC systems running Windows (or MacOS). If you are unfamiliar with using Linux, please consider:

  • Finding introduction to Linux resources online (through Google/Bing/Yahoo etc).

  • Working through our brief Introduction to Linux course.

  • Attending our Introduction to ARC training course (this does not teach you how to use Linux but the examples will help you gain a greater understanding).

Capability cluster (arc)

The capability system - cluster name arc - has a total of 262× 48 core and 10× 288 core worker nodes, some of which are co-investment hardware. These machines are available for general use, but may be subject to job time limits and/or may occasionally be reserved for exclusive use of the entity that purchased them.

The ARC system offers a total of 15,456 CPU cores.

ARC nodes consist of 2 difference type of node:

  • 262 worker nodes with:

    • 2x Intel Platinum 8628 CPU. The Platinum 8628 is a 24 core 2.90 GHz Cascade Lake CPU. Thus all nodes have 48 CPU cores per node.

    • 384 GB memory

    • HDR 100 infiniband interconnect. The fabric has a 4:1 blocking factor with non-blocking islands of 44 nodes (2112 cores).

  • 10 worker nodes with:

    • 2x AMD EPYC 9825 (Zen5c/Turin) CPU (144 cores @ 2.20 GHz per CPU)

    • 2.3 TB of DDR5 ECC Registered memory (equivalent to 8GB per core)

    • NDR 400 Infiniband interconnect

The cluster runs AlmaLinux 9.4 OS and is scheduled by SLURM.

Login node for the system is ‘arc-login.arc.ox.ac.uk’, which allows logins from the University network range (including VPN).

The generally available partitions are:

Partition

Nodes / cores

Nodes

Default run time

Maximum run time

short

267 / 15,216

arc-c[046-293]
arc-c[299-301]
arc-c[306-311]
arc-c[312-321]

1 hour

12 hours

medium

261 / 14,928

arc-c[046-293]
arc-c[299-301]
arc-c[312-321]

12 hours

2 days

long

261 / 14,928

arc-c[046-287]
arc-c[299-301]
arc-c[312-321]

1 day

unlimited

legacy

4 / 192

arc-c[302-305]

10 minutes

devel

2 / 96

arc-c[294-295]

10 minutes

interactive

3 / 144

arc-c[296-298]

1 hour

4 hours

Throughput cluster (htc)

The throughput system - cluster name htc - currently 124 worker nodes, some of which are co-investment hardware. These machines are available for general use, but may be subject to job time limits and/or may occasionally be reserved for exclusive use of the entity that purchased them. The hardware on the HTC system is more heterogeneous than on the ARC system.

51 of the nodes are GPGPU nodes. More information on how to access GPU nodes is available.

OS is AlmaLinux 9.4. Scheduler is SLURM.

Login node for the system is htc-login.arc.ox.ac.uk, which allows logins from the University network range (including VPN).

Details on the partitions are:

Partition

Nodes / cores, GPUs

Nodes

Default run time

Maximum run time

short

124 / 7,872
- 76x V100
- 16x A100
- 24x RTX8000
- 12x RTXA6000
- 20x P100
- 52x Titan RTX
- 92x L40s
htc-c[005-046,048-073]
htc-g[009-018]
htc-g[032-035,037-38]
htc-g[041-043,050-055]
htc-g[058-060]
htc-g[061-084]

1 hour

12 hours

medium

101 / 6,888
- 48x V100
- 16x A100
- 24x RTX8000
- 92x L40s
htc-c[006-046,048-073]
htc-g[009-018]
htc-g[061-084]

12 hours

2 days

long

101 / 6,888
- 48x V100
- 16x A100
- 24x RTX8000
- 92x L40s
htc-c[006-046,048-073]
htc-g[009-018]
htc-g[061-084]

1 day

unlimited

devel

2 / 80
- 16x V100

htc-g[046-047]

10 minutes

interactive

2 / 80
- 16x V100

htc-g[048-049]

1 hour

4 hours

Node CPU details are:

Nodes

CPU

Cores per node

memory per node

interconnect

htc-c[005-006]

Intel Platinum 8628 (Cascade Lake), 2.90GHz

96

3TB

HDR100

htc-c[007-046]

Intel Platinum 8628 (Cascade Lake), 2.90GHz

48

384GB

htc-c[048-049]

AMD EPYC 9634 (Genoa), 2.25GHz

168

2.3TB

htc-c[050-055]

AMD EPYC 9634 (Genoa), 2.25GHz

168

1.5TB

htc-c[056-075]

AMD EPYC 9634 (Genoa), 2.25GHz

84

1.1TB

htc-g[009-018]

Intel Platinum 8628 (Cascade Lake), 2.90GHz

48

384GB

HDR100

htc-g019

AMD Epyc 7452 (Rome), 2.35GHz

64

1TB

htc-g[032-040]

Intel Gold 5120 (Skylake), 2.20GHz

28

384GB

htc-g[041-043]

Intel Silver 4112 (Skylake), 2.60GHz

8

192GB

htc-g[045-049]

Intel E5-2698 v4 (Broadwell), 2.20GHz

40

512GB

htc-g[050-052]

Intel Silver 4208 (Cascade Lake), 2.10GHz

16

128GB

HDR100

htc-g[053-055]

Intel Gold 6342 (Ice Lake), 2.80GHz

16

500GB

HDR100

htc-g056

Intel Gold 6342 (Ice Lake), 2.80GHz

48

512GB

HDR100

htc-g057

NVidia Grace Hopper AArch64 3.5GHz

72

580GB

htc-g058

Intel Gold 5418Y (Sapphire Rapids), 2.0GHz

48

1.5TB

htc-g[059-060]

Intel Platinum 8468 (Sapphire Rapids), 2.1GHz

96

1TB

HDR100

htc-g[061-084]

Intel Gold 6548N (Emerald Rapids), 2.8GHz

64

500GB

NDR400

GPU Resources

ARC has a number of GPU nodes in the “htc” cluster.

Node GPU details are:

Nodes

GPUs

#GPUs

GPU memory

ECC

CUDA cores

CUDA compute capability

nvlink

htc-g[009-014]

RTX8000

4

40GB

yes

4608

7.5

no

htc-g[015-019]

A100

4

40GB

yes

6912

8.0

no

htc-g[032-034]

P100

4

16GB

yes

3584

6.0

no

htc-g[035]

V100

4

16GB

yes

5120

7.0

no

htc-g[037-038]

V100

4

32GB

yes

5120

7.0

yes

htc-g[041-043]

Titan RTX

4

24GB

yes

4606

7.5

pairwise

htc-g[045-049]

V100-LS

8

32GB

yes

5120

7.0

yes

htc-g[050-052]

RTXA6000

4

48GB

yes

10,752

8.6

yes

htc-g[053-055]

H100

4

82GB

yes

10,752

12.6

no

htc-g056

MI210

4

64GB

yes

htc-g057

GH200

1

96GB

yes

10,752

12.6

no

htc-g058

H100

4

96GB

yes

10,752

12.6

yes

htc-g[059-060]

H100

8

80GB

yes

10,752

12.6

yes

htc-g[061-084]

L40S

4

46GB

yes

18,176

12.9

no

Memory

On the HTC cluster, there are several generally available high memory nodes:

  • 2 nodes with 96 CPU cores & 3 TB memory

  • 4 nodes with 168 CPU cores & 2.2 TB memory

  • 4 nodes with 168 CPU cores & 1.5 TB memory

  • 18 nodes with 84 CPU cores & 1.1 TB memory

You can use the high-memory nodes by adding a value between 400G and 3000G in the --mem option in your submission script, e.g.:

#SBATCH --mem=1500G

to request 1.5 TB

Storage

Our clusters systems share 2 PB of high-performance OnTAP filesystem for project data storage, as well as 1 PB of ultra high performance/low latency Weka filesystem for shared scratch storage.

Project data storage is mounted via NFS on all nodes. On nodes with NDR/HDR interconnect, the scratch filesystem uses that fabric instead.

For more information about the storage, please refer to the ARC Storage page.

Software

Users may find the application they are interested in running is already been installed on at least one of the systems. Users are welcome to request the installation of new applications and libraries or updates to already installed applications via our software request form.

For more information, please refer to our ARC Software Guide.