ARC Systems

At the centre of the ARC service are two high performance compute clusters - arc and htc.

arc is designed for multi-node parallel computation
htc is designed for high-thoughput operation (lower core count jobs).

htc is also a more heterogeneous system offering different types of resources, such as GPGPU computing and high memory systems; nodes on arc are uniform. Users get access to both both clusters automatically as part of the process of obtaining an account with ARC, and can use either or both.

For more detailled information on the hardware specifications of these clusters, see the tables below:

Cluster	Description	Login Node	Compute Nodes	Minimum Job Size	Notes:
arc	Our largest compute cluster. Optimised for large parallel jobs spanning multiple nodes. Scheduler prefers large jobs. Offers low-latency interconnect (Mellanox HDR 100).	arc-login	CPU: 48 core Cascade Lake (Intel Xeon Platinum 8268 CPU @ 2.90GHz) Memory: 384 GB or CPU: 288 core Zen 5c/Turin (AMD EPYC 9825 @ 2.2GHz) Memory: 2.3 TB	1 core	Non-blocking island size is 2212 cores
htc	Optimised for single core jobs, and SMP jobs up to one node in size. Scheduler prefers small jobs. Also catering for jobs requiring resources other than CPU cores (e.g. GPUs).	htc-login	CPUs: mix of Broadwell, Haswell, Cacade Lake, Sapphire/Emerald Rapids, AMD Genoa/Rome GPU: P100, V100, A100, RTX, H100, L40s	1 core	Jobs will only be scheduled onto a GPU node if requesting a GPU resource.

Operating system

The ARC systems use the Linux Operating System (specifically AlmaLinux 9) which is commonly used in HPC. We do not have any HPC systems running Windows (or MacOS). If you are unfamiliar with using Linux, please consider:

Finding introduction to Linux resources online (through Google/Bing/Yahoo etc).
Working through our brief Introduction to Linux course.
Attending our Introduction to ARC training course (this does not teach you how to use Linux but the examples will help you gain a greater understanding).

Capability cluster (arc)

The capability system - cluster name arc - has a total of 262× 48 core and 10× 288 core worker nodes, some of which are co-investment hardware. These machines are available for general use, but may be subject to job time limits and/or may occasionally be reserved for exclusive use of the entity that purchased them.

The ARC system offers a total of 15,456 CPU cores.

ARC nodes consist of 2 difference type of node:

262 worker nodes with:
- 2x Intel Platinum 8628 CPU. The Platinum 8628 is a 24 core 2.90 GHz Cascade Lake CPU. Thus all nodes have 48 CPU cores per node.
- 384 GB memory
- HDR 100 infiniband interconnect. The fabric has a 4:1 blocking factor with non-blocking islands of 44 nodes (2112 cores).
10 worker nodes with:
- 2x AMD EPYC 9825 (Zen5c/Turin) CPU (144 cores @ 2.20 GHz per CPU)
- 2.3 TB of DDR5 ECC Registered memory (equivalent to 8GB per core)
- NDR 400 Infiniband interconnect

The cluster runs AlmaLinux 9.4 OS and is scheduled by SLURM.

Login node for the system is ‘arc-login.arc.ox.ac.uk’, which allows logins from the University network range (including VPN).

The generally available partitions are:

Partition	Nodes / cores	Nodes	Default run time	Maximum run time
short	267 / 15,216	arc-c[046-293] arc-c[299-301] arc-c[306-311] arc-c[312-321]	1 hour	12 hours
medium	261 / 14,928	arc-c[046-293] arc-c[299-301] arc-c[312-321]	12 hours	2 days
long	261 / 14,928	arc-c[046-287] arc-c[299-301] arc-c[312-321]	1 day	unlimited
legacy	4 / 192	arc-c[302-305]		10 minutes
devel	2 / 96	arc-c[294-295]		10 minutes
interactive	3 / 144	arc-c[296-298]	1 hour	4 hours

Throughput cluster (htc)

The throughput system - cluster name htc - currently 124 worker nodes, some of which are co-investment hardware. These machines are available for general use, but may be subject to job time limits and/or may occasionally be reserved for exclusive use of the entity that purchased them. The hardware on the HTC system is more heterogeneous than on the ARC system.

51 of the nodes are GPGPU nodes. More information on how to access GPU nodes is available.

OS is AlmaLinux 9.4. Scheduler is SLURM.

Login node for the system is htc-login.arc.ox.ac.uk, which allows logins from the University network range (including VPN).

Details on the partitions are:

Partition	Nodes / cores, GPUs	Nodes	Default run time	Maximum run time
short	124 / 7,872 - 76x V100 - 16x A100 - 24x RTX8000 - 12x RTXA6000 - 20x P100 - 52x Titan RTX - 92x L40s	htc-c[005-046,048-073] htc-g[009-018] htc-g[032-035,037-38] htc-g[041-043,050-055] htc-g[058-060] htc-g[061-084]	1 hour	12 hours
medium	101 / 6,888 - 48x V100 - 16x A100 - 24x RTX8000 - 92x L40s	htc-c[006-046,048-073] htc-g[009-018] htc-g[061-084]	12 hours	2 days
long	101 / 6,888 - 48x V100 - 16x A100 - 24x RTX8000 - 92x L40s	htc-c[006-046,048-073] htc-g[009-018] htc-g[061-084]	1 day	unlimited
devel	2 / 80 - 16x V100	htc-g[046-047]		10 minutes
interactive	2 / 80 - 16x V100	htc-g[048-049]	1 hour	4 hours

Node CPU details are:

Nodes	CPU	Cores per node	memory per node	interconnect
htc-c[005-006]	Intel Platinum 8628 (Cascade Lake), 2.90GHz	96	3TB	HDR100
htc-c[007-046]	Intel Platinum 8628 (Cascade Lake), 2.90GHz	48	384GB
htc-c[048-049]	AMD EPYC 9634 (Genoa), 2.25GHz	168	2.3TB
htc-c[050-055]	AMD EPYC 9634 (Genoa), 2.25GHz	168	1.5TB
htc-c[056-075]	AMD EPYC 9634 (Genoa), 2.25GHz	84	1.1TB
htc-g[009-018]	Intel Platinum 8628 (Cascade Lake), 2.90GHz	48	384GB	HDR100
htc-g019	AMD Epyc 7452 (Rome), 2.35GHz	64	1TB
htc-g[032-040]	Intel Gold 5120 (Skylake), 2.20GHz	28	384GB
htc-g[041-043]	Intel Silver 4112 (Skylake), 2.60GHz	8	192GB
htc-g[045-049]	Intel E5-2698 v4 (Broadwell), 2.20GHz	40	512GB
htc-g[050-052]	Intel Silver 4208 (Cascade Lake), 2.10GHz	16	128GB	HDR100
htc-g[053-055]	Intel Gold 6342 (Ice Lake), 2.80GHz	16	500GB	HDR100
htc-g056	Intel Gold 6342 (Ice Lake), 2.80GHz	48	512GB	HDR100
htc-g057	NVidia Grace Hopper AArch64 3.5GHz	72	580GB
htc-g058	Intel Gold 5418Y (Sapphire Rapids), 2.0GHz	48	1.5TB
htc-g[059-060]	Intel Platinum 8468 (Sapphire Rapids), 2.1GHz	96	1TB	HDR100
htc-g[061-084]	Intel Gold 6548N (Emerald Rapids), 2.8GHz	64	500GB	NDR400

GPU Resources

ARC has a number of GPU nodes in the “htc” cluster.

Node GPU details are:

Nodes	GPUs	#GPUs	GPU memory	ECC	CUDA cores	Compute capability	nvlink
htc-g[009-014]	RTX8000	4	40GB	yes	4608	7.5	no
htc-g[015-019]	A100	4	40GB	yes	6912	8.0	no
htc-g[032-034]	P100	4	16GB	yes	3584	6.0	no
htc-g[035]	V100	4	16GB	yes	5120	7.0	no
htc-g[037-038]	V100	4	32GB	yes	5120	7.0	yes
htc-g[041-043]	Titan RTX	4	24GB	yes	4606	7.5	pairwise
htc-g[045-049]	V100-LS	8	32GB	yes	5120	7.0	yes
htc-g[050-052]	RTXA6000	4	48GB	yes	10,752	8.6	yes
htc-g[053-055]	H100	4	82GB	yes	10,752	9.0	no
htc-g056	MI210	4	64GB	yes		gfx90a
htc-g057	GH200	1	96GB	yes	10,752	12.6	no
htc-g058	H100	4	96GB	yes	10,752	9.0	yes
htc-g[059-060]	H100	8	80GB	yes	10,752	9.0	yes
htc-g[061-084]	L40S	4	46GB	yes	18,176	8.9	no

Memory

On the HTC cluster, there are several generally available high memory nodes:

2 nodes with 96 CPU cores & 3 TB memory
4 nodes with 168 CPU cores & 2.2 TB memory
4 nodes with 168 CPU cores & 1.5 TB memory
18 nodes with 84 CPU cores & 1.1 TB memory

You can use the high-memory nodes by adding a value between 400G and 3000G in the --mem option in your submission script, e.g.:

#SBATCH --mem=1500G

to request 1.5 TB

Storage

Our clusters systems share 2 PB of high-performance OnTAP filesystem for project data storage, as well as 1 PB of ultra high performance/low latency Weka filesystem for shared scratch storage.

Project data storage is mounted via NFS on all nodes. On nodes with NDR/HDR interconnect, the scratch filesystem uses that fabric instead.

For more information about the storage, please refer to the ARC Storage page.

Software

Users may find the application they are interested in running is already been installed on at least one of the systems. Users are welcome to request the installation of new applications and libraries or updates to already installed applications via our software request form.

For more information, please refer to our ARC Software Guide.