AMAX ClusterMax SuperG High-powered HPC cluster driven by the revolutionary NVIDIA Tesla V100 GPU Accelerator.

ClusterMax® SuperG | NVIDIA® A100 GPU Cluster

The fastest, most efficient application performance: scalable compute, networking, storage, power, and cooling with compute clusters powered by NVIDIA® A100

Specifications:

  • Incorporate the latest 3rd Generation Intel® Xeon® Scalable Processors Family or AMD EPYC™ 7003 Series Processors
  • Delivers up to 72 NVIDIA A100 SXM4 40GB/80GB GPUs, 497,664 FP32 CUDA Cores / INT32 Cores, 248,832 FP64 Cores, 31,104 Tensor Cores, 698 Teraflops of peak FP64 performance, 1,404 Teraflops of peak FP64 Tensor Core performance, 1,404 Teraflops of peak FP32 Performance per 42U cluster
  • Up to 1,280GB GPU memory
  • Supports HDR InfiniBand fabric & real time InfiniBand diagnostics
  • Cluster management and GPU monitoring software, including GPU temperature monitoring, fan speed, and power, providing exclusive access to GPUs in a cluster

Request a Quote

The ClusterMax® SuperG GPU computing clusters are powered by the NVIDIA® GPU computing platforms, based on NVIDIA® A100 Tensor Core, delivers unprecedented acceleration at every scale, and powering the world’s highest-performing elastic data centers for AI, data analytics, and HPC. These cluster solutions provides up to 20X higher performance over the prior NVIDIA Volta™ generation, allowing researchers to deliver real-world results and deploy solutions into production at scale.

Complete Cluster Assembly and Setup Services:

  • Fully integrated and pre-packaged turnkey HPC solution, including HPC professional services and support, expert installation and setup of rack-optimized cluster nodes, cabling, rails, and other peripherals
  • Configuration of cluster nodes and the network
  • Installation of applications and client computers to offer a comprehensive solution for your IT needs
  • Rapid deployment
  • Server management options include Standards-based IPMI or AMAX remote server management
  • Seamless standard and custom application integration and cluster installation
  • Cluster management options include a choice of commercial and open source software solutions
  • Supports a variety of UPS and PDU configuration and interconnect options, including Infiniband (EDR/HDR), Fibre channel, and Ethernet (Gigabit, 10GbE, 40GbE, 25GbE, 100GbE, 200GbE)
  • Energy efficient cluster cabinets, high performance UPS and power distribution units for expert installation and setup of rack-optimized nodes, cabling, rails, and other peripherals

 

Rack Level Verification

  • Performance and Benchmark Testing (HPL)
  • ATA rack level stress test
  • Rack Level Serviceability
  • Ease of Deployment Review
  • MPI jobs over IB for HPC
  • GPU stress test using CUDA
  • Cluster management

 

Large Scale Rack Deployment Review

  • Scalability Process
  • Rack to Rack Connectivity
  • Multi-Cluster Testing
  • Software/Application Load

 

Optional Cluster System Software Installed:

  • Microsoft Windows Server 2019
  • Bright Computing Cluster Manager
  • SuSE / Red Hat Enterprise Linux,
  • C-based software development tools, CUDA Toolkit and SDK, and various libraries for CPU GPU clusters
  • Deep learning software

ClusterMax® SuperG NVIDIA A100 GPU Computing Cluster Specifications, with 3rd Generation Intel® Xeon® Scalable Processors:

Model # ClusterMax®
SuperG-142.X100S
ClusterMax®
SuperG-244.X100S
ClusterMax®
SuperG-426.X100S
ClusterMax®
SuperG-42U9.X100S
Rack Height 14U 24U 42U 42U
# of 4U 8x A100 SXM4 GPU Nodes per rack 2 4 6 9
# of A100 SXM4 GPUs per Rack (8x GPU per Node) 16 32 48 72
GPU Memory Capacity per Rack (40GB per GPU) 640GB 1,280GB 1,920GB 2,880GB
GPU Memory Capacity per Rack (80GB per GPU) 1,280GB 2,560GB 3,840GB 5,760GB
GPU Node Processor Support 2x 3rd Generation Intel® Xeon® Scalable Processors per node 2x 3rd Generation Intel® Xeon® Scalable Processors per node 2x 3rd Generation Intel® Xeon® Scalable Processors per node 2x 3rd Generation Intel® Xeon® Scalable Processors per node
# of Processors per Rack ( 2 Processors per node) 4 8 12 18
Maximum # of CPU Cores per Rack (40 cores per Processor) 160 Cores 320 Cores 480 Cores 720 Cores
Maximum Compute Node Memory Capacity per Rack (8TB per system) 16TB 32TB 48TB 72TB
# of FP32 CUDA Cores per Rack (6,912 cores per GPU) 110,592 Cores 221,184 Cores 331,776 Cores 497,664 Cores
# of FP64 Cores per Rack (3,456 cores per GPU) 55,296 Cores 110,592 Cores 165,888 Cores 248,832 Cores
# of INT32 Cores per Rack (6,912 cores per GPU) 110,592 Cores 221,184 Cores 331,776 Cores 497,664 Cores
# of Tensor Cores per Rack (432 cores per GPU) 6,912 Cores 13,824 Cores 30,720 Cores 31,104 Cores
Peak FP64 Performance per Rack (9.7 TF per GPU) 155 TFLOPS 310 TFLOPS 466 TFLOPS 698 TFLOPS
Peak FP64 Tensor Core Performance per Rack (19.5 TF per GPU) 312 TFLOPS 624 TFLOPS 936 TFLOPS 1,404 TFLOPS
Peak FP32 Performance per Rack (19.5 TF per GPU) 312 TFLOPS 624 TFLOPS 936 TFLOPS 1,404 TFLOPS
Tensor Float 32 (TF32) Performance per Rack (156 TF per GPU) 2,496 TFLOPS 4,992 TFLOPS 7,488 TFLOPS 11,232 TFLOPS
Tensor Float 32 (TF32) Performance per Rack, with Sparsity (312 TF per GPU) 4,992 TFLOPS 9,984 TFLOPS 14,976 TFLOPS 22,464 TFLOPS
Peak BFLOAT16 / FP16 tensor Core Performance per Rack (312 TF per GPU) 4,992 TFLOPS 9,984 TFLOPS 14,976 TFLOPS 22,464 TFLOPS
Peak BFLOAT16 / FP16 tensor Core Performance per Rack, with Sparsity (624 TF per GPU) 9,984 TFLOPS 19,968 TFLOPS 29,952 TFLOPS 44,928 TFLOPS
Peak INT8 tensor Core Performance per Rack (624 TOPs per GPU) 9,984 TOPs 19,968 TOPs 29,952 TOPs 44,928 TOPs
Peak INT8 tensor Core Performance per Rack, with Sparsity (1,248 TOPs per GPU) 19,968 TOPs 39,936 TOPs 59,904 TOPs 89,856 TOPs
Peak INT4 tensor Core Performance per Rack (1,248 TOPs per GPU) 19,968 TOPs 39,936 TOPs 59,904 TOPs 89,856 TOPs
Peak INT4 tensor Core Performance per Rack, with Sparsity (2,496 TOPs per GPU) 39,936 TOPs 79,872 TOPs 119,808 TOPs 179,712 TOPs
GPU Nodes Interconnectivity 10GbE HDR InfiniBand HDR InfiniBand HDR InfiniBand
GPU Node Storage 6x U.2 NVMe bays & 2 x M.2 NVMe bays 6x U.2 NVMe bays & 2 x M.2 NVMe bays 6x U.2 NVMe bays & 2 x M.2 NVMe bays 6x U.2 NVMe bays & 2 x M.2 NVMe bays
Storage Node None 1x 1U Storage Node 1x 1U Storage Node 1x 1U Storage Node
Storage Node Processor Support 2x 3rd Generation Intel® Xeon® Scalable Processors 2x 3rd Generation Intel® Xeon® Scalable Processors 2x 3rd Generation Intel® Xeon® Scalable Processors
Storage Node Memory Support 8TB Registered ECC DDR4 3200MHz memory 8TB Registered ECC DDR4 3200MHz memory 8TB Registered ECC DDR4 3200MHz memory
Storage Node Drive Bays 12x hot-swap 2.5″ U.2 NVMe drive bays 12x hot-swap 2.5″ U.2 NVMe drive bays 12x hot-swap 2.5″ U.2 NVMe drive bays
Storage Node Interconnectivity HDR InfiniBand HDR InfiniBand HDR InfiniBand
Network Switch 1x 24-port 10GbE Gigabit Ethernet 1x 24-port 10GbE Ethernet
1x HDR InfiniBand
1x 52-port 10GbE Ethernet
1x 40-Port EDR/HDR InfiniBand
1x 52-port 10GbE Ethernet
1x 40-port EDR/HDR InfiniBand
Cluster Management Software Optional Bright Cluster Manager software Optional Bright Cluster Manager software Optional Bright Cluster Manager software Optional Bright Cluster Manager software

Software Options

Bright Cluster Manager software automates the process of building and managing modern high-performance Linux clusters, eliminating complexity and enabling flexibility.

Excelero

NVMesh enables shared NVMe across any network and supports any local or distributed file system. The solution features an intelligent management layer that abstracts underlying hardware with CPU offload, creates logical volumes with redundancy, and provides centralized, intelligent management and monitoring.

QuantaStor’s unique Storage Grid architecture organizations are able to manage multiple clusters across sites as a unified storage platform that’s easily configured and maintained through the web user interface and automated via advanced CLI and REST APIs

Enabling data centers to easily transform themselves into a flexible cloud infrastructure with the performance and reliability needed to run enterprise applications.

ClusterMax® SuperG NVIDIA A100 GPU Computing Cluster Specifications, with AMD EPYC™ 7003 Series Processors:

Model # ClusterMax®
SuperG-142.A100S
ClusterMax®
SuperG-244.A100S
ClusterMax®
SuperG-426.A100S
ClusterMax®
SuperG-42U9.A100S
Rack Height 14U 24U 42U 42U
# of 4U 8x A100 SXM4 GPU Nodes per rack 2 4 6 9
# of A100 SXM4 GPUs per Rack (8x GPU per Node) 16 32 48 72
GPU Memory Capacity per Rack (40GB per GPU) 640GB 1,280GB 1,920GB 2,880GB
GPU Memory Capacity per Rack (80GB per GPU) 1,280GB 2,560GB 3,840GB 5,760GB
GPU Node Processor Support 2x AMD EPYC™ 7003 Processors per node 2x AMD EPYC™ 7003 Processors per node 2x AMD EPYC™ 7003 Processors per node 2x AMD EPYC™ 7003 Processors per node
# of Processors per Rack ( 2 Processors per node) 4 8 12 18
Maximum # of CPU Cores per Rack (64 cores per Processor) 256 Cores 512 Cores 768 Cores 1,152 Cores
Maximum Compute Node Memory Capacity per Rack (8TB per system) 16TB 32TB 48TB 72TB
# of FP32 CUDA Cores per Rack (6,912 cores per GPU) 110,592 Cores 221,184 Cores 331,776 Cores 497,664 Cores
# of FP64 Cores per Rack (3,456 cores per GPU) 55,296 Cores 110,592 Cores 165,888 Cores 248,832 Cores
# of INT32 Cores per Rack (6,912 cores per GPU) 110,592 Cores 221,184 Cores 331,776 Cores 497,664 Cores
# of Tensor Cores per Rack (432 cores per GPU) 6,912 Cores 13,824 Cores 30,720 Cores 31,104 Cores
Peak FP64 Performance per Rack (9.7 TF per GPU) 155 TFLOPS 310 TFLOPS 466 TFLOPS 698 TFLOPS
Peak FP64 Tensor Core Performance per Rack (19.5 TF per GPU) 312 TFLOPS 624 TFLOPS 936 TFLOPS 1,404 TFLOPS
Peak FP32 Performance per Rack (19.5 TF per GPU) 312 TFLOPS 624 TFLOPS 936 TFLOPS 1,404 TFLOPS
Tensor Float 32 (TF32) Performance per Rack (156 TF per GPU) 2,496 TFLOPS 4,992 TFLOPS 7,488 TFLOPS 11,232 TFLOPS
Tensor Float 32 (TF32) Performance per Rack, with Sparsity (312 TF per GPU) 4,992 TFLOPS 9,984 TFLOPS 14,976 TFLOPS 22,464 TFLOPS
Peak BFLOAT16 / FP16 tensor Core Performance per Rack (312 TF per GPU) 4,992 TFLOPS 9,984 TFLOPS 14,976 TFLOPS 22,464 TFLOPS
Peak BFLOAT16 / FP16 tensor Core Performance per Rack, with Sparsity (624 TF per GPU) 9,984 TFLOPS 19,968 TFLOPS 29,952 TFLOPS 44,928 TFLOPS
Peak INT8 tensor Core Performance per Rack (624 TOPs per GPU) 9,984 TOPs 19,968 TOPs 29,952 TOPs 44,928 TOPs
Peak INT8 tensor Core Performance per Rack, with Sparsity (1,248 TOPs per GPU) 19,968 TOPs 39,936 TOPs 59,904 TOPs 89,856 TOPs
Peak INT4 tensor Core Performance per Rack (1,248 TOPs per GPU) 19,968 TOPs 39,936 TOPs 59,904 TOPs 89,856 TOPs
Peak INT4 tensor Core Performance per Rack, with Sparsity (2,496 TOPs per GPU) 39,936 TOPs 79,872 TOPs 119,808 TOPs 179,712 TOPs
GPU Nodes Interconnectivity 10GbE HDR InfiniBand HDR InfiniBand HDR InfiniBand
GPU Node Storage 6x U.2 NVMe bays & 2 x M.2 NVMe bays 6x U.2 NVMe bays & 2 x M.2 NVMe bays 6x U.2 NVMe bays & 2 x M.2 NVMe bays 6x U.2 NVMe bays & 2 x M.2 NVMe bays
Storage Node None 1x 1U Storage Node 1x 1U Storage Node 1x 1U Storage Node
Storage Node Proessor Support 2x AMD EPYC™ 7003 Processors 2x AMD EPYC™ 7003 Processors 2x AMD EPYC™ 7003 Processors
Storage Node Memory Support 8TB Registered ECC DDR4 3200MHz memory 8TB Registered ECC DDR4 3200MHz memory 8TB Registered ECC DDR4 3200MHz memory
Storage Node Drive Bays 12x hot-swap 2.5″ U.2 NVMe drive bays 12x hot-swap 2.5″ U.2 NVMe drive bays 12x hot-swap 2.5″ U.2 NVMe drive bays
Storage Node Interconnectivity HDR InfiniBand HDR InfiniBand HDR InfiniBand
Network Switch 1x 24-port 10GbE Gigabit Ethernet 1x 24-port 10GbE Ethernet
1x HDR InfiniBand
1x 52-port 10GbE Ethernet
1x 40-Port EDR/HDR InfiniBand
1x 52-port 10GbE Ethernet
1x 40-port EDR/HDR InfiniBand
Cluster Management Software Optional Bright Cluster Manager software Optional Bright Cluster Manager software Optional Bright Cluster Manager software Optional Bright Cluster Manager software

Software Options

Bright Cluster Manager software automates the process of building and managing modern high-performance Linux clusters, eliminating complexity and enabling flexibility.

Excelero

NVMesh enables shared NVMe across any network and supports any local or distributed file system. The solution features an intelligent management layer that abstracts underlying hardware with CPU offload, creates logical volumes with redundancy, and provides centralized, intelligent management and monitoring.

QuantaStor’s unique Storage Grid architecture organizations are able to manage multiple clusters across sites as a unified storage platform that’s easily configured and maintained through the web user interface and automated via advanced CLI and REST APIs

Enabling data centers to easily transform themselves into a flexible cloud infrastructure with the performance and reliability needed to run enterprise applications.