GPU Cluster - FPT AI Factory

Everything your GPU workload demands

Flexible GPU Deployment

Run Kubernetes on GPU Bare Metal for peak performance or scale with GPU VMs for on-demand workloads

High-performance Shared Storage

NVMe block storage up to 50 GB/s throughput for training and distributed file storage with a unified namespace across all nodes

Production-Ready Under 30 Minutes

Pre-configured clusters with everything you need to launch your first GPU job in under 30 minutes

Fully Managed Control Plane

The Kubernetes control plane is fully managed with automated updates, high availability, and a 99.99% SLA

Scalable infrastructure designed to power next-generation AI workloads

No setup burden with fully pre-configured stack (GPU drivers, InfiniBand, Kubernetes, monitoring)

Reschedule workloads in seconds with automated health checks and self-healing

Sync instantly across all nodes through a shared distributed file system

Autoscale Kubernetes nodes to match capacity to workload in real time

Designed for High-Performance and AI-Driven Workloads

Scale LLM & Foundation Model Training

Run distributed training at full GPU speed, with fast communication and storage that avoid performance bottlenecks

Production-Ready Model Serving

Serve AI models on GPU VMs with autoscaling that adapts to demand, ensuring efficiency without over-provisioning or dropped requests

HPC & Scientific Computing

Run HPC workloads and ML pipelines in one cluster, with seamless integration that supports existing workflows without disruption