Best-in-class HPC cloud combines price-performance of on-premises with flexibility of cloud
Developers and data scientists can boost AI, data science, HPC, and machine learning performance by 2.5-6x, at lowest cost
Oracle announced that it is the first major cloud provider to make NVIDIA A100 Tensor Core GPU on bare metal instances generally available. Oracle’s latest GPU instances enable customers in industries such as automotive and aerospace to run complex, data-intensive, high-performance applications like modeling and simulations more efficiently and at a lower cost than ever before. Get started here.
Oracle Cloud Infrastructure, running the NVIDIA A100 Tensor Core GPUs on bare metal instances, can run complex AI models and deep learning systems between two-and-a-half to six times faster than instances featuring previous generations of GPUs. When running on Oracle Cloud, the new A100 GPU can help enterprises unlock more value from their data and innovate faster, enabling important breakthroughs such as testing and developing new medications, building safer airplanes, and quickly sourcing natural resources. Additionally, customers can for the first time run their complex HPC applications using GPUDirect over NVIDIA Mellanox RDMA networking, which enables clusters of thousands of GPUs, connected with microsecond latency, to deliver massive computational power on-demand.
“Since its inception, Oracle Cloud Infrastructure has standardized on NVIDIA’s GPUs, beginning with the Pascal generation, moving to Volta and now with Ampere. Our customers demand the best of on-premises with all the benefits of the cloud, which is what we’re delivering with our latest GPU instance running on NVIDIA’s latest A100 GPU,” said Karan Batta, vice president, Oracle Cloud Infrastructure. “We have the largest, most performant, and most cost effective A100 offering in the cloud because we offer double the memory and more local storage than competitors. This is the GPU instance customers have been waiting for to move to the cloud and deliver important breakthroughs.”
“Accelerated computing is essential to driving research breakthroughs for enterprises across all industries,” said Ian Buck, general manager and vice president of Accelerated Computing at NVIDIA. “By bringing the NVIDIA A100 Tensor Core GPUs into its cloud service, and offering the ability to scale to more than 500 GPUs interconnected with Mellanox networking, Oracle is providing the computing performance needed to accelerate the most critical work being done today in AI and high performance computing.”
NVIDIA A100 Tensor Core GPUs running on Oracle Cloud Infrastructure
The new bare metal instance, GPU4.8, features eight NVIDIA A100 Tensor Core GPUs with 40 GB of memory each, all interconnected via NVIDIA NVLink™. The CPU on board has 64 physical cores of AMD Rome processors running at 2.9 GHz supported by 2,048 GB of RAM and 24 TB of NVMe storage. Oracle’s new bare metal GPU instance joins the high-speed, low-latency Cluster Network architecture, enabling customers to scale to 500+ GPU clusters with NVIDIA Mellanox RDMA over Converged Ethernet (RoCE) for large-scale distributed workloads requiring RDMA and providing up to 1.6 TB of bandwidth per bare-metal node.
In addition to the bare metal instance, organizations will be able to deploy one, two, or four GPUs per virtual machine in the coming months. These instance shapes will also give customers access to all the existing toolsets, like pre-configured Data Science VMs optimized for GPUs, to run any HPC or deep learning containers from NVIDIA NGC, a hub of cloud-native, GPU-optimized containers, models and industry-specific SDKs.
Expanding GPU Ecosystem
As part of today’s news, Oracle is announcing images, solution stacks, and services that expand users’ ability to extract value from their data, including:
AI Engineered MLOps Solution Stack – Automates the end-to-end workflow using Apache Airflow and instantiating a cluster of bare metal NVIDIA A100 shapes for distributed training and inference.
Media Ops Packaged Solution Stacks – Automates workflows leveraging Apache Airflow framework to operate a self-managed media operations pipeline on Oracle Cloud. This enhances customers’ abilities to compress, package, and distribute content.
Cloud Native MLOps Orchestration Package – Allows data scientists to focus on innovation by using open source Kubeflow along with Oracle Cloud Infrastructure-engineered images and the NGC cloud image to automate the movement of data and the creation of compute instances.
Julia AI HPC Image – Automation stack for engineered HPC Image with Julia installed with Jupyter Notebook and IJulia development environment that has been tested and optimized for NVIDIA A100 GPUs. Paired with NVIDIA A100 Multi-Instance GPU technology and Oracle HPC shapes, the environment is proving to be faster than the older systems with Python.
Pre-configured Data Science and AI Image – Includes NVIDIA’s Deep Neural Network libraries, common ML/deep learning frameworks, Jupyter Notebooks and common Python/R integrated development environments. Available in the Oracle Cloud Marketplace.
IDenTV provides advanced video analytics based on AI capabilities powered by computer vision, automated speech recognition and textual semantic classifiers. “The amount of streaming video data being created is growing exponentially. To deliver real-time analytics and insights demands the highest level of graphics processing units,” said Amro Shihadah, founder and COO, IDenTV. “Oracle Cloud Infrastructure delivers that with the new NVIDIA A100 GPU where we expect an immediate performance gain of 35 percent.”
DeepZen produces digital voice solutions for audiobooks, advertising, marketing, brand voices and other types of voice content, including podcasting, gaming and virtual assistants. “Replicating the human voice with AI is highly dependent on processing power and Oracle Cloud Infrastructure delivers that with the new NVIDIA A100 GPU,” said Kerem Sozugecer, co-founder and CTO, DeepZen Limited. “This provided an immediate performance increase of 37 percent, enabling us to scale our business.”
Oracle for Startups
With machine learning and compute-intensive applications exploding, companies are already benefiting from Oracle and NVIDIA’s partnership. Through the partnership between Oracle for Startups and NVIDIA Inception, Oracle is accelerating AI startups with NVIDIA’s deep technical expertise and industry-leading GPU technology on Oracle Cloud. Startups are also benefiting from both companies’ ability to connect them with potential customers and leverage vast marketing and business-building resources essential to their growth.