Comments on: H100 GPU Instance Pricing On AWS: Grin And Bear It https://www.nextplatform.com/2023/07/27/h100-gpu-instance-pricing-on-aws-grin-and-bear-it/ In-depth coverage of high-end computing at large enterprises, supercomputing centers, hyperscale data centers, and public clouds. Fri, 01 Sep 2023 14:12:45 +0000 hourly 1 https://wordpress.org/?v=6.5.5 By: I_M https://www.nextplatform.com/2023/07/27/h100-gpu-instance-pricing-on-aws-grin-and-bear-it/#comment-211836 Sun, 30 Jul 2023 15:32:31 +0000 https://www.nextplatform.com/?p=142706#comment-211836 How can an AMD Milan server with 8 NVMe drives and NVSwitch possibly have 3200Gb of nw bandwidth (unless it means 1600Gb in each direction) given that out of 160 PCIe 4.0 lanes 32 lanes are used by NVMe drives and at least 32 for CPU interfaces to NVSwitch – and remaining 96 lanes (or less) i.e. 6 x16 slots can’t provide enough bandwidth (unless it’s 2×1600 Gbs). Am I missing something?
There’s no way hw cost for this AWS instance is almost 470K$ listed above because retail cost (qty 1) for a more modern and faster platform ( AMD Genoa or Intel 4th Gen) with the same config (2x48core CPUs / 8 H100 SXM5 / 2TB RAM RAM / 8×3.84TB NVMe /8 dual port 200GbE) is between 310K$ and 340K$ (or less) so in my view it’s highly unlikely that AWS (given their purchase volumes) is paying more than 250K$ per server.

]]>
By: EC https://www.nextplatform.com/2023/07/27/h100-gpu-instance-pricing-on-aws-grin-and-bear-it/#comment-211792 Sat, 29 Jul 2023 05:55:29 +0000 https://www.nextplatform.com/?p=142706#comment-211792 >>you can see now why AWS is very committed to its homegrown Trainium AI training compute engine, and also why Meta Platforms will etch its own training chips, too. At these prices, you might as well.<<

Two factors to consider, SW Ecosystem, CUDA is default, and cost to develop and deploy.

H100 is already pretty formidable. Then add Grace CPU as a mid-life-kicker to raise the bar another notch. Nobody wants to be beholden to a single vendor. But this is exactly the way Nvidia destroyed 3DFX and every other 3D chip guy in early 2000s: "the relentless pace of new products," if I recall Scott Sellers, 3DFX founder's, quote correctly.

]]>
By: hoohoo https://www.nextplatform.com/2023/07/27/h100-gpu-instance-pricing-on-aws-grin-and-bear-it/#comment-211784 Sat, 29 Jul 2023 00:30:21 +0000 https://www.nextplatform.com/?p=142706#comment-211784 Don’t forget how difficult it is to get AWS to just rent you a single big machine. You can’t just ask for quota sufficient to run a single H100 machine: you have to spend some real money renting lesser systems until AWS thinks you are credible enough to use an H100 system.

]]>
By: DEVI GUACA https://www.nextplatform.com/2023/07/27/h100-gpu-instance-pricing-on-aws-grin-and-bear-it/#comment-211780 Fri, 28 Jul 2023 21:24:17 +0000 https://www.nextplatform.com/?p=142706#comment-211780 Auroxo hybrid xvd platform
Amnenegeziana 2
WhiTch hetch
Fyi
Laah

]]>