Comments on: Frontier: Step By Step, Over Decades, To Exascale https://www.nextplatform.com/2022/05/30/frontier-step-by-step-over-decades-to-exascale/ In-depth coverage of high-end computing at large enterprises, supercomputing centers, hyperscale data centers, and public clouds. Tue, 18 Apr 2023 15:26:26 +0000 hourly 1 https://wordpress.org/?v=6.5.5 By: will https://www.nextplatform.com/2022/05/30/frontier-step-by-step-over-decades-to-exascale/#comment-192718 Fri, 17 Jun 2022 08:14:54 +0000 https://www.nextplatform.com/?p=140667#comment-192718 In reply to Timothy Prickett Morgan.

Is it possible that matrix 95.7T is unreachable under a reasonable TDP? That is, it can not keep the PEAK freqency 1.7GHz for full matrix fp64, instead, it reduces the clock frequency to around 800MHz (and thus the Rpeak of each GPU is around 45T). It is well known that matrix engine is much more power efficient than vector engine for DGEMM, so there is no reason to not use matrix engine, right?

]]>
By: Rebecca Lewington https://www.nextplatform.com/2022/05/30/frontier-step-by-step-over-decades-to-exascale/#comment-192152 Thu, 02 Jun 2022 23:28:49 +0000 https://www.nextplatform.com/?p=140667#comment-192152 In reply to PAUL BERRY.

“Wafer-scale-integration is always just out of reach.”

Ahem.

https://www.cerebras.net/

Yes, I work at Cerebras 🙂

]]>
By: Feiyi W. https://www.nextplatform.com/2022/05/30/frontier-step-by-step-over-decades-to-exascale/#comment-192145 Thu, 02 Jun 2022 13:23:32 +0000 https://www.nextplatform.com/?p=140667#comment-192145 Summit’s HPL-AI has been dramatically improved since the first submission: it is no longer 445 PF, but 1.41 EFs, obtained and submitted last year by OLCF’s HPL-AI dev team.

]]>
By: PAUL BERRY https://www.nextplatform.com/2022/05/30/frontier-step-by-step-over-decades-to-exascale/#comment-192082 Tue, 31 May 2022 16:07:27 +0000 https://www.nextplatform.com/?p=140667#comment-192082 It’s interesting to me the history of supercomputing since about the paragon.
Back and forth between very-huge networks of small, low-cost nodes versus huge networks of expensive “fat” nodes. The performance, technical limits, and economics of that battle must be transitory, as there never seems to be a real winner. What’s most interesting to me, is how rarely we’ve really seen anything outside of that duopoly. Every five years I hear someone bring up processor-in-memory, but it never goes anywhere. Wafer-scale-integration is always just out of reach. Blue Gene was an interesting variation on the theme, using embedded processors instead of server CPUs. Well done AMD/Cray, but nothing really new here.

]]>
By: Timothy Prickett Morgan https://www.nextplatform.com/2022/05/30/frontier-step-by-step-over-decades-to-exascale/#comment-192078 Tue, 31 May 2022 11:33:15 +0000 https://www.nextplatform.com/?p=140667#comment-192078 In reply to will.

That’s just math. See the specs: https://www.amd.com/en/products/server-accelerators/instinct-mi250x

The MI250X is rated at 47.9 teraflops on the FP64 vector, and 95.7 teraflops on the FP64 matrix. If they are only getting 45.6 teraflops, it ain’t in matrix mode. If it were, it would be showing effective 2.2 exaflops. With the HPL-AI, I think they are using matrix mode, and hence the higher ratio between HPL and HPL-AI. I inferred that, but no one has confirmed that. I can’t think of another way this ratio could happen. The math works.

]]>
By: will https://www.nextplatform.com/2022/05/30/frontier-step-by-step-over-decades-to-exascale/#comment-192072 Tue, 31 May 2022 08:44:07 +0000 https://www.nextplatform.com/?p=140667#comment-192072 Nice analysis! Could you verify the source of “The MI250X is rated twice that on the matrix engines not used in the HPL test”? I thought the matrix is used in HPL, otherwise the power effiency could not be so impressive.

]]>
By: Timothy Prickett Morgan https://www.nextplatform.com/2022/05/30/frontier-step-by-step-over-decades-to-exascale/#comment-192068 Tue, 31 May 2022 02:45:26 +0000 https://www.nextplatform.com/?p=140667#comment-192068 In reply to Ben.

Thank you very much. I asked about this, and no one at HPE seemed to know what I was talking about. This clears it up.

]]>
By: HuMo https://www.nextplatform.com/2022/05/30/frontier-step-by-step-over-decades-to-exascale/#comment-192061 Tue, 31 May 2022 01:08:02 +0000 https://www.nextplatform.com/?p=140667#comment-192061 Nice analysis! I agree that the power efficiency seems to be the big story here at approx. 20 MJ/Exaflop (or 50 to 60 GFlop/s/W). It should be interesting to find out if this comes from the EPYC, or the MI250X, or the Slingshot, or the process node (7, 5 or 3 nm, ?), or liquid cooling? The other two EPYC-MI250X-Slingshot systems in the top 10 have the same power profile but the remaining 7 systems are mostly 60 MJ/Exaflop it seems.

(P.S. apologies for initially posting this in the wrong story — from Nov. 16)

]]>
By: Ben https://www.nextplatform.com/2022/05/30/frontier-step-by-step-over-decades-to-exascale/#comment-192057 Mon, 30 May 2022 23:00:27 +0000 https://www.nextplatform.com/?p=140667#comment-192057 The node diagram is here: https://docs.olcf.ornl.gov/_images/Crusher_Node_Diagram.jpg
The connection between each GPU and NIC is marked as ‘PCIe Gen4 ESM’, which appears to be a reference to the ‘extended speed mode’ of CCIX over PCIe.

]]>