Comments on: Los Alamos Pushes The Memory Wall With “Venado” Supercomputer

By: Timothy Prickett Morgan

Timothy Prickett Morgan — Tue, 30 Apr 2024 15:08:26 +0000

In reply to Tanj Bennett. Good questions. I assumed there was ECC on HBM memory.

By: Tanj Bennett

Tanj Bennett — Tue, 30 Apr 2024 15:01:32 +0000

Neither of the memory types used in Grace or Hopper, LPDDR5x or HBM2E, have error correction suitable for 2PB scale systems. HBM2E has no ECC, and LPDDR5x has weak 1-bit correction with no useful ability to report uncorrectable events.

How will the LANL system add strong ECC and how much perf loss will occur due to that? This is a special concern since the article cites they have an interest in problems with random access patterns, which are awkward to pair with schemes based on separate checksums. Supercomputers are notoriously demanding on error correction and detection.

By: Calamity Jim

Calamity Jim — Tue, 16 Apr 2024 22:51:01 +0000

In reply to Timothy Prickett Morgan.

“Monte Carlo simulations, which use randomness to simulate what are actually deterministic processes”

Yes, it is pretty much used like that at Sandia and LANL, as a technique for uncertainty and sensitivity analysis in cases where parameters, driving forces, or initial conditions are known “imprecisely” in what are otherwise deterministic processes (say if a key model parameter’s value is known to be 4.57 ± 0.03). They also developed a Latin Hypercube Sampling (LHS) method and software to make such analyses more efficient (SAND2001-0417 at: https://www.osti.gov/biblio/806696/ ).

By: Paul Berry

Paul Berry — Tue, 16 Apr 2024 15:44:32 +0000

In reply to Timothy Prickett Morgan. If HPC and making LANL happy was Intel's objective, they obviously would have this. LANL has been asking for memory bandwidth ahead of flops for decades. The problem is they're willing to spend hundreds of millions on these sorts of machines, but not tens of billions. Can intel (or AMD) develop something like that for 2-3 customers?

By: Timothy Prickett Morgan

Timothy Prickett Morgan — Tue, 16 Apr 2024 12:24:33 +0000

In reply to HuMo.

It is about having more memory per core and much lower cost per memory bandwidth, I think.

Maybe Intel and AMD should have LPDDR5 Xeons and Epycs with a 16 or 24 memory controllers as an option? Or even 32 or 48 using Eliyan PHYs? How much work could a CPU with proper memory configuration with lots of vector and matrix engines do? I am gonna have a think about this….

By: Timothy Prickett Morgan

Timothy Prickett Morgan — Tue, 16 Apr 2024 12:21:31 +0000

In reply to Matt. I do believe that is what I said originally without being anywhere near as precise. Monte Carlo is a kind of magic as far as I am concerned. It is an approximation, and sometimes nature is perhaps a bit more precise than we are. But that you can take randomness and do predictions at all is the amazing thing.

By: HuMo

HuMo — Tue, 16 Apr 2024 09:07:06 +0000

Great to see that Santa’s red-nosed rocket sleigh is now officially gambolling about, even as LANL elves get it ready for acceptance of upcoming christmas-present workloads! Its trendy heterogeneous configuration is wildly reminiscent of recent partitioned animals grazing in this field, like the Mare Nostrum 5 GPP (CPU only) and ACC (CPU+GPU). It does seem a bit smaller though (as wild deer can be relative to mares?), with its 316,800-strong slender-yet-gracious core (vs 725,760 for the GPP mare’s xeons; and also 660,800 xeon max’s at the LANL Crossroads).

Its PF/s 101 (15.6+85.8) could prove an interesting intro into heterogeneous gift delivery, seeing (for example) how a hopper full of Oh’Henri’s chocolate bars was the #1 Green500 favorite last year (yummy!). More to the point though, its targeted ability to efficiently graze through sparsely distributed North Pole tundra lichens should directly translate into great performance at dispatching christmas lumps of coal to the equally sparse naughty kids of the world! With 1.5x as many cores as Japan’s meteorological twins (PRIMEHPC FX1000), one may expect a top 20 or better 0.7 PF/s standing for this HPCG task, somewhat similar to the Wisteria’s Odyssey, but with a potential boost from the rocket engines of its matrix hoppers (helping it hop-along quicker!).

All in all a quite interesting animal, sure to be more power-thrifty than its upcoming Swiss cheese cousin from the Alps, itself better adpated to the dense grassy pastures found there (running at 445 PF/s in this ecosystem), but likely somewhat equivalent in sparse tundras (IMHO)! Cant’ wait to see its blinking red nose in the sky! 8^b

By: Matt

Matt — Tue, 16 Apr 2024 03:26:44 +0000

A Monte Carlo method may be used to find a numerical approximation for a deterministic model. An example: Pi is a mathematical constant. The area of a circle is deterministic. The area, A, of a circle is uniquely determined by its radius, r, with the formula A = pi*r^2. By (pseudo)randomly sampling points from a uniform distribution from a square with side length r and testing whether they are inside or outside a circle inscribed in the square, one can find a numerical approximation of the area of the circle with radius r by multiplying the area of the square, r^2, by the fraction of points found to lie inside the circle. Noting the area formula for the circle, A = pi*r^2, the fraction of points found to lie within the circle yields an approximation of pi.

By: Ryan

Ryan — Mon, 15 Apr 2024 19:55:55 +0000

Monte Carlo is used to simulate random processes. In fact, it could be argued that you have it exactly backwards, as the numbers used in Monte Carlo are usually not random at all, but pseudo-random (aka deterministic).