Comments on: The Power Of Power10’s Memory Inception Clustering https://www.nextplatform.com/2021/08/13/the-power-of-power10s-memory-inception-clustering/ In-depth coverage of high-end computing at large enterprises, supercomputing centers, hyperscale data centers, and public clouds. Tue, 24 Aug 2021 07:38:01 +0000 hourly 1 https://wordpress.org/?v=6.5.5 By: Markos Malliarakis https://www.nextplatform.com/2021/08/13/the-power-of-power10s-memory-inception-clustering/#comment-165552 Tue, 24 Aug 2021 07:38:01 +0000 https://www.nextplatform.com/?p=138963#comment-165552 In reply to Eric Olson.

The concept sends existing neural network frameworks straight to dust … everything in memory, hybrid connected hardware pushing this to there. Intel DPC++ is not far, C++ 20 closer with programs running solely in compiler.
Your comment is exact, ML and AI automated in compiler for unsupervised ML … getting closer to Star Trek.

]]>
By: Axel Koester https://www.nextplatform.com/2021/08/13/the-power-of-power10s-memory-inception-clustering/#comment-165423 Wed, 18 Aug 2021 13:01:43 +0000 https://www.nextplatform.com/?p=138963#comment-165423 In reply to Paul Berry.

One major difference is the advent of in-memory computation which is also pioneered at IBM labs: https://www.zurich.ibm.com/sto/memory/
Considering the memory attached to the Power10 cores could be not just DRAM, but also compute-capable PCRAM with neural processing embedded into the PCRAM matrix, it’d make sense for several cores to be able to access that portion of memory in a shared way. Maybe another good reason to call it “memory inception”?

]]>
By: Erik Scott https://www.nextplatform.com/2021/08/13/the-power-of-power10s-memory-inception-clustering/#comment-165398 Tue, 17 Aug 2021 03:39:17 +0000 https://www.nextplatform.com/?p=138963#comment-165398 In reply to William Kelley.

It’s been so long ago I don’t trust my memory, but didn’t the KSR approach require operating system support? I seem to remember it being “page faulting over the network instead of to disk”. That’s a gross oversimplification and there must have been some elegant locking. I saw a KSR-1, once. Running. 🙂

]]>
By: Mark Funk https://www.nextplatform.com/2021/08/13/the-power-of-power10s-memory-inception-clustering/#comment-165394 Mon, 16 Aug 2021 21:02:36 +0000 https://www.nextplatform.com/?p=138963#comment-165394 In reply to v.ang.

Interesting observation. I’ve been picturing quite a few higher level comm architectures – along with variations on inter-process shared-memory architectures – built on top of something like this. But I also wonder at what level and how different. For example, with RDMA, even for simplex communications, there is still a source buffer and a target buffer, each securely addressed in each system. It strikes me that RDMA assumes exactly that (but would be pleased to learn otherwise). In RDMA, each system provides linkage into their system’s memory windows and all of the associated lower-level enablement and seems to show that to the RDMA user. This clustered memory, though, the simplex communications seems to be built upon a single shared memory buffer. Both systems and processes on each know of that one shared buffer. The lower-level enablement sets things up so that the two systems and these processes see the same thing using their own higher-level addresses, again allowing cores on both systems to access the share buffer. I’m just not sure that this underlying difference can be hidden from the RDMA user. Move it up a level, though, as in “I want that other system to see my data” then sure.

]]>
By: Mark Funk https://www.nextplatform.com/2021/08/13/the-power-of-power10s-memory-inception-clustering/#comment-165391 Mon, 16 Aug 2021 20:11:55 +0000 https://www.nextplatform.com/?p=138963#comment-165391 In reply to William Kelley.

Right. NUMA (Non-Uniform Memory Access) is a characteristic of pretty much any multi-socket SMP-based system. Such a cache-coherent SMP system is also called ccNUMA. Its non-uniform in the sense that a core’s access of the memory hung off of its own socket tends to be faster than an identical access from memory hung off of another socket. NUMA, then, is a performance issue, not a functional issue. As long as a core can directly access memory in a cache coherent manner, it is an SMP; again, a functional issue. I had considered folding NUMA into this article, but it was getting quite long as it was and for now wanted to stick with the functional aspects of it. But as a hopefully straightforward answer, notice that IBM’s PowerAXON is being used for cross-socket accesses within their (NUMA) SMPs, even for up to their 16-socket system. Now also notice that PowerAXON is being used for clustered memory as well.

]]>
By: William Kelley https://www.nextplatform.com/2021/08/13/the-power-of-power10s-memory-inception-clustering/#comment-165388 Mon, 16 Aug 2021 17:14:59 +0000 https://www.nextplatform.com/?p=138963#comment-165388 I kept expecting to see an explanation for how this architecture differs from NUMA (Non-Uniform Memory Access) or the Kendall Square Research (KSR-1) machine’s “All Cache Memory” architecture.

]]>
By: Paul Berry https://www.nextplatform.com/2021/08/13/the-power-of-power10s-memory-inception-clustering/#comment-165386 Mon, 16 Aug 2021 15:08:42 +0000 https://www.nextplatform.com/?p=138963#comment-165386 This is all really impressive. Is it substantively different from the Cray X1 or SGI origin of 20 years ago? Obviously there’s a lot more bytes and more bytes per second.

]]>
By: v.ang https://www.nextplatform.com/2021/08/13/the-power-of-power10s-memory-inception-clustering/#comment-165359 Sun, 15 Aug 2021 09:23:07 +0000 https://www.nextplatform.com/?p=138963#comment-165359 It reads like getting RDMA under the hood (or adding an ‘RDMA accelerator’) bind with smoothing out some complex for new engineers to grasp network programming and process synchronization (programming) tasks. Not a small feat by all means

]]>
By: Eric Olson https://www.nextplatform.com/2021/08/13/the-power-of-power10s-memory-inception-clustering/#comment-165345 Sat, 14 Aug 2021 15:34:25 +0000 https://www.nextplatform.com/?p=138963#comment-165345 It would be nice, as future analysis, to see how the IBM memory inception architecture fits with modern Fortran’s notion of a coarray. Will there be compilers that support this in a natural way at launch?

]]>