Comments on: The Bespoke Supercomputing Architecture That Stood the Test of Time https://www.nextplatform.com/2023/12/04/the-bespoke-supercomputing-architecture-that-stood-test-of-time/ In-depth coverage of high-end computing at large enterprises, supercomputing centers, hyperscale data centers, and public clouds. Wed, 06 Dec 2023 05:10:23 +0000 hourly 1 https://wordpress.org/?v=6.5.5 By: Slim Albert https://www.nextplatform.com/2023/12/04/the-bespoke-supercomputing-architecture-that-stood-test-of-time/#comment-217217 Wed, 06 Dec 2023 05:10:23 +0000 https://www.nextplatform.com/?p=143347#comment-217217 In reply to Nicole Hemsoth Prickett.

This is a super-fast moving field with enormous stakes and outsized competition between AIslingers — a wild-wild west exploration of new drug discovery frontiers, with associated fevers, snake oils, and journalistic opportunities (so we, readers, can keep up, without being trampled by the stampede!). In this rodeo atmosphere, one can expect wild claims and counterclaims by competitng vendors (look! gold!?), and for academia and boffins in gov labs to offer a more objective and modulated perspective.

For example, from Facebook folks (2021: https://www.pnas.org/doi/10.1073/pnas.2016239118 ):

“Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences” (wowee!)

and, from Meta folks (2022: https://www.biorxiv.org/content/10.1101/2022.07.20.500902v1.full ):

“Language models of protein sequences at the scale of evolution enable accurate structure prediction” (yippee!)

But, from 58 highly sedated academics (Sept. 2023, paywalled: https://www.nature.com/articles/s41573-023-00774-7 ), on “Artificial intelligence for natural product drug discovery” (a review paper), we find in the abstract:

“We also discuss how to address key challenges […] such as the need for high-quality datasets to train deep learning algorithms and […] algorithm validation”

Geez Lewise, what party poopers! They probably just forgot to read the industry papers! You snooze, you loze! (eh-eh-eh!)

Interpolating suggests that there is great potential but kinks still need ironing out …

]]>
By: Slim Jim https://www.nextplatform.com/2023/12/04/the-bespoke-supercomputing-architecture-that-stood-test-of-time/#comment-217208 Wed, 06 Dec 2023 01:14:33 +0000 https://www.nextplatform.com/?p=143347#comment-217208 In reply to Nicole Hemsoth Prickett.

I’m pretty much with Fabio on this, in that ML-oriented AI approaches require a lot of training data which may not be readily available for drug discovery applications at present (unlike for LLMs). Dr. Matysiak (former UMCP colleague: https://bioe.umd.edu/clark/faculty/169/Silvina-Matysiak ) is a folding expert who could likely provide a more nuanced answer on some of this.

I’ll note also the UMCP’s recent DOE INCITE award (Nov. 29) of 100’s of thousands of node-hours on Aurora, Frontier, and Polaris, to advance AI/ML training and application on HPC machines, including healthcare and physician support (if I read well: https://www.umiacs.umd.edu/about-us/news/umd-team-wins-doe-award-advance-ai-using-supercomputers ). Drug discovery might be a component of this I guess, if there is sufficient faculty interest and availability.

Meanwhile, I would expect that some AI approaches that are not particularly rooted in ML (or in the way that ML is currently applied) could find uses in drug discovery. For example, AI-oriented cognitive modeling and optimization strategies could surely be useful here, especially in combination with MD (IMHO).

From the angle of computational processes and architecture, in AI we have many artificial neurons acting in parallel and interacting through synapses (linear vector biasing, linear matrix-vector multiplication, nonlinear activation indepedent of other neurons), and in MD we have many atoms or molecules acting partly in parallel (momentum) and partly in interaction with others (eg. distance-dependent nonlinear Lennard-Jones potential) which is slightly more complex (symplectic integration could be a must?). So I think an arch designed for MD could work well in AI but not necessarily the reverse. Also, AI can benefit from lower precision but I’m not sure if MD can too (maybe mixed-precision?).

In any instance, if genAI can somehow speed up drug discovery, then so much the better I think (IMHO)!

]]>
By: Nicole Hemsoth Prickett https://www.nextplatform.com/2023/12/04/the-bespoke-supercomputing-architecture-that-stood-test-of-time/#comment-217198 Tue, 05 Dec 2023 19:37:35 +0000 https://www.nextplatform.com/?p=143347#comment-217198 In reply to Fabio.

Great, useful answer. Thanks for the insight.

]]>
By: Fabio https://www.nextplatform.com/2023/12/04/the-bespoke-supercomputing-architecture-that-stood-test-of-time/#comment-217194 Tue, 05 Dec 2023 19:11:48 +0000 https://www.nextplatform.com/?p=143347#comment-217194 In reply to Nicole Hemsoth Prickett.

Currently there is basically no public data (databases) for the conformational landscape of proteins, so no training data to use AI to do what they’re doing with MD. It’s not impossible to create it, e.g. with active learning using MD, but I’d say this is not going to happen on the short-term. 5 years from now, maybe.
Alphafold-like models are already having big difficulties with the lack of training data, see RosettaFoldNA with nucleic acid protein complexes…

]]>
By: Nicole Hemsoth Prickett https://www.nextplatform.com/2023/12/04/the-bespoke-supercomputing-architecture-that-stood-test-of-time/#comment-217188 Tue, 05 Dec 2023 18:12:07 +0000 https://www.nextplatform.com/?p=143347#comment-217188 In reply to Slim Jim.

I didn’t include this in the piece but I wonder if even this MD powerhouse will be less effective than GenAI for drug discovery. I suspect you might know a thing or two about this there given your bio background there in MD. What do you think?

]]>
By: Slim Jim https://www.nextplatform.com/2023/12/04/the-bespoke-supercomputing-architecture-that-stood-test-of-time/#comment-217166 Tue, 05 Dec 2023 06:05:59 +0000 https://www.nextplatform.com/?p=143347#comment-217166 Fantastic article (IMHO)! It’s great to read about this “priceless” “fire-breathing monster for molecular dynamics simulations” (from first link, “Anton”, with many diagrams and photos). The second link, “NON-VON”, from 1982, is also superb; on page 6 we find:

“In the NON-VON primary processing Subsystem (PPS) […] a large number of very simple […] processing elements (PE’s) are, in effect, distributed throughout the memory.”

Right on! That’s near-memory (or in-memory) dataflow-style compute right there (as you certainly noted: “The data flow nature of the Anton systems […]”)!

Now, if this arch could be merged with that of AI’s fine-grained reconfigurable dataflow accelerators (usual suspects; not the chunky ones), it would make a real AI/MD multipurpose winner (I think)!

]]>