Comments on: Tearing Apart Google’s TPU 3.0 AI Coprocessor

By: Mike Mallen

Mike Mallen — Thu, 18 Oct 2018 03:26:34 +0000

In reply to vangelis angelakos. What is the utilization rate? The key behind all of this hardware.

By: ErichF

ErichF — Mon, 09 Jul 2018 08:58:20 +0000

Nice article, thank you! One detail is hard to believe:
“Last year, we estimated that TPUv2 consumed 200 watts to 250 watts per chip. We now know that includes 16 GB of HBM in each package, with 2.4 TB/sec bandwidth between the MXU and HBM.”

A HBM2 can do at most 256GB/s. Actually less. That’s why V100 has “only” 900GB/s with 4 HBM2s. The two HBM2s of TPUv2 would therefore rather give us ~500GB/s.

By: Guangbin

Guangbin — Wed, 23 May 2018 10:31:05 +0000

Is every TPU Rack is combine two open rack side-by-side, and remove the adjacent brace and side plate?

By: vangelis angelakos

vangelis angelakos — Sun, 20 May 2018 19:16:18 +0000

Is there a public reference to that: ‘..Cloud TPU using 32 lanes of cabled PCI-Express 3.0 (28 GB/s for each link)’

Thank you!

By: Blade Meng

Blade Meng — Wed, 16 May 2018 19:43:45 +0000

In reply to jimmy.

Imaging you need professional knowledge, read tons of tech documents and extra time for configuration, no one want to waste time for the infrastructure.

That’s why people use cloud to rent TPUs or GPUs. Not buy them when you doing your own training as startup or do research. Its highly efficient training ability can save much more time for people without configuration or optimize.

People buy GPUs to build their own training centers for two reasons: 1. The scale is too large to rent from cloud, it’s also too expensive if you want to rent that many GPUs. 2. They want to keep privacy of training data. No one can sure the data you feed to the gpu cloud can only be yours, the cloud platform can use them too (it’s illegal, but has possibilities)

Another part people will not buy TPUs for now is it only support tensorflow, no caffle, no pytorch, no mxnet or CNTK. No one want to be locked on the platforms.

So it is reasonable that Google will never provide TPU products to the market, but only cloud service.

By: Paul Teich

Paul Teich — Mon, 14 May 2018 14:35:22 +0000

In reply to Jack Smith.

Every time someone says this I cringe.
Yes, pricing is inexpensive when there is no guarantee of service. Google’s Cloud TPU instances are great to benchmark with, but are pre-production (beta) and not yet priced as a supported production service. The benchmarked GPU instances are priced with SLA as a production-worthy service. (SLA = Service Level Agreement.)

From Google’s Cloud TPU Pricing page
https://cloud.google.com/tpu/docs/pricing
(as of 14 May 2018):

Beta
This is a Beta release of Cloud TPU. This release is not covered by any SLA or deprecation policy and is not intended for real-time use in critical applications.

By: Blade Meng

Blade Meng — Mon, 14 May 2018 14:10:58 +0000

Quote– “However, Google also restated that its TPUv2 pod clocks in at 11.5 petaflops. An 8X improvement should land a TPUv3 pod at a baseline of 92.2 petaflops, but 100 petaflops is almost 9X. We can’t believe Google’s marketing folks didn’t round up, so something is not quite right with the math. This might be a good place to insert a joke about floating point bugs, but we’ll move on.”

I asked Zak Stone about this question, he said the 8x performance increasing between TPUv3 Pod and TPUv2 pod is the minimum number in some benchmark, the better number will be 10x or more.

And I guess the 100+ petaflops Google mentioned is the Linpack benchmark(or Tensor benchmark,etc.) –the pure performance.

—-

BTW, Can I get your permission to translate and quote part of your articles to Chinese? I want to publish these good pictures and analysis through my wemedia account on Tencent wechat.

By: jimmy

jimmy — Sun, 13 May 2018 07:27:23 +0000

In reply to Jack Smith.

the GPUs are cheaper if ypu buy tjem yourselves.

You cant buy TPUs.

The GPU renting pricing is pushed up due to huge demand currently.

By: Jack Smith

Jack Smith — Sat, 12 May 2018 02:51:43 +0000

In reply to jimmy.

Could not do Wavenet and offer at a competitive price without the TPUs. Plus we can see pricing the TPUs are about half the cost of using Nvidia.

https://medium.com/@8fee9a760280/c2bbb6a51e5e

By: Jack Smith

Jack Smith — Sat, 12 May 2018 02:44:01 +0000

The TPU 2 was about half the cost of using Nvidia for same work.

https://medium.com/@8fee9a760280/c2bbb6a51e5e

Be interesting how much further ahead Google is now with the 3.

But most impressive is able to offer Wavenet at a competitive cost to the old technique for TTS used by everyone else.

16k through a NN in real time is just hard to believe possible. Nvidia has their work cut out for them.