Intial Post from Linkedin:
https://www.linkedin.com/posts/lukenorris_lots-of-people-are-talking-about-the-newly-activity-7235084404799201280-uPQP?utm_source=share&utm_medium=member_desktop
Lots of people are talking about the newly released MLperf by NVIDIA but somewhat muted is the continuing proof of 18 month cycles of PLANNED OBSOLESENCE of the LARGEST CAPX purchases in technology history
!!!The NVIDIA B200, as highlighted in the image, showcases nearly 4x the performance of the previous-generation H100 when it comes to inferencing, achieving an impressive 11,264 tokens per second compared to the H100’s 3,065 tokens per second.
This performance leap is not just the result of tweaks or optimized models—The B200 supports FP4 (4-bit floating point) inferencing with a fidelity that meets modern AI workload demands. This level of performance was achieved through the integration of NVIDIA Quasar Quantization, a cutting-edge software feature that leverages the B200's optimized tensor cores for FP4. Importantly, this capability cannot be retrofitted to earlier generations of NVIDIA hardware, further emphasizing the planned obsolescence of each NVIDIA chipset generation.
This post extends from my previous discussion on the planned obsolescence strategy employed by NVIDIA.
It’s clear that this is not just a marketing narrative—NVIDIA’s consistent generational advancements in hardware and software, proven by the math, leave previous models like the H100 quickly obsolete in the face of newer innovations.
Let’s break down the numbers. Suppose the B200 hits the market at the same price point as the H100—around $40,000. Both cards operate within nearly identical power envelopes, approximately 1000 watts. In an 8-way system, this translates to a power draw of about 10 kW.
For simplicity, assume the monthly operational cost of running a B200 system is $3,500. To match the B200’s performance with H100s, you’d need four H100 systems, which would cost around $14,000 per month in power alone. This creates a significant delta in operating costs, allowing a recoup of the capital expenditure (CapEx) of upgrading to the B200 within ~18months not considering the PROFORMANCE differential!
The B200 provides nearly 4x more compute per card compared to the H100, which means that within the same power restrictions of your data center, you could achieve up to 16x more inferencing power by upgrading all H100s to B200s.
I also wounder if anyone is taking this into consideration such as impact to suppliers such as Dell TechnologiesSupermicro who will see consistent upgrade purchases, or Service providers that need to recoup capital and make profit before buying the next generation such as CoreWeaveDenvr Dataworks as this is FAR faster then normal depreciation cycles and Asset turn over ratios would need to be 5 or 6 to make sense!
KamiwazaAIDaniel NewmanKeith TownsendPatrick MoorheadRyan ShroutMatt BakerGuido AppenzellerVal BercoviciDavid NicholsonRob Parish