Replace Internet with AI in the following quote from the New York Times, November 11, 1996[0]:
"For many people, AI could replace the functions of a broker, whose stock in trade is information, advice and execution of transactions, all of which may be cheap and easy to find on line. AI also is prompting some business executives to wonder whether they really need a high-priced Wall Street investment bank to find backers for their companies when they may now be able to reach so many potential investors directly over the Net. And ultimately, AI's ability to bring buyers and sellers together directly may change the very nature of American financial markets."
It's a cautionary tale. Obviously, the Internet did live up to the hype. Just after it wiped out millions of retail investors...
[0]https://www.nytimes.com/1996/11/11/business/slow-transition-...
The .com boom of the late 90s was different. Companies who had very little to do with the internet were adding ".com" to their name. I was a penny stock trader and that was one of the fastest ways companies would increase value -- add ".com" and issue a press release about how they plan to be "internet enabled" or "create a web presence".
Today most companies aren't getting a bump by talking about AI. You don't see Spotify changing their name to Spotify.AI. Companies are dabbling in offering AI, e.g., SnapChat, but they aren't shifting their whole focus to AI.
Now there is an industry of small companies building on AI, and I think that's healthy. A handful will find something of value. Going back to the early .com days -- I remember companies doing video/movie streaming over the web and voice chats. None of those early companies, AFAIK, still exist. But the market for this technology is bigger than its ever been.
Intel peaked at $73.94 in September 2000. It's currently $28.99. In the interim there have been no splits.
NVidia has split 5 times (cumulative 48x) since 2000. It closed 2000 around $2.92. It is currently $389.93. Totally gain 6400x. If you ignore the last 12 months, NVidia's last peak was $315 in 2021, for a total gain of 5178x. -ish.
Meanwhile the hedge funds and institutional investors are just trying to ride the momentum while it lasts, which could be for a while.
Nvidia is pricing on actual revenue growth (~16%?) and projected growth (~20%). Since 2016 they've been killing it.
AI will turn into a bubble when unrelated companies begin being priced like that, without historical or current revenue growth to back up their projections, simply by virtue of being AI-associated.
Nvidia is different in that they're the ones selling the hardware, AI isn't going anywhere imo, the spike Nvidia is seeing atm may subside a little but I doubt it, as minor players give up, stronger players will still need more hardware anyway.
Tbh imagine being Nvidia: * Known for dominating in the gaming market, consumers buy plenty of Nvidia cards and always will do * Workstation cards have always been used for CAD/rendering digital media and always will be * Nvidia hardware used in plenty of supercomputers * Crypto craze hit and Nvidia sold a bajillion cards for that, I imagine 2nd-hand mining cards have impacted the consumer arm of their biz but probably not too much, I've seen people avoid buying crypto cards unless they're offered at a very low price * Nvidia has sold cards to people doing AI for a long time, but now the AI boom has started and they're making bank
Basically they've enjoyed the crypto boom and are now enjoying the AI boom, but even if AI boom declines to 0 (it won't) they can still fall back on their workstation/consumer hardware.
Reason I don't think the AI boom will end is that besides companies smashing AI in for no reason, actual applications of it are incredibly useful. I still remember friends being amazed that they could search their Google Photos by "dog" or "cat" (which as furries it's hilarious that it comes up with fursuiters).
Did Intel ever ‘grow’ into their massively overvalued valuation? No.. their stock never even reached it’s September, 2000 peak yet.
There is a chance that AMD, Intel, maybe Google etc. catch up with Nvidia in a year or two and data center GPUs become a commodity (clearly the entry bar should be lower than what it was for x86 CPUs back in) and what happens then?
Realistically, there is next to zero chance Intel (especially given the Arc catastrophe and foundry capabilities) or AMD (laundry list of reasons) catchup within 2 years.
Safe bet Google's TPUv5 will be competitive with the H100, as the v4 was with the A100, but their offering clearly hasn't impacted market share thus far and there is no indication Google intends to make their chips available outside of GCP.
With that said I also agree the current valuation seems too high, but I highly doubt there is a serious near-term competitor. I think it is more likely that current growth projections are too aggressive and demand will subside before they grow into their valuation, especially as the space evolves with open source foundation models and techniques come out (like LoRA/PEFT) that substantially reduce demand for the latest chips.
1. You can buy mini versions of their chips through Coral (coral.ai). But yea, they’d never sell them externally as long as there exists a higher-margin advantage to selling software on top of them, and chips have supply constraints.
2. Google can sell VMs with the tensor chips attached, like GPUs. Most organizations with budgets that’d impact things will be using the cloud. If Apple/MSFT/AWS/Goog/Meta start serious building their own chips, NVidia could be left out of the top end.
They have already been doing this for quite a while now and even when offered free via TRC barely anyone uses TPUs. There is nothing to suggest that Google as an organization is shifting focus to be the HPC cloud provider for the world.
As it stands TPU cloud access really seems ancillary to their own internal needs.
> If Apple/MSFT/AWS/Goog/Meta start serious building their own chips, NVidia could be left out of the top end.
That's a big "if", especially within two years, given that this chip design/manufacturing isn't really a core business interest for any of those companies (other than Google which has massive internal need and potentially Apple who have never indicated interest in being a cloud provider).
They certainly could compete with Nvidia for the top-end, but it would be really hard and how much would the vertical integration actually benefit their bottom line? A 2048 GPU SuperPOD is what, like 30M?
There's also the risk that the not-always-friendly DoJ gets anti-trusty if a cloud provider has a massive advantage and is locking the HW in their walled garden.
What are you basing that on? I'm not aware of GCP having released any numbers on their usage.
I'm making that statement as my experience (easily several hundreds of publications read or reviewed over 3 years) is that it is very uncommon to see TPU's mentioned or TRC acknowledged in any non-Google transformer paper (especially major publications) dating back to the early BERT family of models despite the fact that Google is very generous with research credits (they'll give out preemptible v3-32s and v3-64s for 14 days with little question, presumably upgraded now as I haven't asked for credits in a while).
Fully acknowledge this isn't quality evidence to back my claim and I'm happy to be proven wrong but I'm very confident a literature review would support this as when I tried to use TPUs myself I couldn't find much.
This doesn't account for industry use, there is probably a non-insignificant amount of enterprise customers still using AutoML (I can think of a few at least) which I believe uses the TPU cloud but I would be surprised if many use TPU nodes directly outside of Jax shops like cohere and anyone still using TF.
PyTorch XLA has just breaks too much otherwise and when I last tried to use it in January of this year there was still quite a significant throughput reduction on TPUs. Additionally when using nodes there is a steeper learning curve on the ops side (VM, storage, Stackdriver logging) that make working with them harder than spinning up a A100x8 which is relatively cheap, cheaper than the GCP learning curve for sure.
Isn't Medical Informatics inherently biased against the cloud? That's my uninformed guess as an outsider.
A) Nvidia's TAM is not really what the stock is priced foe
B) Google will try to enter this market and compete
Either way NVDA looks perilously pricey, not that that is very predictive of anything (see TSLA).
The Intel® Data Center GPU Max Series outperforms Nvidia H100 PCIe card by an average of 30% on diverse workloads1, while independent software vendor Ansys shows a 50% speedup for the Max Series GPU over H100 on AI-accelerated HPC applications.2 The Xeon Max Series CPU, the only x86 processor with high bandwidth memory, exhibits a 65% improvement over AMD’s Genoa processor on the High Performance Conjugate Gradients (HPCG) benchmark1, using less power. High memory bandwidth has been noted as among the most desired features for HPC customers.3 4th Gen Intel Xeon Scalable processors – the most widely used in HPC – deliver a 50% average speedup over AMD’s Milan4, and energy company BP’s newest 4th Gen Xeon HPC cluster provides an 8x increase in performance over its previous-generation processors with improved energy efficiency.2 The Gaudi2 deep learning accelerator performs competitively on deep learning training and inference, with up to 2.4x faster performance than Nvidia A100.
https://www.intel.com/content/www/us/en/newsroom/news/intel-...
Arc is manufactured using TSMC N6.
Intel originally wanted to use Intel 4 but it wasn’t ready yet. Maybe the next batch of GPUs assuming Meteor Lake and their other CPUs don’t consume all the Intel 4 capacity.
Also Arc hardware-wise is fine for what it is and the process node it’s using - N6 isn’t a leading edge node to my knowledge. Drivers are unfortunately something that’s going to take time to fix up - there is no way around this.
Given Intel 4 is launching at the end of the year I would expect their focus will be on catching up wth AMD on CPUs and the next-gen Arc GPUs. Assuming everything goes well with their yields and they have extra foundry time (which they won't be using as part of IFS) will they have the institutional energy/capital/will to open a new software+hardware battle in a market the entrenched Nvidia will fight to the death for?
It seems extremely unlikely to me within 1-2 years.
People have said AAPL was overvalued perennially as long as I remember yet their market performance seems to ignore these opinions.
On the other hand, a big part of it also comes down to the tool chain, and NVIDIA owns CUDA. Until OpenCL or other gpu platforms catch up, it seems like NVIDIA can continue to corner the gpu market at large.
> People have said AAPL was overvalued perennially
Yes but they were saying this when AAPL's p/e ratio was in the low teens and now it's near 30. it was never near the insanity that is NVDA. I will grant that there's a lot of uncertainty about the future, but there's immense optimism baked in right now. It will be hard to live up to.
The A750 and A770 are tremendous GPUs and compete very well with anything Nvidia has in those brackets (and Intel is willing to hammer Nvidia on price, as witnessed by the latest price cuts on the A750). Drivers have rapidly improved in the past few quarters. It's likely given how Nvidia has chosen to aggressively mistreat its customers that Intel will surpass them on value proposition with Battlemage.
The reason is because they can get away with it, because there's so much demand for their product. Were Nvidia to see AMD release a 4090 equivalent at half the price they need only reduce their own ridiculous prices and take less of a profit margin.
There is no significant competition to the NVIDIA A100 and H100 for machine learning.
This being the operative part of the statement. If we're talking top-end GPUs it's not even close.
> Intel is willing to hammer Nvidia on price
They also have no choice, Intel's spend on Arc has been tremendous (which is what I mean by catastrophe, everything I've read suggests this will be a huge loss for Intel). I doubt they have much taste for another loss-leader in datacenter-level GPUs right now, if they even have the manufacturing capacity.
Most likely, all their prices go up...
I mean, your first instinct is to say, "but how could all their prices so up, they'll steal value from each other", but that's not necessarily true. If AI starts solving useful problems, and especially if it starts requiring multi-modality to do so, I would expect the total GPU processing demand to increase by 10,000-100,000X that we have now.
Now, you're going to say "What's going to pay for this massive influx of GPU power by corporations". And my reply would be "Corporations not having to pay for your health insurance any longer".
I mean, maybe it's not a fair comparison but I don't see why the datacenter/GPGPU market won't end up the same way. Nvidia is notorious for trying to lock in users with proprietary tech too, though people don't seem to mind.
If you take dividends into account it did break even a few years ago, at least in nominal terms.
Cisco and Sun Microsystems may be even better comparables though.
"Figures show its [NVidia] AI business generated around $15bn (£12bn) in revenue last year, up about 40% from the previous year and overtaking gaming as its largest source of income"
Oddly, in updates to the article they rewrote a lot of it, and that line is missing, but you can still see it if you search for it.
"The highest analyst price target for Cisco stock before the dot-com crash was $125 per share. This target was set by Merrill Lynch analyst Henry Blodget in April 2000, just as the dot-com bubble was beginning to burst. Blodget's target was based on his belief that Cisco was well-positioned to benefit from the continued growth of the Internet."
I was looking to compare with analyst targets set for NVDA yesterday. Analysts now are saying the exact thing about Nvidia being able to capture the continued growth of AI:
"JPMorgan set its price target to $500 Wednesday, double its previous estimate and among the highest out of the big banks. Analyst Harlan Sur said this is the “first massive wave of demand in generative AI,” with more gains to follow. He reiterated his overweight rating on the stock."
The ironic bit of course is that my own research here is powered by Bard which probably used an NVDA gpu to train it. But even those dot-com analyst calls were probably emailed around on equipment sold by Cisco.
If I were holding that stock right now, regardless of how right these analysts end up being over the next year or so. I would sell today
Google uses in-house TPUs for Bard.
https://www.hpcwire.com/2023/04/10/google-ai-supercomputer-s...
The current AI wave is 95% hype (ultimately useless/broken crap invoking LLM APIs or AI art app du jour) but some of the companies are clearly useful (transcription, summarization, categorization, code generation, next-gen search engine, etc.) and will disrupt traditional services and scale large.
And AI infra companies (AI hardware, AI software on top of hardware, and generic AI model SaaS) will make tons of money as those app companies scale.
In addition 2minute papers viewers have seen that AI generated media is coming fast, soon we'll go from Unity/Unreal having an AI "assistant" that can generate models "make a chair for two characters in the same style as this single person chair" to "based on the current information you know about this game world, generate a new zone for the player that includes x, y, z challenges, resources. Create models, textures, animations for all of this" etc. And this is only implications for making games, let along all the other stuff we could get it to do.
The video on automatic animations (https://www.youtube.com/watch?v=wAbLsRymXe4 and others) is super cool, once refined it's going to be possible to have a system that can: generate a character model, texture it, automatically animate it for that particular character (young, old, how many limbs) and adjust as needed "right foot becomes injured, so limp" generated voices and unique dialogue set within the realm of the overall game world. I think main plots will still be controlled by game makers, but interaction with rando npcs/side-quests could be totally organic.
Nonetheless, the AI news cycle is continuous (like .COM was) and the attribution of NVDA's +25% romp to the prospects of AI grabs the attention of retail investors, who tuned in to see AVGO +20% and the likes of MSFT, TSLA, NFLX and GOOG add 5% in 2 days. The longer that goes on, the more we'll see investors looking for reasons that companies will benefit from AI and want to buy in, then, companies that don't have a strong AI story will need to get on the train and start buying all the AI startups that have materialized over the last couple of years. Then, we start seeting AI IPOs with increasingly sketchy histories. (sorry, .COM PTSD kicking in...)
All this could happen in a weak market. In fact, strong returns in AI during a weak overall market will simply call more attention to it.
https://www.bloomberg.com/news/articles/2002-03-31/a-talk-wi...
A very good rule of thumb is: if someone's mentions dividends when discussing valuation they are clueless. It doesn't always work (paying high dividends has implications ranging from clueless management to political pressure on the company) but it's a very good rule that the argument is nonsense.
The fact someone is willing to pay $100 for one share doesn't mean every share is worth $100.
The fair value of a stock should always depend on the expected cash flow you can receive by holding the stock for perpetuity. Nobody can predict the future, so nobody really knows what the fair value is.
But, if you had 1 trillion dollars and still wouldn't want to pay 1 trillion to acquire an entire company, because you feel you very likely can't make that 1 trillion back, then it's fair to say the company is not worth 1 trillion to you.
Sun Microsystems CEO Scott McNealy in 2002 (source https://smeadcap.com/missives/the-mcnealy-problem/#:~:text=A....)
We're on the precipice of obviating 80% of white collar work, and 99% of Graeber's Bullshit Jobs.
But give it a few years and I'm really curious how regulatory and licensing bodies react because they have almost always moved uniformly in whichever direction is necessary to suppress wages. There are few exceptions to this (e.g. physicians). The output benefits of worker + AI could potentially lead to some professional services becoming dirt cheap, while others become ludicrously expensive.
I'm also curious what this means for immigration. For the West, the primary justification to siphon the world's talent fundamentally vanishes. That's talent that potentially stays put and develops in non-Western countries. For countries where the entire country is a demographic ponzi using immigrants to prevent collapse, it's potentially an existential problem.
"humans as agents who are consistently rational and narrowly self-interested, and who pursue their subjectively defined ends optimally."
We will figure out new, irrational and suboptiomal ways to make new bullshit jobs.
The AI ethics department will be hiring a ton of people.
Personally, I did really wish this would have been a new-era moment where society would take a step back and evaluate how we are organizing ourselves (and living, even), but I fear that AI comes too late for us, in the sense that we're so rusted and backwards now that we cannot accept it. Or any important change, in fact. It's pretty depressing.
To avoid this, countries need to plan for and mitigate the social effects of economic dislocation, such as UBI. Unfortunately that ain't gonna happen. Brace yourselves.
About two months ago, I bought three shares of Nvidia stock. I noticed that no one appears to be doing serious AI/ML research with AMD hardware, and I also noticed that Nvidia's stock hadn't spiked yet with the rise of ChatGPT and Stable Diffusion.
For once I was actually right about something in the stock market...About a dozen more accurate predictions and I'll finally make up the money I lost from cryptocurrency.
I think plenty have noticed, But cant get heads around investing in a company with 150x PE.
NVDA forward P/E is still eye watering, but it's much lower.
Nvidia next quarter guidance ( Which some media misinterpreted as 2H FY24 ) will be $11B, at least ~50% ahead of current forecast and best market expectation and are TSMC capacity constrained. Revenue expectation of FY2024 will be anywhere between 50% to 80% of FY23. Considering their Revenue will be mostly from Datacenter, Margin is expected to be better, Profits Forward earning in the higher end could double of FY23. Meaning Forward P/E ( Taking Current ~PE 220 from Yahoo, assuming that is correct ) would be ~110. Some sites are putting it Forward P/E as 50, I have no idea where that figure came from. So someone correct me if I am wrong.
But the market currently ( at least as I read it on news and some comments here ) are expecting them to DOUBLE again in FY25.
The Datacenter currently spent about $25B in total on Intel and AMD. Nvidia is looking at around $30B this year.
But again, Nvidia's GPU is only used for training. The actual mass market and volume are still in Inference. I dont see how $30B a year being sustainable in the long term. But I dont doubt this year and next will be a problem, I guess that is what market wants to see anyway.
It is interesting at these sort of Margin and volume. Now the tide has turned and Nvidia will be able to use and afford leading Edge Node on their GPU. Apple will no longer be TSMC's only option although still preferred due to smaller die size of Phone SOC.
It is only a matter of time before Nvidia further grow their market with ARM CPU. Which they haven't put too much effort into it yet. They already have Network Card from Mellanox, and are doing great.
But here is a wild idea, Acquisition. If Nvidia were to spend money to buy, which company would it be? Qualcomm currently only worth $120B. Nvidia will get access to Qualcomm's Patent and their SoC Line. Finally setting foot in Mobile.
I don't think any company in history has achieved that when it was already at Nvidia's scale. Obviously when a company is much smaller it can achieve greater growth.
Forward P/E is just current P divided by some estimated future E. You can use your own earnings estimate, or commonly you can use sell side analyst estimates.
Whenever a PE looks expensive people are expecting very large increases in E. They aren't just trying to buy expensive stuff.
You may think it's a bubble. It's not obviously a bubble to anyone who understands the current capabilities and how locked in nvidia is with their gpus and cuda. It might end up being expensive in retrospect. It might not.
Two months ago the stock was 60% up since ChatGPT was released and 150% up since October’s low.
You cannot look at the prior price to make a purchase decision. You have to look at the future projected revenues per share, and apply an industry standard multiple.
If you hesitate to buy a stock you like because you feel annoyed that you didn't buy it for cheaper the day/week/month before, you will nerf yourself.
“You have to look at the future projected revenues per share, and apply an industry standard multiple.”
Work out to?
The stock is was (and is) extremely expensive. Good luck to all buying this things at 30x sales. It doesn't make any sense.
- NVIDIA: https://www.google.com/finance/quote/NVDA:NASDAQ?sa=X&ved=2a...
- AMD: https://www.google.com/finance/quote/AMD:NASDAQ?sa=X&ved=2ah...
At the end of the day doubling your money is exceedingly rare, especially on any single security, no sense feeling bad you didn't 10x it.
Don't be like me: never, never, never, never feel bad about selling shares for a profit. Sell it, go one about your day (IOW, quit looking at $STOCK, you don't own it anymore), take the spouse/SO out for a nice dinner if you made some serious bank.
Which reminds me that now might be a good time to unload that NVDA I've been holding. I'm not completely unteachable.
[0] Somewhat oversimplified for discussion purposes.
This forum is filled with people who sold all sorts of tech stocks way too early (or too late), and people nerding out over things and tossing them and them magically gaining tons of value over time - I'm thinking about all of my super early CCGs that I tossed when cleaning house, the 20 bitcoin I mined for fun way back in 2012 or whenever and then deleted from my laptop (that I then sold on eBay for $100), the 10k of AAPL I bought for like $5 and sold for $10, etc. etc.
Same with all the early job opps and what not too - but we're the sum of our life choices till now and that's OK. :)
300k Investment on $900 dollars.
What's funny is that we on HN know there's no magic inside these chips, a sufficiently smart foundry could easily cripple Nvidia overnight... yet where's the nearest VC fund for ASICs??
If you meant outsmart nvidia, Google’s TPU is already more efficient but a GPU is much more than an efficient design .
NVDA has a trailing twelve months (TTM) Price to earnings (P/E) ratio of 175x. Based on the latest quarter and forward guidance they have a forward-looking P/E ratio of 50x - So the market is already expecting (and has priced in) even higher expectations of growth than what the stock is already at.
NVDA is expected to at least double their already great growth (to get to P/E of 25x) according to the market. I have my doubts.
You can compare this to the historical averages of the S&P 500: https://www.multpl.com/s-p-500-pe-ratio
I may have missed the news. Where did they mention they are going to make 3.5X the profits in their forward guidance or forward looking P/E ?
Assuming consumer revenue stays roughly the same, ( crypto usage being the largest variable ). Data Center sector has to grown at least 6X in revenue.
The TTM Price/Earnings ratio is even crazier as the market is expecting them to grow revenue 9x from what they made in the last year (to get back to a 20x P/E).
They expect NVDA to not only dominate GPU market, but have a break through in AI or contribute to it, which would lead to way more money.
Also have to look at the fact, any "AI" portfolio is going to be heavily weighted NVDA stock. And people who may be hedging against a raise in AI or buying into said raise are investing in AI portfolios/ETFs, and thereby a portion of that NVDA.
It's not as simple as how the people above are explaining it.
I'm really curious to see where NVDA stands on Tuesday morning.
Microsoft Office rode the same type of paradigm to dominate the desktop app market.
Comparatively few people have “deep” experience with CUDA (basically Tensorflow/Pytorch maintainers, some of whom are NVIDIA employees, and some working in HPC/supercomputing).
CUDA is indeed sticky, but the reason is probably because CUDA is supported on basically every NVIDIA GPU, whereas AMD’s ROCm was until recently limited to CDNA (datacenter) cards, so you couldn’t run it on your local AMD card. Intel is trying the same strategy with oneAPI, but since no one has managed to see a Habana Gaudi card (let alone a Gaudi2), they’re totally out of the running for now.
Separately, CUDA comes with many necessary extensions like cuSparse, cuDNN, etc. Those exist in other frameworks but there’s no comparison readily available, so no one is going to buy an AMD CDNA card.
AMD and Intel need to publish a public accounting of their incompatibilities with PyTorch (no one cares about Tensorflow anymore), even if the benchmarks show that their cards are worse. If you don’t measure in the public no one will believe your vague claims about how much you’re investing into the AI boom. Certainly I would like to buy an Intel Arc A770 with 16GB of VRAM for $350, but I won’t, because no one will tell me that it works with llama.
It would seem a great startup idea with the intent to get acqui-hired by AMD or Intel to get into the details of these incompatibilities and/performance differences.
At worst it seems you could pivot into some sort of passive income AI benchmarking website/YT channel similar to the ones that exist for Gaming GPU benchmarks.
For example, I recently went looking into Numba for AMD GPUs. The answer was basically, "it doesn't exist". There was a version, it got deprecated (and removed), and the replacement never took off. AMD doesn't appear to be investing in it (as far as anyone can tell from an outsider's perspective). So now I've got a code that won't work on AMD GPUs, even though in principle the abstractions are perfectly suited to this sort of cross-GPU-vendor portability.
NVIDIA is years ahead not just in CUDA, but in terms of all the other libraries built on top. Unless I'm building directly on the lowest levels of abstraction (CUDA/HIP/Kokkos/etc. and BLAS, basically), chances are the things I want will exist for NVIDIA but not for the others. Without a significant and sustained ecosystem push, that's just not going to change quickly.
How big an effort would it take to get those libraries to work with AMD drivers?
To give some perspective, see @ngimel’s comments and PRs in Github. That’s what AMD and Intel are competing against, along with confidence that optimizing for ML customers will pay off (clearly NVIDIA can justify the investment already).
On top of that, Intel is making a serious effort to get into this space and they have a better history of making usable libraries. OpenVINO is already pretty good. It's especially good at having implementations in both Python and not-Python, the latter of which is a huge advantage for open source development because it gets you out of Python dependency hell. There's a reason the thing that caught on is llama.cpp and not llama.py.
CUDA compiles to hardware agnostic intermediary binaries which can run on any hardware as long as the target feature level is compatible and you can target multiple feature levels with a single binary.
CUDA code compiled 10 years ago still runs just fine, ROCm require recompilation every time the framework is updated and every time a new hardware is released.
First there is no forward compatibility guarantee for compiling and based on current history it always breaks.
Secondly even if the code is available a design that breaks software on other users machine is stupid and anti user.
Plenty of projects could import libraries and then themselves be upstream dependencies for other projects, many of which may not be supported.
CUDA is king because people can and still do run 15 year old compiles CUDA code on a daily basis and they know that what they produced today is guaranteed to work on all current and future hardware.
With ROCm you have no guarantee that it would work on even the hardware from the same generation and you pretty much have a guarantee that the next update will break your stuff.
This was a problem with all AMD compilers for GPGPU and ROCm should’ve tried to solve it from day 1 but it still adopted a poor design and that has nothing to do with how many people are working on it.
Most things work like this. You can't natively run ARM programs on x86 or POWER or vice versa, but in most languages you can recompile the code. If you have libraries then you recompile the libraries. All it takes is distributing the code instead of just a binary. Not distributing the code is stupid and anti-user.
> This was a problem with all AMD compilers for GPGPU and ROCm should’ve tried to solve it from day 1 but it still adopted a poor design and that has nothing to do with how many people are working on it.
It isn't even a design decision. Compilers will commonly emit machine code that checks for hardware features like AVX and branch to different instructions based on whether the machine it's running on supports that. That feature can be added to a compiler at any time.
The compiler is open source, isn't it? You could add it yourself, absent any resource constraints.
Also if you expect anyone to compile anything you probably haven’t shipped anything in your life.
ROCm is a pile of rubbish until they throw it out and actually have a model that would guarantee forward and backward compatibility it would remain useless for anyone who actually builds software other people use.
Your x86 program doesn't work on Apple Silicon without something equivalent to a recompile. Old operating systems very commonly can't run on bare metal new hardware because they don't have drivers for it.
Even the IR isn't actually machine code, it's just a binary format of something that gets compiled into actual machine code right before use.
> Also if you expect anyone to compile anything you probably haven’t shipped anything in your life.
Half the software people run uses JIT compilation of some kind.
The "NVIDIA they might leave graphics and just do AI in the future!" that people sometimes do is just such a batshit take because it's graphics that opens the door to all these platforms, and it's graphics that a lot of these accelerators center around. What good is DLSS without a graphics platform? Do you sign the Mediatek deal without a graphics platform? Do you give up workstation graphics and OptiX and raysampling and all these other raytracing techs they've spent billions developing, or do you just choose to do all the work of making Quadros and all this graphics tech but then not do gaming drivers and give up that gaming revenue and all the market access that comes with it? It's faux-intellectualism and ayymd wish-casting at its finest, it makes zero sense when you consider the leverage they get from this R&D spend across multiple fields.
CUDA is unshakeable precisely because NVIDIA is absolutely relentless in getting their foot in the door, then using that market access to build a better mousetrap with software that everyone else is constantly rushing to catch up to. Every segment has some pain points and NVIDIA figures out what they are and where the tech is going and builds something to address that. AMD's approach of trying to surgically tap high-margin segments before they have a platform worth caring about is fundamentally flawed, they're putting the cart before the horse, and that's why they keep spinning their wheels on GPGPU adoption for the last 15 years. And that's what people are clamoring for NVIDIA to do with this idea of "abandon graphics and just do AI" and it's completely batshit.
Intel gets it, at least. OneAPI is focused on being a viable product and they'll move on from there. ROCm is designed for supercomputers where people get paid to optimize for it - it's an embedded product, not a platform. Like you can't even use the binaries you compile on anything except one specific die (not even a generation, "this is binary is for Navi 21, you need the Navi 23 binary"). CUDA is an ecosystem that people reach for because there's tons of tools and libraries and support, and it works seamlessly and you can deliver an actual product that consumers can use. ROCm is something that your boss tells you you're going to be using because it's cheap, you are paying to engineer it from scratch, you'll be targeting your company's one specific hardware config, and it'll be inside a web service so it'll be invisible to end-users anyway. It's an embedded processor inside some other product, not a product itself. That's what you get from the "surgically tap high-margin segments" strategy.
But the Mediatek deal is big news. When we were discussing the ARM acquisition etc people totally scoffed that NVIDIA would ever license GeForce IP. And when that fell through, they went ahead and did it anyway. Because platform access matters, it's the foot in the door. The ARM deal was never about screwing licensees or selling more tegras, that would instantly destroy the value of their $40b acquisition. It was 100% always about getting GeForce as the base-tier graphics IP for ARM and getting that market access to crack one of the few remaining segments where CUDA acceleration (and other NVIDIA technologies) aren't absolutely dominant.
And graphics is the keystone of all of it. Market access, software, acceleration, all of it falls apart without the graphics. They'd just be ROCm 2.0 and nobody wants that, not even AMD wants to be ROCm. AMD is finally starting to see it and move away from it, it would be wildly myopic for NVIDIA to do that and Jensen is not an idiot.
Not entirely a direct response to you but I've seen that sentiment a ton now that AI/enterprise revenue has passed graphics and it drives me nuts. Your comment about "what would it take to get Radeon ahead of CUDA mindshare" kinda nailed it, CUDA literally is winning so hard that people are fantasizing about "haha but what if NVIDIA got tired of winning and went outside to ride bikes and left AMD to exploit graphics in peace" and it's crazy to think that could ever be a corporate strategy. Why would they do that when Jensen has spent the last 25 years building this graphics empire? Complete wish-casting, “so dominant that people can’t even imagine the tech it would take to break their ubiquity” is exactly where Jensen wants to be, and if anything they are still actively pushing to be more ubiquitous. That's why their P/E are insane (probably overhyped even at that, but damn are they good).
If there is a business to be made doing only AI hardware and not a larger platform (and I don’t think there is, at that point you’re a commodity like dozens of other startups) it certainly looks nothing like the way nvidia is set up. These are all interlocking products and segments and software, you can’t cut any one of them away without gutting some other segment. And fundamentally the surgical revenue approach doesn’t work, AMD has continuously showed that for the last 15 years.
Being unwilling to catch a falling knife by cutting prices to the bone doesn’t mean they don’t want to be in graphics. The consumer GPU market is just unavoidably soft right now, almost irregardless of actual value (see: 4070 for $600 with a $100 GC at microcenter still falling flat). Even $500 for a 4070 is probably flirting with being unsustainably low (they need to fund R&D for the next gen out of these margins) but if a de-facto $500 price doesn’t spark people’s interests/produce an increase in sales they’re absolutely not going any lower than that this early in the cycle. They’ll focus on margin on the sales they can actually make, rather than chasing the guy who is holding out for 4070 to be $329. People don't realize it but obstinently refusing to buy at any price (even a good deal) is paradoxically creating an incentive to just ignore them and chase margins.
It doesn’t mean they don’t want to be in that market but they’re not going to cut their own throat, mis-calibrate consumer expectations, etc.
Just as AMD is finding out with the RX 7600 launch - if you over-cut on one generation, the next generation becomes a much harder sell. Which is the same lesson nvidia learned with the 1080 ti and 20-series. AMD is having their 20-series moment right now, they over-cut on the old stuff and the new stuff is struggling to match the value. And the expectations of future cuts is only going to dampen demand further, they’re Osborne Effect’ing themselves with price cuts everyone knows are coming. Nvidia smartened up - if the market is soft and the demand just isn’t there… make less gaming cards and shift to other markets in the meantime. Doesn’t mean they don’t want to be in graphics.
My sibling commenter is shadowbanned, but if you look into their comment history, there are occasionally comments that are not dead. How does this happen?
I needed two for a project and ended up paying a lot more than I wanted for used ones.
For those not familiar, consumer/hobbyist grade TPUs:
Tensorflow lost out to Pytorch because the former is grossly complex for the same tasks, with a mountain of dependencies, as is the norm for Google projects. Using it was such a ridiculous pain compared to Pytorch.
And anyone can use a mythical TPU right now on the Google Cloud. It isn't magical, and is kind of junky compared to an H100, for instance. I mean...Google's recent AI supercomputer offerings are built around nvidia hardware.
CUDA keeps winning because everyone else has done a horrendous job competing. AMD, for instance, had the rather horrible ROCm, and then they decided that they would gate their APIs to only their "business" offerings while nvidia was happy letting it work on almost anything.
Something AMD doesn't seem to understand/accept is that since they are consistently lagging nVidia on both the hardware and software front, nVidia can get away with some things AMD can't. Everyone hates nVidia for it, but unless/until AMD wises up they're going to keep losing.
Just give you a crude metaphor - buying NVDA is like buying a $10 million dollar house to collect $10,000 in rent a year. The price to earnings is bonkers. This valuation only makes sense if somehow Nvidia is using alien technology that couldn't possible by reproduced in the next two decades by any other company.
Apple is non-viable for LLM workloads.
This may seem like a very low bar to clear, but AMD continues to struggle with it. I don't understand it. They act as if GPU compute was a fad not worth investing in.
https://geohot.github.io//blog/jekyll/update/2023/05/24/the-...
CUDA works, ROCm doesn't work well. Very few people want to run stable diffusion inference, fine tune LLaMA, train a large foundation model on AMD cards.
OpenAI has put in some work on Triton, Modular is working on Mojo, and tiny corp is working on their alternative.
Until some of those alternatives work as well as CUDA, people will mostly choose to buy Nvidia cards.
The monopoly is under attack from multiple angles, but they'll be able to print some good cash in the (potentially long) meantime.
Oh, and still significant supply shortages at many cloud providers. And now Nvidia's making more moves to renting GPUs directly. It'll be interesting to see how long it takes them to be able to have their supply meet demand.
Edit: Nevermind, found a huge thread from 2 days ago Lol.
“The current crop of AI chip companies failed. Many of them managed to tape out chips, some of those chips even worked. But not a single one wrote a decent framework to use those chips. They had similar performance/$ to NVIDIA, and way worse software. Of course they failed. Everyone just bought stuff from NVIDIA.”
https://geohot.github.io//blog/jekyll/update/2023/05/24/the-...
TPUs are good hardware, but TPUs are not available outside of GCP. There's not as much of an incentive for other companies to build software around TPUs like there is with CUDA. The same is likely true of chips like Cerebras' wafer scale accelerators as well.
Nvidia's won a stable lead on their competition that's probably not going to disappear for the next 2-5 years and could compound over that time.
The effort to make (PyTorch) code run on TPUs is not worth it and my lab would rather rent (discounted) Nvidia GPUs than use free TRC credits we have at the moment. Additionally, at least in Jan 2023 when I last tried this, PyTorch XLA had a significant reduction in throughput so to really take advantage you would probably need to convert to Jax/TF which are used internally at Google and better supported.
Note there are added costs when using V4 nodes such as the VM, storage and logging which can get $$$.
> where for GPU model need to fit in NVlink connected GPUs
Huh, where is this coming from? You can definitely efficiently scale transformers across multiple servers with parallelism and 1T is entirely feasible if you have the $. Nvidia demonstrated this back in 2021.
Because Nvidia created a supercomputer with A100, with lot of focus for networking. Cloud providers don't give that option.
Pretty sure MosaicML also does this but I haven't used their offering.
https://www.amazon.science/blog/scaling-to-trillion-paramete...
I don’t really see the Nvidia monopoly on ML training stopping anytime soon.
That said, AMD used to be in a dire financial situation, whereas now they can afford to fix their shit and actually give chase. NVIDIA has turned the thumb screws very far and they can probably turn them considerably further before researchers jump, but far enough to justify 150x? I have doubts.
It's a little irritating that Nvidia has effectively monopolized the GPGPU market so effectively; a part of me wonders if the best that that AMD could do is just make a CUDA-compatibility layer for AMD cards.
Or am I misunderstanding CUDA? I think of it as something like OpenGL/DirectX.
It reminds me of when the YOLOv3 model came out and every single upgrade just gave us more and more features and capabilities (the v8 has auto segmentation).
AMD dropped the ball on this, just like Intel when Ryzen dropped, I just don't see a way for them to bring it around.
Particularly applicable here, couldn't resist myself.
1) Heavy short option interest going into earnings
2) A large beat announced in after hours
Major market players can take advantage of large earnings surprises by manipulating a stock in after hours. It is possible to trigger very large movements with very little volume because most participants don't have access to after hours trading.
When the market opens the next day the "false" gains should typically be wiped out unless the move is large enough to force the closing of certain positions. In this case, it looks like there was a clamor to escape long puts and short calls.
The momentum behind NVDA as well as some other tech stocks right now (SMCI, META, NFLX) is frankly stunning. Nary a dip for 6 months. There is so much FOMO in the AI trade that I don't think it crashes back down to earth very soon. Still I'm way too scared to try to get in late.
And in this case the F in FOMO is real. Not just a feeling of missing out, but fear that all your other investments are going to zero as AI replaces entire industries, for example.
- Crypto
- AI
- Gaming / Entertainment
- Self driving cars
- VR / Metaverse (whatever that is)
I'm very bullish on the company.
Enterprise compute has been and will continue to be NVidias bread and butter going forward, and they have been betting on this for the past decade. Whether enterprise compute will be for AI, studio graphics, simulation, FSD, etc. those are all more lucrative and imo more interesting from a growth perspective vs their gaming segment. b2b companies have much higher ceilings than b2c.
We know they'll be motivated, but can they actually compete is the question.
AMD is shitting the bed but might find partners in this space.
Apple is a dark horse and they are very much into the on-device AI abilities.
Amazon, Microsoft, Google can all throw endless amounts of software and hardware at the problem and there have been many advances in the AI space regarding training smaller models that can compete with the big ones.
Open source is going nuts in this space, as well as a ton of academic research. This is likely the biggest dark horse. Major advances are happening weekly.
Nvidia effectively has no competition right now due to AMDs software issues. It's hard to see how that can continue with how big their cap is. Someone will be able to create a competitive product.
Don't they have specialized chips?
PS - you can literally go to https://www.nvidia.com/en-us/ and read the menu to see what they do...
When it comes to owning a share, it eventually needs to make the investor money through dividends or price appreciation. The argument for high PE ratio is price appreciation (growth), but exponential growth is very hard to sustain, so PE ratio has to come down to a certain level in the long term. Also, there is always a risk of a company declining or even folding.
At least you have to actually look in the eyes the guy you are screwing over.
You can't buy stuff because you think other people will also buy, that would mean that you are buying/selling opinions not companies.
If you can predict the next hype cycle or when exactly this one will end, you will make a zillion dollars.
If you are talking about entire markets rather than individual companies then p/e matters because the only examples we have of markets with extreme p/e ratios all ended in disaster.
You should still buy stocks for the long term regardless of PE ratios because you're right, it's not possible to predict what they'll do. Unfortunately when you buy at a high pe you can't anticipate as much return in the future, all else equal.
And yet another reminder how far behind opencl/AMD is
AI hardware is useless without software ecosystem (AMD and Intel could tell a story about that).
Latest marketing materials of Tenstorrent tell stories about great chips and licensing deals, but not a single word about the software side of the things.
Compare that to how much NVidia talks about software on its presentations.
Going to have to buy a car whose doors open in a new way.
How many A100 or H100 cards are actually manufactured annually? A few hundred thousand, if that?
Suddenly, there's a big demand. Microsoft mentioned buying something like 25,000 of the H100 cards for GPT-4 and ongoing training. I'm certain they're not paying retail pricing, so that's a few hundred million in revenue for NVIDIA. They're probably the biggest single customer right now, except perhaps for Amazon.
NVIDIA's revenue in 2022 was $27 billion. The additional H100 cards they've sold this year is a fraction of that. Their retail prices have spiked and availability has dropped because supply is inelastic and there aren't any other suppliers with equivalent products... yet.
Fundamentally, a H100 is not that different to a desktop GPU! It's a little bigger, the math units have a different ALU balance, and they use high-bandwidth memory (HBM), but that's it. There's nothing else really special about them. Unlike a CPU, which is extremely complex, a GPU is a relatively simple unit repeated over 10K times. In some sense, it's a copy-paste exercise.
NVIDIA has a tiny moat, because AMD simply didn't bother to go after what was -- until now -- a relatively small market.
That market is going to be huge, but that invites competition! When tens or even hundreds of billions are on the table, you can bet your bottom dollar that AMD, Intel, Google, and even Facebook won't sit idly by and watch NVIDIA walk off with it.
So what moat does NVIDIA have?
CUDA is like assembly language. PyTorch can target other back-ends. Compilers can target other GPU instructions sets. Throw a billion dollars at this, and it suddenly becomes an eminently solvable problem. Just look at Apple's CPU transitions and Amazon rolling out ARM cloud servers.
A card with HBM memory? AMD did that first! They already have the tech.
A GPU + CPU hybrid with unified memory? Both Intel and Apple have either existing or upcoming products. Intel for example just abandoned[1] a HPC CPU design that was a combo of a GPU+CPU surrounded by HBM chips acting as a cache for terabytes of DDR5 memory -- ideal for training or running very large language models!
A GPU with a huge amount of 16-bit, 8-bit, or 4-bit ops/sec? Guess what: this is easier than getting high performance with 64-bit floats! You can literally brute force optimal circuit layouts for 4-bit ALUs. No need to be clever at all. All you need is the ability to manufacture "3nm" chips. TSMC does that, not NVIDIA. Intel and Samsung are catching up, rapidly.
Fundamentally, the programming interface 99% of AI researchers use are high-level languages like Python or maybe C++. Compilers exist. Even with CUDA, diverse instruction sets and capabilities exist.
So.. where's the moat!?
[1] Ooo, I bet they feel real stupid right now for throwing in the towel literally months before the LLM boom started taking off.
On the other hand it's cool to see that programming language tech as the keystone but on the other hand it's frustrating and tragic that the whole software stack and dev exp landscape is so crap in GPU/TPU land and the bar is so low that you NV win with a hard to use proprietary C++ based language and preside over a fragmented landscape of divided and conquered competition. Makes you wish the Intel Larrabee etc open platform direction had won out.
The amount of software written for CUDA pales in comparison to the amount that has been written for Intel x86, yet two large companies migrated off it.
The lock-in with Intel was due to binary distribution (Windows software), and binary ABIs.
Everything to do with large language models is compiled from scratch, using high level languages.
The LLM codes themselves themselves are trivial things, easily replicated in a matter of hours on some other platforms. The hard part is gathering the training data and the compute.
The hard parts are not dependent on CUDA.
Look at it this way: Google developed Transformers and trained multiple LLMs using TPUs, not CUDA GPUs! The reason their LLMs are stupid is not because of the TPUs.
In principle it's easy to recompile CPU side stuff too, but there are 3rd party component ecosystems, other dependency quirks, familiarity with the platform, familiar well known performance properties, sharing the same questions and answers on stackoverflow, etc. The lock-in network effect can be strong even if its individual contributing factors
I agree it's less locked in than eg Intel had it in the heyday. Ultimately we'll find out when the competition eventually shows up with reasons to consider switching.