Discussion about this post

User's avatar
Marcus Seldon's avatar

“Conceding that AI is doing more than just predicting the next word doesn’t actually mean you need to become an AI booster.”

I appreciate you saying this. I share your frustration with the stochastic parrot crowd, but I’ve also been seeing a lot of takes the past few weeks that act like disproving the stochastic parrot thesis means that the most extreme booster ideas are true. For example, that we’ll get superintelligence in 5-10 years (or less).

“For almost all tasks that can be done on a computer, you will make better predictions about Claude’s behavior if you predict that Claude will do what a very smart, very motivated human would do than if you do anything else.”

This strikes me as hyperbolic. Claude Code is shockingly good, but it can’t really operate GUI software, and I wouldn’t trust it operate new software outside its training data without extensive documentation and hand holding. Also, would you give Claude Code a credit card and let it book flights and hotels for you for a trip?

Even in the area of programming, Anthropic has over 100 engineering job openings. Clearly it has real limitations.

ScienceGrump's avatar

This post was a weird flavor of aggressive ignorance. Harper is correct. All LLMs in all stages of production are next token predictors. Fine tuning shifts the distribution of predictions, the hidden portion of the prompts shifts them more, but almost all the information in the LLM is embedded in the base model in any case. This is not some controversial take; it's an objective fact about what LLMs are, and your objections seem to boil down to "it doesn't *feel* like next token prediction to me when I use it, so obviously it's not!" Well, yeah. AI companies invest a lot of resources to make sure it feels like you're talking to mind just a little different from yours. That's a big part of why they don't show you all the prompting text around yours, so it *sounds* like the next token prediction sounds like a character speaking to you. Until jailbreaking destroys the illusion.

Your experience with the vaunted Claude is a lot more positive than mine, although I used their model through aider. I asked it to add a major feature to a project. I caught several bugs in the commit, but more subtle ones took forever to track down and it was more or less useless for helping. In the end it was at best break-even vs doing the whole thing by hand, and only because I'm not familiar with asyncio. I'm sure that if what you need is some trivial app or mod of a kind that is well-represented in its training data and that won't need to be maintained, it seems awesome. But if it's so great at generating software, where is the flood of software? If it's this amazing boost to productivity, where's the production?

Lastly, I have to laugh at this notion that writers are all poo-pooing LLMs, except for Kelsey Piper, bravely swimming against the tide. Look, writers are *exactly* the people most primed to be awed by LLMs. My inbox every day is full of people announcing WHAT A BIG DEAL AI is, how YOU'RE ALL FOOLING YOURSELVES, same as it has been for the last three years. To be honest the volume and repetition sometimes feels like a coordinated propaganda campaign.

50 more comments...

No posts

Ready for more?