Do you like SCP foundation content? There is an SCP directly inspired by Eliezer and lesswrong. It's kind of wordy and long. And in the discussion the author waffled on owning that it was a mockery of Eliezer.
I think they also want recognition/credit for spending 5 minutes (or less) typing some words at an image generator as if that were comparable to people who develop technical skills and then create effortful meaningful work just because the outputs are (superficially) similar.
You had me going until the very last sentence. (To be fair to me, the OP broke containment and has attracted a lot of unironically delivered opinions almost as bad as your satirical spiel.)
The latest twist I'm seeing isn't blaming your prompting (although they're still eager to do that), it's blaming your choice of LLM.
"Oh, you're using shitGPT 4.1-4o-o3 mini _ro_plus for programming? You should clearly be using Gemini 3.5.07 pro-doubleplusgood, unless you need something locally run, then you should be using DeepSek_v2_r_1 on your 48 GB VRAM local server! Unless you need nice sounding prose, then you actually need Claude Limmerick 3.7.01. Clearly you just aren't trying the right models, so allow me to educate you with all my prompt fondling experience. You're trying to make some general point? Clearly you just need to try another model."
It can make funny pictures, sure. But it fails at art as an endeavor to communicate an idea, feeling, or intent of the artist, the promptfondler artists are providing a few sentences instruction and the GenAI following them without any deeper feelings or understanding of context or meaning or intent.
GPT-1 is 117 million parameters, GPT-2 is 1.5 billion parameters, GPT-3 is 175 billion, GPT-4 is undisclosed but estimated at 1.7 trillion. Token needed for training and training compute scale linearly (edit: actually I'm wrong, looking at the wikipedia page... so I was wrong, it is even worse for your case than I was saying, training compute scales quadratically with model size, it is going up 2 OOM for every 10x of parameters) with model size. They are improving ... but only getting a linear improvement in training loss for a geometric increase in model size, training time. A hypothetical GPT-5 would have 10 trillion training parameters and genuinely need to be AGI to have the remotest hope of paying off it's training. And it would need more quality tokens than they have left, they've already scrapped the internet (including many copyrighted sources and sources that requested not to be scrapped). So that's exactly why OpenAI has been screwing around with fine-tuning setups with illegible naming schemes instead of just releasing a GPT-5. But fine-tuning can only shift what you're getting within distribution, so it trades off in getting more hallucinations or overly obsequious output or whatever the latest problem they are having.
Lower model temperatures makes it pick it's best guess for next token as opposed to randomizing among probable guesses, they don't improve on what the best guess is and you can still get hallucinations even picking the "best" next token.
And lol at you trying to reverse the accusation against LLMs by accusing me of regurgitating/hallucinating.
Posts that explode like this are fun and yet also a reminder why the banhammer is needed.
Joe Rogan Experience!
...side note my most prominent irl conversation about Joe Rogan was with a relative who was trying to convince me it was a good thing that Joe Rogan platformed a celebrity who was saying 1x1=2 (Terrence Howard). Literally beyond parody.
What if [tokes joint] hallucinations are actually, like, proof the models are almost at human level man!
Eliezer Yudkowsky, Geoffrey Hinton, Paul Cristiano, Ilya Sustkever
One of those names is not like the others.
The promptfarmers can push the hallucination rates incrementally lower by spending 10x compute on training (and training on 10x the data and spending 10x on runtime cost) but they're already consuming a plurality of all VC funding so they can't 10x many more times without going bust entirely. And they aren't going to get them down to 0%, hallucinations are intrinsic to how LLMs operate, no patch with run-time inference or multiple tries or RAG will eliminate that.
And as for newer models... o3 actually had a higher hallucination rate because trying to squeeze rational logic out of the models with fine-tuning just breaks them in a different direction.
I will acknowledge in domains with analytically verifiable answers you can check the LLMs that way, but in that case, its no longer primarily an LLM, you've got an entire expert system or proof assistant or whatever that can operate independently of the LLM and the LLM is just providing creative input.
A junior developer learns from these repeated minor corrections. LLM's can't learn from them. they don't have any runtime fine-tuning (and even if they did it wouldn't be learning like a human does), at the very best past conversations get summarized and crammed into the context window hidden from the user to provide a shallow illusion of continuity and learning.
Given the libertarian fixation, probably a solid percentage of them. And even the ones that didn't vote for Trump often push or at least support various mixes of "grey-tribe", "politics is spiders", "center left", etc. kind of libertarian centrist thinking where they either avoided "political" discussion on lesswrong or the EA forums (and implicitly accepted libertarian assumptions without argument) or they encouraged "reaching across the aisle" or "avoiding polarized discourse" or otherwise normalized Trump and the alt-right.
Like looking at Scott's recent posts on ACX, he is absolutely refusing responsibility for his role in the alt-right pipeline with every excuse he can pull out of his ass.
Of course, the heretics who have gone full e/acc absolutely love these sorts of "policy" choices, so this actually makes them more in favor of Trump.
In terms of writing bots to play Pokemon specifically (which given the prompting and custom tools written I think is the most fair comparison)... not very well... according to this reddit comment a bot from 11 years ago can beat the game in 2 hours and was written with about 7.5K lines of LUA, while an open source LLM scaffold for playing Pokemon relatively similar to claude's or gemini's is 4.8k lines (and still missing many of the tools Gemini had by the end, and Gemini took weeks of constant play instead of 2 hours).
So basically it takes about the same number of lines written to do a much much worse job. Pokebot probably required relatively more skill to implement... but OTOH, Gemini's scaffold took thousands of dollars in API calls to trial and error develop and run. So you can write bots from scratch that substantially outperform LLM agent for moderately more programming effort and substantially less overall cost.
In terms of gameplay with reinforcement learning... still not very well. I've watched this video before on using RL directly on pixel output (with just a touch of memory hacking to set the rewards), it uses substantially less compute than LLMs playing pokemon and the resulting trained NN benefits from all previous training. The developer hadn't gotten it to play through the whole game... probably a few more tweaks to the reward function might manage a lot more progress? OTOH, LLMs playing pokemon benefit from being able to more directly use NPC dialog (even if their CoT "reasoning" often goes on erroneous tangents or completely batshit leaps of logic), while the RL approach is almost outright blind... a big problem the RL approach might run into is backtracking in the later stages since they use reward of exploration to drive the model forward. OTOH, the LLMs also had a lot of problems with backtracking.
My (wildly optimistic by sneerclubbing standards) expectations for "LLM agents" is that people figure out how to use them as a "creative" component in more conventional bots and AI approaches, where a more conventional bot prompts the LLM for "plans" which it uses when it gets stuck. AlphaGeometry2 is a good demonstration of this, it solved 42/50 problems with a hybrid neurosymbolic and LLM approach, but it is notable it could solve 16 problems with just the symbolic portion without the LLM portion, so the LLM is contributing some, but the actual rigorous verification is handled by the symbolic AI.
(edit: Looking at more discussion of AlphaGeometry, the addition of an LLM is even less impressive than that, it's doing something you could do without an LLM at all, on a set of 30 problems discussed, the full AlphaGeometry can do 25/30, without the LLM at all 14/30,* but* using alternative methods to an LLM it can do 18/30 or even 21/30 (depending on the exact method). So... the LLM is doing something, which is more than my most cynical sneering would suspect, but not much, and not necessarily that much better than alternative non-LLM methods.)
Despite the snake-oil flavor of Vending-Bench, GeminiPlaysPokemon, and ClaudePlaysPokemon, I've found them to be a decent antidote to agentic LLM hype. The insane transcripts of Vending-Bench and the inability of an LLM to play Pokemon at the level of a 9 year old is hard to argue with, and the snake oil flavoring makes it easier to get them to swallow.
It starts out seeming like a funny but petty and irrelevant criticism of his kitchen skill and product choices, but then beautifully transitions that into an accurate criticism of OpenAI.
This post has prompted me to give a reminder that one of the authors of AI 2027 predicted back in 2021 that "prompt programming" would be a thing by now.
On the other hand, it doesn't matter what Elon actually thinks, because he would probably go through with the lawsuit given any available pretext because he's mad he got kicked out and couldn't commercialize OpenAI exactly like Sam tried.
I knew the exact couple you were talking about before I read any additional comments. They seem to show up in the news like clockwork... do they have a publicist or PR agent looking for newspapers in need of garbage filler puff pieces? If anything, going to the white house is a step up from there normal pattern of self promotion.
Yeah they are normally all over anything with the word "market" in it, with an almost religious like belief in market's ability to solve things.
My suspicion is that the writer has picked up some anti-Ukrainian sentiment from the US right wing (which in order to rationalize and justify Trump's constant sucking up to Putin has looked for any and every angle to tear Ukraine down). And this anti-Ukrainian sentiment has somehow trumped their worship of markets... Checking back through their posting history to try to discern their exact political alignment... it's hard to say, they've got the Scott Alexander thing going on where they use disconnected historical examples crossed with a bad analogies crossed with misappropriated terms from philosophy to make points that you can't follow unless you already know their real intended context. So idk.
I found a neat essay discussing the history of Doug Lenat, Eurisko, and cyc here. The essay is pretty cool, Doug Lenat made one of the largest and most systematic efforts to make Good Old Fashioned Symbolic AI reach AGI through sheer volume and detail of expert system entries. It didn't work (obviously), but what's interesting (especially in contrast to LLMs), is that Doug made his business, Cycorp actually profitable and actually produce useful products in the form of custom built expert systems to various customers over the decades with a steady level of employees and effort spent (as opposed to LLM companies sucking up massive VC capital to generate crappy products that will probably go bust).
This sparked memories of lesswrong discussion of Eurisko... which leads to some choice sneerable classic lines.
In a sequence classic, Eliezer discusses Eurisko. Having read an essay explaining Eurisko more clearly, a lot of Eliezer's discussion seems a lot emptier now.
> To the best of my inexhaustive knowledge, EURISKO may still be the most sophisticated self-improving AI ever built - in the 1980s, by Douglas Lenat before he started wasting his life on Cyc. EURISKO was applied in domains ranging from the Traveller war game (EURISKO became champion without having ever before fought a human) to VLSI circuit design.
This line is classic Eliezer dunning-kruger arrogance. The lesson from Cyc were used in useful expert systems and effort building the expert systems was used to continue to advance Cyc, so I would call Doug really successful actually, much more successful than many AGI efforts (including Eliezer's). And it didn't depend on endless VC funding or hype cycles.
> EURISKO used "heuristics" to, for example, design potential space fleets. It also had heuristics for suggesting new heuristics, and metaheuristics could apply to any heuristic, including metaheuristics. E.g. EURISKO started with the heuristic "investigate extreme cases" but moved on to "investigate cases close to extremes". The heuristics were written in RLL, which stands for Representation Language Language. According to Lenat, it was figuring out how to represent the heuristics in such fashion that they could usefully modify themselves without always just breaking, that consumed most of the conceptual effort in creating EURISKO.
...
> EURISKO lacked what I called "insight" - that is, the type of abstract knowledge that lets humans fly through the search space. And so its recursive access to its own heuristics proved to be for nought. > Unless, y'know, you're counting becoming world champion at Traveller without ever previously playing a human, as some sort of accomplishment.
Eliezer simultaneously mocks Doug's big achievements but exaggerates this one. The detailed essay I linked at the beginning actually explains this properly. Traveller's rules inadvertently encouraged a narrow degenerate (in the mathematical sense) strategy. The second place person actually found the same broken strategy Doug (using Eurisko) did, Doug just did it slightly better because he had gamed it out more and included a few ship designs that countered the opponent doing the same broken strategy. It was a nice feat of a human leveraging a computer to mathematically explore a game, it wasn't an AI independently exploring a game.
Another lesswronger brings up Eurisko here. Eliezer is of course worried:
> This is a road that does not lead to Friendly AI, only to AGI. I doubt this has anything to do with Lenat's motives - but I'm glad the source code isn't published and I don't think you'd be doing a service to the human species by trying to reimplement it.
And yes, Eliezer actually is worried a 1970s dead end in AI might lead to FOOM and AGI doom. To a comment here: > Are you really afraid that AI is so easy that it's a very short distance between "ooh, cool" and "oh, shit"?
Eliezer responds:
> Depends how cool. I don't know the space of self-modifying programs very well. Anything cooler than anything that's been tried before, even marginally cooler, has a noticeable subjective probability of going to shit. I mean, if you kept on making it marginally cooler and cooler, it'd go to "oh, shit" one day after a sequence of "ooh, cools" and I don't know how long that sequence is.
Fearmongering back in 2008 even before he had given up and gone full doomer.
And this reminds me, Eliezer did not actually predict which paths lead to better AI. In 2008 he was pretty convinced Neural Networks were not a path to AGI.
> Not to mention that neural networks have also been "failing" (i.e., not yet succeeding) to produce real AI for 30 years now. I don't think this particular raw fact licenses any conclusions in particular. But at least don't tell me it's still the new revolutionary idea in AI.
Apparently it took all the way until AlphaGo (sometime 2015 to 2017) for Eliezer to start to realize he was wrong. (He never made a major post about changing his mind, I had to reconstruct this process and estimate this date from other lesswronger's discussing it and noticing small comments from him here and there.) Of course, even as late as 2017, MIRI was still neglecting neural networks to focus on abstract frameworks like "Highly Reliable Agent Design".
So yeah. Puts things into context, doesn't it.
Bonus: One of Doug's last papers, which lists out a lot of lessons LLMs could take from cyc and expert systems. You might recognize the co-author, Gary Marcus, from one of the LLM critical blogs: https://garymarcus.substack.com/
So, lesswrong Yudkowskian orthodoxy is that any AGI without "alignment" will bootstrap to omnipotence, destroy all mankind, blah, blah, etc. However, there has been the large splinter heresy of accelerationists that want AGI as soon as possible and aren't worried about this at all (we still make fun of them because what they want would result in some cyberpunk dystopian shit in the process of trying to reach it). However, even the accelerationist don't want Chinese AGI, because insert standard sinophobic rhetoric about how they hate freedom and democracy or have world conquering ambitions or they simply lack the creativity, technical ability, or background knowledge (i.e. lesswrong screeds on alignment) to create an aligned AGI.
This is a long running trend in lesswrong writing I've recently noticed while hate-binging and catching up on the sneering I've missed (I had paid less attention to lesswrong over the past year up until Trump started making techno-fascist moves), so I've selected some illustrative posts and quotes for your sneering.
- Good news, China actually has no chance at competing at AI (this was posted before deepseek was released). Well. they are technically right that China doesn't have the resources to compete in scaling LLMs to AGI because it isn't possible in the first place > China has neither the resources nor any interest in competing with the US in developing artificial general intelligence (AGI) primarily via scaling Large Language Models (LLMs).
- The Situational Awareness Essays make sure to get their Yellow Peril fearmongering on! Because clearly China is the threat to freedom and the authoritarian power (pay no attention to the techbro techno-fascist) > In the race to AGI, the free world’s very survival will be at stake. Can we maintain our preeminence over the authoritarian powers?
- More crap from the same author
- There are some posts pushing back on having an AGI race with China, but not because they are correcting the sinophobia or the delusions LLMs are a path to AGI, but because it will potentially lead to an unaligned or improperly aligned AGI
- And of course, AI 2027 features a race with China that either the US can win with a AGI slowdown (and an evil AGI puppeting China) or both lose to the AGI menance. Featuring "legions of CCP spies" > Given the “dangers” of the new model, OpenBrain “responsibly” elects not to release it publicly yet (in fact, they want to focus on internal AI R&D). Knowledge of Agent-2’s full capabilities is limited to an elite silo containing the immediate team, OpenBrain leadership and security, a few dozen US government officials, and the legions of CCP spies who have infiltrated OpenBrain for years.
- Someone asks the question directly Why Should I Assume CCP AGI is Worse Than USG AGI?. Judging by upvoted comments, lesswrong orthodoxy of all AGI leads to doom is the most common opinion, and a few comments even point out the hypocrisy of promoting fear of Chinese AGI while saying the US should race for AGI to achieve global dominance, but there are still plenty of Red Scare/Yellow Peril comments > Systemic opacity, state-driven censorship, and state control of the media means AGI development under direct or indirect CCP control would probably be less transparent than in the US, and the world may be less likely to learn about warning shots, wrongheaded decisions, reckless behaviour, etc. True, there was the Manhattan Project, but that was quite long ago; recent examples like the CCP's suppression of information related to the origins of COVID feel more salient and relevant.
We should aim for better elites

I am still subscribed to slatestarcodex on reddit, and this piece of garbage popped up on my feed. I didn't actually read the whole thing, but basically the author correctly realizes Trump is ruining everything in the process of getting at "DEI" and "wokism", but instead of accepting the blame that rightfully falls on Scott Alexander and the author, deflects and blames the "left" elitists. (I put left in quote marks because the author apparently thinks establishment democrats are actually leftist, I fucking wish).
An illustrative quote (of Scott's that the author agrees with)
> We wanted to be able to hold a job without reciting DEI shibboleths or filling in multiple-choice exams about how white people cause earthquakes. Instead we got a thousand scientific studies cancelled because they used the string “trans-” in a sentence on transmembrane proteins.
I don't really follow their subsequent points, they fail to clarify what they mean... In sofar as "left elites" actually refers to centrist democrats, I actually think the establishment Democrats do have a major piece of blame in that their status quo neoliberalism has been rejected by the public but the Democrat establishment refuse to consider genuinely leftist ideas, but that isn't the point this author is actually going for... the author is actually upset about Democrats "virtue signaling" and "canceling" and DEI, so they don't actually have a valid point, if anything the opposite of one.
In case my angry disjointed summary leaves you any doubt the author is a piece of shit:
> it feels like Scott has been reading a lot of Richard Hanania, whom I agree with on a lot of points
For reference the ssc discussion: https://www.reddit.com/r/slatestarcodex/comments/1jyjc9z/the_edgelords_were_right_a_response_to_scott/
tldr; author trying to blameshift on Trump fucking everything up while keeping up the exact anti-progressive rhetoric that helped propel Trump to victory.
My experience at the recently controversial conference/festival on prediction markets …

So despite the nitpicking they did of the Guardian Article, it seems blatantly clear now that Manifest 2024 was infested by racists. The post article doesn't even count Scott Alexander as "racist" (although they do at least note his HBD sympathies) and identify a count of full 8 racists. They mention a talk discussing the Holocaust as a Eugenics event (and added an edit apologizing for their simplistic framing). The post author is painfully careful and apologetic to distinguish what they personally experienced, what was "inaccurate" about the Guardian article, how they are using terminology, etc. Despite the author's caution, the comments are full of the classic SSC strategy of trying to reframe the issue (complaining the post uses the word controversial in the title, complaining about the usage of the term racist, complaining about the threat to their freeze peach and open discourse of ideas by banning racists, etc.).
The virtue of tsuyoku naritai, "I want to become stronger", is to always keep improving—to do better than your previous failures, not just humbly con…

This is a classic sequence post: (mis)appropriated Japanese phrases and cultural concepts, references to the AI box experiment, and links to other sequence posts. It is also especially ironic given Eliezer's recent switch to doomerism with his new phrases of "shut it all down" and "AI alignment is too hard" and "we're all going to die".
Indeed, with developments in NN interpretability and a use case of making LLM not racist or otherwise horrible, it seems to me like their is finally actually tractable work to be done (that is at least vaguely related to AI alignment)... which is probably why Eliezer is declaring defeat and switching to the podcast circuit.