Need to let loose a primal scream without collecting footnotes first? Have a
sneer percolating in your system but not enough time/energy to make a whole post
about it? Go forth and be mid: Welcome to the Stubsack, your first port of call
for learning fresh Awful you’ll near-instantly regret. Any awf...
Need to let loose a primal scream without collecting footnotes first? Have a sneer percolating in your system but not enough time/energy to make a whole post about it? Go forth and be mid: Welcome to the Stubsack, your first port of call for learning fresh Awful you’ll near-instantly regret.
Any awful.systems sub may be subsneered in this subthread, techtakes or no.
If your sneer seems higher quality than you thought, feel free to cut’n’paste it into its own post — there’s no quota for posting and the bar really isn’t that high.
The post Xitter web has spawned soo many “esoteric” right wing freaks, but there’s no appropriate sneer-space for them. I’m talking redscare-ish, reality challenged “culture critics” who write about everything but understand nothing. I’m talking about reply-guys who make the same 6 tweets about the same 3 subjects. They’re inescapable at this point, yet I don’t see them mocked (as much as they should be)
Like, there was one dude a while back who insisted that women couldn’t be surgeons because they didn’t believe in the moon or in stars? I think each and every one of these guys is uniquely fucked up and if I can’t escape them, I would love to sneer at them.
(Semi-obligatory thanks to @dgerard for starting this.)
Reposting this for the new week thread since it truly is a record of how untrustworthy sammy and co are. Remember how OAI claimed that O3 had displayed superhuman levels on the mega hard Frontier Math exam written by Fields Medalist? Funny/totally not fishy story haha. Turns out OAI had exclusive access to that test for months and funded its creation and refused to let the creators of test publicly acknowledge this until after OAI did their big stupid magic trick.
From Subbarao Kambhampati via linkedIn:
"𝐎𝐧 𝐭𝐡𝐞 𝐬𝐞𝐞𝐝𝐲 𝐨𝐩𝐭𝐢𝐜𝐬 𝐨𝐟 “𝑩𝒖𝒊𝒍𝒅𝒊𝒏𝒈 𝒂𝒏 𝑨𝑮𝑰 𝑴𝒐𝒂𝒕 𝒃𝒚 𝑪𝒐𝒓𝒓𝒂𝒍𝒍𝒊𝒏𝒈 𝑩𝒆𝒏𝒄𝒉𝒎𝒂𝒓𝒌 𝑪𝒓𝒆𝒂𝒕𝒐𝒓𝒔” hashtag#SundayHarangue. One of the big reasons for the increased volume of “𝐀𝐆𝐈 𝐓𝐨𝐦𝐨𝐫𝐫𝐨𝐰” hype has been o3’s performance on the “frontier math” benchmark–something that other models basically had no handle on.
We are now being told (https://lnkd.in/gUaGKuAE) that this benchmark data may have been exclusively available (https://lnkd.in/g5E3tcse) to OpenAI since before o1–and that the benchmark creators were not allowed to disclose this *until after o3 *.
That o3 does well on frontier math held-out set is impressive, no doubt, but the mental picture of “𝒐1/𝒐3 𝒘𝒆𝒓𝒆 𝒋𝒖𝒔𝒕 𝒃𝒆𝒊𝒏𝒈 𝒕𝒓𝒂𝒊𝒏𝒆𝒅 𝒐𝒏 𝒔𝒊𝒎𝒑𝒍𝒆 𝒎𝒂𝒕𝒉, 𝒂𝒏𝒅 𝒕𝒉𝒆𝒚 𝒃𝒐𝒐𝒕𝒔𝒕𝒓𝒂𝒑𝒑𝒆𝒅 𝒕𝒉𝒆𝒎𝒔𝒆𝒍𝒗𝒆𝒔 𝒕𝒐 𝒇𝒓𝒐𝒏𝒕𝒊𝒆𝒓 𝒎𝒂𝒕𝒉”–that the AGI tomorrow crowd seem to have–that 𝘖𝘱𝘦𝘯𝘈𝘐 𝘸𝘩𝘪𝘭𝘦 𝘯𝘰𝘵 𝘦𝘹𝘱𝘭𝘪𝘤𝘪𝘵𝘭𝘺 𝘤𝘭𝘢𝘪𝘮𝘪𝘯𝘨, 𝘤𝘦𝘳𝘵𝘢𝘪𝘯𝘭𝘺 𝘥𝘪𝘥𝘯’𝘵 𝘥𝘪𝘳𝘦𝘤𝘵𝘭𝘺 𝘤𝘰𝘯𝘵𝘳𝘢𝘥𝘪𝘤𝘵–is shattered by this. (I have, in fact, been grumbling to my students since o3 announcement that I don’t completely believe that OpenAI didn’t have access to the Olympiad/Frontier Math data before hand… )
We all know that data contamination is an issue with LLMs and LRMs. We also know that reasoning claims need more careful vetting than “𝘸𝘦 𝘥𝘪𝘥𝘯’𝘵 𝘴𝘦𝘦 𝘵𝘩𝘢𝘵 𝘴𝘱𝘦𝘤𝘪𝘧𝘪𝘤 𝘱𝘳𝘰𝘣𝘭𝘦𝘮 𝘪𝘯𝘴𝘵𝘢𝘯𝘤𝘦 𝘥𝘶𝘳𝘪𝘯𝘨 𝘵𝘳𝘢𝘪𝘯𝘪𝘯𝘨” (see “In vs. Out of Distribution analyses are not that useful for understanding LLM reasoning capabilities” https://lnkd.in/gZ2wBM_F ).
At the very least, this episode further argues for increased vigilance/skepticism on the part of AI research community in how they parse the benchmark claims put out commercial entities."
Every time they go 'this wasnt in the data' it turns out it was. A while back they did the same with translating rareish languages. Turns out it was trained on it. Fucked up. But also, wtf how are they expecting this to stay secret and there being no backlash? This world needs a better class of criminals.
The conspiracy theorist who lives in my brain wants to say its intentional to make us more open to blatant cheating as something that's just a "cost of doing business." (I swear I saw this phrase a half dozen times in the orange site thread about this)
The earnest part of me tells me no, these guys are just clowns, but I dunno, they can't all be this dumb right?
Maybe this is common knowledge, but I had no idea before. What an absolutely horrible decision from google to allow this. What are they thinking?? This is great for phishing and malware, but I don't know what else. (Yeah ok, the reason has probably something to do with "line must go up".)
I recall seeing something of this sort happening on goog for about 12~18mo - every so often a researcher post does the rounds where someone finds Yet Another way goog is fucking it up
the advertising dept has completely captured all mindshare and it is (demonstrably) the only part that goog-the-business cares about
Hmm, surely there is no downside to doing all of one's marketing, both personal* and professional, through the false certainty and low signal of short-form social media. The leopard has only licked Sam's face, it will never bite and begin chewing!
*You and I may find the concept of a "personal brand" to be horrifying, but these guys clearly want to become brands more fervently than Bruce Wayne wanted to become a bat