Most if not all leading models use synthetic data extensively to do exactly this. However, the synthetic data needs to be well defined and essentially programmed by the data scientists. If you don't define the data very carefully, ideally math or programs you can verify as correct automatically, it's worse than useless. The scope is usually very narrow, no hitchhikers guide to the galaxy rewrite.
But in any case he's probably just parroting whatever his engineers pitched him to look smart and in charge.
I had some similar and obscure corruption issues that wound up being a symptom of failing ram in a main server node. After that, only issues have been conflicts. So I'd suggest checking hardware health in addition to the ideas about backups vs sync.
I've used it extensively, almost $100 in credits, and generally it could one shot everything I threw at it. However: I gave it architectural instructions and told it to use test driven development and what test suite to use. Without the tests yeah it wouldn't work, and a decent amount of the time is cleaning up mistakes the tests caught. The same can be said for humans, though.
Some details. One of the major players doing the tar pit strategy is Cloudflare. They're a giant in networking and infrastructure, and they use AI (more traditional, nit LLMs) ubiquitously to detect bots. So it is an arms race, but one where both sides have massive incentives.
Making nonsense is indeed detectable, but that misunderstands the purpose: economics. Scraping bots are used because they're a cheap way to get training data. If you make a non zero portion of training data poisonous you'd have to spend increasingly many resources to filter it out. The better the nonsense, the harder to detect. Cloudflare is known it use small LLMs to generate the nonsense, hence requiring systems at least that complex to differentiate it.
So in short the tar pit with garbage data actually decreases the average value of scraped data for bots that ignore do not scrape instructions.
Out of curiosity, found 3 versions of the policy manual edits. As of Jan 19, it prohibited operations based on gender, gender identity, or sexual orientation. Sometime around February it removed those terms, adding "sex." Around March, and current as of 2025-04-06, it readded sexual orientation, presumably after this hitting the news, making the allowance of basing operations on gender identity more pointed.
Sources, page 79: https://web.archive.org/web/20250119072246/https://www.dhs.gov/sites/default/files/2025-01/Office of Intelligence and Analysis Policy Manual.pdf https://web.archive.org/web/20250222000624/https://www.dhs.gov/sites/default/files/2025-02/Office of Intelligence and Analysis Policy Manual-508.pdf https://web.archive.org/web/20250323044321/https://www.dhs.gov/sites/default/files/2025-03/25_0313_ia_office-of-intelligence-and-analysis_policy-manual.pdf
Was about to post a Hugging Face link til I finished reading. For what it's worth, once you have Ollama installed it's a single command to download, install, and immediately drop into a chat with a model, either from Ollama's library or Hugging Face, or anyone else. On Arch the entire process to get it working with gpu acceleration was installing 2 packages then start ollama.
Orthokeratology lenses reshape your cornea overnight. Been using them for years, heartily recommend.
Important context: Sweden and especially Finland have long had a defense model based around literally everyone contributing to defending against an occupation. The real change is they don't consider that enough of a deterrent anymore, hence joining NATO, after seeing Russia bloody itself against Ukraine for several years.
Key detail: they're not dropping it because they're giving up, the judge dismissed it without prejudice, which means that in 4 years they can pick the case back up. Under a Trump DoJ the case would likely have ended with prejudice, closing it permanently.
In an interview recently he openly speculated about how long he'd be in prison if Kamala wins. It seems like he has a strong savior complex, and thinks he's the only one that can save humanity by establishing colonies on Mars. He phrases it as preserving "the light of consciousness." Can't reasonably do that from prison. With that perspective, for him, practically all means justify that end.
At more personal level, after one of his kids transitioned he publicly stated it was like that kid had "died." In his own words, he swore to kill the "woke mind virus."
Stories like this are sometimes more complicated than they appear. The infamous examples of $500 hammers, for example, were anti sparking hammers for working around flammables or munitions, hence requiring special materials, certification, and low production runs.
For this case, we have liquid hand soap dispensed by a pump. Pumps require a sealed vessel. Unlike commercial planes, military planes are required to anticipate prolonged operation with an unpressurized cabin. At max altitude of a C17, atmospheric pressure is only 20% of sea level. Off the shelf dispensers are unlikely to be designed to withstand that pressure difference, let alone function normally. In a high demand environment like aerospace, even apparently minor failures like an exploding soap container needs to be taken seriously due to the possibility of unexpected cascading failures. Why not use bar soap, then? Unfortunately this too has complications, like not being able to be securely mounted, liquid soaps having superior hygiene and cross contamination characteristics, and necessity for military standardized soap, sometimes designed for heavy metal, eg lead, which is likely if the cargo were munitions.
This unusual set of requirements unlikely to be seen outside the military context, so whether designed by Boeing or off the shelf the unit would likely have low quantity manufacturing runs, significantly increasing per unit costs. Combine that with the necessary certifications and the per unit costs balloon even further.
While a soap dispenser having an 80x markup seems absurd, it might be more reasonable than it seems at first glance. To be clear, there absolutely is military contractor graft. I just don't expect even a $10,000 soap dispenser would be a substantial proportion if it even within the C17.
Simply become older. The older I get the less I want to do hours long sessions. But practically, I have a hard bedtime alarm on my phone, which works cause sleep deprivation and work suck.
I haven't gone through all their work, but some of the delisted maintainers were working on driver support for Baikal, a Russia based electronics company. Their work includes semiconductors, ARM processors. Given the sanctions against Russia, especially for dual use stuff like domestic semiconductors, I would expect that Linus and other maintainers were told or concluded that by signing off and merging their code they'd be personally violating sanctions.
I recently removed in editor AI cause I noticed I was acquiring muscle memory for my brain, not thinking through the rest past the start of a snippet that would get an LLM to auto complete. I'm still using LLMs, particularly for languages and libraries I'm not familiar with, but using the artifacts editors in ChatGPT and Claude.
I really don't blame them, security and privacy minded folk are more likely to use niche configs. Feels like for Linux stuff companies may be better served making APIs and letting the community handle it. Rclone for example implements a bunch, and last I knew had an unstable Proton plugin.
The comments from that article are some of the most vitriolic I've ever seen on a technical issue. Goes to prove the maintainer's point though.
Some are good for a laugh though, like assertions that Rust in the kernel is a Microsoft sabotage op or LLVM is for grifters and thieves.
FOSS in general needs better means of financial support. While the software is free and libre, developer time is not, and ultimately they gotta eat and pay bills. I hope they get positive results and don't catch much unnecessary flak.
Given the ease of implantation of end to end encryption now, it's a reasonable assumption that anything not e2ee is being data mined. E2ee has extensive security benefits, for example even if your data is dumped the info is still useless. So, there has to be a compelling reason to not use it.
People haven't really changed. As always, power corrupts. When the rewards are great enough, it seems people are often enough willing to compromise their integrity.
My first programming experience, an online class, was in a Linux VM. Linux made programming easy and delightful, Windows always made it a huge pain. As time went on, more of what I did was easier on Linux, and now everything is.