About as open source as a binary blob without the training data
About as open source as a binary blob without the training data
Office space meme:
"If y'all could stop calling an LLM "open source" just because they published the weights... that would be great."
You're viewing a single thread.
It's not just the weights though is it? You can download the training data they used, and run your own instance of the model completely separate from their servers.
4 0 ReplyYou don't download the training data when running an LLM locally. You are downloading the already baked model.
8 0 ReplyDid "they" publish the training data? And the hyperparameters?
8 0 ReplyI mean, I downloaded it from the repo.
2 0 ReplyYou downloaded the weights. That's something different.
9 0 ReplyI may misunderstand, but are the weights typically several hundred gigabytes large?
2 0 ReplyYes. The training data is probably a few hundred petabytes.
8 0 ReplyOh wow that's fuckin huge
2 0 ReplyYeah, some models are trained on pretty much the entire content of the publicly accessible Internet.
3 0 Reply