DeepSeek roundup: banned by governments, no guard rails, lied about its training costs
ebu @ ebu @awful.systems Posts 0Comments 140Joined 2 yr. ago
ebu @ ebu @awful.systems
Posts
0
Comments
140
Joined
2 yr. ago
i can admit it's possible i'm being overly cynical here and it is just sloppy journalism on Raffaele Huang/his editor/the WSJ's part. but i still think that it's a little suspect on the grounds that we have no idea how many times they had to restart training due to the model borking, other experiments and hidden costs, even before things like the necessary capex (which goes unmentioned in the original paper -- though they note using a 2048-GPU cluster of H800's that would put them down around $40m). i'm thinking in the mode of "the whitepaper exists to serve the company's bottom line"
btw announcing my new V7 model that i trained for the $0.26 i found on the street just to watch the stock markets burn