- cross-posted to:
- technology@lemmit.online
- cross-posted to:
- technology@lemmit.online
The milestone highlights how DeepSeek has left a deep impression on Silicon Valley, upending widely held views about U.S. primacy in AI and the effectiveness of Washington’s export controls targeting China’s advanced chip and AI capabilities.
To be fair, I’m pretty sure that’s what everyone is doing. If you’re not measuring against something, there’s no way to tell if you’re doing anything at all.
My point was a mixture of Experts model could suffer from generalization. Although in reading more I’m not sure if it’s the newer R model that had the MoE element.