• brucethemoose@lemmy.world
    link
    fedilink
    English
    arrow-up
    13
    ·
    edit-2
    2 days ago

    I had suspicious before, but I knew they were screwed when Qwen 2.5 came out. 32Bs and 72Bs nipping at their heels… O3 was a joke in comparison.

    And they probably aren’t fudging anything. Base Deepseek isn’t like crazy or anything, and the way they finetuned it to R1 is public. Researchers are trying to replicate it now.