Can we discuss how it’s possible that the paid model (gpt4) got worse and the free one (gpt3.5) got better? Is it because the free one is being trained on a larger pool of users or what?
Can we discuss how it’s possible that the paid model (gpt4) got worse and the free one (gpt3.5) got better? Is it because the free one is being trained on a larger pool of users or what?
It’s because the research in question used a really small and unrepresentative dataset. I want to see these findings reproduced on a proper task collection.
True, checking whether a number is prime is very limited in scope for chargpt, but this is in line with other reports of progressive dumbing down.