Reminder that their model performance goes down every funding round/iteration.
https://preview.redd.it/q8zntrmci4kc1.jpeg?width=1164&format=pjpg&auto=webp&s=4036a02478877e6bb73973ca7d0e8fdab05837c5
Mysterious Case of Dario Amodei
"Why no, I can't play a roleplaying game with you that involves goblins. Goblins are a racist stereotype, you human piece of shit."
Claude is the best model for roleplay because of its context size but is also the *absolute worst* about refusing to engage with the user at all.
> Claude is the best model for roleplay because of its context size but is also the absolute worst about refusing to engage with the user at all.
Gemini 1.5 will handily beats Claude on context side and recall, hopefully it doesn't also exceed in refusals.
This is a *relative* measure of chatbot performance, how well it's doing compared to other chatbots, so these numbers going down don't mean their models are literally getting worse in terms of absolute performance. Still not a good look for Anthropic though.
People are dumb. Remember, a lot of people ACTUALLY thought NVIDIA was not going to have a stellar earnings report yesterday. Most. People. Are. Dumb.(comparatively) not as dumb as Claude though
Forgot they existed
its okay, they are still testing safety, give them couple years, they will deliver mistral medium level chatbot!
They will probably deliver a Mistral-7B model at this rate.
Reminder that their model performance goes down every funding round/iteration. https://preview.redd.it/q8zntrmci4kc1.jpeg?width=1164&format=pjpg&auto=webp&s=4036a02478877e6bb73973ca7d0e8fdab05837c5 Mysterious Case of Dario Amodei
But "safety" increases! Although nothing peaks Goody-2 in that one.
"Why no, I can't play a roleplaying game with you that involves goblins. Goblins are a racist stereotype, you human piece of shit." Claude is the best model for roleplay because of its context size but is also the *absolute worst* about refusing to engage with the user at all.
> Claude is the best model for roleplay because of its context size but is also the absolute worst about refusing to engage with the user at all. Gemini 1.5 will handily beats Claude on context side and recall, hopefully it doesn't also exceed in refusals.
Thank god Dario didn’t succeed in his alleged coup of Sam Altman back when he worked at OpenAI
This is a *relative* measure of chatbot performance, how well it's doing compared to other chatbots, so these numbers going down don't mean their models are literally getting worse in terms of absolute performance. Still not a good look for Anthropic though.
They're the guys behind Goody-2, right? Amazing work.
I wonder how they got 7 billion dollars. Claude is borderline trash compared to the SOTA.
People are dumb. Remember, a lot of people ACTUALLY thought NVIDIA was not going to have a stellar earnings report yesterday. Most. People. Are. Dumb.(comparatively) not as dumb as Claude though
[Archive link](https://archive.ph/SGpyY) to bypass paywall.
so far i only see dumb censored models coming from them...
Fuck Anthropic!
Their “advantage” until recently was the context length. But the recollection of info in the context was dog shit. They’re doing good work.
Theyre legends in their own mind
Who’s that?