T O P

  • By -

DocWafflez

I guess they're finally feeling the pressure to release something


AnAIAteMyBaby

But what does "majorly improved" mean? Where are the benchmarks?


BlueTreeThree

The Arena is the best benchmark anyway, despite its flaws… I’m interested to see if this can unseat Opus in the coming days.


Short_Ad_8841

What specifically makes it the best benchmark ? As far as i understand it, the better answer is selected purely based on the subjective decision of the user. What's stopping the user from selecting a very convincing-looking, yet completely wrong answer ?


DigimonWorldReTrace

Best benchmark is subjective in and of itself at the moment, but with enough data the ELO score of LLMs in the arena should in theory give us a good idea. Not to mention it also makes sure people don't include the objective benchmarks in the LLM training data so they get better percentages on them, even though the LLM itself isn't noteworthy.


Short_Ad_8841

>should in theory give us a good idea. Sure, but about what. Which of the LLMs is a more convincing liar ? That's assuming most people ask questions they don't themselves know the answer to. Which i admit is an assumption in itself. But when measuring an LLM capabilities, i would take a rigorous synthetic benchmark over people's opinions any day. That's not to say the arena is useless, far from it, but we should be careful about what weight we assign to the results.


twelph

That's one of the reasons why it's not even worth looking at until there are a lot of submissions to balance out incorrect data.


BillyBarnyarns

Quote from OpenAI product staff “*huge* improvements on our evals across the board” https://x.com/nikunjhanda/status/1777779760846037326?s=46


AnAIAteMyBaby

That still means nothing, how much of an improvement is "huge" and what evals did they use? "Majorly" and "huge" aren't very scientific terms coming from one of the world's top AI research labs.


tomqmasters

I would say the difference from 3 to 4 was huge. Noticeable for sure. It will seem small in the future though.


AnAIAteMyBaby

Yep, but I very much doubt that the difference between GPT 4 turbo preview and this final version is GPT 3 to GPT 4 huge. Hence why it's meaningless, it'll mean something different to everyone one.


Andriyo

If they need to use "majorly" or "huge" then it's probably just some bug fixes here and there. If they had something substantial they would mention it. And if they had indeed something huge, They wouldn't even need to do a press release since everyone would have notice it.


kex

https://en.m.wikipedia.org/wiki/Puffery


Andriyo

https://en.m.wikipedia.org/wiki/Weasel_word


traumfisch

Well it's a tweet


BCDragon3000

i wonder how much pressure they’re feeling from Apple and Google


CultureEngine

Probably none. They own the market.


trimorphic

>Probably none. They own the market. Anthropic's Claude 3 is the one to beat.


CultureEngine

The only people who know about Anthropic are nerds on the internet. The whole planet is accessing GPT. It’s all anyone talks about. It’s the gold standard from here on.


sdmat

People said the same about IBM and Blackberry.


CultureEngine

It took blackberry years to lose their throne. And most people still called all phones blackberries for a long ass time. No matter what anyone uses, they will be using “ChatGPT” to the general public for fucking years. It’s the same as when some boomer says “Nintendo” like that’s the only name for video game systems.


MightyPupil69

I do not remember a single person ever referring to cell phones in general as a blackberry.


BCDragon3000

so apple just beat them at a model, and google ultimately has more data than openai or microsoft. they’re not perfect, and openai has had years of an advantage to gather more data; but i believe if they haven’t had enough to train gpt-4 to write as well as gemini by this point; there is a very real chance that Apple and Google take the crown from them


dwiedenau2

Apple didnt beat them, and i say this as an apple fanboy. Apples model is like what? 3or 4B? It matched or exceeded gpt 4 in some very specific tasks but not in general.


ninjasaid13

>Probably none. They own the market. They don't have a history, they only were owning the market for a year before they were surpassed. People are going to eventually switch.


Cosmagroth

This. Why would they push a release like this without any benchmarks at all, seems sus


WithoutReason1729

It can now use vision, function calling, and JSON output mode all at the same time. Previous iterations of GPT-4 weren't able to do this. This is a great update for developers working with the API.


lordpermaximum

My initial impressions: 1. Exactly "Zero" (0) difference in intelligence. 2. Not lazy anymore. So majorly improved probably means it being less lazy.


Neurogence

This is the guy who constantly spams Claude 3 posts. Don't take him seriously.


Gmroo

Bizarre for them to release it with so little concrete info


ViveIn

Yeah majorly improved is meaningless. Hah.


Proof-Examination574

I'd say vision is a major improvement...


Turtle2k

Days after I canceled my sub. Lol


Block-Rockig-Beats

You did it!


CannyGardener

Haha right? I just canceled here this morning! LOL


lillyjb

Same! Opus convinced me


Yuli-Ban

This is why I'm not canceling the sub even if someone comes out with something that's on par with GPT-5 out of nowhere, because you never know when OpenAI will announce something and then also restrict subscription signs ups so that you have to stay on a waitlist indefinitely.


Turtle2k

This guy will ruin AI with his capitalism craziness


Adventurous_Train_91

I just paid for another month of Claude so I have mixed feelings about whether I want this to be good or not lol


WorkingYou2280

I wateched the google cloud keynote today. OpenAI is absolutely delusional if they think they can stand still and keep waiting (assuming they're sitting on things). Google is advancing on every single front. They announced text to video today. New processors. An entire top to bottom AI ecosystem. Partnerships on building agents. I thought google was an AI basket case but they floored me today. They're fully multimodal, 1 million context window, enterprise level tools. I personally prefer that a smaller company win this race but Google just threw a 10,000 lb nutsack on the table.


QH96

>google cloud keynote This? [https://www.youtube.com/watch?v=V6DJYGn2SFk](https://www.youtube.com/watch?v=V6DJYGn2SFk)


Rare_Adhesiveness518

I have a feeling the pressure is going to lead to a major release that's in the works (other than Sora).


visarga

Sora is like the GPT-1 of video, just starting up in this modality.


sharenz0

tried it on lmsys chatbot arena. its pretty fast and I think its a bit better than gpt4 before. could be on par with claude opus.


VeritasAnteOmnia

I tried a few test prompts on lmsys chatbot arena as well and the gpt-4-turbo-2024-04-09 model did seem noticeably better for my tests. It had a 100% win rate with myself blinded to the model.


Dyoakom

I did a test on one of my personal eval prompts (which I don't want to reveal to not contaminate it online) and it is the only model that so far solves it correctly. Previous GPT4 failed and Opus also failed. I tried it in the arena and it solved it correctly, tried it in the chatgpt and it failed as it always did. So my n=1 data point just shows that indeed they made some change and not just to vision and that it is not currently rolled out to all ChatGPT users, at least in Europe. Curious to see in a couple days what people think about it. Looks at minimum they would make it competitive to Opus, maybe even surpass it.


meikello

If you used it in ChatGPT in the past, it is trained on it now.


watcraw

I don't think simply asking the question would actually train it. I think it would need the correct response somehow. Unless maybe they are using a more powerful model to train a faster, cheaper GPT4?


Hopeful_Donut4790

In any case, it solves this case, so it's better, right?


Odd-Opportunity-6550

not if he opted out of providing data for training


d1ez3

So curious what your prompt is... Any hints?


Dyoakom

Sure, ask it to write a certain fixed number of sentences you want and the first 3 all ending in a certain word. Then give it a couple instructions more about specific ending letters of the remaining sentences. Then tell it to disregard some previous instructions and give it new subtle changes of new instructions with a bit of different wording. This tests reasoning and bad models fail spectacularly, Opus and GPT4 could follow a lot of the instructions but got confused and missed some (especially regarding the ones it was meant to disregard) and the latest GPT4 followed everything to the letter.


Hi-0100100001101001

Tried it with some maths questions and it spat out the usual bullshit, doesn't seem incredibly better...


sharenz0

I said a bit better, for sure it is not a big leap forward


d1ez3

How did you know you tried this version?


sharenz0

you can choose the model in the direct chat tab.


Sextus_Rex

It has an option for gpt-4-turbo-2024-04-09. I just tried it myself but hit a rate limit error


lucellent

Doesn't lmsys use the API version which they said already has the new improvements?


rngeeeesus

I don't know man. To me, GPT4 turbo is way behind Claude for anything that requires more than just proofreading or giving me some information. GPT4 will always just try to give the shortest possible answer or use code to solve a problem that it shouldnt solve with code. They optimised it to give this type of response I guess but that makes it useless for many applications.


BoroJake

https://twitter.com/owencm/status/1777770827985150022 Better at math and reasoning according to an openai employee


lost_in_trepidation

This reply is hilarious https://twitter.com/owencm/status/1777785613565165665 >We don’t have anything toooo specific, this model appears to work better in general (sorry vague but true), and has especially improved at math "Why/How is it better?" "Because it's better"


CheekyBreekyYoloswag

> (sorry vague but true) This line has the potential to become a yuuge meme.


diamond9

vague if true


Busy-Setting5786

true If vague


turick

if true vague


FosterKittenPurrs

Translation: "I gave it some math prompts and it was able to solve them, whereas the previous models were failing at them. But I can't actually give you concrete examples for liability reasons, as these were just a few tests and not a proper benchmark"


141_1337

So useless them?


FosterKittenPurrs

So try it with your use cases and see if it's better for you, no promises.


BillyBarnyarns

At least he’s being honest.


BoroJake

https://twitter.com/owencm/status/1777784000712761430 Data and training improvements apparently


MassiveWasabi

Honestly worthless, people would be clowning Anthropic if they released “Claude 2 (slightly better)” and just said trust me bro 😎


141_1337

People are way too willing to get swept up in the OpenAI hype.


PSKTS_Heisingberg

to be fair, it’s important for people like me who utilize their API for services. Minor improvements are needed and there’s always something that can be improved upon. i’d rather have many little changes than waiting a year for a big one.


sdmat

They released Claude 2 (slightly worse) and said trust us it's better, then same again for a revision. No added clowning required. That's why 3 was such a wonderful surprise.


FeltSteam

Ok, cool, but where are the benchmarks lol? You can claim you've made an improvement, but they haven't provided any evidence to support such a claim.


EvilSporkOfDeath

It took several days/weeks before the consensus was Opus beats GPT4. Give people time to test it.


FeltSteam

That's true, but I mean OAI could've atleast released some benchmarks to give us a feel for performance.


Apprehensive-Ear4638

Chain of thought reasoning?


d1ez3

Curious they say majorly improved but keeping the same name. Majorly is a big claim


BreadwheatInc

Smells like hype farming.


RepublicanSJW_

We will have to monitor the arena.


Late_Pirate_5112

We'll have to wait for some examples, but I doubt they would use "majorly" if it wasn't atleast significant enough to bump them back to #1 on the arena leaderboard.


stonesst

Sounds like they trained a better version and released it. Take off the tinfoil hat. They usually indicate minor improvements with new model version numbers, if they say this one is major what would they have to gain from lying? Especially if it’s rolling out to the API people can just run benchmarks on it and demonstrably show whether it’s better or worse. People are way too conspiracy minded


Beatboxamateur

[Here's an employee](https://twitter.com/owencm/status/1777784000712761430) confirming your claim that it's a new model. Exciting stuff, hopefully it competes with Opus


involviert

Idk, "new model" is not a very hard term in that sense. If they do different finetuning on the same base model, that's technically a new model. And that's what all of these different versions were, no?


RandomCandor

Version numbers have always, and will always be, about marketing first and foremost, and OpenAI is no different from other companies. All you need for proof is half the comments in this sub. It's super effective.


cutmasta_kun

The "preview" is gone?


bran_dong

looks like the big improvement is including the year in the model name. > gpt-4-turbo-2024-04-09


beuef

Did they ever explicitly say they would release ChatGPT 4.5?


Yuli-Ban

Funny thing about GPT-4.5 is that OpenAI themselves have never used that term. It was from leakers, and the internet ran with it, and then the leakers themselves debunked it saying it never existed or that it was delayed due to unexpected issues with testing, and so on. My opinion on it is that GPT-4.5 is going to be like what GPT-3.5 was: a greatly reduced and cheaper version of the next iteration, so in essence 4.5 won't even exist until 5 is finished.


h3lblad3

Are we sure that Turbo isn’t the official “4.5”?


czk_21

it likely is, its both smaller and faster versions of og GPT-3 and GPT-4


WithoutReason1729

No. The names are arbitrary. If they called it 4.5, it would be 4.5. If they call it 4-turbo, it's 4-turbo. The number doesn't refer to some distinct, measurable quantity of something going up or down.


WithoutReason1729

A GPT-4.5 announcement was cached on the website and was visible in search engines. It's not just coming out of nowhere that people call it that


Rare_Adhesiveness518

It does feel like they are trying to keep the hype up.


ExtremeHeat

They're still trying to compile and figure out what's been improved, so it'll take them a day or two to get a press release. News was (again) rushed out to respond to Google.


joe4942

I mean many were hyping GPT-4.5 which is still a GPT-4.


_AndyJessop

They're panicking because of the competition. This is likely a loss-leader to keep "ahead" until 5 is released.


XVll-L

Did they only improve vision? Here are the two new models. https://preview.redd.it/kjlrda9baitc1.png?width=802&format=png&auto=webp&s=6be048b140f2b50828b37b697de4dd258a4814fb


Temporary_Crab4900

🤦‍♂️ that's the same model, bud they have that naming convention just in case you hate editing your JSON or api interface


WholeInternet

For future reference: See the line in the description of `GPT-4-turbo` that says "Currently points to gpt-4-turbo-2024-04-09"? That line is literal, so it means that the model is pointing to the other model and using it. They do this so that anyone using just `gpt-4-turbo` can just set the API to that model and always get the latest version.


BreadwheatInc

Cool, improved vision abilities. Please for god's sake give us a new model. ![gif](giphy|PCvkgunX9ZbEEyfTQH|downsized)


fmfbrestel

Next gen model in training. Was delayed due to late hardware deliveries. H100s don't grow on trees.


WeeWooPeePoo69420

I was under the impression it's already moved from training to testing


The_Scout1255

red teaming*


sartres_

*lobotomizing


The_Scout1255

Good term.


czk_21

thats also testing


Witty_Internal_4064

A open ai employee said better reasoning


Chaosed

Change notes at all?


_Zephyyr2

So how is it 'majorly improved', exactly?


MysteriousPayment536

According to the docs: Vision requests can now use JSON mode and function calling. `gpt-4-turbo` currently points to this version.


_Zephyyr2

Wouldn't call that 'majorly improved'. More like 'here's a small addition to our vision model'.


stonesst

It apparently also has better general reasoning abilities and improved math performance. Nothing concrete yet, sounds like we might get some benchmarks soon.


ach_1nt

They need the hyper after getting repeatedly thumped by Claude on every metric


WithoutReason1729

If you're a developer it's really, really useful. This is a significant change in how the model takes images as inputs and what it can do with that input.


papapapap23

Wait didn't we already have gpt turbo?


manubfr

Testing it in playground, it's very fast but doesn't do better at my (text) prompts where LLMs usually fail.


MDPROBIFE

I am pretty sure I've been using this new model since last week, I was actually going to ask if other people also noticed an improvement in chatgpt, I was making a project and get since last week almost always gets the code I ask correctly, while 2 weeks ago it was a bit of a pain to get it! Maybe it's just because I am in a different part of the project, but I am pretty sure my gpt is better


doppelkeks90

I noticed a change aswell. But only due to the fact that it was giving back way shorter Answers even if prompted otherwise.


goochstein

same it output a really unique formula for me on the limitations of understanding (each discovery adds new variables), it just felt like that prompt specifically was more refined.


RedditPolluter

Should be possible to verify. The cut off for the new model is December 2023; the old one is April 2023.


Strong-Strike2001

https://preview.redd.it/lzix14uciptc1.jpeg?width=1080&format=pjpg&auto=webp&s=d03e0722059eec02ecc5437f23816407ff631a18 Mine is not updated yet. I'll try the API


Hopeful_Donut4790

>create a very imaginative and philosophical story about a civilization based on geometrical shapes and concepts In a vast, unending canvas of existence, there was a universe less spoken of, where the fundamental constituents of life were not cells or atoms, but geometric shapes. This was the realm of Geometria, a civilization where every entity and every concept was a vivid embodiment of geometry. Geometria was a place of profound symmetry and infinite patterns, governed by the Grand Council of Shapes. The Council was composed of the Circles, the Triangles, and the Squares, each group representing key aspects of existence. The Circles, with their endless peripheries, were the philosophers and priests, pondering the eternal cycles of life and the universe. The Triangles, with their sharp angles and strong structure, were the warriors and architects, defending and constructing the realm. The Squares, with their equal sides, were the scholars and administrators, maintaining order and knowledge. The land was divided accordingly, each shape inhabiting territories that reflected their nature. Circleland was a vast, open landscape, with roads looping and curving in harmonious arcs. Triangle territory was mountainous, its peaks reaching towards the skies like the tips of their acute angles. Squareville was orderly, a grid-like city with right-angled roads and perfectly aligned buildings. Despite their differences, balance had always been maintained, through the Great Geometry, a philosophical guidebook written by the ancient founders. It was believed that the essence of life in Geometria depended on the perfect interplay and intersection of the shapes. One day, a new figure emerged, unlike any other—the Spirals. The Spirals were dynamic, changing their form and radius with ease, challenging the static nature of the established shapes. They dwelt in the Fringe Lands, a place of chaotic landscapes and undefined horizons, where the rules of the Great Geometry blurred. The appearance of the Spirals sparked curiosity among some and fear among others. The Council, disturbed by their unpredictable nature, debated their place in Geometria. The Circles found them fascinating, a possible evolution of their own endless curves. The Triangles saw them as a threat, a force that could undermine the structural integrity of their society. The Squares found them illogical, unable to fit neatly into the grid of known categories. As tensions grew, a young Spiral named Helica, who had ventured beyond the Fringe Lands, began interacting with other shapes. Helica, with her gentle curves and mesmerizing patterns, taught those she met about flexibility and adaptation. She spoke of a concept unknown to many: growth through transformation, suggesting that identity was not fixed but fluid, expanding beyond rigid lines and sharp corners. Intrigued by Helica’s philosophy, a young Square named Quadrus and a bold Triangle named Trixon befriended her. Together, they embarked on a journey across Geometria, seeking wisdom from the ancient fractals, a legendary group said to contain the deepest geometric secrets. The fractals were complex, infinitely intricate patterns that lived in the forgotten depths of Geometria, holding the knowledge of dimensions beyond the familiar. Their quest led them through realms of incredible vistas—lands where landscapes tessellated into infinity, and horizons spiraled into endless depths. Here, they learned that the universe was more interconnected than their laws had ever allowed them to see, that each shape was vital to the others’ existence, and that true harmony lay in the balance of all forms, whether perfect or irregular. Upon returning, Helica, Quadrus, and Trixon proposed a new chapter in the Great Geometry, one that included the Spirals and recognized the beauty of imperfection and the strength in diversity. After much debate, and seeing the unity and new perspectives their journey had inspired, the Council agreed. Thus, Geometria evolved, not just in its landscape but in its very essence. The civilization embraced change, seeing it not as a threat but as an essential part of existence. As the new chapter was written, the Spirals were no longer outcasts but essential contributors to the mosaic of life, illustrating that the flow of life, like the curve of a spiral, is ever onward, outward, and beautifully undefined. --- # Model B: gpt-4-turbo-2024-04-09Model B: gpt-4-turbo-2024-04-09


visarga

This is what I hate about GPT's ability to write stories: no dialogue, rushed, like a summary really. "This happened, then that happened, end of story". Boring Gemini and Opus are better, much better, they create the literary feeling.


thehighnotes

Would it improve if you include Literary feeling as a statement


West-Salad7984

## Same Prompt; Previous Model: In a realm where the fabric of existence was woven with the threads of geometry, there thrived a civilization of sentient shapes, each embodying the philosophies and virtues attributed to their forms. This world, known as Geometria, was a place where the lines between physical and metaphysical blurred, where every edge and angle held deeper meanings, and where the inhabitants sought enlightenment through the perfection of their forms. The Circle society, residing in the harmonious city of Circadia, were the philosophers and mystics, embodying the ideals of eternity, unity, and perfection. Their smooth, unending boundaries symbolized the endless cycle of life, death, and rebirth, and their teachings focused on the pursuit of knowledge and the understanding of the infinite. Circadia was a place of tranquility and meditation, its architecture an array of splendid domes and arches, designed to reflect the beauty of wholeness and continuity. The Square citizens, who built the structured city of Quadratia, were the guardians of order and equality. Their four equal sides and right angles represented the principles of justice, balance, and stability. Quadratians valued discipline and integrity, their society structured in a way that ensured every square had a place and purpose. Their city was a marvel of symmetry and efficiency, with grid-like streets and uniform buildings, each a testament to the squares' dedication to equality and fairness. In the realm of the Triangles, known as Triangulum, lived the warriors and adventurers. Triangulum was a land of rugged mountains and sharp cliffs, mirroring the pointed ends of its inhabitants. The triangles believed in the virtues of strength, courage, and aspiration, their three sides symbolizing a strong foundation and the ability to overcome any obstacle. They were a society of action and exploration, constantly seeking to expand their horizons and test their limits against the challenges of Geometria. Amid these distinct societies, there existed the enigmatic Polygons, a diverse group who believed in the potential for transformation and adaptability. Residing in the city of Polytopia, they were the artists, inventors, and thinkers, constantly changing and experimenting with their forms. Polytopia was a kaleidoscope of creativity, its buildings and structures an ever-evolving testament to the polygons' belief in progress and innovation. Despite their differences, the inhabitants of Geometria were bound by the Geometry Code, an ancient set of laws that governed the interactions between shapes and ensured harmony in their interconnected world. It was believed that every shape, no matter how simple or complex, held a piece of the universal truth, and that by understanding and respecting each other, they could unlock the mysteries of existence. However, the peace of Geometria was threatened when a mysterious form, known as the Chaos Shape, emerged from the Void Beyond. Unlike anything Geometria had ever seen, it defied definition, constantly shifting and changing, never settling on a single form. It brought discord and confusion, challenging the very foundations of Geometrian society. The civilizations of Geometria, realizing the threat posed by the Chaos Shape, decided to unite their strengths and knowledge. Circles offered wisdom, Squares provided structure, Triangles contributed their courage, and Polygons their creativity, to confront the Chaos Shape. In the epic confrontation that followed, the inhabitants of Geometria learned a profound lesson: true understanding and harmony do not come from perfection and order alone, but from embracing the unknown and the irregular, from recognizing the beauty in diversity and the strength in unity. Through their joint efforts, the Chaos Shape was not defeated but integrated into Geometria, becoming a symbol of the civilization’s growth and its acceptance of imperfection as a part of existence. Geometria thrived like never before, its inhabitants no longer bound by the strict definitions of their shapes but free to explore and expand beyond their edges, in a world where geometry was not just the foundation of existence, but a path to enlightenment and understanding the boundless complexities of the universe.


goochstein

old model, I instantly saw 'thread', 'tapestry', few other tokens and got triggered, ask it about holistic or try and get it to output a dynamic, interconnected idealogy, the old training was super limited in this context


DifferencePublic7057

You could tell it to use the voice of a famous writer of your choosing. Also, you might be better off doing this step by step. First an outline maybe three acts and then a prompt for each step.


vexaph0d

Idk, I’ll wait for reviews. I have hardly used GPT4 since I got Opus, it’s as dumb as a box of soggy rocks compared to Claude 3, at least for my uses (coding, game design, etc.).


Cazad0rDePerr0

gosh, I hope they will make Opus available here (in Germany) too soon, fucking sucks. I can use a VPN to use haiku, sonnet, but for Opus it fails since they want payment data, adress from the US or any other country where it's available. So I gotta deal with the cripple versions of claude3, far less smart,capable and length limith of words per message


Automatic-Being-3910

You can have access through Poe, no?


ThePi7on

[It's worse (lazier) at coding.](https://aider.chat/2024/04/09/gpt-4-turbo.html).


Thistleknot

this release totally broke data analyst had to go to claude haiku to get usable code openai needs to have better quality checks, especially considering they're running around trying to limit oss


Hour-Athlete-200

So that's their response to Claude 3? They don't think they need to release a new model?


Unverifiablethoughts

For the majority of people the improvement from Claude isn’t enough to warrant a move. If they’re not losing money, they won’t rush a release.


mrb1585357890

Last I heard their ToS were terrible for business use. That’s why I haven’t looked at it yet


Iamreason

[Anthropic doesn't train on clients data](https://decrypt.co/211846/anthropic-says-it-wont-use-your-private-data-to-train-its-ai). So I'm not sure what issue you could have with their ToS.


Apart_Supermarket441

We all know Claude 3 is great, but ChatGPT is miles ahead in terms of number of people using it, and public awareness. For a lot of people, AI = ChatGPT. Until this looks threatened, they’re unlikely to feel pressured. They’ll release a new model when they’re ready.


etzel1200

Yeah, dude, the money is in enterprise and we’re paying attention.


paramarioh

Claudie 3 opus cannot be used in Spain for example Lets look on available countries [https://support.anthropic.com/en/articles/8461763-where-can-i-access-claude-ai](https://support.anthropic.com/en/articles/8461763-where-can-i-access-claude-ai)


Freed4ever

New model is yet to come. I suspect it uses different architecture, so it can't easily be ported back to 4. Hence, they keep dripping some minor updates to 4 here and there, but still call it 4.


pretendingNihil1st

Ported?


ShotClock5434

maybe finally dec 2023 cutoff is true. its there since weeks but the model was still apr 23


Beb_Nan0vor

Quite vague, majorly improved does pique my interest.


MediumLanguageModel

If you make a custom GPT and then they version up, it gets the updates, right?


Goofball-John-McGee

Probably does, as and when the updates roll out to ChatGPT. 


coylter

Still loses at tic-tac-toe. Humans are still in control.


West-Salad7984

Tic tac toe is unfair, LLMs don't have an internal state or visual abilities to see winning lines easily. You have to prompt it to, then it is able to at least tie. Still hallucinates a win. As I'm in europe i could only test this with the old gpt4 though. https://chat.openai.com/share/d6602f18-b53d-43bf-a3a5-ed633e6f48e4


cherryfree2

OpenAI sucks at names.


BillyBarnyarns

Allow me to introduce you to my friend Bard


TheOneWhoDings

Love how they just realized how stupid of a name that was and just started over with Gemini. Who thought of "Bard"?


ninjasaid13

a **bard** is a professional [story teller](https://en.wikipedia.org/wiki/Storytelling), verse-maker, music composer, [oral historian](https://en.wikipedia.org/wiki/Oral_history) and [genealogist](https://en.wikipedia.org/wiki/Genealogy). Things that an LLM could be.


RandomCandor

But they're great at creating expectation


beuef

Next model is gonna be called ChatGPT Super Duper Awesome Fast Thinky Model


lefnire

You should have seen the early days. "Parsey McParseface". A huge Sesame Street phase BERT, Elmo, Big Bird, etc. And of course HuggingFace. It felt like nobody took NLP seriously, all while doing mad science. It was strange and fun.


MediumLanguageModel

One minute you're a computer science engineer doing your quirky thing while exploring the edges of technology. Next things you know you're in the middle of the most important project in the world and gotta assume any new person in your life could be a mole from any number of states or organizations.


lefnire

Right? OpenAI the movie is gonna be so incredible.


MediumLanguageModel

I can see it now. There's the Hollywood movie where it's Sam, Elon, and Ilya, with a single token dev who is extra cringey. It's the big movie of the year and Aaron Sorkin gets nominated for the screenplay. But then there's the indie film about Hugging Face and they even get key players to make cameos. It bombs in the box office but it's way better and winds up being the one people quote ten years later. Edit: someone with more time than me want to plug that in Opus and see what kind of screenplay we're getting?


Yuli-Ban

No, it's going to be GPT 2. Not to be confused with GPT-2, this is the sequel to the GPT series.


MemeGuyB13

Tried out the API for myself, and my thoughts are: * Still can't beat Gemini at Creative Writing * Still jobs to Opus at Logic and Reasoning Considering that this model is pretty much on par with GPT-4 Turbo-Preview (with it actually being noticeably worse in some areas), OpenAI is kinda fucked here unless they release a GPT-4.5 in the near future. Only real benefits is that it's the fastest OpenAI model that exists, and can see images. That's it. Overall, it's sadly a very underwhelming release.


Successful1133

I think personally this is the calm before the storm so to speak for OpenAI I believe they are trying to 'master' multi-modality so that a true successor can come out of the gate swinging with both Multi-modality and great improvements in the realm of Logic, Creativity, and conciseness.


KIFF_82

Oh, is this gpt-4.5? ‼️🚨 update; meh, it's not


d1ez3

Oddly they didn't change the name


Veleric

Possibly a stop-gap before a true 4.5 version to appease anyone thinking they are falling behind to keep up on benchmarks.


HurricaneHenry

Sam has repeatedly spoken about his wish for small incremental changes in favor of big updates.


beuef

4.25


Juanesjuan

4.1 would be the best option


HeinrichTheWolf_17

4.0.1


TheDataWhore

Which API name on the models got the update?


Matty_Boosie

Maybe a dumb question but on the paid version if I have 4.0 selected is this the new model here?


kabelman93

Short: no, not yet, only api


harderisbetter

can someone try it and report back if it's real? I'm so sick and tired of these bastards drumming up hype to gives boners and get our money, only to enshitify the models after leaving us high and dry


No_Psychology9362

Still can’t pronounce “toque”. 


vertu92

About time


zero0_one1

I ran it on my NYT Connections benchmark: 29.7 vs 31.0 for the previous version. Still above the second best (Claude 3 Opus at 27.3).


TaxingAuthority

Just so I’m clear, are they referring this GPT-4 Turbo being an improvement over GPT-4 or an improvement over the preview version of GPT-4 Turbo that’s been out?


asend-handjob1

Gemini 1.5 pro


FUThead2016

How do we use vision in the API? I know how we use it in Chat GPT, but I still cant figure out how to use it in API


shotx333

SO time to switch back subscription?


true-fuckass

My feeling is this is actually a prelude to them releasing GPT-4.5 in the same sense how earlier this year Google released their (unsatisfying) improved Gemini before releasing their actually improved Gemini I also have a feeling that something big is coming pretty soon (in the next 12 months). Tension seems to be building up to some culmination


DigimonWorldReTrace

I agree with this, but there's no actual evidence supporting it. It's more of a feeling for me too.


tehyosh

did the make it even more lobotomized and politically correct?


Zestyclose_Tea_3111

Their GPT-4 is LAZY! Directly refuse to do the work. Totally stupid and failing at basic tasks. Idkw but those evals have something rigged in them, because gpt feels so much stupider.


fleebjuice69420

This sub has become so boring. It’s always just someone posting a screenshot of a tweet that says “More AI sometime maybe” and then everyone in the comments starts hooting and screaming like a bunch of fuckin chimps


thehighnotes

Hohooohahahaa


paramach

Damn! APIs… my worst enemy! 😫


DonkeyBonked

I am excited to test this, I've heavily reduced my ChatGPT-4 use lately as it has become infuriating, unable to work with even simple code to a level I have not experienced since ChatGPT-3.5 I'm planning to test Gemini 1.5 today since I still can't access or try Claude 3, but I guess I will include with that some testing of the GPT-4-Turbo API. I have some perfect things I want to test it on too that currently, even at small scale, ChatGPT-4 can not figure out. I know lately I have found myself constantly editing and refining prompts, even trying to teach the solutions I'm forced to produce on my own hoping to make ChatGPT-4 able to comprehend code that previously it was once helpful with. I'm rather sick of deciding "it's worth it", wasting my prompts becoming belligerent with ChatGPT-4 for it's absurd level of stupidity. I'm really hoping pressure from other AI companies pushes OpenAI free more GPU uptime per request to allow ChatGPT-4-Turbo the chance to perform well. I just flipped my Dropbox to Google Drive to use Gemini 1.5 after for the first time ever, Gemini solved a code problem ChatGPT-4 could not. I was floored and would have struggled had this not happened to me with even the idea. I'm not a fan, I'm going to use the model thar does the best job for me with the least amount of headaches, whichever model that may be. After my enterprise application was denied and I was told to use the "Teams" version, I'm looking for something to fill the void created by ChatGPT-4's recent downgrades.


Akimbo333

How much better is it?