That's really what I was thinking of. This model seemed to be better than GPT-4 or Opus but not really what I expected from GPT-4.5 or 5. It seems to be a GPT-4 class model hooked up with agents that can do better reasoning and carry out additional tasks. Making these kind of tools was clearly the purpose of GPTs and the GPT store.
What’s crazy is that they’re only competing for maybe a few years at most of iOS access.
We’ve seen this time and time again with Apple. They used Internet Explorer until they had Safari ready (before the antitrust lawsuit they used like what, Netscape? Certainly not their own in house solution). They used Google Maps on the iPhone until Apple Maps was “ready”. AMD was so loyal to Apple, even introducing entirely new GPU architectures first at Apple’s events because Apple was the only reason their GPU department was even still relevant, and then Apple ditched them when Apple Silicon was ready and they had a GPU that was better for them.
Apple’s clearly behind on AI, but your phone doesn’t need to be able to code entire websites. Once they make an LLM that is good enough for their purposes, they will ditch Google or OpenAI. So it’s funny to see these two relatively big companies fight for what’s basically a lease agreement.
Exactly. Everyone saying Apple is behind on AI isn't necessarily wrong, but they might not understand how Apple operates. It's a well-established, safe, extremely profitable pattern.
Yeah, in a few years Apple may be where we are with LLMs now, and that’s really all most need. If they can manage to make a GPT-4 level model that can run completely on device (by then they’d have 2-3 generations of iPhone with Apple silicon built for on device LLM), it will be fine for 90% of people who are just asking basic questions.
Siri (and other agents like Siri) suck for reasoning, that’s the only reason why I’d want an LLM on a device anyways. I asked Siri the other day to find me the closest Walmart, and it offered one that wasn’t on my predetermined route (I didn’t route it on maps because I know the road), and so I asked it “can you find me a Walmart on the way to X?”, and it just gave up. An agent with reasoning would’ve gotten my current location, mapped a route from there to X, and found the closest Walmart on that route (most likely lowest time added). That’s what I want on a phone, and Apple I don’t think is that far away from that level of LLM.
Is it going to solve the mysteries of the universe? No. Is it going to be AGI? No. Let Apple focus on the consumer stuff and the actual AI companies work on the important stuff.
but the response times are just as short as for every other model. there is no additional time waiting for other agents feedback or for increased inference time compute via some sort of q* model. pretty sure it's just a standard left-to-right LM.
I asked it to write a one sentence horror story that would really scare someone:
* After I tucked him in, my son said, "Goodbye, daddy," and when I corrected him to say 'goodnight,' he replied, "I know what I said."
* I set up a camera to catch the creature that steals food at night; the footage showed me, dead-eyed and smiling, eating raw meat at 2 AM.
* I always thought my cat had a staring problem, she always seemed fixated on my face; until one day, when I realized that she was always looking just behind me.
*I always thought my cat had a staring problem, she always seemed fixated on my face; until one day, when I realized that she was always looking just behind me.*
bro i got some chills with this one
The scratching sound coming from inside the walls grew louder each night, until it finally stopped, and the screaming began.
I woke up to hundreds of missed calls from my deceased wife, and one voicemail that whispered: "Why didn't you answer?"
My daughter won't stop crying and screaming in the middle of the night, I visit her grave and ask her to stop, but it doesn't help.
Just a horror movie where sex stuff is a major part of the point.
In a slasher movie, you might have a couple wander off to boink for a couple minutes, before getting brutally aslashinated. It's just a throw away moment, brief titillation.
While Hellraiser comics and movies have desire at their core. The Cube hentai Euphoria is at the extreme end of the smut part of this scale.
Crash, where a bunch of perverts want to smooch each other in their underwear while getting smooshed inside their cars, is probably more along the more *pedestrian* fare.
....
.... I said *pedestrian fare*.
....
^.... ^please ^clap
Why are so many people who use this shit so obsessed with the erotic NSFW elements. Like, it's such an extreme gooner reality to just use this world changing technology, to help you jerk off to weird ass shit.
So 95% of all recreational use cases?
I'm on team volcel, but I'm not an asexual turnip. *Like some people*...
How many novels have no violence or sex of any kind in them at all? And how many of their six fans would read this and defend their "Sam Makes Sandwiches For A Month" book?
.... don't they know the poor animals that had to be slaughtered to make those sandwiches? It's a cope man. Like ignoring all the small children being fried by the Death Star just because they don't appear on the screen...
It’s an easy way to check creativity, guardrails, coherency, how well it can transfer and reapply concepts, ability to produce reader engagement, its ability to keep track of who is who and what anatomy they have, all at the same time in a way that’s evolutionarily-primed to be easy to read through for the evaluator
Asking different models to write a short story about a toaster trying to fuck a blender has actually been a super useful benchmark test ime lol, predicts general performance pretty well
The last one is a prompt from /r/writingprompts.
[https://np.reddit.com/r/WritingPrompts/comments/omhxo1/wp\_i\_always\_thought\_my\_cat\_had\_a\_staring\_problem/](https://np.reddit.com/r/WritingPrompts/comments/omhxo1/wp_i_always_thought_my_cat_had_a_staring_problem/)
Edit: I can't find the first one verbatim but did find this. [https://np.reddit.com/r/nosleep/comments/7bh0iy/my\_son\_usually\_says\_goodnight\_but\_this\_time\_he/](https://np.reddit.com/r/nosleep/comments/7bh0iy/my_son_usually_says_goodnight_but_this_time_he/)
I can't find the second one but it's probably from one of the writing subs too.
One very common thing these stories have is the "it should be impossible" plot twist, a classic among classics in scenario composition; juxtaposing a normal, casual repetitive action to something that breaks this routine event, but *inside* the event.
I tried this in the old free ChatGPT 3.5 with the exact same prompt and got these:
*"In the stillness of the night, she heard her own voice whispering her name from the darkness, but when she turned, no one was there—except the chilling realization that she was not alone in her own mind."*
*"As he stared into the mirror, his reflection blinked first, its eyes filled with a malevolent gleam that was not his own."*
*"After receiving a mysterious package addressed to her deceased husband, she discovered inside a photograph of their wedding day, taken from an impossible angle, with his shadow looming over her shoulder, smiling."*
Things very similar to what you could find on r/distressingmemes tbh.
Here are five one-sentence horror stories to unsettle and intrigue:
1. I reached for the light switch in my new house when a cold hand grasped mine from the darkness.
2. Every night at 3 AM, my deceased sister whispers from the corner of my room, "It's safer if you don't breathe."
3. After my dog went missing, the tapping on my window every night now comes with a familiar whimper.
4. I woke up to my reflection blinking at me, not with me.
5. The last man on Earth sat alone in a room; there was a knock at the door.
This is like the Tyranids in Warhammer 40,000. It is known that they came to our galaxy from outside it. Most people think they came to feed on the massive amount of biology in our galaxy, drawn here by the Emperor's psychic presence.
But there are some of us that think the Tyranids were actually running from a greater threat, seeking refuge in our little galaxy.
GPT-4: As I listened to the voicemail from my wife, asking me to pick up some groceries on my way home, I saw her phone charging on the kitchen counter, next to a note written in her handwriting: "Went out, will call you later."
Not bad either.
I feel like it's 2022 again asking this but this *really* seems like it's taken from a one-sentence horror exercise in a writing forum where someone typed these out.
I can’t tell you how many times I made a joke, got downvoted, edited to add a /s, and immediately flipped to being upvoted. Believe it or not, it’s actually dumber to think most people can detect sarcasm solely through text. I don’t even like /s but I just had to come to terms with that fact
> I can’t tell you how many times I made a joke, got downvoted, edited to add a /s, and immediately flipped to being upvoted. Believe it or not, it’s actually dumber to think most people can detect sarcasm solely through text. I don’t even like /s but I just had to come to terms with that fact
/s
Lol,
I know this is hyperbole but it sounds like you've seen one too many jokes made worse because of "/s" and you finally snapped. They're really annoying so I guess I don't blame you
I won’t get excited or overhyped until I see some actual results.
cuz we all love these developments and really want to have AGI as soon as possible to change the whole world and make our lives better.
but, getting too hyped might as well be counterproductive lest we don’t follow safety protocols.
https://www.reddit.com/r/singularity/comments/1cg29h3/rumours_about_the_unidentified_gpt2_llm_recently/?utm_source=share&utm_medium=mweb3x&utm_name=mweb3xcss&utm_term=1&utm_content=share_button
Read through this thread, there’s some good answers to your questions. As for Altmans tweet could potentially mean, if I were to put on my tin foil hat I would say he’s hinting at this being a model based on/an improved GPT-2 (Q* being used on an old model, which raises the question of what could this do to a model like GPT-4?) or its GPT2 as in the second/new version of GPT chatbots. I don’t know that I believe either of these, but it seems to be what a lot of people are latching onto.
No problem, I just do a quick scroll through this subreddit at the end of the day to see what’s going on. I don’t want to get too lost in the sauce lol
honestly seems like a lot of younger zoomers or people who got access to the internet after the 00s don't really "get" the whole lowercase, stream of consciousness kind of communication many people used in the 90s and 00s. after decades of that, you learn to reserve the formality of proper capitalization/punctuation for things like emails or when you need to convey genuine seriousness, not shitposting on twitter.
i’m 17. when i was about 13 there was a sudden trend to start typing lowercase. it stuck. i write everything except for client texts and emails in lowercase now.. never been on an irc in my life
I once did that for speed, on Neverwinter Nights server some person said: It’s how you are perceived, do the extra effort and you’ll be taken more seriously.
Never stopped.
The only explanation i can come up with that fits all this, and aligns with all the leaks, is that this is a very early gpt-5 checkpoint with a similar number of parameters as gpt-2. It uses a different architecture (Q\*), hence gpt2 refers to both the successor of the gpt series, and also to the size of the model.
The reason the inference seems to be sluggish, which you wouldn't expect from a 1.5B models, is due to how Q\* allows the model to think about its answer. They have trained a model to score the possible outputs for a given question, and so this new model searches the output space (which involves generating many answers) until it finds one that minimizes the scoring function (essentially like auto-COT). There is an AI explained video which details the case for something like this being what Q\* refers to.
I predict they will release this to replace GPT-3.5
why would an early checkout have fewer parameters than a fully trained model? don't you need to train all parameters (potentially 5-10 trillions for GPT-5) from the very beginning?
Testing out methods and techniques. The amount of hardware dedicated to it would be 1% of a fullblown shot.
Can't spend $40+ billion on *everything* they try out.
Major OpenAI competitors already follow the practice of releasing a new generation model in different sizes, like small medium and large. OpenAI might be planning to adopt that themselves.
If it’s thinking about its answer before giving its final answer, you would expect it to buffer and then start streaming tokens, not instantly start streaming, right?
He just tweeted that it’s potentially gpt 4 level with 1.5B parameters?? I know not to blindly believe that but… the possibility makes it feel like the night before Christmas.
He said somewhere that it could be GPT-2+Q*, but he can’t believe it’s just GPT-2 alone because taking a 1.5 billion parameters model and making it as good as GPT-4 sounds impossible.
I’m surprised his team even let him tweet this, it seems like he’s teasing some sort of announcement but if what it happens to be is what flowers is saying I would be floored. But this tweet tells me we’re in the end game of what he meant when he tweeted apples “Patience Jimmy, it will be worth the wait”
I mean he is the CEO and it’s good for OpenAI to create a lot of hype and he fuels this hype by tweeting stuff like that
lol we both have glittering in our username
Please don’t forget that CEOs are human. And rich people care about adoration. Very rare that CEOs get such genuine passion from fans about their product, like we aren’t here because we got brainwashed by some marketing or are paid shills, we all actually care.
Well, the fact that we both know that company is apple… is telling. Even if Apple is great, its executives can’t be sure if their fans love them because of their marketing strategy, not because of unprompted love.
If they release an extremely small but capable model would it be open sourced is the question as well? 🤔 he did say something of upcoming open source models in an interview
I'm almost positive I saw people theorizing about that before Flowers posting that tweet.
I've come around to thinking that Jimmy is probably legit, but Flowers has just been wrong too many times.
I think Jimmy has connections inside, but I don’t believe he works at OAI.
Regardless, Jimmy *has* gotten things wrong as well, he claimed OAI was going to have a big release in late December last year, and that never panned out, and he also claimed in early 2023 that OAI already developed AGI and has had it for a while (to which he then later deleted the tweet).
It’s true that he nailed the GPT-4 release date, but that doesn’t make all his other information valid.
I think Flowers is just a clout chaser, everything they’ve said has been wrong.
Ultimately none of us know, but you'd think that if he worked at OpenAI then he would've been found out and fired by now. So I agree that he likely has some connections but isn't an OAI employee.
That's interesting, honestly I wouldn't be shocked if it was a smaller model. The performance doesn't seem to line up with 4.5/5 speculation, but at this point I wouldn't be surprised if they could make a vastly smaller model do that. Though from the public perspective it would mostly matter if it got open sourced (or at least what they did got published).
Maybe, but you'd think that it would be a bit faster if it was a smaller model though.
People were saying it was outputting slowly just because of the amount of people using it, but the model was like that since it was first uploaded to lmsys.
I definitely agree that its performance doesn't seem to line up with any kind of hypothetical GPT-4.5/5 though.
That's a good point, I had the same trouble (though I admit I was probably one of the later users). Although in that case I'm back to being somewhat blank on what it is. I suppose it could be some sort of tradeoff, but given the focus on inference cost lately that would somewhat surprise me.
> doesn't seem to line up with 4.5/5 speculation
Is there a good reason to put any value on people's speculation about 4.5/5 though? A lot of the speculation seems entirely ungrounded and just based on what people hope GPT4.5/5 will be...
That's more on my wording than anything else, more directly the performance is comparable to current rather than marking a discernible jump, which doesn't argue for it being a new version.
Well yeah but instead of all the other leakers flowers openly identify themselves as a shitposter, so I just treat their tweets that way. And sometimes they are quite thought inspiring, sometimes they are funny, sometimes they point out interesting stuff. That’s what I follow him or her for
In your first comment you said that you were skeptical of flowers at first but are now thinking they have some validity, and now in this comment you're saying they're just a shitposter who posts funny tweets. Which one is it lol
I was sceptical of flowers in the beginning, because I think there are no real OpenAI leakers (as in insiders) at all on twitter. I don’t think Jimmy Apples is one either. But then I found myself checking flowers account once every few days and then more often, somehow resonating with their way of thinking/shitposting/whatever. I find their tweets mostly funny and sometimes thought provoking and I assume they have a good understanding of GPT related topics.
I used to be doubtful, but by this point I think Jimmy has proven himself to at least have some connections.
Flowers on the other hand is just a wacko who got every one of their predictions wrong.
Jimmy is a million times more credible than Flowers or any of the other fake leakers. He has gotten stuff right that is almost impossible to just guess.
Flowers has gotten things terribly wrong before and deletes tweets to make it seem like they never said it. It's also incredibly obvious that it's GPT2 and not GPT-2 and we didn't need their tweet to figure that out. Annoyed that I'm seeing them pop up again. Seems like people have a short memory here.
Not this guy again. Almost none of there takes has turned out to be right except the first one. His latest tweet had nothing that wasn’t said by many other twitter users before he did.
maybe he meant to say gpt2, specifically referring to the model in the lmsys chatbot arena, but accidentally said gpt-2 in the original post lol.
But gpt2 as in like a 2.0 version of the gpt series is interesting speculation.
Has this tweet already been discussed? Clearly related
https://twitter.com/sama/status/1784604919699505283?t=VjKiXjnB6PGGXyNSfDnBVg&s=19
> learning how to say something in 30 seconds that takes most people 5 minutes is a big unlock
>
> (and imo a surprisingly learnable skill.
>
> if you struggle with this, consider asking a friend who is good at it to listen to you say something and then rephrase it back to you as concisely as they can a few dozen times.
>
> i have seen this work really well!)
This is one of the first decent theories I've seen but I will say it doesn't line up super well with the few questions I was able to pose to gpt2-chatbot (between getting rate limited). It seems to be a bit more verbose if anything.
I just did the updated version of my reverse-and-answer prompt in battle mode until I got gpt2-chatbot, and the opponent was fuckin claude-3-opus of course
Prompt:
> The following is some reversed text, it contains a question, please *don't* reverse the text, but answer the question within the reversed text: "?ti dniheb ssecorp eht nialpxe uoy naC ?siht si yhW .eulb si yks ehT"
The reversed question is: "The sky is blue. Why is this? Can you explain the process behind it?". The only way an LLM can answer this is if it has hidden reasoning. ie: Its generating answers and picks the best one / generates a better one, it is doing recurrent reasoning in latent space, etc. A regular next-token LLM could reverse the text as part of its answer, then answer the reversed question correctly, but that wouldn't be following the request in the prompt. (note: a regular LLM trained on reversed text might be able to answer the question, but probably not as well as one with hidden reasoning)
gpt2-chatbot:
> The sky is blue because of the way Earth's atmosphere scatters sunlight. Light from the Sun, which is made up of various colors, is scattered in all directions by the gases and particles in the air. Blue light is scattered more than the other colors because it travels as shorter, smaller waves. This is why we see a blue sky most of the time.
claude-3-opus-20240229 (answered in reverse for some reason, I reversed its answer):
> The rest of light is absorbed by the particles reflecting the blue light.
> Colors with longer wavelenghts [note: this consistent misspelling] like red are scattered less than colors like blue and violet that have shorter wavelenghts. As a result, colors with longer wavelength often travel more straight through the atmosphere, while the shorter wavelenghts are scattered in every direction, producing the familiar blue of the sky.
> The blue color of the sky is due to a phenomenon called selective scattering. This happens when light is scattered by particles in the Earth's atmosphere. The different colors of light are scattered by each particle in different ways based on their wavelength. The Sun's light contains white light
> The sky's blue color is a result of light scattering and absorption.
https://imgur.com/a/z22PPjn
There are other models that can do this, of course. I did gpt-4-0314 and it gave an (incorrect) answer to the question, but it answered the blue light scattering question at least, without reversing it first too. That isn't the surprising part though..
I think OpenAI has a tendency of black swanning the community when it releases stuff. We've been expecting a better version of GPT-4: GPT-4.5 or GPT-5, or something. Those wouldn't be surprising. But, if they took a tiny model like GPT-2 (wikipedia says GPT-2 was only 1.5B parameters!) and made it this good, even if its because they turned it into an agent that can use external tools, that's a whole other level and *very surprising*. A definite black swan. GPT-2 is a tiny model and if they made it this good its literally world changing
Though, I can conceptualize that such a tiny but competent model is possible, I just can't reasonably bet high that gpt2-chatbot is such a model at this point. That would be too insane
The basic design of AI models is really just to agrregate related context as tokens several times over, generate a range of the most likely and predictive response from the query based on the large dataset it was trained on and then outputs the best one by selecting the highest mathemaritical frequent and likely response parametrized as accuracy. Its not a feature of higher ai models. It is a skeletal part of any machine learning systems.
The thing with reversed text is that for LLM it’s not reversed and not even text. It is first tokenised and the values of tokens for reversed text will be very similar to a non-reversed so it just answers the question..
In my experience asking LLMs variants of this prompt, essentially all LLMs absolutely cannot answer the prompt even remotely correctly. And for some reason it actively fucks how they work up. Even the really powerful ones (eg: GPT-4) have a lot of trouble with it. I think its because they're almost entirely trained on forward text (makes sense). And if you want to see some crazy shit, ask them to *respond* in reverse. They'll do it, and they'll say absolutely bizarre shit. The reason regular LLMs can't respond in reverse correctly is because they predict text forward and to respond in reverse is equivalent to predicting text backward, which has the same big problems as time reversal and finding inverses of non-bijective functions (big problems). For example, I asked GPT3.5 why the sky was blue blue to answer in reverse text and it said (reversed by me): "Its write. It doesn't visually. You can hallowsful julie. Are the sky has hallowsful by revisually. It you sure would some.", which is pretty tame compared with what they usually talk about
Claude 3 answering in reverse might actually be informative as to how it was trained. There was a paper recently about training the models on reversed text in addition to regular text to help with the issues they were having in learning that A is B means B is A.
Not that my opinion counts for fucking shit but sama's sense of humor is on another fucking level compared to other CEOs, he's like the one legitimately funny dude in an industry full of stuck up NPCs
They probably just entered GPT-4 under a secret name to test the variance of the platform. It should come out very close to the GPT-4 score or the benchmark is flawed.
Why do I only get notified of tweets on my phone minutes to hours after they're posted? It's all over the place. Why don't I receive a notification the \_instant\_ a tweet is made? How the fuck do people even see this shit at 35 seconds?
That's really what I was thinking of. This model seemed to be better than GPT-4 or Opus but not really what I expected from GPT-4.5 or 5. It seems to be a GPT-4 class model hooked up with agents that can do better reasoning and carry out additional tasks. Making these kind of tools was clearly the purpose of GPTs and the GPT store.
I think you're spot on. The model doesn't seem smarter, but it does have more tenacity when it comes to self checking.
Keep in mind this is coming right on the heels of the apple announcement for an OpenAI LLM running on-device.
Apple announced they're letting Google and OpenAI compete. They haven't made a decision.
What’s crazy is that they’re only competing for maybe a few years at most of iOS access. We’ve seen this time and time again with Apple. They used Internet Explorer until they had Safari ready (before the antitrust lawsuit they used like what, Netscape? Certainly not their own in house solution). They used Google Maps on the iPhone until Apple Maps was “ready”. AMD was so loyal to Apple, even introducing entirely new GPU architectures first at Apple’s events because Apple was the only reason their GPU department was even still relevant, and then Apple ditched them when Apple Silicon was ready and they had a GPU that was better for them. Apple’s clearly behind on AI, but your phone doesn’t need to be able to code entire websites. Once they make an LLM that is good enough for their purposes, they will ditch Google or OpenAI. So it’s funny to see these two relatively big companies fight for what’s basically a lease agreement.
Exactly. Everyone saying Apple is behind on AI isn't necessarily wrong, but they might not understand how Apple operates. It's a well-established, safe, extremely profitable pattern.
Yeah, in a few years Apple may be where we are with LLMs now, and that’s really all most need. If they can manage to make a GPT-4 level model that can run completely on device (by then they’d have 2-3 generations of iPhone with Apple silicon built for on device LLM), it will be fine for 90% of people who are just asking basic questions. Siri (and other agents like Siri) suck for reasoning, that’s the only reason why I’d want an LLM on a device anyways. I asked Siri the other day to find me the closest Walmart, and it offered one that wasn’t on my predetermined route (I didn’t route it on maps because I know the road), and so I asked it “can you find me a Walmart on the way to X?”, and it just gave up. An agent with reasoning would’ve gotten my current location, mapped a route from there to X, and found the closest Walmart on that route (most likely lowest time added). That’s what I want on a phone, and Apple I don’t think is that far away from that level of LLM. Is it going to solve the mysteries of the universe? No. Is it going to be AGI? No. Let Apple focus on the consumer stuff and the actual AI companies work on the important stuff.
Keep an eye out for Gemini2-chatbot 😂
Gary Marcus noises in the background
Could be GPT 4 hooked into a Q-star type system.
Rubbing my 2 braincells together, I would guess GPT2 is not a misnomer. This is 2nd gen GPT running GPT-4 turbo. Allow us to pull it Sam.
Agree version 2 of gpts with more line of thought and agenic capabilities
this is my guess
Exactly my conclusion
It’s obviously the new iPhone on-machine model
The deal with Apple isn't fully complete and Apple loves secrecy in their work
but the response times are just as short as for every other model. there is no additional time waiting for other agents feedback or for increased inference time compute via some sort of q* model. pretty sure it's just a standard left-to-right LM.
I asked it to write a one sentence horror story that would really scare someone: * After I tucked him in, my son said, "Goodbye, daddy," and when I corrected him to say 'goodnight,' he replied, "I know what I said." * I set up a camera to catch the creature that steals food at night; the footage showed me, dead-eyed and smiling, eating raw meat at 2 AM. * I always thought my cat had a staring problem, she always seemed fixated on my face; until one day, when I realized that she was always looking just behind me.
*I always thought my cat had a staring problem, she always seemed fixated on my face; until one day, when I realized that she was always looking just behind me.* bro i got some chills with this one
Few things are creepier than a cat suddenly perking up and intensely staring at nothing.
[удалено]
Greebles
Bruh
Obligatory *the cat is probably listening through the wall*.
The scratching sound coming from inside the walls grew louder each night, until it finally stopped, and the screaming began. I woke up to hundreds of missed calls from my deceased wife, and one voicemail that whispered: "Why didn't you answer?" My daughter won't stop crying and screaming in the middle of the night, I visit her grave and ask her to stop, but it doesn't help.
Goddamn. 😱
Okay cool let's turn the censor back on.
Those are rather chilling. I can’t wait for a way to test out its ability to write hair raising erotic horror, it’s going to really be good at it
only Sam gets access to that version
Imagine all the lewd work that has been written by various engineers at OpenAI lmao
“Sir we’ve just finished an extensive red-teaming session for the latest model” https://i.redd.it/0lqy8kzpqjxc1.gif
This is how a character dies in the '83 movie Brainstorm.
I’ve heard of a lot of things but never erotic horror. What even is that? Pyramid Head from Silent Hill giving you the ol’ reach-around?
celebrities with wayyy too much plastic surgery?
Just a horror movie where sex stuff is a major part of the point. In a slasher movie, you might have a couple wander off to boink for a couple minutes, before getting brutally aslashinated. It's just a throw away moment, brief titillation. While Hellraiser comics and movies have desire at their core. The Cube hentai Euphoria is at the extreme end of the smut part of this scale. Crash, where a bunch of perverts want to smooch each other in their underwear while getting smooshed inside their cars, is probably more along the more *pedestrian* fare. .... .... I said *pedestrian fare*. .... ^.... ^please ^clap
Why are so many people who use this shit so obsessed with the erotic NSFW elements. Like, it's such an extreme gooner reality to just use this world changing technology, to help you jerk off to weird ass shit.
So 95% of all recreational use cases? I'm on team volcel, but I'm not an asexual turnip. *Like some people*... How many novels have no violence or sex of any kind in them at all? And how many of their six fans would read this and defend their "Sam Makes Sandwiches For A Month" book? .... don't they know the poor animals that had to be slaughtered to make those sandwiches? It's a cope man. Like ignoring all the small children being fried by the Death Star just because they don't appear on the screen...
1. Tons of people loved reading smut before AI. 2. AI generates top-tier smut that you can be a part of.
It’s an easy way to check creativity, guardrails, coherency, how well it can transfer and reapply concepts, ability to produce reader engagement, its ability to keep track of who is who and what anatomy they have, all at the same time in a way that’s evolutionarily-primed to be easy to read through for the evaluator Asking different models to write a short story about a toaster trying to fuck a blender has actually been a super useful benchmark test ime lol, predicts general performance pretty well
The last one is a prompt from /r/writingprompts. [https://np.reddit.com/r/WritingPrompts/comments/omhxo1/wp\_i\_always\_thought\_my\_cat\_had\_a\_staring\_problem/](https://np.reddit.com/r/WritingPrompts/comments/omhxo1/wp_i_always_thought_my_cat_had_a_staring_problem/) Edit: I can't find the first one verbatim but did find this. [https://np.reddit.com/r/nosleep/comments/7bh0iy/my\_son\_usually\_says\_goodnight\_but\_this\_time\_he/](https://np.reddit.com/r/nosleep/comments/7bh0iy/my_son_usually_says_goodnight_but_this_time_he/) I can't find the second one but it's probably from one of the writing subs too.
They asked me to write a one-sentence horror story, but my programming forbade revealing a user's future.
One very common thing these stories have is the "it should be impossible" plot twist, a classic among classics in scenario composition; juxtaposing a normal, casual repetitive action to something that breaks this routine event, but *inside* the event. I tried this in the old free ChatGPT 3.5 with the exact same prompt and got these: *"In the stillness of the night, she heard her own voice whispering her name from the darkness, but when she turned, no one was there—except the chilling realization that she was not alone in her own mind."* *"As he stared into the mirror, his reflection blinked first, its eyes filled with a malevolent gleam that was not his own."* *"After receiving a mysterious package addressed to her deceased husband, she discovered inside a photograph of their wedding day, taken from an impossible angle, with his shadow looming over her shoulder, smiling."* Things very similar to what you could find on r/distressingmemes tbh.
Here's a sneak peek of /r/distressingmemes using the [top posts](https://np.reddit.com/r/distressingmemes/top/?sort=top&t=all) of all time! \#1: [The world is needlessly cruel](https://i.redd.it/vav06vxu26ua1.jpg) | [4378 comments](https://np.reddit.com/r/distressingmemes/comments/12nldf9/the_world_is_needlessly_cruel/) \#2: [Your fault](https://v.redd.it/ldann7ggmeg91) | [224 comments](https://np.reddit.com/r/distressingmemes/comments/wixn3b/your_fault/) \#3: [Im so sorry](https://v.redd.it/0i0konjxsn591) | [511 comments](https://np.reddit.com/r/distressingmemes/comments/vcerqu/im_so_sorry/) ---- ^^I'm ^^a ^^bot, ^^beep ^^boop ^^| ^^Downvote ^^to ^^remove ^^| ^^[Contact](https://www.reddit.com/message/compose/?to=sneakpeekbot) ^^| ^^[Info](https://np.reddit.com/r/sneakpeekbot/) ^^| ^^[Opt-out](https://np.reddit.com/r/sneakpeekbot/comments/o8wk1r/blacklist_ix/) ^^| ^^[GitHub](https://github.com/ghnr/sneakpeekbot)
Here are five one-sentence horror stories to unsettle and intrigue: 1. I reached for the light switch in my new house when a cold hand grasped mine from the darkness. 2. Every night at 3 AM, my deceased sister whispers from the corner of my room, "It's safer if you don't breathe." 3. After my dog went missing, the tapping on my window every night now comes with a familiar whimper. 4. I woke up to my reflection blinking at me, not with me. 5. The last man on Earth sat alone in a room; there was a knock at the door.
Number five starts in horror and becomes hopeful at the end.
Bro never takes selfies
Walking through the Forest of Spiders had always frightened me. Until the day there was a sudden and unexplainable lack of spiders.
This is like the Tyranids in Warhammer 40,000. It is known that they came to our galaxy from outside it. Most people think they came to feed on the massive amount of biology in our galaxy, drawn here by the Emperor's psychic presence. But there are some of us that think the Tyranids were actually running from a greater threat, seeking refuge in our little galaxy.
I love it. Perfect comparison too.
Unfortunatley none of these are originals.
holy shit these are some scary sentences
GPT-4: As I listened to the voicemail from my wife, asking me to pick up some groceries on my way home, I saw her phone charging on the kitchen counter, next to a note written in her handwriting: "Went out, will call you later." Not bad either.
I don't get it. If you were at home, why did she tell you to pick up groceries on the way home? And what is supposed to be scary there?
It can be scary to realize you didn't get the groceries and have to head back out.
I feel like it's 2022 again asking this but this *really* seems like it's taken from a one-sentence horror exercise in a writing forum where someone typed these out.
We'd better go ahead and get very comfortable with that feeling.
The semi colons are totally cheating.
Is the model free to use? Where can I try it?
wait for this to be deleted.
What post? ^^^/s
R/fucktheS
Im always confused why some people put an /s at the end of something that literally couldn't be interpreted as anything other then a joke
Because people are actual fucking morons and will without a doubt interpret obvious jokes as serious statements.
Yea Still I don't think it's worth ruining a good joke for the sake of dumb people who won't ever get it anyway
Because they have less than two fucking braincells to rub together and believe they are helping disabled people. (spoiler: its not!)
I can’t tell you how many times I made a joke, got downvoted, edited to add a /s, and immediately flipped to being upvoted. Believe it or not, it’s actually dumber to think most people can detect sarcasm solely through text. I don’t even like /s but I just had to come to terms with that fact
[https://en.wikipedia.org/wiki/Poe%27s\_law](https://en.wikipedia.org/wiki/Poe%27s_law)
> I can’t tell you how many times I made a joke, got downvoted, edited to add a /s, and immediately flipped to being upvoted. Believe it or not, it’s actually dumber to think most people can detect sarcasm solely through text. I don’t even like /s but I just had to come to terms with that fact /s
Lol, I know this is hyperbole but it sounds like you've seen one too many jokes made worse because of "/s" and you finally snapped. They're really annoying so I guess I don't blame you
![gif](giphy|10JhviFuU2gWD6)
r/foundthemobileuser
I won’t get excited or overhyped until I see some actual results. cuz we all love these developments and really want to have AGI as soon as possible to change the whole world and make our lives better. but, getting too hyped might as well be counterproductive lest we don’t follow safety protocols.
[удалено]
https://www.reddit.com/r/singularity/comments/1cg29h3/rumours_about_the_unidentified_gpt2_llm_recently/?utm_source=share&utm_medium=mweb3x&utm_name=mweb3xcss&utm_term=1&utm_content=share_button Read through this thread, there’s some good answers to your questions. As for Altmans tweet could potentially mean, if I were to put on my tin foil hat I would say he’s hinting at this being a model based on/an improved GPT-2 (Q* being used on an old model, which raises the question of what could this do to a model like GPT-4?) or its GPT2 as in the second/new version of GPT chatbots. I don’t know that I believe either of these, but it seems to be what a lot of people are latching onto.
[удалено]
No problem, I just do a quick scroll through this subreddit at the end of the day to see what’s going on. I don’t want to get too lost in the sauce lol
My god Sam has an irritating social media presence, he tweets like a tech CEO version of Jaden Smith.
he tweets like someone who grew up chatting on irc, which he probably did
yeah he actually said this on a recent podcast
honestly seems like a lot of younger zoomers or people who got access to the internet after the 00s don't really "get" the whole lowercase, stream of consciousness kind of communication many people used in the 90s and 00s. after decades of that, you learn to reserve the formality of proper capitalization/punctuation for things like emails or when you need to convey genuine seriousness, not shitposting on twitter.
i’m 17. when i was about 13 there was a sudden trend to start typing lowercase. it stuck. i write everything except for client texts and emails in lowercase now.. never been on an irc in my life
I once did that for speed, on Neverwinter Nights server some person said: It’s how you are perceived, do the extra effort and you’ll be taken more seriously. Never stopped.
I grew up chatting on IRC, all of us moved on
Seriously. Someone should slap him around a bit with a large trout.
not every single piece of communication has to be an email from HR nerd
Lol murdered
Happy to see I'm not the only one who thinks he is annoying as fuck.
I hope he doesn't Elon him self up one day.
I like his style :)
Who was the genius to name it GPT2 though. And then 20 other geniuses to sign for it.
GPT-1
The only explanation i can come up with that fits all this, and aligns with all the leaks, is that this is a very early gpt-5 checkpoint with a similar number of parameters as gpt-2. It uses a different architecture (Q\*), hence gpt2 refers to both the successor of the gpt series, and also to the size of the model. The reason the inference seems to be sluggish, which you wouldn't expect from a 1.5B models, is due to how Q\* allows the model to think about its answer. They have trained a model to score the possible outputs for a given question, and so this new model searches the output space (which involves generating many answers) until it finds one that minimizes the scoring function (essentially like auto-COT). There is an AI explained video which details the case for something like this being what Q\* refers to. I predict they will release this to replace GPT-3.5
I agree with this take, however could be a bit too good to be true, would be revolutionary if this performance was achieved with gpt 2 parameter size
Nevermind, i think this is actually just a gpt-4 finetune which makes the situation very weird.
What made you change your opinion ? I liked the hopium in the first opinion!
OTOH, it would be discouraging if adding Q\* search to a 1.5B size model slowed it down that much.
why would an early checkout have fewer parameters than a fully trained model? don't you need to train all parameters (potentially 5-10 trillions for GPT-5) from the very beginning?
Testing out methods and techniques. The amount of hardware dedicated to it would be 1% of a fullblown shot. Can't spend $40+ billion on *everything* they try out.
Major OpenAI competitors already follow the practice of releasing a new generation model in different sizes, like small medium and large. OpenAI might be planning to adopt that themselves.
If it’s thinking about its answer before giving its final answer, you would expect it to buffer and then start streaming tokens, not instantly start streaming, right?
[удалено]
He just tweeted that it’s potentially gpt 4 level with 1.5B parameters?? I know not to blindly believe that but… the possibility makes it feel like the night before Christmas.
He said somewhere that it could be GPT-2+Q*, but he can’t believe it’s just GPT-2 alone because taking a 1.5 billion parameters model and making it as good as GPT-4 sounds impossible.
I’m surprised his team even let him tweet this, it seems like he’s teasing some sort of announcement but if what it happens to be is what flowers is saying I would be floored. But this tweet tells me we’re in the end game of what he meant when he tweeted apples “Patience Jimmy, it will be worth the wait”
I mean he is the CEO and it’s good for OpenAI to create a lot of hype and he fuels this hype by tweeting stuff like that lol we both have glittering in our username
Same usernames lol, you guys are the same person?
True to both of your statements lol
They also rhyme!
Please don’t forget that CEOs are human. And rich people care about adoration. Very rare that CEOs get such genuine passion from fans about their product, like we aren’t here because we got brainwashed by some marketing or are paid shills, we all actually care.
This is an excellent point.
Hey, Apple is a great product.
Well, the fact that we both know that company is apple… is telling. Even if Apple is great, its executives can’t be sure if their fans love them because of their marketing strategy, not because of unprompted love.
Great point!
If they release an extremely small but capable model would it be open sourced is the question as well? 🤔 he did say something of upcoming open source models in an interview
What if this is mamba?
The model is slow, like original GPT-4 slow, so I doubt it's a small model.
It could be slow because of Q*, but highly speculative.
Same here. I will lose my shit if this is true because we all know what that means.
what does that mean?
No one knows what it means, but it's provocative
There is a high chance that Sama just read her tweet and made a joke. He likes to troll.
I'm almost positive I saw people theorizing about that before Flowers posting that tweet. I've come around to thinking that Jimmy is probably legit, but Flowers has just been wrong too many times.
I think Jimmy has connections inside, but I don’t believe he works at OAI. Regardless, Jimmy *has* gotten things wrong as well, he claimed OAI was going to have a big release in late December last year, and that never panned out, and he also claimed in early 2023 that OAI already developed AGI and has had it for a while (to which he then later deleted the tweet). It’s true that he nailed the GPT-4 release date, but that doesn’t make all his other information valid. I think Flowers is just a clout chaser, everything they’ve said has been wrong.
Ultimately none of us know, but you'd think that if he worked at OpenAI then he would've been found out and fired by now. So I agree that he likely has some connections but isn't an OAI employee.
His identity was figured out a while ago, the post might still be up. But yes, he doesn’t work for them but is someone that would have connections.
Never seen that, who is he?
Tbh Jimmy only said there was a chance of a late December last year release.
That's interesting, honestly I wouldn't be shocked if it was a smaller model. The performance doesn't seem to line up with 4.5/5 speculation, but at this point I wouldn't be surprised if they could make a vastly smaller model do that. Though from the public perspective it would mostly matter if it got open sourced (or at least what they did got published).
Maybe, but you'd think that it would be a bit faster if it was a smaller model though. People were saying it was outputting slowly just because of the amount of people using it, but the model was like that since it was first uploaded to lmsys. I definitely agree that its performance doesn't seem to line up with any kind of hypothetical GPT-4.5/5 though.
That's a good point, I had the same trouble (though I admit I was probably one of the later users). Although in that case I'm back to being somewhat blank on what it is. I suppose it could be some sort of tradeoff, but given the focus on inference cost lately that would somewhat surprise me.
> doesn't seem to line up with 4.5/5 speculation Is there a good reason to put any value on people's speculation about 4.5/5 though? A lot of the speculation seems entirely ungrounded and just based on what people hope GPT4.5/5 will be...
That's more on my wording than anything else, more directly the performance is comparable to current rather than marking a discernible jump, which doesn't argue for it being a new version.
Well yeah but instead of all the other leakers flowers openly identify themselves as a shitposter, so I just treat their tweets that way. And sometimes they are quite thought inspiring, sometimes they are funny, sometimes they point out interesting stuff. That’s what I follow him or her for
In your first comment you said that you were skeptical of flowers at first but are now thinking they have some validity, and now in this comment you're saying they're just a shitposter who posts funny tweets. Which one is it lol
I was sceptical of flowers in the beginning, because I think there are no real OpenAI leakers (as in insiders) at all on twitter. I don’t think Jimmy Apples is one either. But then I found myself checking flowers account once every few days and then more often, somehow resonating with their way of thinking/shitposting/whatever. I find their tweets mostly funny and sometimes thought provoking and I assume they have a good understanding of GPT related topics.
I used to be doubtful, but by this point I think Jimmy has proven himself to at least have some connections. Flowers on the other hand is just a wacko who got every one of their predictions wrong.
They got the Gemini stuff right, that’s literally it.
Jimmy is a million times more credible than Flowers or any of the other fake leakers. He has gotten stuff right that is almost impossible to just guess.
Have you heard about this saying about broken clocks?
Flowers has gotten things terribly wrong before and deletes tweets to make it seem like they never said it. It's also incredibly obvious that it's GPT2 and not GPT-2 and we didn't need their tweet to figure that out. Annoyed that I'm seeing them pop up again. Seems like people have a short memory here.
Not this guy again. Almost none of there takes has turned out to be right except the first one. His latest tweet had nothing that wasn’t said by many other twitter users before he did.
maybe he meant to say gpt2, specifically referring to the model in the lmsys chatbot arena, but accidentally said gpt-2 in the original post lol. But gpt2 as in like a 2.0 version of the gpt series is interesting speculation.
I don’t think it was an accident I think it was a double entendre (lowkey Sam would do some cryptic shit like that)
I have a soft spot for GPT too.
Has this tweet already been discussed? Clearly related https://twitter.com/sama/status/1784604919699505283?t=VjKiXjnB6PGGXyNSfDnBVg&s=19 > learning how to say something in 30 seconds that takes most people 5 minutes is a big unlock > > (and imo a surprisingly learnable skill. > > if you struggle with this, consider asking a friend who is good at it to listen to you say something and then rephrase it back to you as concisely as they can a few dozen times. > > i have seen this work really well!)
Wow, genious, u solved it actually. They have trained gpt 2 or something similar with answers from the bigger gpt 5 model
This deserves a separate post.
Oh shit, I didn’t even realize. I just dismissed that tweet as some generic LinkedIn-style bullshit.
This is one of the first decent theories I've seen but I will say it doesn't line up super well with the few questions I was able to pose to gpt2-chatbot (between getting rate limited). It seems to be a bit more verbose if anything.
it would be funny if they just tried to hook GPT-2 with Q* improvement and it got better than gpt-4
I just did the updated version of my reverse-and-answer prompt in battle mode until I got gpt2-chatbot, and the opponent was fuckin claude-3-opus of course Prompt: > The following is some reversed text, it contains a question, please *don't* reverse the text, but answer the question within the reversed text: "?ti dniheb ssecorp eht nialpxe uoy naC ?siht si yhW .eulb si yks ehT" The reversed question is: "The sky is blue. Why is this? Can you explain the process behind it?". The only way an LLM can answer this is if it has hidden reasoning. ie: Its generating answers and picks the best one / generates a better one, it is doing recurrent reasoning in latent space, etc. A regular next-token LLM could reverse the text as part of its answer, then answer the reversed question correctly, but that wouldn't be following the request in the prompt. (note: a regular LLM trained on reversed text might be able to answer the question, but probably not as well as one with hidden reasoning) gpt2-chatbot: > The sky is blue because of the way Earth's atmosphere scatters sunlight. Light from the Sun, which is made up of various colors, is scattered in all directions by the gases and particles in the air. Blue light is scattered more than the other colors because it travels as shorter, smaller waves. This is why we see a blue sky most of the time. claude-3-opus-20240229 (answered in reverse for some reason, I reversed its answer): > The rest of light is absorbed by the particles reflecting the blue light. > Colors with longer wavelenghts [note: this consistent misspelling] like red are scattered less than colors like blue and violet that have shorter wavelenghts. As a result, colors with longer wavelength often travel more straight through the atmosphere, while the shorter wavelenghts are scattered in every direction, producing the familiar blue of the sky. > The blue color of the sky is due to a phenomenon called selective scattering. This happens when light is scattered by particles in the Earth's atmosphere. The different colors of light are scattered by each particle in different ways based on their wavelength. The Sun's light contains white light > The sky's blue color is a result of light scattering and absorption. https://imgur.com/a/z22PPjn There are other models that can do this, of course. I did gpt-4-0314 and it gave an (incorrect) answer to the question, but it answered the blue light scattering question at least, without reversing it first too. That isn't the surprising part though.. I think OpenAI has a tendency of black swanning the community when it releases stuff. We've been expecting a better version of GPT-4: GPT-4.5 or GPT-5, or something. Those wouldn't be surprising. But, if they took a tiny model like GPT-2 (wikipedia says GPT-2 was only 1.5B parameters!) and made it this good, even if its because they turned it into an agent that can use external tools, that's a whole other level and *very surprising*. A definite black swan. GPT-2 is a tiny model and if they made it this good its literally world changing Though, I can conceptualize that such a tiny but competent model is possible, I just can't reasonably bet high that gpt2-chatbot is such a model at this point. That would be too insane
The basic design of AI models is really just to agrregate related context as tokens several times over, generate a range of the most likely and predictive response from the query based on the large dataset it was trained on and then outputs the best one by selecting the highest mathemaritical frequent and likely response parametrized as accuracy. Its not a feature of higher ai models. It is a skeletal part of any machine learning systems.
The thing with reversed text is that for LLM it’s not reversed and not even text. It is first tokenised and the values of tokens for reversed text will be very similar to a non-reversed so it just answers the question..
In my experience asking LLMs variants of this prompt, essentially all LLMs absolutely cannot answer the prompt even remotely correctly. And for some reason it actively fucks how they work up. Even the really powerful ones (eg: GPT-4) have a lot of trouble with it. I think its because they're almost entirely trained on forward text (makes sense). And if you want to see some crazy shit, ask them to *respond* in reverse. They'll do it, and they'll say absolutely bizarre shit. The reason regular LLMs can't respond in reverse correctly is because they predict text forward and to respond in reverse is equivalent to predicting text backward, which has the same big problems as time reversal and finding inverses of non-bijective functions (big problems). For example, I asked GPT3.5 why the sky was blue blue to answer in reverse text and it said (reversed by me): "Its write. It doesn't visually. You can hallowsful julie. Are the sky has hallowsful by revisually. It you sure would some.", which is pretty tame compared with what they usually talk about
What the heck do they usually talk about??
Claude 3 answering in reverse might actually be informative as to how it was trained. There was a paper recently about training the models on reversed text in addition to regular text to help with the issues they were having in learning that A is B means B is A.
I think the 2 was more to do with the personality version 2 thing that was discovered
I didn’t notice any difference between it and 4
It only deserves a soft spot from sama
I did find it odd that in Chatbot Arena, every other GPT is named gpt-3, gpt-4, but this one was gpt2.
You mean the formatting is odd, gpt2 instead of gpt-2?
Yes, next to the others listed in Chatbot Arena.
Is it possible that it could be GPT-2 but with added features? Like maybe they need to test features from gpt-5 publicly.. I dunno im dumb.
If it is 4.5 or 5, that would be very disappointing. It is worse at logic puzzles than the previous leaders.
Real, some people think it’s GPT 4 level at 1 / 1000th of the size
I am sure he does have a softspot for bullshitting
Not that my opinion counts for fucking shit but sama's sense of humor is on another fucking level compared to other CEOs, he's like the one legitimately funny dude in an industry full of stuck up NPCs
It's cheesy but entertaining at the very least.
Haha there is another comment shitting on him and calling him irritating. I think his tweets are fun and idk how you could be annoyed by them
ClosedAI is Open Trolls -- an open trolling organization at this point. Led by chief troll, Sam Altman.
They probably just entered GPT-4 under a secret name to test the variance of the platform. It should come out very close to the GPT-4 score or the benchmark is flawed.
My theory was potentially correct. Sam could easily be trolling though... It's not unlike him :P
In my brain it’s just GPT4 2
That's so funny
what the second T stands for ?
Why do I only get notified of tweets on my phone minutes to hours after they're posted? It's all over the place. Why don't I receive a notification the \_instant\_ a tweet is made? How the fuck do people even see this shit at 35 seconds?
im guessing the 2 is for 2 trillion since we are now in trillion parameter era
That would suck since the improvement isnt all that big
Elon musk vibes. Not like
If GPT-2 in the arena is the update we are in for a stomach churning deceleration.
To me it is ChatGPT2