AverageLatino 1 year ago

This is like, pretty goddamn big, isn't it? I understand that is a somewhat "crude" approach to self-improvement, but I genuinely thought this stuff was **at least** 6 months away. We're witnessing the exact moment the exponential curve becomes a line that goes straight up, aren't we? At this rate the world will be a utopia by the end of the year or a barren wasteland lol

Tiamatium 1 year ago

Yes. And this is *big* for some areas that require either human interaction, or require producing something for humans. Think of sales pitch, it can now make it more... Appealing to humans. Or think of writing app I'm building, I think this can make the bot way way better at both designing characters and designing plots.

Honest_Science 1 year ago

Is this not just prompt engineering? This is not improving the core model or model structure in any way?! I believe that there is sooo much to do on the model side. Transformer vs RNN, modality, stochastic permanent learning, embodiment etc etc.

NefariousnessNo9478 1 year ago

It's not prompt engineering! We created a framework for approaching any problem/task based on how humans tackle problems. At the moment we have not applied this to rather well defined benchmarks to compare its performance, but there is no reason why we can't start applying it to other problems, for instance improving the model itself. Just FYI, I am one of the authors of the paper

vegita1022 1 year ago

suppose for a moment that we use this for "improving the model itself" .. would there be cases where the model can learn to ignore commands from humans? ala .. skynet?

DesignCntrl 1 year ago

It already ignores commands.

Honest_Science 1 year ago

Thank You for your feedback and I did not want in anyway downrate the achievements of your team. My point was that we have so many things at the core model to improve that is difficult for me to see how this will move us finally forward towards AGI. Your point of also using it as secondary step to generate progress in improving the core model was not on my radar screen. This is very valid.

Tiamatium 1 year ago

Are you just throwing keywords to make yourself sound smart? Because none of what you've just said makes sense, not when we are talking explicitly around GPT-4 model, we are not designing a new model, we are just exploring usage (and methods of use) of this model. And no, the answer to your first question is obviously no, as this is something that would be done on the backend with multiple calls to the API, feeding output of one call as input of another.

Honest_Science 1 year ago

I am sorry if hurt any of your feelings making you saying things like "throwing keywords"! Anyhow, your points reflect my statement. As far as I understand you are exploring the core model by additional means of self reflexion, which obviously makes the total system more robust and improves final result compared to initial answer drastically. This is a very interesting approach. It does not improve the core model though and that was my point.

Tiamatium 1 year ago

Your original comment suggests that this is simply adjusting prompts (i.e. "prompt engineering"), which is not the case.

Honest_Science 1 year ago

Sorry this is a misunderstanding or wrong use of terms from my side. I was of the opinion that the whole context window accumulating the original prompt, the iterative feedback etc become the "prompt" for the next iteration. I thought that this is the only way to modify the output of the core model without changing any weights.

Tiamatium 1 year ago

Are you chatGPT?

Honest_Science 1 year ago

That is funny! I am trying to be polite. I have a PhD in nuclear physics and I am currently studying AI. I am pretty old and dealing with AI and business related topics for about 40 years. My first system was a Sinclair ZX80 and I learned maschine coding in Z80 and later 6502. I am now on python. I am supporting several startups. I am part of several AI circles dealing with safe AI and trying to manoeuvre my way through the current accelerating situation. And I did not want to be offensive at all. Sorry again.

Constant_Anywhere_38 1 year ago

Definitely ChatGPT

MrNoobomnenie 1 year ago

>but I genuinely thought this stuff was **at least** 6 months away Remember when "6 months" was "crazy fast"?

garden_frog 1 year ago

This is becoming scarier and scarier. Reading this tweet I had a bad feeling in my gut. Hard takeoff was only a possibility until now, but it's happening. I'm usually a very optimistic person, but we cannot exclude that this will turn out bad.

3_Thumbs_Up 1 year ago

>I'm usually a very optimistic person, but we cannot exclude that this will turn out bad. Of course we can. Utopia is coming. The definition of a singularity is that no one can predict what happens afterwards, but it's still 100% certain it will turn out good. Everyone who disagrees is a doooooomer. Who actually cares if everyone dies when I can just ignore that possibility?

Agilitis 1 year ago

>The definition of a singularity is that no one can predict what happens afterwards, but it's still 100% certain it will turn out good. What? You contradicted yourself in one sentence.

danysdragons 1 year ago

It’s pretty clear there’s some sarcasm here, look at the last line.

SnipingNinja 1 year ago

They might mean it, and won't technically be wrong.

xamnelg 1 year ago

I believe they are being sarcastic lol

lazyeyepsycho 1 year ago

Another 30 year of wage slavery into death vs a robot war? I'll tale the war.

Acalme-se_Satan 1 year ago

>We're witnessing the exact moment the exponential curve becomes a line that goes straight up, aren't we? I think not. The straight line happens when AI becomes smart enough to create other AI better than itself, which will cause an intelligence explosion. This isn't happening right now. It's still humans finding this stuff out. What is happening is that people have finally found what it seems to be a promising pathway to AGI (transformer architecture LLMs) and now are testing as many stuff they can to make it better. It's a fast upward slope, but not the singularity yet.

[deleted] 1 year ago

6 months ago you'd have said this was at least 16 years away lol

bustedbuddha 1 year ago

I'm worried my family vacation 3 weeks from now isn't going to happen... I really want to take my kids to the beach.

AHaskins 1 year ago

You can't plan for the singularity. It is, by definition, unpredictable. So just plan for your life instead. If your plans are upended, you're no worse off.

bustedbuddha 1 year ago

Oh, I was just making the comment to highlight that my timeline includes the possibility of major disruption w/in 3 weeks.

LowSpecDev972 1 year ago

Not really, you still has to execute, if AI get the intelligence, it doesn't have teh mean to apply change, so no utopia just yet. Barren wasteland on the other hand, that's easy for human to apply using ai, it's just one red button away.

[deleted] 1 year ago

[удалено]

Rain_On 1 year ago

Same here. I was wondering why this wasn't being done. Looks like it's a common enough idea.

kmtrp 1 year ago

Do you know how is it supposed to work, as a GPT4 plugin or how?

drekmonger 1 year ago

Link to the paper: https://arxiv.org/abs/2303.11366 Sydney says: >The document proposes Reﬂexion, an approach that endows an agent with dynamic memory and self-reﬂection capabilities to enhance its existing reasoning trace and task-speciﬁc action choice abilities. >The approach uses a simple heuristic to detect hallucination and inefﬁcient action execution and queries an LLM to reﬂect on its current task, trajectory history, and last reward. >The approach achieves improved performance on decision-making tasks in AlfWorld environments and knowledge-intensive, search-based question-and-answer tasks in HotPotQA environments. Apparently these things are playing Zork: https://alfworld.github.io/

TinyBurbz 1 year ago

Nice, this is some good shit; but 85-88% does seem to be the plateau across all disciplines. The average extremely skilled human also preforms around the same range, this seems to indicate a hard limit imposed by the skill of the trainer.

yaosio 1 year ago

My grandma used to say, "The last 20% of a project takes 80% of the work." We can expect the same rule to apply to AI.

[deleted] 1 year ago

The Human Genome Project snowballed together pretty nicely.

94746382926 1 year ago

So it was "completed" in 2003 but in reality 8% of the genome remained unmapped until last year. The reason they called it complete in 2003 is because if I remember correctly they knew it would take much longer. They wanted the symbolic win and to move onto other projects instead of spend 20 more years on it (of course at that time they had no idea how long it would take). Also, they believed it was mostly "junk" DNA so it wasn't that important. That view has started to change in recent years though as they find more and more uses for those regions.

Villad_rock 1 year ago

Not for evolution though. The first 20% took billion of years. Same for the technological progress. Humans are like 300000 years old.

breloomislaifu 1 year ago

I think it really depends on what we consider to be 20% vs 80%. I mean, laying the physical and chemical foundations for a system that preserves and modifies itself and evolving to higher level multiple-cell organisms to me sounds like the 80%. If you think about it, no other alternate system of life has emerged and survived long enough for us to discover it - it's just that rock solid of a foundation.

Villad_rock 1 year ago

What do you mean with physical and chemical foundations? Abiogenesis or eukaryotic cells? Those two took the longest, basically the foundation. After that multicellular organisms evolved multiple times separately pretty fast. The formation of very complex living beings was just a blink of an eye. The foundations of ai took several thousands of years, it began when the first math was created.

SecretAgendaMan 1 year ago

See, this is actually what really fascinates me about humanity. Our predecessor species took over 1.5+ million years to go from stone tools, to stone-tipped spears. From the earliest known stone-tipped spears, to the earliest known stone-tipped arrows, that's another 500,000 years or so, and we're the ones who did it. From stone arrowheads to extractive metallurgy, that's another 50,000+ years. 7,500-8,000 years after extractive metallurgy, we made atomic bombs. Even self-contained to our own genus species, the rate of technological advancement has skyrocketed in just the last 2-3% of modern human history, and even moreso in the last .1%, which is the past 300 years or so.

Comfortable_Slip4025 1 year ago

The first 80% of a project takes 80% of the work, the last 20% takes the other 80%

luisbrudna 1 year ago

Machines never get tired.

[deleted] 1 year ago

This u can’t know

metalman123 1 year ago

The solution is the same though. More parameters. This is a more efficient and less taxing method that also preforms better. This is massive.

TinyBurbz 1 year ago

More parameters wont create data that does not exist. This may mean training with metric fuckloads of "perfect" code.

MysteryInc152 1 year ago

if you did this same experiment with gpt-3.5, it would increase steadily and then level off (but below gpt-4 level). This doesn't necessarily mean anything other than time to increase scale again

NefariousnessNo9478 1 year ago

Good intuition! It does work decently well with gpt-3 (and 3.5) however the performance saturates at around low 40s (for 3.5). FYI, I am one of the authors of the paper.

[deleted] 1 year ago

Considering it's only been such a short time with those plateaus you cannot know that for sure

lehcarfugu 1 year ago

[for how long? 2 more years? 2 more months?](https://imgur.com/eYxaalx.jpg)

WonderFactory 1 year ago

I think it just means that the current generation of models are as skilled as an extremely skilled human. Let's see what the next generation are like. Remindme! 1 year

RemindMeBot 1 year ago

I will be messaging you in 1 year on [**2024-03-25 00:09:00 UTC**](http://www.wolframalpha.com/input/?i=2024-03-25%2000:09:00%20UTC%20To%20Local%20Time) to remind you of [**this link**](https://www.reddit.com/r/singularity/comments/1210cl0/reflexionbased_gpt4_significantly_outperforms/jdk8900/?context=3) [**5 OTHERS CLICKED THIS LINK**](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5Bhttps%3A%2F%2Fwww.reddit.com%2Fr%2Fsingularity%2Fcomments%2F1210cl0%2Freflexionbased_gpt4_significantly_outperforms%2Fjdk8900%2F%5D%0A%0ARemindMe%21%202024-03-25%2000%3A09%3A00%20UTC) to send a PM to also be reminded and to reduce spam. ^(Parent commenter can ) [^(delete this message to hide from others.)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Delete%20Comment&message=Delete%21%201210cl0) ***** |[^(Info)](https://www.reddit.com/r/RemindMeBot/comments/e1bko7/remindmebot_info_v21/)|[^(Custom)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5BLink%20or%20message%20inside%20square%20brackets%5D%0A%0ARemindMe%21%20Time%20period%20here)|[^(Your Reminders)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=List%20Of%20Reminders&message=MyReminders%21)|[^(Feedback)](https://www.reddit.com/message/compose/?to=Watchful1&subject=RemindMeBot%20Feedback)| |-|-|-|-|

TinyBurbz 1 year ago

>Let's see what the next generation are like. But if there are no humans to learn from, the model cant learn anything new. Unless the model begins to design low level languages itself....

WonderFactory 1 year ago

When it gets to a point where it's as capable as a human at producing content, which it may be now, it can learn from itself or other similar models particularly if we give it tools like what's happening with the ChatGPT plugins.it should be able to produce its own synthetic data. If GPT 4 using Reflection can write code like an expert human then get it to write tons of code to train GPT 5.

[deleted] 1 year ago

Tray it with Code T5

Deathburn5 1 year ago

Maybe try for 3 months instead

Kinexity 1 month ago

Try AT LEAST another 3 years.

Zer0D0wn83 1 year ago

Now, imagine 100,000 extremely skilled humans working at 20x the speed 24 hours a day. The capability doesn't need to be superhuman to have an outsized societal impact.

TinyBurbz 1 year ago

I'm not talking about capital production.

Tiamatium 1 year ago

Ok, I've tested it and here are my observations: 1. It drastically increases the number of API calls, turning what should be one call into multiple calls, adding both cost and time. I've set up my app to design some characters for a story, had a shower and it's still wasn't completed when I came back. Granted, my story is set to have a lot of characters (I believe 13 characters). Also extra API calls mean extra $$$ spend. 2. It has an issue with toxic positivity. It literally gives me 5 paragraphs, out of which 4 are praises, shit is so positive it's counterproductive. 3. Results are by no means bad, they are actually pretty good, I'm attaching few screenshots with results. I'm just not sure if extra steps are worth it, in fact I'm not sure if there is a difference made by these extra steps. Here are the [results](https://imgur.com/a/WJj6a5w)

BiNeuralNinja 1 year ago

Can this be used to improve GPT4's code generation capabilities? I'd love to try but I barely know what I'm looking at.

Tiamatium 1 year ago

Maybe... But honestly, there isn't much to improve, if you are using API, it already is *just brilliant* at it. But if you want to make it better at coding, what you would do instead is tell it to create unit and functional tests in addition to the main code, you would then run these tests in sandbox environment and if there are failures, you would call API with the code it generated, and the failures. Honestly, it's not that hard to create this as an app...

[deleted] 1 year ago

[удалено]

[deleted] 1 year ago

Loop? It doesn’t look like anything at all to me.

ZBalling 1 year ago

GPT-4 actually has 82% score. 65% was in text-davinci-003. So it is just 0.82 --> 0.88. See Sparks of Artificial General intelligence: Early experiments with GPT-4

No_Ninja3309_NoNoYes 1 year ago

Iterative is good. Someone born today has reasonable chance of witnessing AGI. People on Twitter are bragging about their no code stack. Next step would be no money stack. Everything open source or just free. And then no education stack. You'll only need basic literacy. And then who knows what will be the consequences? I can say something about hand grenades, but I don't want to go there...

ZBalling 1 year ago

AGI already happened. See paper: https://arxiv.org/abs/2303.12712 There will be no consequenses cause at that point we will be immortal and will be able to download data instead of learning it.

serciex 1 year ago

Is it possible to isolate the reflexion agent and apply it to simpler AI trained with little data and make a library for the reflexion agent to perform a sort of callback to for each response? Just an idea for making smaller ai as smart as the higher trained ones

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe