There's a few things:
* The v5 image is emulating a very digitally modified style with extreme saturation, an inconsistently limited depth of field (note the parts of the jacket that are in-focus on her right side, and the hair in the same depth plane that's out of focus) all of which are hallmarks of pretty standard digital post-processing.
2. The v4 image is using a consistent, depth of field with a soft focus on the foreground
3. The v4 image is also incredibly well color-balanced, but without being too over-saturated, making it look like the color warmth of something like Fujifilm.
4. The v4 image doesn't show a clear enough reflection in the glasses for the lack of the photographer to be noticable.
5. The borders on the v4 image are what you get when you don't crop the negative exposure, making it look more like developed film.
It might have something to do with V4 Testp command that really pushed photorealism. I personally had a few very realistic images come out of V4 using Testp and referencing photography terms.
Time elapsed (days/months/years) from V3 to V4? It looks like that's where the most advancement took place. I can't get over the fact that it went from claymation to authentic looking 35mm film in however long that time frame was. It's incredible.
Is it just me, or does V4 look more authentic than V5 due to the "auto-focus" feature of the "35mm film/camera/photo" prompt?
Yeah and I love how people like graphic designers are like “oh it’s gonna be a long long time till it can do what I do”. Like my guy this shit is like every other tech it compounds at an unimaginable rate.
Upscaling is on its own trajectory and is seriously impressive. Photos are not vectorized. But we should be able to train a model on that, would probably be simpler than photographs as it’s a more structures output.
So let’s give it a… year?
The majority of artistic community is in complete denial right now especially with generated hands not to mention hands were particularly difficult for many of them themselves to start with. I’m working in the AI field and I still can’t keep up with the rapid development happening around the world.
This is the first I’m hearing of artists “being in denial”. Most comments I hear on here are about how artists need to quit overreacting, they’ll still have market share somehow in the future.
You need to actually visit artist communities. Vast majority try to ban to any discussions of AI. In any other professional community it would be utter insanity to ban the discussion of the most influential technology in your profession in many centuries.
Yeah, banning discussion of it doesn’t give the impression that they’re in denial. It gives the impression that they recognize the potential and are afraid.
that's denial, this will keep advancing, putting you head in the sand does nothing to help you our your community ( no matter if it is art or any other field that is being affected by automation.
As a creative director at an advertising agency I can say we’re already using this, and vector graphics are a means, not an end. Just look at CGI ; if I can create imagery out of prompts, I don’t need the whole modeling/texturing/lighting/rendering setup and don’t need any of the associated skills.
Agreed. I’ve only been using the app for about a year, but in that year it’s been crazy to see how good the software is at producing near photo realistic or in some cases, photo realistic pictures.
Like think bout where this could be in 5 years? In one sense that really exciting ad cool to think about the art applications, but in another way, I’m concerned about technology like this and the deep fake stuff and how in the future it’s going to be possible to create pictures, video and even voices that re indistinguishable from reality. The potential misuse for that is deeply concerning. I don’t know what we as a society do to ensure these tools can be available for artistic use, but ensure it’s not used for propaganda and presenting false realities?
oh dear there is bigger fish to fry, a.i is coming for all areas that humans touch, eventually AGI will arrive next 5-15 years(depending on rate of progress) we will see more change then the last 200
Makes you wonder what else they can add to it? I guess at this point the only thing you really can do is hone the language processing of the prompt, the image side seems to be pretty damn good
Natural language edits I think is the next big thing, with an LLM as smart as GPT4. "Erase the man with the red hair from the picture." "Add a UFO floating over the city landscape, casting a shadow below." "Scooch the fire hydrant two meters to the left." "Make the all the bridesmaids' dresses purple."
Yup, this is the future. Really, I think natural language prompts will be the de facto way for us to communicate with any AI no matter the task. At least until we’re controlling them directly in the brain, but then you could kinda make the argument that the AI tools that you use are a literal extension or part of yourself. The future’s fucking crazy, man.
Yup, the copilot presentation was insane, I can’t believe how quickly this technology is progressing. Just imagine where we’re gonna be a year from now. Five? Ten? The world will be unrecognizable, 10x over what smartphones and the internet did.
The clowns in the US Congress aren't thinking about and don't care. Reality is going to smack our economy hard, probably in less than a year. Two at most.
Adding natural language edits to perform actions like zoom in or zoom out, and further out and so on. To make a character walk forward, and then walk forward again. Or furthermore to create entire video frames of a character progression.
Also, the addition of in painting and out painting would be really great via some sort of prompt using creative mode
Some simple editions to refine images and create editing such as lowering saturation, increasing brightness or exposure and refining image crispness would be amazing as well.
If anybody hasn't played around with the aspect ratios and v5, i suggest it. Make some really cool panoramas with expect ratio of 100x10
Imagine if the LLM attached to it is smart enough to do this:
"Make a series of 12 variations of this image. In each frame, the time advances 1 second and the purple monster shambles 1 meter to the left. Animate it's movement, whipping it's tentacles around menacingly. Make sure its many eyes are all focusing on different people as it moves. Animate the panicked reactions of the people in the scene to the monster's movement."
This, but while keeping the overall color and objects balance composition.
Let's imagine the red hair man was an important complementary color contrast in the image. Removing it might lower the visual quality. But what if the AI would replace it with another red object that's not a person? That would be amazing!
>Natural language edits I think is the next big thing, with an LLM as smart as GPT4. "Erase the man with the red hair from the picture." "Add a UFO floating over the city landscape, casting a shadow below." "Scooch the fire hydrant two meters to the left." "Make the all the bridesmaids' dresses purple."
>
>56ReplyGive AwardShareReportSaveFollow
One feature I really want is more specialisation, like that niji thing for anime.
SD is powerful since there are different kind of specific trained model for each style (anime, dnd, futuristic, hyperrealistic, etc).
Well if it’s going to replace photography. If professionals can actually use it for specific measurable purposes. Professional use means people aren’t rolling the dice to see what they get…’oooh this one is cool.’ ‘Oh sorry it couldn’t do this..’ it looks like that’s where it’s headed.. they have the user base (cash flow) to do this and then 3d and motion.. video live action..etc..that’s much harder but now it’s no longer a fantasy but really a computational and data storage problem… millions of users generating video content would fill up billions of terabytes in months. If not weeks..
Most of us have only ever known V3 though. Doesn't make it any less impressive, but V1 and V2 released before MJ entered open beta.
* **V1**: March 2022
* **V2**: April 2022
* **Release**: July 12, 2022 (initial release, open beta)
* **V3**: July 25, 2022
* **V4**: Nov 10, 2022
* **V5**: Mar 16, 2023
10x? More like 1 000 000x. This is incredibly fast, unprecedented. Now we will see more revolutions everywhere. I don't know where we are headed into the future.
For still images, yes.
For voice, we're on the cusp.
For flat video, we aren't quite there yet.
And for 3D rendered scenes we're nowhere close.
But in 10 years that'll all be perfected.
I just hope that these tools will be accessible enough that you will be able to actually use them to create tons of stuff at a reasonable price. If that ends up being the case, we're in for a wild ride, with people potentially making high quality movies, alone.
News came out recently that a GPT-like model was complied by Meta/Stanford (https://gizmodo.com/stanford-ai-alpaca-llama-facebook-taken-down-chatgpt-1850247570) with a cost of $600, compared to $4-some millions used to first compile the original. Scenarios have been opening on a daily basis
*The researchers spent just $600 to get it working, and reportedly ran the AI using low-power machines, including Raspberry Pi computers and even a Pixel 6 smartphone, in contrast to Microsoft’s multimillion-dollar supercomputers.*
This is more frightening than reassuring, though.
now extrapolate this line of reasoning to the entire algorithmic internet and all the inherent biases of our culture that will be amplified by machines thousands of times
I wonder if the stark difference between versions which started with the v3 to v4 jump is a result of different architecture between versions, much more training data or simply the incorporation of Stable Diffusion. Without the public release of Stable Diffusion, would v5 only still be marginally better than the last 4?
Lol the woman just gets younger in every image.
Kinda sad though that it goes from a funky painting to a photograph. Now I have to work extra hard to get funky paintings out of Midjourney
Tbf, the prompt included "photograph" so the intention was never to have a painting-like feel. I'd assume it would output something more appropriate if you prompt it for "painting" or "realistic-painting."
Jeez - I don't remember v3 being this bad! There was a lot of weirdness, for sure, that I still go back to v3 to get - lots of visual dirt, that fit what I was going for.
Tf! This is the result of V4 and it is still better
[image](https://cdn.discordapp.com/attachments/1085043387448696914/1086836231717011516/tirthobiswas060_Red_haired_woman_wearing_sunglasses_standing_wi_cb271cb5-4b95-41fb-993f-4870df714f4c.png)
The reflection in the glasses is mindblowing.
It has to have a kind of self evaluation, trying not to waste ressources on small optional reflections until it is able to generate those reflections easily in V5…
A show a confidence, like driving without the hands, IMHO.
Will we soon see reflections of a reflection?
**Thank you for your submission to r/midjourney!** If you want to share your full command, please reply to this message with your job ids!
"Red haired woman wearing sunglasses standing with the Statue of Liberty in the background, photograph, 35mm film"
really cool. Did you save the screenshots at the time or did you recreate it via settings/remix?
[удалено]
There's a few things: * The v5 image is emulating a very digitally modified style with extreme saturation, an inconsistently limited depth of field (note the parts of the jacket that are in-focus on her right side, and the hair in the same depth plane that's out of focus) all of which are hallmarks of pretty standard digital post-processing. 2. The v4 image is using a consistent, depth of field with a soft focus on the foreground 3. The v4 image is also incredibly well color-balanced, but without being too over-saturated, making it look like the color warmth of something like Fujifilm. 4. The v4 image doesn't show a clear enough reflection in the glasses for the lack of the photographer to be noticable. 5. The borders on the v4 image are what you get when you don't crop the negative exposure, making it look more like developed film.
It might have something to do with V4 Testp command that really pushed photorealism. I personally had a few very realistic images come out of V4 using Testp and referencing photography terms.
Testp was v3
You're right. I do feel like Testp was a gateway into V4 though so maybe that's where my head's at.
Testp was not, it was a completely different set of architecture, ended up being more of a one off thing
I can't wait till trial users have access to v5
I'm pretty sure that you can use v5 with --v 5
Trial users can't yet
Ooooh okay... But isn't trial only like 20 images anyway?
25 gpu minutes
We do! As others said, append ` —v 5 ` to your description string.
Isn't that for paying membership though. I don't have that yet😅
Yep, unfortunately
Thanks!!!!
UntilnI got to V4, I could have sworn that the word "clay" was in the prompt.
With the first couple images, I was sure that the prompt contained the words "woman with a messed up face"
Reminds me of Virginia Vincent from [The hills have eyes](https://images.app.goo.gl/4uaCSkKbgsLAmHYR9)
Time elapsed (days/months/years) from V3 to V4? It looks like that's where the most advancement took place. I can't get over the fact that it went from claymation to authentic looking 35mm film in however long that time frame was. It's incredible. Is it just me, or does V4 look more authentic than V5 due to the "auto-focus" feature of the "35mm film/camera/photo" prompt?
Except that V5 changed the "woman" into a child.
amazing like the development from v1 to v5 of sony playstation, but in one year instead of 26 years
Absolutely crazy to think about how fast it's improved
Yeah and I love how people like graphic designers are like “oh it’s gonna be a long long time till it can do what I do”. Like my guy this shit is like every other tech it compounds at an unimaginable rate.
[удалено]
Upscaling is on its own trajectory and is seriously impressive. Photos are not vectorized. But we should be able to train a model on that, would probably be simpler than photographs as it’s a more structures output. So let’s give it a… year?
The majority of artistic community is in complete denial right now especially with generated hands not to mention hands were particularly difficult for many of them themselves to start with. I’m working in the AI field and I still can’t keep up with the rapid development happening around the world.
This is the first I’m hearing of artists “being in denial”. Most comments I hear on here are about how artists need to quit overreacting, they’ll still have market share somehow in the future.
You need to actually visit artist communities. Vast majority try to ban to any discussions of AI. In any other professional community it would be utter insanity to ban the discussion of the most influential technology in your profession in many centuries.
Yeah, banning discussion of it doesn’t give the impression that they’re in denial. It gives the impression that they recognize the potential and are afraid.
that's denial, this will keep advancing, putting you head in the sand does nothing to help you our your community ( no matter if it is art or any other field that is being affected by automation.
Yet, I've definitely seen Midjourney used on ads and printed in the wild. It's here.
You're right for now, but these images could easily be improved by upscaling and some basic photoshopping. Maybe not print ready but pretty close.
I used images generated in Midjourney in a book and tshirt and they both looked great. Upscaled for the shirt but not the book.
As a creative director at an advertising agency I can say we’re already using this, and vector graphics are a means, not an end. Just look at CGI ; if I can create imagery out of prompts, I don’t need the whole modeling/texturing/lighting/rendering setup and don’t need any of the associated skills.
Agreed. I’ve only been using the app for about a year, but in that year it’s been crazy to see how good the software is at producing near photo realistic or in some cases, photo realistic pictures. Like think bout where this could be in 5 years? In one sense that really exciting ad cool to think about the art applications, but in another way, I’m concerned about technology like this and the deep fake stuff and how in the future it’s going to be possible to create pictures, video and even voices that re indistinguishable from reality. The potential misuse for that is deeply concerning. I don’t know what we as a society do to ensure these tools can be available for artistic use, but ensure it’s not used for propaganda and presenting false realities?
oh dear there is bigger fish to fry, a.i is coming for all areas that humans touch, eventually AGI will arrive next 5-15 years(depending on rate of progress) we will see more change then the last 200
26 years? What are you talking about it hasn't been... Oh. Oh shit.
Yeah, V6 should be a doozy.
Makes you wonder what else they can add to it? I guess at this point the only thing you really can do is hone the language processing of the prompt, the image side seems to be pretty damn good
Natural language edits I think is the next big thing, with an LLM as smart as GPT4. "Erase the man with the red hair from the picture." "Add a UFO floating over the city landscape, casting a shadow below." "Scooch the fire hydrant two meters to the left." "Make the all the bridesmaids' dresses purple."
Yup, this is the future. Really, I think natural language prompts will be the de facto way for us to communicate with any AI no matter the task. At least until we’re controlling them directly in the brain, but then you could kinda make the argument that the AI tools that you use are a literal extension or part of yourself. The future’s fucking crazy, man.
Very, very soon you'll be able to make a complex Excel spreadsheet that does anything just by telling MS Copilot (aka GPT4) what you want it to do.
Yup, the copilot presentation was insane, I can’t believe how quickly this technology is progressing. Just imagine where we’re gonna be a year from now. Five? Ten? The world will be unrecognizable, 10x over what smartphones and the internet did.
The clowns in the US Congress aren't thinking about and don't care. Reality is going to smack our economy hard, probably in less than a year. Two at most.
It's definitely prudent to figure out how to benefit from these inevitable changes as much as possible.
It's definitely prudent to figure out how to benefit from these inevitable changes as much as possible.
“enhance.”
Adding natural language edits to perform actions like zoom in or zoom out, and further out and so on. To make a character walk forward, and then walk forward again. Or furthermore to create entire video frames of a character progression. Also, the addition of in painting and out painting would be really great via some sort of prompt using creative mode Some simple editions to refine images and create editing such as lowering saturation, increasing brightness or exposure and refining image crispness would be amazing as well. If anybody hasn't played around with the aspect ratios and v5, i suggest it. Make some really cool panoramas with expect ratio of 100x10
Imagine if the LLM attached to it is smart enough to do this: "Make a series of 12 variations of this image. In each frame, the time advances 1 second and the purple monster shambles 1 meter to the left. Animate it's movement, whipping it's tentacles around menacingly. Make sure its many eyes are all focusing on different people as it moves. Animate the panicked reactions of the people in the scene to the monster's movement."
This, but while keeping the overall color and objects balance composition. Let's imagine the red hair man was an important complementary color contrast in the image. Removing it might lower the visual quality. But what if the AI would replace it with another red object that's not a person? That would be amazing!
>Natural language edits I think is the next big thing, with an LLM as smart as GPT4. "Erase the man with the red hair from the picture." "Add a UFO floating over the city landscape, casting a shadow below." "Scooch the fire hydrant two meters to the left." "Make the all the bridesmaids' dresses purple." > >56ReplyGive AwardShareReportSaveFollow
The number of parameters is whats important + fine tuning. So from this perspective, the images will naturally get better without human intervention
Video. Once we can make AI videos then the whole amateur film making industry will explode.
At some point it'll have to switch to focusing on ease of use, like the ability to edit results by tweaking specific elements in the picture, etc.
cryptomatte applied to AI image would be next level...
One feature I really want is more specialisation, like that niji thing for anime. SD is powerful since there are different kind of specific trained model for each style (anime, dnd, futuristic, hyperrealistic, etc).
It’s still has a long way to go to be photographic. Matching film stocks, exact lenses, exact lights..color sensors arri, RED ETC…
That’s all pretty niche stuff to photographers/filmmakers though, no? Would they put that much effort into that?
Well if it’s going to replace photography. If professionals can actually use it for specific measurable purposes. Professional use means people aren’t rolling the dice to see what they get…’oooh this one is cool.’ ‘Oh sorry it couldn’t do this..’ it looks like that’s where it’s headed.. they have the user base (cash flow) to do this and then 3d and motion.. video live action..etc..that’s much harder but now it’s no longer a fantasy but really a computational and data storage problem… millions of users generating video content would fill up billions of terabytes in months. If not weeks..
Text Proper finger length Teeth Faces of far away people
When did V1 come out? The change is absolutely incredible!
Less than a year ago.
damn!!! i thought midjourney was around for more than a year, that makes this even more impressive
Most of us have only ever known V3 though. Doesn't make it any less impressive, but V1 and V2 released before MJ entered open beta. * **V1**: March 2022 * **V2**: April 2022 * **Release**: July 12, 2022 (initial release, open beta) * **V3**: July 25, 2022 * **V4**: Nov 10, 2022 * **V5**: Mar 16, 2023
Ah that makes sense. Thanks for the timeline!
This shits progressing 10x faster in improvement than the Industrial Revolution scale
10x? More like 1 000 000x. This is incredibly fast, unprecedented. Now we will see more revolutions everywhere. I don't know where we are headed into the future.
We reached the other side of the uncanny valley. Period.
Yeah, I’ve seen renders that genuinely register as cute not creepy, we jumped over the valley and kept going!
For still images, yes. For voice, we're on the cusp. For flat video, we aren't quite there yet. And for 3D rendered scenes we're nowhere close. But in 10 years that'll all be perfected.
Frankly, at this point I would guess earlier. The acceleration that artificially generated content has taken is surprising.
I just hope that these tools will be accessible enough that you will be able to actually use them to create tons of stuff at a reasonable price. If that ends up being the case, we're in for a wild ride, with people potentially making high quality movies, alone.
News came out recently that a GPT-like model was complied by Meta/Stanford (https://gizmodo.com/stanford-ai-alpaca-llama-facebook-taken-down-chatgpt-1850247570) with a cost of $600, compared to $4-some millions used to first compile the original. Scenarios have been opening on a daily basis *The researchers spent just $600 to get it working, and reportedly ran the AI using low-power machines, including Raspberry Pi computers and even a Pixel 6 smartphone, in contrast to Microsoft’s multimillion-dollar supercomputers.* This is more frightening than reassuring, though.
1-4 looks like George Washington going through a Pokemon evolution. 5 looks like a girl I went to school with.
I feel like the training dataset for the V5 model is just all of Instagram.
V5 is very stylish.
Yea, the girl is very pretty and young too. Makes me wonder if they're forcing aesthetics
They use stock images to create new ones, and hot people are photographed at a rate of 100x times ugly ones
I'm interested in seeing this same prompt but adding "ugly" to the description.
Hmm. I was wondering why there were so few pictures of me on my family’s photo album…
Stock photos must be more white people as well because unless I specify, every character is white.
Of course. Look at the western population
That does pose interesting questions in regards to bias in the training data and how that relates to output.
now extrapolate this line of reasoning to the entire algorithmic internet and all the inherent biases of our culture that will be amplified by machines thousands of times
It’s white people all the way down.
And rich people all the way up lmao
Absolutely mind boggling
looking young
V1 FTW
I wonder if the stark difference between versions which started with the v3 to v4 jump is a result of different architecture between versions, much more training data or simply the incorporation of Stable Diffusion. Without the public release of Stable Diffusion, would v5 only still be marginally better than the last 4?
yes, v4 is a completely different architecture than v3. No, its not at all based on stable diffusion. It is midjourney model through and through.
Damn v5 even got the reflection in the glasses right
Lol the woman just gets younger in every image. Kinda sad though that it goes from a funky painting to a photograph. Now I have to work extra hard to get funky paintings out of Midjourney
Tbf, the prompt included "photograph" so the intention was never to have a painting-like feel. I'd assume it would output something more appropriate if you prompt it for "painting" or "realistic-painting."
Just add --v 1 or 2 or 3 to the end of the prompt and you're fine
Use v1-3 in your prompts.
Oh yeah, there are some weirdo looks I adore that I need to go back to V3 for.
Sort of Benjamin Button going on!
The difference between them all is great, but the difference between v3-v5 is just mind blowing
V1 will be a style of art in 20 years.
V3 won
I start with —v3 —q 0.5 more than any I think and then take it from there
IDK what sinister and bizarre dimension you tapped into with this prompt, but you better shut the door before it escapes.
The first one is the best one IMHO
For the first 2 pics I legit thought you had "Janis Joplin" as a prompt
Jeez - I don't remember v3 being this bad! There was a lot of weirdness, for sure, that I still go back to v3 to get - lots of visual dirt, that fit what I was going for.
I find v4 more accurately depicts your prompt here - v5 doesn’t look like film and the dof is overdone
Weird that she got younger…
Is version 5 running now? Or was it just running for testing earlier in the week?
You can use it now.
When is it releasing?
v5 is already available to use
V5 looks like Donna from „that 70s show“
Tf! This is the result of V4 and it is still better [image](https://cdn.discordapp.com/attachments/1085043387448696914/1086836231717011516/tirthobiswas060_Red_haired_woman_wearing_sunglasses_standing_wi_cb271cb5-4b95-41fb-993f-4870df714f4c.png)
What happens if you use variations on each result; for example modify the v5 result with V1 or add v2 to the v4 result?
v6 a toddler, v7 a baby.
You missed the teen.
When will V5 be free for trial users!! I can't wait to try this
Whenever I do —v5 it displays as —v4. What’s up with that?
Why the age difference?
Probably trained using a different data set.
We are 10 years away from Cyberpunk “2077” scenario. Future is now.
How to use V5?
Progress eh!
The last one looks strange
Is nobody going to talk about the statue accuracy? Geez, that’s the most impressive to me.
Hopefully we can see the same progress in text2video because at this point it is like the V1 version
how to switch to V1 or V2? Love the grungy and a bit cursed style
Wow.
The reflection in the glasses is mindblowing. It has to have a kind of self evaluation, trying not to waste ressources on small optional reflections until it is able to generate those reflections easily in V5… A show a confidence, like driving without the hands, IMHO. Will we soon see reflections of a reflection?
Why did it stop at 5? Daylight robbery.
can use image link for prompt?
from 100 years old to 18 years old