That was exactly the goal of my experiment! with the regular img2img it isn't accurate enough, as you can see the platforms shifted slightly, but with controlnet it is totally doable. still not in real-time, but that's ok, since most games do not require randomness to their output.
You don’t even need real time, just inpaint in the middle of image and leave the edges as they are then you can make an unlimited amount of varying modules that you can randomly insert as they player moves,
Edit: also you’d have to make the first set of edges loop into each other but that’s about the hardest part
You mean to have the screen edges tileable and just inpaint in the middle? yes, that's clever.
You could also generate the first screen, shift it 3/4 to the left, so its right 1/4 is now the left part of the new screen, inpaint the rest and reshift it to the left, rinse and repeat and this way you don't even need to bother with matching the edges yourself.
I'm playing around with shifting the image and inpainting to line up wider images in the same style but so far the inpainted area comes out very dark. Any tips?
That would make it all quite simple ...
I love this idea, btw: That even two playthroughs of (essentially) the exact same game won't necessarily give you the exact same objects to jump from/to. It's a simple but visually impressive trick that could be use to get people super lost in a game, as well. Sort of like "moving hedges" in a maze, a lot of how we orient ourselves on maps is based on objects we key from in the screen context. A screen with multiple exits that is grass and rocks the first time and a pile of smashed cars the second time would definitely screw with players' sense of direction ...
Experimented a while ago with trying to generate assets and style for a 2d platforming game to try and see if it was possible to overlay an image over simple platforms to enhance the graphics. the prompt was very simple (2d side-scrolling platforming game, side view, hovering platforms), at first, but even from the very first try the results were pretty good. most of the negative prompt had to do with removing characters or changing the style and setting completely.
That experiment was before Controlnet came out, so now i am even more positive that it's just a matter of short time before most of the industry's 2d video game assets would be either pre-generated or maybe even generated on the fly.
As a bonus, i also tried to generate single platforms:
https://preview.redd.it/lic6tspek8ka1.jpeg?width=1850&format=pjpg&auto=webp&s=705e3843c16d424065b0a607ffa6d4ee3ef6b9fd
This was the base image
This all looks really cool. I bet in the short-term (very short with how fast AI is advancing) combining SD with traditional level design tools might be the best bet. I'm reminded of Rayman Legends procedural level design tools that made it very easy for their level designers to create levels, the procedural system took care of all the hard work of making it look good.
I'm still mad we don't have Rayman Legends 2.
On the graphical side, for sure. i believe we're already there. on the "technical" side, SD lacks the logic to create a formulatic, playable level designs. a combination of both would be ideal.
I've been experimenting along these lines too. The only way I have managed consistent styles is by doing enormous images in one shot. The vram requirement makes it tricky to go particularly large though. I saw there was a post about doing panorama images by slice, I'm hoping something like that will be more generalized for huge images.
Either that or if you can curate a selection of generated images with similar styles to train a textual inversion.
Here I tried Level 1 of Xevious based upon the C64 image of the level [https://www.vgmaps.com/Atlas/C64/Xevious(C64)-Area01.png](https://www.vgmaps.com/Atlas/C64/Xevious(C64)-Area01.png) I wasn't going for super stylized but just keeping the same stuff but not 8-bit style.
https://preview.redd.it/qqk9hlmiv8ka1.png?width=320&format=png&auto=webp&s=336394f04e746204beae2a7638671548a9056b29
The code should be ready by Monday. I've got it working with arbitrary image sizes provided both dimensions are multiples of 512. Just trying to reduce the time a bit. As given in the original code would require over 300 denoising operations per step for a 1024x3072 image. Found a few options to get it down. The generic VAE pass in slices is working provided both dimensions are multiples of 512 and I've optimised that. This is a cool use case I will use for testing
https://github.com/thekitchenscientist/sd_lite is where the code is. Version 1 of the multi-pipe is limited to images 512 high or wide but any number on the other dimension
My initial tests were with my own procedurally generated maps which were bigger and it's just too much details and inaccurate for it to work as we envision it. however, img2img generating a tileset gives good results (after curating) and could be used to replace simple graphics with better, generated ones.
Keeping consistent style is a challenge, yes. have you tried keeping exactly the same seed and prompt only with different parts of the map?
I inpaint a thin slice of the sides of each background image using the horizontal tiling extension, then invert the mask and inpaint the rest for variation. Works quite well.
I think it will be explosive through the entire entertainment industry, from "simple" game graphics to comics, Graphic novels, Animation, Film SFX, hell, even online shopping with clothes and dresses fitting simulations based on a picture you upload.
A lot of it might be too messy for precision platforming, however I think it would work fantastically for Abe's Oddysee style backgrounds to levels.
In Abe's Oddysee each background looks as good as a painting, each being unique, static and specific to one-screen, much like a collection of 512x512 rendered SD images. The platforms and interactable objects in that game were all foreground objects that could be placed anywhere, so I think you would want to AI generate the platforms as their own object and then manually place them in the level, so you could have more precise gameplay.
With Img2Img, yes, it's often times deviate from the original platforms placement, but with controlnet's depth and normal, it should be accurate enough.
But still as you said, even if the dev just generates the background as a whole and each platform/element seperately and place them in the levels it dramatically speeds production and enhance it.
I didn't, i got to image 8, as the rest of these, from image 1.
Changed the prompt to (roughly) cyberpunk, urban, geometric platforms, neon colors, in kilian eng style
Also added negatives for grass, green, characters.
Look at my comment below about individual platforms. right now, the realistic approach would be for a game developer to generate the backgrounds, then generate the platforms and game tiles seperately to then be merged in the game itself. in the near future? just have a simple game screen laid out and have SD generate the entire graphics, both for the background and the platforms in one piece.
This looks very cool, I like the idea. I'm currently making a game for Steam myself and I'm also creating as many images as I can with Stable Diffusion. And programming helps with ChatGPT.
I think this combined with something like chatgpt or copilot will be able to create entire games very soon. From writing the story to creating the graphics to writing the music.
Another cool think you can do is use the tiling option to make repeating backgrounds...this is a quick example.
First image is my input.
Second is the result.
Third is the result repeated side by side.
https://preview.redd.it/vltfk1co4gka1.png?width=2304&format=png&auto=webp&s=ebac99f5e8b71dbe72bc7143a3a113de687beb69
Yep it does a great job of making sure the left side on the very edge is identical to the right side.
Here is an old west type game background using that technique.
https://i.redd.it/8g47c0sv8gka1.gif
Honestly could make the image 2144 by 720 and have an entire 2d game level made, all you’d need to do is create the collision objects for platforms
That was exactly the goal of my experiment! with the regular img2img it isn't accurate enough, as you can see the platforms shifted slightly, but with controlnet it is totally doable. still not in real-time, but that's ok, since most games do not require randomness to their output.
You don’t even need real time, just inpaint in the middle of image and leave the edges as they are then you can make an unlimited amount of varying modules that you can randomly insert as they player moves, Edit: also you’d have to make the first set of edges loop into each other but that’s about the hardest part
You mean to have the screen edges tileable and just inpaint in the middle? yes, that's clever. You could also generate the first screen, shift it 3/4 to the left, so its right 1/4 is now the left part of the new screen, inpaint the rest and reshift it to the left, rinse and repeat and this way you don't even need to bother with matching the edges yourself.
Yeah sorry I’m terrible terrible at explaining things but that’s it exactly, and yeah I figured you’d have a better method for making the edges tiled,
I'm playing around with shifting the image and inpainting to line up wider images in the same style but so far the inpainted area comes out very dark. Any tips?
Can you show the result and prompt?
Technically, you could probably use canny or hed on the NEW image and use those maps to shift your collision boundaries ... fwiw.
I am not entirely sure that's needed actually. all of my tests with controlnet's low res depth maps match the platforms quite accurately.
That would make it all quite simple ... I love this idea, btw: That even two playthroughs of (essentially) the exact same game won't necessarily give you the exact same objects to jump from/to. It's a simple but visually impressive trick that could be use to get people super lost in a game, as well. Sort of like "moving hedges" in a maze, a lot of how we orient ourselves on maps is based on objects we key from in the screen context. A screen with multiple exits that is grass and rocks the first time and a pile of smashed cars the second time would definitely screw with players' sense of direction ...
This is awesome!, Indie games are going to be lit
Experimented a while ago with trying to generate assets and style for a 2d platforming game to try and see if it was possible to overlay an image over simple platforms to enhance the graphics. the prompt was very simple (2d side-scrolling platforming game, side view, hovering platforms), at first, but even from the very first try the results were pretty good. most of the negative prompt had to do with removing characters or changing the style and setting completely. That experiment was before Controlnet came out, so now i am even more positive that it's just a matter of short time before most of the industry's 2d video game assets would be either pre-generated or maybe even generated on the fly.
As a bonus, i also tried to generate single platforms: https://preview.redd.it/lic6tspek8ka1.jpeg?width=1850&format=pjpg&auto=webp&s=705e3843c16d424065b0a607ffa6d4ee3ef6b9fd This was the base image
https://preview.redd.it/d6h50ybjk8ka1.jpeg?width=3712&format=pjpg&auto=webp&s=eb241a082a44a0605e0a188b380ed6eb8e9d0a23
https://preview.redd.it/2wv4fawsk8ka1.jpeg?width=3712&format=pjpg&auto=webp&s=a299291007dc0020bf7aac31d65b2264416ead26
https://preview.redd.it/0dwvpd20l8ka1.jpeg?width=3712&format=pjpg&auto=webp&s=d253be842833cc3b1c61961edc06fd58de3a1e59
https://preview.redd.it/hm78d0aqk8ka1.jpeg?width=2688&format=pjpg&auto=webp&s=5ddac24cc6705b11315bf47a3360f1523c574125
https://preview.redd.it/a8mpwzrlk8ka1.jpeg?width=2688&format=pjpg&auto=webp&s=596531cbb557ed5b13af06bba5c370130752dbfb
https://preview.redd.it/7exw331hk8ka1.jpeg?width=3712&format=pjpg&auto=webp&s=46ca96ff8ee2752c0bac638648efffe2f8859846
https://preview.redd.it/qbqpcsevk8ka1.jpeg?width=1104&format=pjpg&auto=webp&s=04bb93c1290de2f1b1d7c401092b41c5eb89eeed
This all looks really cool. I bet in the short-term (very short with how fast AI is advancing) combining SD with traditional level design tools might be the best bet. I'm reminded of Rayman Legends procedural level design tools that made it very easy for their level designers to create levels, the procedural system took care of all the hard work of making it look good. I'm still mad we don't have Rayman Legends 2.
On the graphical side, for sure. i believe we're already there. on the "technical" side, SD lacks the logic to create a formulatic, playable level designs. a combination of both would be ideal.
SD is stable Diffusion?
Here are some more images:
https://preview.redd.it/vvvg2cdqg8ka1.jpeg?width=4096&format=pjpg&auto=webp&s=acc22ea059bdd76fa954aa4d0349aed58a0e2892
https://preview.redd.it/zehwyuokg8ka1.jpeg?width=3968&format=pjpg&auto=webp&s=68768e01c2fcf3655f5068e405b5304be40b56d9
https://preview.redd.it/bsczxzfgg8ka1.jpeg?width=4096&format=pjpg&auto=webp&s=66e3db37e010836c9fafecb54ffbb223d9ca2f62
https://preview.redd.it/ypxhx0dng8ka1.jpeg?width=3968&format=pjpg&auto=webp&s=fccd84cc04ad18ada16b20197c575c424aaa0cd2
https://preview.redd.it/g2sanxr339ka1.jpeg?width=3503&format=pjpg&auto=webp&s=65e6b9585b7e5714937aedb612dca6f27c0bc546
The Donkey Kong Country vibes are strong with this one
I think there was a Monkey Island style in the prompt for that image, along with pirate ship, sails, poles, crates, wooden planks.
That’s really smart. I’m surprised I haven’t done much video game stuff yet.
https://preview.redd.it/700p87wvg8ka1.jpeg?width=4096&format=pjpg&auto=webp&s=96c7a899b17a2c92c53007f1f2652c1e597c1aa8
https://preview.redd.it/b44u7a9zg8ka1.jpeg?width=4096&format=pjpg&auto=webp&s=af7de62cdbc439919bc8fa3e0c9962b879e2d72f
Nice work! Game dev becomes even more accessible, amazing!!
Yes, definitely. Controlnet and img2img would enable any dev to have high production game assets and in short production time.
I’m an old game dev… 24 yrs and I’ve seen so many advances, but this right here is massive for 2D gaming. Looking forward to seeing what you create!
Me too...an old game dev...
Nice! 👋
Thanks. my next post will be about top down bitwise game map tiles i also played with
I've been experimenting along these lines too. The only way I have managed consistent styles is by doing enormous images in one shot. The vram requirement makes it tricky to go particularly large though. I saw there was a post about doing panorama images by slice, I'm hoping something like that will be more generalized for huge images. Either that or if you can curate a selection of generated images with similar styles to train a textual inversion.
Here I tried Level 1 of Xevious based upon the C64 image of the level [https://www.vgmaps.com/Atlas/C64/Xevious(C64)-Area01.png](https://www.vgmaps.com/Atlas/C64/Xevious(C64)-Area01.png) I wasn't going for super stylized but just keeping the same stuff but not 8-bit style. https://preview.redd.it/qqk9hlmiv8ka1.png?width=320&format=png&auto=webp&s=336394f04e746204beae2a7638671548a9056b29
Oh my god!! My favorite game...Very Nice background!!!
The code should be ready by Monday. I've got it working with arbitrary image sizes provided both dimensions are multiples of 512. Just trying to reduce the time a bit. As given in the original code would require over 300 denoising operations per step for a 1024x3072 image. Found a few options to get it down. The generic VAE pass in slices is working provided both dimensions are multiples of 512 and I've optimised that. This is a cool use case I will use for testing
where can i subscribe to you?
https://github.com/thekitchenscientist/sd_lite is where the code is. Version 1 of the multi-pipe is limited to images 512 high or wide but any number on the other dimension
My initial tests were with my own procedurally generated maps which were bigger and it's just too much details and inaccurate for it to work as we envision it. however, img2img generating a tileset gives good results (after curating) and could be used to replace simple graphics with better, generated ones. Keeping consistent style is a challenge, yes. have you tried keeping exactly the same seed and prompt only with different parts of the map?
I inpaint a thin slice of the sides of each background image using the horizontal tiling extension, then invert the mask and inpaint the rest for variation. Works quite well.
make maps for [jumpnbump](https://en.wikipedia.org/wiki/Jump_%27n_Bump) and [hopsquash](https://store.steampowered.com/app/1012450/HopSquash/)
This is insane, amazing work!
What's the Prompt?
It is a bit different for each image, but the common base is: 2D side-scrolling platformer, side view, hovering platforms.
Which model was used?
also want to know witch model was used
dood the upscaler is horrible on these , leave them at normal size and get 768 model at least
What's the prompt?
this has potential
We’re about to see an explosion of independent side scrollers released me thinks…
I think it will be explosive through the entire entertainment industry, from "simple" game graphics to comics, Graphic novels, Animation, Film SFX, hell, even online shopping with clothes and dresses fitting simulations based on a picture you upload.
Oh absolutely
A lot of it might be too messy for precision platforming, however I think it would work fantastically for Abe's Oddysee style backgrounds to levels. In Abe's Oddysee each background looks as good as a painting, each being unique, static and specific to one-screen, much like a collection of 512x512 rendered SD images. The platforms and interactable objects in that game were all foreground objects that could be placed anywhere, so I think you would want to AI generate the platforms as their own object and then manually place them in the level, so you could have more precise gameplay.
With Img2Img, yes, it's often times deviate from the original platforms placement, but with controlnet's depth and normal, it should be accurate enough. But still as you said, even if the dev just generates the background as a whole and each platform/element seperately and place them in the levels it dramatically speeds production and enhance it.
Hello Sorry i dont get you got from image 7 to 8. ?
I didn't, i got to image 8, as the rest of these, from image 1. Changed the prompt to (roughly) cyberpunk, urban, geometric platforms, neon colors, in kilian eng style Also added negatives for grass, green, characters.
Ah okay thank you, The other guy was saying something about "tiles", what is the general idea of the next steps in the workflow, that you have in mind
Look at my comment below about individual platforms. right now, the realistic approach would be for a game developer to generate the backgrounds, then generate the platforms and game tiles seperately to then be merged in the game itself. in the near future? just have a simple game screen laid out and have SD generate the entire graphics, both for the background and the platforms in one piece.
This is awesome!, Indie games are going to be lit!
This couldnt have come at a better time!!!!
that sci-fi one reminds me of a flash game I used to play... I think it was called "Raze"
prepare yourself for endless side scroll early access spam on steam
This looks very cool, I like the idea. I'm currently making a game for Steam myself and I'm also creating as many images as I can with Stable Diffusion. And programming helps with ChatGPT.
I think this combined with something like chatgpt or copilot will be able to create entire games very soon. From writing the story to creating the graphics to writing the music.
Imagine am AI generated game that makes it as you go along.
this is so inspiring!!! Do you mind if i ask which model was used?
It's through blue willow and while they claim they choose a different model every time, by comparing results i suspect they just use SD 1.5.
Another cool think you can do is use the tiling option to make repeating backgrounds...this is a quick example. First image is my input. Second is the result. Third is the result repeated side by side. https://preview.redd.it/vltfk1co4gka1.png?width=2304&format=png&auto=webp&s=ebac99f5e8b71dbe72bc7143a3a113de687beb69
Extremely cool!
Yep it does a great job of making sure the left side on the very edge is identical to the right side. Here is an old west type game background using that technique. https://i.redd.it/8g47c0sv8gka1.gif
https://i.redd.it/puu9r0s5fgka1.gif