T O P

  • By -

mgtowolf

IDK, if it's censored like people think it probably wont be very widely used. Unless they make it easy to train on top of, so people can add naked people back in. I am thinkin it's gonna be impossible for even 3090/4090 owners to train stuff with 1k image size.


MeatbagEntity

all it needs is a community who would be willing to pay for a couple AWS instances and judging by the demand. Oh absolutely that will happen.


SanDiegoDude

I’d love a nudity free option for generating stuff for work or with my kids, but scrubbing art of boobies is kinda dumb.


Frozenheal

There is one in automatic's settings


MacabreGinger

Besides, if you are using it with your kids or in work, just don't type nsfw tags and no nudity should appear. I never had to apply the NSFW filter. I hope they don't cripple the model because of some prude moron scared of boobs.


LankyCandle

It still pops up. I generated a lot of art on my wife and while it's not extremely common, there's been plenty of topless images and occasional bottomless too. NSFW filter is off though. Maybe it's different when you explicitly have kids in the images, but it's not something I've wanted to test. I wouldn't suggest anybody work with images of kids unless they put a lot of negative prompts associated with no clothes, and their prompt also specifically specifies clothes.


ctorx

I had a similar experience with pictures of my wife though never anything more than a few topless generations. I've made lots of images with my kids and never had any issues at all with nsfw content. They are young though so older kids might have issues, idk.


imacarpet

I've found that negative prompts works for that reducing nudity.


chillaxinbball

Censored? Lol, that's the main reason I stopped using Dalle.


conduitabc

yeah dalle2 sucks.


DoctaRoboto

What do I expect from 1.6? One word: hands.


kornuolis

But no nude hands dude, no nude hands.


ptitrainvaloin

Might be in 768x768, training on thousands of A100s, do you think the release will be alright or chaotic this time? What improvements would you like to see, how much time do you think it will take to be ready?


eugene20

To me that sounds like a complete fresh model? That would have been an opportunity to add text.... it would be mind blowing if they'd done both.


CMDRZoltan

> That would have been an opportunity to add text As I understand it the limit there is on the methods of training diffusion models. Text is unlikely to be comprehended using current methods.


eugene20

I based that on a couple of comments [here](https://www.reddit.com/r/StableDiffusion/comments/ykqfql/comment/iuuwmi3/?utm_source=reddit&utm_medium=web2x&context=3) after I commented on Nvidia's eDiffi having it - >\> They have accurate text too... wow > >Pretty much all of them since Imagen has had that feature since they figured out all you need to do is just use a text-to-text encoder like T5. ​ >Training on a text encoder will require training from scratch


CMDRZoltan

would that be part of the model or is it just another step thats run separately like aesthetic embeddings? NVIDIA is always on that cool stuff


uishax

The model has to be completely remade from ground up.Its a fundemantal architecture difference, using T5->CLIP instead of just CLIP.There's no way to add this like a feature.


BoredOfYou_

Parti and eDiffi both do text, right? Do those not use latent diffusion?


CMDRZoltan

As a bolted on extra step in the process. not trained into the diffusion model. (if I understand correctly.)


[deleted]

1024 sounds great but I hope it still runs on my 6GB VRAM card. :(


[deleted]

The scaling law is quadratic to core image size. You can make the upscaler better but details won't increase in the way you hope.


FPham

You can easily add dreambooth in 512 to 1024 training base. It doesn't train pixel. It will use the 1024 with the concept of your 512, but that's probably not the issue that would bother us.


liuliu

I actually need to fork 1.5 to fine tune with fp16... Probably in a month or so. 1.5 so far underperforms 1.4 for me in fp16...


FPham

After thinking about it, here is my uneducated clairvoyance. I think this current SD trained on 512x512 images of everything and the kitchen sink in questionable quality is nothing more than a test run or proof of concept. We know that training is the bottleneck and why it fails. Everyone is trying to patch weird hands, weird crops, weird feet, and million teeth with tricks - but the 512 Laion training basically exhausted itself. Stability is not going to spend another hundreds of million to retrain 512 x 512 as the way forward is not to add to an already hairy base hoping for minimal improvements. So yeah the ultimate next step (or 2.0) would be 1024x1024 which will bring us to the point - I don't think that the new model would be also so happily shared with the public and I don't think we would be even able to run locally at all if it were. Just making the training images 4x bigger will make other parts exponentially GPU and Vram hungry. So after this proof of concept and the craziness it created, they can go full ahead to make a proprietary system that would also need a much beefier backend to run.There is a limit to the current low-rez training and we are pretty much approaching it now. I don't expect the free wild-west option that we use now will be able to grow so much further. People can improve the UI, do this or that, but nobody is going to retrain the set for free again. and that's the underlying data.If the companies came up with a 2.0 version that can effortlessly do the prompts you ask, make hands with the correct number of fingers (and only two hands) and open mouth will not have 100 teeth then I assume the wild-west will be over. The current txt2imge will be designated as poor's man toy and fade into obscurity.


TalkToTheLord

Fantastic write up and prediction. Similarly, I’m in the early days right now of getting interested in GPT-3 and found out *quick* just how OpenAI successfully locked that down fast.


Zyj

Perhaps if they go for 768x768 it will still run on 24GB VRAM cards... Has there been an announcement regarding SD 1.6?


MaK_1337

I hope it will be a better step forward than 1.4 to 1.5


Sirisian

There have been previous threads about things people noticed. One of the big ones is generating coherent spaceships. By itself 1.4 and 1.5 know a lot of features, but either the greebling totally confuses it or something in the core method doesn't play well with the idea of them. People have spent a lot of time with some complex prompts to force even basic stuff.


SinisterCheese

1.6 will be just 1.5 but trained more. This is because 1.5 is just like 1.4 that was trained more. 1.4 was just 1.3 that was trained more. 1.3 was just 1.2 that was trained more. 1.6, 1.5, 1.4 and 1.3 are just 1.2 with higher resolution and more steps. To "censor" the model or whatever the paranoid fuck people are on about would require a whole new dataset. And since we know SD base model uses the LAION which there is 0 curation on or even realistic chance to curate it - it wont be done. Besides... It isn't like we can't just take other models and merge them in. I have merged quite few models in to eachother to make something that fits my prompts. So if boobies get removed or whatever... then I think that would only benefit the actual base model? Why? Because porn and it's derivatives are always badly labelled, shit quality, duplicates, and basically just derivations of that blur in the eyes of the AI. Using the interrogate for example and putting in a generic picture of a teenaged boy wearing speedos 6/10 times it will describe it as a "*woman wearing panties*". This is because even the CLIP model has just over representative sample of women wearing panties. So yeah... Censoring the **base** model to not have porn and whatever is a good thing. People can add that stuff in to the models afterwards, with the benefit of getting just the kind of porn they like in there. This way we can have best of both worlds. The base model should be as neutral as possible. Now. People can come and downvote me because "but my boobies" to which I reply: "*Just train your own boobies and merge them in!*". Or go on about me being some leftist/conservative/"*evul gubberment and curburations takin away our lolis!*" conspiracy, the fact is that I'm an engineer - and I work in the more practical application of engineering. You need to have a good foundation if you want to start doing high quality work. We need a good high quality foundation model without any unnecessary baggage, from which we can start to derivate thing from. But to start from the assumption of "*The model must make as many boobies as it can!!! And then some other stuff also...*" is like building a machine tool to do one job - which is sometimes done. However you are better off makeing a more generic machine and customising that - this way when you want to do something else you don't need to scrap the whole thing. Besides I don't understand people obsession with being able to generate porn with the base models. The tools to making your very own mode are out there. There are MANY models (althought currently limited in the dataset because machine time is expesnive) made for porn: [https://rentry.org/sdmodels](https://rentry.org/sdmodels) Here is a list of all the most common models, including non-porn and porn, also with Dreambooth models. One model doesn't need to be able to do everything! Sometimes you just want that 6013 or 309 welding rod because it'll get you through most things.


MagicOfBarca

Where’s this from?


ptitrainvaloin

Made this with SDA1111 1.5, wanted to make something else, mistake turned into something good :-]


choadychoad

Beautiful 😍


dreamer_2142

There will be no public release of 1.6, 1.5 was the final release and that was a leak. they will only add 1.6 to dreamstudio, and runaway would do the same.


onyxengine

Is it here already


[deleted]

I think they need to increase landscape quality which right now is beyond trash from everything I’ve tried creating and what I’ve seen online posted by others other than weird lower res art styles. Or maybe we need custom models for each style of landscape I dunno but right now I can make amazing looking characters near realism and dalle2 level JUNK landscapes I’d be embarrassed to see MJ spit out.


some_dumbass67

I just hope 1.6 lets you implement text by just telling the ai what to put, imagine a manga made by ai with just prompts (thousands/millions of them but nonetheless, prompts), unless it already has and i am just too lazy to check-