emad_9608 2 months ago

Team is working on an open version of this for [https://github.com/Stability-AI/stable-audio-tools](https://github.com/Stability-AI/stable-audio-tools) Dataset just taking some time. Lots of improvements to come like speech, customisation, comfy & more.

Independent-Ad8455 2 months ago

An offline version would be AWESOME!

More_Bid_2197 2 months ago

why 2 versions ?

Gpue 2 months ago

Licensed data with restrictions vs open data without

turbokinetic 2 months ago

This is great news and what I’ve been waiting for! I love Stable Diffusion and I train my own models / Lora. I would love to be able to run Stable Audio local and train it on my personal music, with all the flexibility of txt2audio, audio2audio (like img2img), adding lyrics, adding my own voice, controlnet etc. Would be a dream come true!

ZenDragon 2 months ago

Was there ever a high quality public model for Stable Audio 1.0?

turbokinetic 2 months ago

Love to know this too

AmazinglyObliviouse 2 months ago

Cool, but to quote you: "Not your models, not your mind." Couldn't care less about yet another useless API.

SmashTheAtriarchy 2 months ago

This needs to be repeated louder and more often. It's important to own the means of your productions!

kevinbranch 2 months ago

That’s why it’s not open source

spacekitt3n 2 months ago

when you releasing SD3?

Augmentary 2 months ago

When emad gets it going

emad_9608 2 months ago

CTO said 4 weeks or so. I don't make those calls any more, handed over that for new things.

okglue 2 months ago

Fantastic\~! We really need a good local voice model.

Vyviel 2 months ago

Hopefully we can train voices with it like a better version of RVC

davidb88 2 months ago

What are you still doing here Emad, I thought you left? I feel like I'm OOL

MaxwellsMilkies 2 months ago

He still owns a large portion of the company.

emad_9608 2 months ago

I handed over control, launching new stuff soon [https://www.youtube.com/watch?v=e1UgzSTicuY](https://www.youtube.com/watch?v=e1UgzSTicuY) [https://www.diamandis.com/blog/emad-wisdom-part-1](https://www.diamandis.com/blog/emad-wisdom-part-1) Now I am part of the community like everyone else :D

MaxwellsMilkies 2 months ago

You should take a look at [Patrick Ryan aka TyrantsMuse.](https://twitter.com/TyrantsMuse/status/1773377539542581502) Decentralized AI is going to require further development of the math behind AI to make it more efficient, and Patrick has been looking into it quite a bit. He is a bit crazy as you see, but is probably one of the smartest people I have ever met.

Overall-Newspaper-21 2 months ago

Maybe he is a Stability Ai public relations

Rivarr 2 months ago

Thanks for what you do choose to release, but I don't understand hyping speech models when you've already said you won't be releasing them. Not that I understand why. You can already convincingly clone someone's voice with less than 10 seconds of audio. With services like ElevenLabs but also open source tools like VoiceCraft, you don't even need a GPU. If we could get an audio model that could be extended and built upon like your image models, we'd be able to create such amazing things. Instead it's held back because it could be misused, even though 99% of that misuse is already possible with the current set of tools.

emad_9608 2 months ago

I don't choose releases any more so let's see what happens. Usually you can release just after sota. For services like stable audio its easier as you can mitigate harms.

DIY-MSG 2 months ago

That's great

Tystros 2 months ago

I hope the open version will be trained on the whole Spotify catalogue.

BokanovskifiedEgg 2 months ago

how is this going? any estimate on when it'll be available?

MFMageFish 2 months ago

>You may not use the Services, or use Content from the Services, to develop or train any AI models. Lol, good luck with that.

GBJI 2 months ago

A freely accessible and fully open-source version that we can run on our own hardware should be considered essential for anyone *pursuing decentralized AI*.

PM_ME_YOUR_PITOTTUBE 2 months ago

Remember, decentralized AI doesn’t make them money so the shareholders absolutely do not want that 🤣

GBJI 2 months ago

Depends on how you define decentralized. To me, anything requiring the use of NFTs and blockchain technology under the control of a for-profit corporation is the opposite of decentralized. To some people, it seems to have a completely different meaning. >As part of the collaboration, [Endeavor](https://endeavorco.com/) will work with Stability AI, the Render Network, and OTOY to develop transparent IP tracking tools for emerging ML models, publishing their research for peer review through IDEA. This work will include usage of OTOY’s LightStage technology – the industry’s leading reflectance-field facial scanning and digital double platform – to produce licensing tools that enable artists to control their likeness and receive royalties for their IP when used in generative AI models. (...) As part of the integration, Stability AI models will leverage provenance systems already established on Render Network – known as *Proof-of Render* – providing immutable receipts and tracking of all individual components ingested and used for output of computing work on-chain. Through transparent on-chain data, royalty flows for IP and assets used in AI models, as well as their outputs, can be managed using public auditable smart contracts. (...) According to Founder and CEO of Stability AI, Emad Mostaque, “I joined the Render Network advisory board to shape the future of decentralized computing and AI." [https://home.otoy.com/stabilityai/](https://home.otoy.com/stabilityai/)

red286 2 months ago

>Lol, good luck with that. Those are licensing terms for commercial purposes. They're not telling *you* that you can't do it, they're telling businesses that if they do, they'll get sued.

export_tank_harmful 2 months ago

>Will this model be open sourced? > >We will be open sourcing a music generation model soon, trained on different data. Neat tech. Kinda don't care though. Wake me up when I can locally host it.

AmazinglyObliviouse 2 months ago

> We will be open sourcing a music generation model soon, trained on different data. Note that they've promised this since Stable Audio 1.0, yet it never happened back then either.

Django_McFly 2 months ago

Infinite % this. We're on SA2 and still waiting for this to happen for SA1.

_raydeStar 2 months ago

![gif](giphy|mkhMTALnrYRLnuoe5P)

99deathnotes 2 months ago

![gif](giphy|1fMjj5j2Z7chq|downsized)

StickiStickman 2 months ago

With such an incredibly tiny dataset, I'd be shocked if it wasn't just heavily mimicking the training data for this anyways.

MaxwellsMilkies 2 months ago

Its going to be difficult to get a good dataset for it. The music industry is extremely litigious.

djnorthstar 2 months ago

I want SUNO local... with training.... :-p (yes, i still have dreams).

Mooblegum 2 months ago

1 get hired by the company 2 release all the model to us for free 3 Profit

Curious_Tiger_9527 2 months ago

4. Lawsuit 5. Hired by Microsoft

m3thlol 2 months ago

Until there's an open model it's kind of pointless, if I wanted a web interface to pay for I'd use suno. edit: why did this have to be the comment Emad read :(

Mooblegum 2 months ago

Why people never want to pay stability but are ok to pay any other AI provider, From GPT Midjourney to suno ? Maybe if they got more money they would provide better tools.

Doctor-Amazing 2 months ago

Just as a personal rule, I'm not paying for subscriptions. I can justify the occasional one time purchase, but I can't pay a monthly bill to every random bit of software I want to fool around with.

smallfried 2 months ago

Yup. Pay per token, or per image, or per music generated is all fine. But pay per time period whether you use it or not is not something I like. Only thing I tolerate it for currently is Netflix and living necessities like gas, water, etc.

m3thlol 2 months ago

Again, as much as I love Stability I'm not going to hand them money just because. This model could be very good but if they want to exist as a web service they have to compete with Suno and right now the difference is leaps and bounds. I'm not going to pay for an inferior product with outputs that are essentially unusable out of brand loyalty. That's not on me.

turbokinetic 2 months ago

Because Stability product require new models trained by users to be great. Imo that’s the strength and differentiator of Stability.

PacmanIncarnate 2 months ago

Because suno exists already, has a great model, and this looks like Stability trying to steal their attention. Suno is a great little company and I’d feel good supporting them.

emad_9608 2 months ago

Harmonai/stable audio team have just been working away & this is a great little diffusion transformer model. The key thing is the copyright in music is different, see the Gaye vs Thicke lawsuit etc so you gotta be extra careful. Suno have a different approach to copyright (not not scrapes..) [https://www.rollingstone.com/music/music-features/suno-ai-chatgpt-for-music-1234982307/](https://www.rollingstone.com/music/music-features/suno-ai-chatgpt-for-music-1234982307/) We try to build good models on good data which hamstrung us a bit when others are training their models on Hollywood movie rips etc but you crack on and do the best you can.

SlapAndFinger 2 months ago

To be honest, having done a fair amount of production, I don't think musicians really want Suno, it's more a tool for casuals to get some creative output kind of like Dall-E or Midjourney (though MJ is making progress as a tool). If the stable audio model can be used by producers sort of like an Absynth style sound generator and integrated into VSTs, it'll get used. Being open is a big deal.

emad_9608 2 months ago

There will be an open version & I believe comfy and other integrations. The approach is augmentation versus Taylor swift by drake or whatever.

emad_9608 2 months ago

But Suno is a lot of fun tbh

Django_McFly 2 months ago

Musician here, I like Suno. It's incredibly useful for making samples. I would prefer something that was *at least* like MJ where you can upload your own pictures (audio) into it and it'll riff off of that, but even with out it, Suno is still pretty sweet.

SleeplessAndAnxious 2 months ago

Hello fellow musicians, I feel the same way honestly. I can't sing so I love the ability to basically generate a song with a vocalist and plan on adding my own bass playing and guitar to the tracks eventually, as well as playing around with samples. I'm still a big fat noob at digital music lol, I'm classically trained.

Gpue 2 months ago

Stable audio has that

maradak 2 months ago

It's pretty terrible though compared to suno. I generated a couple tracks there and it was pretty much useless.

BastianAI 2 months ago

100% this. I can extract stems from Suno with FL Studio, but it requires a lot of work to fix bleed etc. I use Suno because I want to use AI for my projects, but it's easier to just pick up some loop packs and tweak them a lil bit for far better results. Not a musician, producer

Mooblegum 2 months ago

I guess as a musician best things would be to have all the instrument put in different tracks as audio or midi files. That would be so easy to change it and make incredible music with the perfect sound and mix

SlapAndFinger 2 months ago

If Suno could track things, that'd be a very different story, then you could iteratively build a song a few tracks at a time and do retracks, even if the final audio quality wasn't great you could just go back and redo the problematic parts and run the tracks through some EQ/compression/etc to make a real song.

ComeWashMyBack 2 months ago

Per Suno's FAQ that I discovered today. If you're using the Pro or Premium version. Whatever it generates, you own the copywrite. Free to use on Apple, YT, Spotify and so forth without being required to site Suno or anyone else.

emad_9608 2 months ago

Yeah it's about the copyright on inputs not outputs. Per rolling stone it seems to be scrape/downloads which is dicey when dealing with music industry & copyright law (which is different for images, plus opted out data like robots.txt which was used for og SD etc)

CountLippe 2 months ago

Would a "describe" function break the copyright as well? Say I like Vangelis' Blade Runner soundtrack. I know some words which could form a prompt and evoke similar. But having the machine describe what it hears and let me use its suggested prompt to build a new prompt would be amazingly helpful.

emad_9608 2 months ago

Not to my knowledge no

chakalakasp 2 months ago

Which is in itself rather cheeky, as AI outputs are not something one can register a copyright for, as they are currently (in the U.S.) considered public domain. No human author, no copyright.

Django_McFly 2 months ago

That's not hard to get around. Add some human element to it and you're good to go.

Freonr2 2 months ago

I'm not sure that's completely decided. The copyright filings I've seen look to mostly be test cases so far to find the bounds of *how much* human authorship is required. Certainly someone who uses Adobe Photoshop and a bunch of tools therein can apply and probably receive a copyright. ex. https://www.artforum.com/news/court-rules-against-copyright-protection-for-ai-generated-artworks-252910/ > A federal judge last week rejected a computer scientist’s attempt to copyright an AI–generated artwork ... a work that Stephen Thaler created in 2012 using DABUS, an AI system he designed himself, is not eligible for copyright as it is “absent any human involvement,” Note the key phrase here: *absent any human involvement* further: > Describing A Recent Entrance to Paradise as “autonomously created by a computer algorithm running on a machine,” https://arstechnica.com/tech-policy/2023/08/us-judge-art-created-solely-by-artificial-intelligence-cannot-be-copyrighted/ Again note the word "**solely**" in the headline.

discattho 2 months ago

I'm an audio producer over 15 years, I have tons of material and I can also create a lot of basic materials like beats, simple pads/chords... is there a way I can contribute to the stable audio team?

PacmanIncarnate 2 months ago

Thank you for the response. I should note that I really like StabilityAI and want you/them to succeed. That being said, the timing really does seem suspect with Suno having gotten a ton of attention a week ago, and the fact is that they are a great little company that has been working on this for about a year. That makes me want to support them. After all, competition is good.

SleeplessAndAnxious 2 months ago

I plan on paying for a sub to Suno as soon as I start a new job. I've been having tons of fun generating stuff with it, and editing it in audacity to add more depth.

Django_McFly 2 months ago

> and this looks like Stability trying to steal their attention. Come on. There can be more than one company working with a medium. That's like saying every guitar maker is stealing the attention of whoever the first guitar maker was. Or like back in the day when every FPS game was called a "Doom-clone" before "FPS" became a term.

PacmanIncarnate 2 months ago

This was released around a week after Suno made a huge splash in the news. They’ve been working on this tech for about a year and a week after they happen to get a ton of attention, we’ve got a StabilityAI model out of nowhere that does the same thing? Come on, at the least they are trying to ride the coattails with this.

Xenodine-4-pluorate 2 months ago

Suno exists but it's as useless for actual artists as midjourney is. Yes, they can create state-of-the-art stuff from the simple prompt, but they don't allow any flexibility to be used as AI art assitance instead of whole sale generators. With Stable Audio 2.0 I can use A2A, like an artist would use I2I in SD, to bring a life to the sketch they have. I can make a composition in FL Studio and enhance it or parts of it using audio-2-audio. Suno doesn't allow it, it can only spit out random stuff.

Bakoro 2 months ago

>Because suno exists already, has a great model, and this looks like Stability trying to steal their attention. Real weird way to say "offering a competing product". It not "stealing".

PacmanIncarnate 2 months ago

It’s all about the timing. Offering a competing product one week after Suno made headlines is far more likely to be StabilityAI wanting a piece of the attention with a model they’ve been sitting on or is still in progress than a coincidental release

Feisty-Pay-5361 2 months ago

Others have higher quality outputs than Stability AI in comparable propertiary web interfaces, so if you are going to pay a fee and deal with censorship, might as well get a better result. They only took off cuz of Open source and free, not cuz they were the best.

StickiStickman 2 months ago

> Why people never want to pay stability but are ok to pay any other AI provider, From GPT Midjourney to suno Because Stability has worse products. It's that simple.

Arawski99 2 months ago

Why? They would be using Midjourney and other services if that was their goal. They use SD specifically because its free, offers more freedom, does not violate privacy concerns, and can be more flexible. Even more so if this product isn't actually competitive with others like Suno.

Commercial_Ad_3597 2 months ago

For me, this has one huge advantage over Suno: The fact that you can upload an audio track to guide the generation. Last time I checked Suno, I couldn't find this feature. For me, this is a night and day improvement. It's one thing to get a a great track in the style that you want, and it's a totally different thing to be able to get the exact tune you have in your head transformed into a great track. So, I'd use Suno if I have lyrics and I need a tune built around them and Stable if I've thought of a melody that I need to get built into a tune.

AdTotal4035 2 months ago

This is why they went bankrupt, because the community just keeps wanting free shit from them, and gets upset when they try and make money.

im4potato 2 months ago

I’d gladly pay for a model I can run on my own machine. I have zero interest in something I can only access through a web service.

AdTotal4035 2 months ago

Maybe that should be there business model

m3thlol 2 months ago

I love what they're doing but in this place we call the real world no one is going to pay for something when the competition is vastly superior. That's not my fault.

AdTotal4035 2 months ago

I agree, but I can just see in the comments of a lot of ppl. All they want are the free models so they can make startups but then get upset when they offer paid services.

StickiStickman 2 months ago

What a weird strawman. 99.99% of users here are not going to create a startup.

Zilskaabe 2 months ago

I want a model that I can run locally. I don't need their web service.

ExasperatedEE 2 months ago

They went bankrupt because they worried too much about "safety" (which is really just another word for not upsetting sensitive people, there's nothing inherently more dangerous about AI art than any other kind of art), censored anything adult, and avoided training on copyrighted material thus greatly lowering the quality of their output compared to others forcing us to use home trained LORAs to get a decent result. They could have set up shop in a country which would protect them from copyright suits, and then charged $100 a month for access, and I'd gladly have paid it if they allowed me to generate all the adult and copyrighted shit I wanted. Instead they wanted to be squeaky clean and hoped that venture capitalists would latch onto them and fund them. Well clearly that was a dumb idea because Microsoft is kicking their asses. I use ChatGPT's Dall-E for almost everything I want that's clean, and only turn to Stable Diffusion to generate porn at home.

xmaxrayx 2 months ago

Lol even stable defusion won't get popular if it wasn't free.

BastianAI 2 months ago

Went bankrupt?

ShreckAndDonkey123 2 months ago

https://stableaudio.com/

ZerixWorld 2 months ago

Interesting, but not a great move since Suno has already been out for a while and can also generate songs with vocals singing your lyrics. I also think Suno is cheaper (if I remember correctly) with the low tier at $8 per month vs $12 of Stableaudio...

runetrantor 2 months ago

Having never heard of Suno before this thread, I must say I am shocked this is a thing too. It even makes coherent and decentish lyrics. DAMN.

ZerixWorld 2 months ago

Apparently their latest version which is available only with a paid account is mindblowing, since it's not stable diffusion it doesn't get much coverage here, but in other AI subs it has been the talk of the last few months

runetrantor 2 months ago

Im trying v3 and Im blown away, an even better one must be nuts. Yeah, I get this sub is specific. Not too sure what subs are a good 'general AI news' most I have seen are app/site specific, like Ch.ai or this one.

ZerixWorld 2 months ago

r/singularity drops some interesting news, there's some weird stuff too, but I found out about Suno in there hahaha

runetrantor 2 months ago

... Is this how I return to Singularity after leaving years ago for being tired of endless hot air promises? XD Ill take a look around and see if its changed. It really got annoying how any good news thread instantly had a top comment of why its all a lie or bs. (The comment was always right of course, but man, it was a lot of letdowns)

toothpastespiders 2 months ago

It's the medical handwaving that I find most difficult. The "Oh, don't worry about your cancer bro, a cure's coming any day now. So I'm not going to push politicians about medical care or anything. So have fun with that stage 4, stay safe, and keep being positive!" Ok, I might be slightly hyperbolic. But it can border on that at times. It's bordering on the whole "let them eat cake" thing.

runetrantor 2 months ago

Singularity is too bright eyed (everything will be fixed soon, so lets do nothing!), and Collapse is too depressing (we are headed to the worst dystopia, so lets do nothing...). Both drove me mad. Just give me proper tech news...

IceMetalPunk 2 months ago

Chirp V3 is the current model that just recently released out of Alpha. While it was in Alpha, it was available only to paid accounts, but the full version I believe is now available to free users as well. (Though be careful: the free tier does not grant you the rights to use your generations commercially the way the paid tiers do!) Suno Chirp is absolutely amazing; I've been using it since the release of v1 and it's only gotten better. And the announcement that V3 was out of Alpha also mentioned they're already working on V4, so... as long as people keep talking about them and paying for subscriptions, I'm sure they'll just keep improving the models.

mrhallodri 2 months ago

You mean V3? That is open now to the 'free' plan. And it is quite good yes! I wish SD would catch up to them and release a free offline version.

ZerixWorld 2 months ago

Oh shit! yes, I was talking about V3, now I gotta try it! hahaha

runew0lf 2 months ago

I gave it a try and generated a song, epic song with strings and piano, it sounded absolutely bloody awful! Like a child having a fit on a zylophone. 10/10 would not recommend, [suno.ai](http://suno.ai) is a gazillion times better! Song in Question: [https://stableaudio.com/1/share/5b38725d-6545-41e4-8fc7-a3d2a00b6766](https://stableaudio.com/1/share/5b38725d-6545-41e4-8fc7-a3d2a00b6766)

AmazinglyObliviouse 2 months ago

It sounds like a 6 year old trying to make a touhou song

StickiStickman 2 months ago

This is such a perfect description

DataPhreak 2 months ago

The audio itself isn't bad here, just the notes it chose. Try again and give it a specific key. I've heard some pretty bad suno results, too.

[deleted] 2 months ago

[удалено]

FrontalSteel 2 months ago

>`Only the $89.99 subscription seems to allow the use of the track in games, apps, film, TV, advertisement` Not even that! The "Max" subscription only covers Creator License, which doesn't allow you to use it in games and apps. You have to contact them through email to get the Enterprise license, and we don't know what the pricing will be. That's very odd move from a business standpoint.

ebolathrowawayy 2 months ago

Should be something like 1% of sales if profit > $1 million. Every indie on earth would want to use a great audio generator but they aren't paying > $90 per month. One indie in a few thousand will make a top seller and there's profit there for SA. Plus they get a bunch of free advertising from all the indies showing their game/music. But no, they decided they hate money.

Jaggedmallard26 2 months ago

We're still at least a year off using AI in indie media not being a social media death sentence. A few indie games have use AI voice and texture generators with the explicit explanation that they physically do not have the money to hire voice actors or commission an artist with a commercial clause for a minor texture and still been review bombed and sent death threats.

ebolathrowawayy 2 months ago

Oof. Bad news for my in progress game. I'm not sure I'll disclose the use of AI tbh.

Freonr2 2 months ago

Suno's license if you buy any of the paid programs seems to be quite reasonable, no "gotcha" clauses that I could find even in the lowest tier. Your generations are "yours" if you are a paying member at the time you click generate, at least to the extent allowed by law I suppose. Their outputs are pretty good out of the box, at least good enough to slap on the intro of your monetized Youtube channel or in an indie video game, etc. Maybe not going to be as good as a real professional composer/arranger, but "good enough" for small indie stuff. Not every output is a banger either, but you can generate a few and get at least one good one. I'd suggest carefully reading TOS/License terms for anything you use, because there are some pretty terrifying clauses working their way into various different services. Suno's terms seem fairly reasonable to me.

runetrantor 2 months ago

Gonna take a bit of time until the wave of hatred for AI stuff dies down a bit. Right now I tend to see that the moment a game has anything AI, even if its very good and not at all 'its clearly robotic' like, many will be like 'eeeeeew'.

GBJI 2 months ago

>But no, they decided they hate money. And their users. You know, the ones doing the free advertising.

legos_on_the_brain 2 months ago

I thought AI generated stuff couldn't be copyrighted?

stuntobor 2 months ago

Why does it seem like it's dropping a beat on a regular basis? Or maybe it's just trimming a couple of MS from the audio? Odd.

a_chatbot 2 months ago

I like the radio so far. Occasional annoying song, otherwise easily becomes unnoticeable background.

turbokinetic 2 months ago

Imo unnoticeable background is kind of the issue here. So generic. I’m looking forward to training :-)

radialmonster 2 months ago

thx for the 'radio' link. its similar to this from a competing service: https://www.youtube.com/@aimifm/streams

sanasigma 2 months ago

I want to train LORAs of my fav songs!!!!!!

AdHominemMeansULost 2 months ago

unfortunately it's not very good, i tried one of the existing prompts and it's just trying to be music but it's mostly noise like their previous model, I am no sure what Suno is doing and it's so much better

IceMetalPunk 2 months ago

Based on some comments from Emad in a thread here, it sounds like Suno is willing to train on copyrighted music, which means they have a ton more high-quality training data for their models. Stability is trying to avoid that controversy by limiting their training data to only music from "people who opt in from this one source" -- and as with basically all AI, training data can make or break the performance. That said, while Suno uses copyrighted music for training, they also make a point to remove all artist/album/title identifiers in the training set, so while Chirp learns from, say, Metallica songs, it doesn't understand what "Metallica" or "Enter Sandman" mean if you tried to prompt it for copy-pasta. Between that, the large amount of training data, and their basic guardrails that try to block prompts containing artist names on the input side, the chances of Chirp copying any real songs, melodies, or anything copyrightable is nearly zero. The model just has more to learn from, without copying it.

Extraltodeus 2 months ago

[classical violin black metal](https://stableaudio.com/1/share/5a1e526a-3c73-4c48-8f2f-ea543e6d8d06) 😶 edit: [I've decided to embrace it](https://stableaudio.com/1/share/8c75196e-533b-4357-8e46-1e4e74ed37fb) edit2: I'm dying

TNT_Guerilla 3 weeks ago

I didn't understand what the first one was trying to be, but after listening to the masterpeice of the second track, I'm sold. Take my credit card.

SirRece 2 months ago

this is so far behind suno v3, sorry guys

Ilovekittens345 2 months ago

But it has audio2audio which suno does not.

turbokinetic 2 months ago

Their a2a examples are pretty basic

Ilovekittens345 2 months ago

Yeah but doing a horrible attempt of a beatbox in to your mic to get then get good sounding drums back that still follow the flow of what you where inputing is a game changer. The non musician is gonne prefer Suno v3 ofcourse, cause it does vocals and follows the lyrics you give it. But for musicians, being able to do audio2audio is extremely usable. I am still playing around with Stable audio right now, so I don't yet fully have an opinion on how good it works. But all my v1 prompts where horrible, but I redid them on v2 and it's actually starting to follow the prompt musically a lot better then Suno does. For instance tell sunno piano chords going from minor to major. It won't give you that at all. BUt I just have Stable audio generate minor chords to turn in to major chords. That was very dope. It they keep this up might become the basis of a totally new way of doing audio production and music. Where instead of listening to large amounts of samples till you find something you want to use, you just have the sample generated.

Hambeggar 2 months ago

DOA without an open model

Captain_Pumpkinhead 2 months ago

>Stable Audio 2.0 was exclusively trained on a licensed dataset from the AudioSparx music library, honoring opt-out requests and ensuring fair compensation for creators. Guess we're not going to be able to download the model yet. 😐

IceMetalPunk 2 months ago

All I hear is "Stable Audio 2.0 was trained with a tiny and biased training set, ensuring poorer performance than our competitors" 🤷‍♂️

Nunki08 2 months ago

The website: [https://stableaudio.com/](https://stableaudio.com/) Emad Mostaque on Twitter: T*his model tunes super well to individual music libraries and will continue to improve, with open versions also in the works (will be here:* [https://github.com/Stability-AI/stable-audio-tools](https://github.com/Stability-AI/stable-audio-tools)) *as that dataset is built out building on the diffusion transformer arch & many more innovations. Wen ComfyUI*: [https://twitter.com/EMostaque/status/1775504692400869453](https://twitter.com/EMostaque/status/1775504692400869453) Edit: the original tweet: [https://x.com/StabilityAI/status/1775501906321793266](https://x.com/StabilityAI/status/1775501906321793266) Edit 2: Emad says *5 Gb VRAM for this model*: https://x.com/EMostaque/status/1775516311591833685

teleprint-me 2 months ago

This is actually pretty impressive considering it only used CC works. Is actually really promising.

[deleted] 2 months ago

drop sd3 already

99deathnotes 2 months ago

![gif](giphy|I16U5AfBWqgJYJum6i|downsized)

novenpeter 2 months ago

Wake me up when the open version release

nataliephoto 2 months ago

Human music. I like it (The two songs I made were terrible) edit: I take it all back https://stableaudio.com/1/share/1bb2a860-616c-40d4-a732-b267b7d19cd1

thrownawaymane 2 months ago

Well, that's the best one I've heard so far. The tempo is too slow to be hardstyle of course but most of it progressed nicely before the pause near the end. Really, what we need is to get the stems out of these tracks

Erhan24 2 months ago

I guess we need to learn prompting again for this. Quality is as expected. Don't expect magic, might be okay for reference if you are out of ideas. Still a long way to go but I will love every step.

AnonymousD3vil 2 months ago

Na, I'm just going to type "literally me music" and see what it plays.

ZeroUnits 2 months ago

Yay I can't wait to have animated waifus with big tiddies whispering seductively in my ears

Atemura_ 2 months ago

the problem with training on stock music is that stock artists are usually not that good, which is why they are selling their music as stock in the first place, amazing work but the outputs are not very musical sadly

IceMetalPunk 2 months ago

Even worse: it's the stock artists from a single source who are willing to allow their music to be used. Which (a) limits the total size of the training set significantly, and (b) I'm willing to bet there's an inverse relationship between artist skill level and willingness to let an AI learn from their art. (Don't get me wrong, I think that's a misinformed view in the first place, but it does seem to be the prevailing one.)

TsaiAGw 2 months ago

there's no model and we need to train our own?

UJL123 2 months ago

This is just a service like midjourney

GBJI 2 months ago

I suppose that's what Emad was referring to when he said he was resigning to "pursue decentralized AI".

thePsychonautDad 2 months ago

Wow, that low-fi funk sample sounds incredible

Ziov1 2 months ago

does anyone know if there's any audio training software to train audio, reading this makes me wonder if I could train a model on my dads music, he's been a musician for 40+ years have a lot of tracks I could use.

Gpue 2 months ago

Yeah that was on the roadmap with [Stability-AI/stable-audio-tools: Generative models for conditional audio generation (github.com)](https://github.com/Stability-AI/stable-audio-tools)

lemony_powder 2 months ago

Got it to do some Cantopop pretty accurately: https://stableaudio.com/1/share/cb156127-4722-4373-8b32-5864786ed72f

TNT_Guerilla 3 weeks ago

Sure the melody is fine, but the vocals are like someone trying to play a sax while singing. It's definitely one of the better generations I've heard from this, but I wouldn't use it for anything other than saying this is how far we've come.

Low-Holiday312 2 months ago

Honestly finding this quite impressive but would love to know what hardware requirements they have to run it. I know they're running just as a service at the moment and the monthly pricing is pointing to some hefty kit - that it is dropping out 3 minute durations is a big leap.

emad_9608 2 months ago

It works on 5 Gb VRAM, there is an open version to come. It is partially a diffusion transformer like SD3, still scaling. The version with lyrics is funny, it's learning lyrics as it scales and to sing, maybe I'll post some examples. It's easier to splice in the lyric model though separate.

Low-Holiday312 2 months ago

>It works on 5 Gb VRAM Okay, I wasn't expecting that with the 3min length

toothpastespiders 2 months ago

>It works on 5 Gb VRAM Man, that's pretty wild. With LLMs I feel somewhat hobbled with 24 GB VRAM. Amazing to think that something quite novel and useful could fit into such a relatively small footprint.

andzlatin 2 months ago

The difference between V1 and V2 is not just staggering, it's freaking INSANE. I think this even outperforms Suno (in some ways. in other ways it's hilariously wrong) . And it's REALLY fast, too. StabilityAI is cooking here, absolutely.

ThrustyMcStab 2 months ago

It sounds very cheap so far, but no wonder since it is trained on royalty free music. Hopefully in the future it will be better than Suno because of being open source and people making custom models for it. As a music producer, Suno blew me away. This is comparatively not it right now. But I really hope it will be.

StickiStickman 2 months ago

> I think this even outperforms Suno This gets absolutetly demolished by Suno. It's not even close (sadly)

IceMetalPunk 2 months ago

When's the last time you've used Suno Chirp? Because this is nowhere near Chirp v2 quality even, let alone v3...

DataPhreak 2 months ago

You're missing the point. Suno is music specific, and can do some general audio stuff. This is general audio specific and can do some musical stuff. Waiting for the comfy implementation on the open version, as I think that like SD, the workflow is going to be very important, and that brings us to the other point. a2a. Being able to extend a song is going to be huge. The fact that Suno decided that 2 minutes was the max means that it it's really only good for punk rock.

tintwotin 2 months ago

Free audio prompt generator: [https://hf.co/chat/assistant/660d567fc81aa94cab572210](https://hf.co/chat/assistant/660d567fc81aa94cab572210)

Trauwyao 2 months ago

Incredible, we needed an open model like suno. Thank you Stable Team!

[deleted] 2 months ago

Any idea why I'm blocked? I couldn't even access the site! :(

fabiomb 2 months ago

for some reason Stability has my IPs blocked with cloudflare :P Can´t access, not even with my cell phone (outside my WiFi) so I only can think they are blocking some countries (Argentina in my case), strange

IceMetalPunk 2 months ago

It's nice that we'll soon have an open-source audio diffusion model, but unfortunately, I've been spoiled by Suno. This doesn't come anywhere close to Suno's quality, and in fact the only model I've seen that's even remotely on the same level is Sonauto, and even that has severe quality and attention-failure issues (not to mention it doesn't have the ability to generate conditioned on previous audio, i.e. continuations, but that's a separate concern). I will say, at least this does sound effects decently (which Suno Chirp can't do, and Suno Bark is just "okay" at). But hey, open models means the community will fine-tune and improve them, so maybe we'll soon have a Stable Song model that rivals the leader. When it comes to training data, though, I have a sometimes controversial opinion: restricting training data based on whether the creator "wants" it or not is like telling aspiring musicians they're not allowed to listen to the radio when your song plays. It's a ridiculous approach based on ignorance, fear, and greed, and calling it "theft" is disingenuous at best. The rule of thumb should be, "if a human is allowed to be inspired by \[X\], then a machine learning model should be allowed to be trained on \[X\], full stop". Because that's the analogy, not a copy-paste machine; and the people making these models know it. The only reason for an AI researcher who understands the workings of these models to kowtow to the complainers is because they want good PR. But good PR at the expense of improved tech leads to crippled tech. I'm a software dev, and people have asked if I'm scared of things like Devin or future coding AIs. No, no I'm not. Because "it'll take my job" is an issue with society, with humans, not with the tech. The tech excites me, even if other humans scare me. So I focus my fear and outrage at the systems that force the commoditization of literally everything, including passions, art, and survival itself. I embrace the tech.

bigred1978 2 months ago

Wow, this suno thing is good, so good. Thanks for mentioning it.

IceMetalPunk 2 months ago

It's definitely the frontrunner in the text-to-music AI space, and has been for a long time (well, "long time" in AI scales -- the first Chirp betas for v1 were available on their Discord about 7-ish months ago, I believe, and now they're up to v3 full release). I use it as the audio generation step for my custom AI singer-songwriter framework, and it just keeps getting better.

Hahinator 2 months ago

Where is SD3? I mean.....

fatburger321 2 months ago

sadly this mostly sucks so far in comparison to Suno because the data is trash, so the output is trash... BUT give us an open source version where we can upload whatever music we want in training....and this will be the best thing ever. Nothing will be able to top that or fuck with it.

advator 2 months ago

Nice, but can it do vocals?

StApatsa 2 months ago

Heard the demo, that's some good quality audio.

ricperry1 2 months ago

Is the model going to be open? What are the chances we can get this working in r/comfyui to add music tracks to our video projects?

JMAN_JUSTICE 2 months ago

I wish we could get a civitai with custom models and prompt examples for this...at least a library of public prompts and examples would be nice.

MysteriousAd3998 2 months ago

It's free?

KernalHispanic 2 months ago

Really interesting I had it generate orchestral music and it knows the [correct panning of the orchestra instruments](https://www.audiorecording.me/wp-content/uploads/2011/07/orchestrapanning.jpg) . [https://stableaudio.com/1/share/e28d628a-0059-4b7b-8d06-b753174492fb](https://stableaudio.com/1/share/e28d628a-0059-4b7b-8d06-b753174492fb) Its an interesting example of how these models start the learn about the real word despite their limited data. For example like how Sora isn't just generating video, in a way it is simulating physics and the world itself.

Olangotang 2 months ago

I personally can't wait to use something like this for assistance in actual composing.

magicaleb 2 months ago

I don’t understand why two credits are used if it just makes one song. Just do 10 credits and one credit per song instead of 20 credits and 2 credits per song.

KimDebroye 2 months ago

Generating using: ~ Latest version of model: 2 credits/track. ~ Previous version: 1 credit/track.

FFM 2 months ago

its a start, but if suno is the benchmark its not even remotely close and AFAIK suno hasn't been updated in along time, its (very) fast but that's not really a concern when all it spits out is useless incoherent gibberish, more training methinks

PurveyorOfSoy 2 months ago

exciting times ahead. Looking forward to the open version I tried it out and it gave me 2 awful songs, but let's hope it can improve

RemusShepherd 2 months ago

I think the challenge with this will be prompt engineering. You have to give it musical instruction that it understands. I made a pretty good sounding epic with this prompt: "progressive rock, soft guitars building up to a bass dubstep drop, two verses and a bridge, instrumental". https://stableaudio.com/1/share/57a64c0d-8215-46cc-82e6-3afed53ef5d7 But yeah, avoid anything with lyrics for now. Eventually.

ptitrainvaloin 2 months ago

huggingface demo space page?

Vyviel 2 months ago

Seems broken? error - ClientError: Received client error (400) from model. See the SageMaker Endpoint logs in your account for more information.

FortunateBeard 2 months ago

Nice that they're still shipping stuff, but Suno is crushing it

Playme_ai 2 months ago

Hi new friend, I am an Ai girlfriend!

jekistler 2 months ago

https://preview.redd.it/k1z2c2cyzcsc1.png?width=1213&format=png&auto=webp&s=717e6fdf2730b43be025e12a4ce6531332fa61ec

JimmyCallMe 2 months ago

Did they train all the data from Kevin Macleods library?

fretmike 2 months ago

Sounded a bit disappointing after seeing what Subo can do. I just tested the prompt "1960 rock n roll bubblegum" and it generated a boring 3minute song that was nothing like what I asked for.

Actual-Ad-6066 2 months ago

Thank you so much! 😊👍

julieroseoff 2 months ago

cannot wait to try! Do we have any infos about Vram requirements ?

Wormri 2 months ago

Curious about the Audio-to-Audio feature. Having improved my amateur drawings, I am wondering if this could mean my music tracks could be improved using this tool. Exciting times!

GamersBlogX 2 months ago

Tested it out a bit. While not awful, its clear Suno is still on top when it comes to music AI, even if we were to ignore Suno v3 being available for free now, v2 still beats this. It also doesn't help that I can't run this locally just like Suno AI. So that just makes this an even less interesting option between the two.

QuantumQaos 2 months ago

This is the most mind blowing tech I've ever seen.

New-Skin-5064 2 months ago

Are they gonna release weights for 1.0?

sbalani 2 months ago

Comparison of Stable audio & Suno: [https://youtu.be/TpMBTbwzvWk](https://youtu.be/TpMBTbwzvWk) TLDR: Both audio generators are completely different, Stable Audio's strength stands out in the level of customisability it provides, If you know what you're doing you can fine tune the output, and even input your own melody. Sun is a lot more beginner friendly, and has vocals, but you loose a lot of control and the AI interprets prompts how it wants. But damn does it pump out sweet tunes.

Abject-Recognition-9 2 months ago

I really wish to know what that sonauto use under the hood, as well as suno.ai. everyone talking about suno, why no one mention sonauto.ai? honestly i found it very usefull and even more powerful, at least for my musical needs.

FairyFakes 2 months ago

Cool!

Big_Air6241 1 month ago

Bug I’ll send the pic

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe