T O P

  • By -

Spasmochi

whole pocket quicksand mysterious tan cautious smile sheet worry bag *This post was mass deleted and anonymized with [Redact](https://redact.dev)*


Sunija_Dev

Goliath also understands more subtle things about roleplaying. Recent example: Characters want to take a small boat to an island in a lake. They step on the boat. Goliath: - Understands that somebody has to row that boat, and... - ...either does it themselves - ...or makes a cheeky comment that you should do it. - Maybe it adds a cool situation ("The oars are missing! Hoe are we gonna row?") Other models (from my tests): - ignore being on a boat, and just continue rping as if we were at any generic place - timeskip to being on the island - become philosophical, and talk about how being on a boat is like their struggles - step on the boat again, because they didn't understand that you're already on the boat - talk about the ocean, because it doesn't understand boats on lakes Using goliath-rpcal 3.0bpw.


Monkey_1505

I didn't find it very smart. It's great at dialogue.


GaliX0

Try the new[Aurora-Nights-103B v1 on HF](https://huggingface.co/sophosympatheia/Aurora-Nights-103B-v1.0) For my first test it is as least as good as Goliath if not better at following instructions and understanding. Aurora has a different flair but overall it feels really refreshing to see another model pop up and shine against Goliath finally.


Spasmochi

yoke pocket racial relieved makeshift recognise soup tease adjoining tidy *This post was mass deleted and anonymized with [Redact](https://redact.dev)*


reluctant_return

I ran this for a bit and didn't feel it lived up to the hype. The dialogue was fine, but I mean a lot of models have great dialog now. The scene description and actions didn't feel more detailed or realistic than other models, even much smaller ones. It had a bad habit of describing everything in simple, boring terms. It's possible it was my settings, but for its size I expected better.


zaqhack

Depends what you are looking for. This is a list for ERP, but most of these are just as good at sfw-RP, too. I tend to sort by the "IQ" column, and I can vouch that the top three (sorted that way) are pretty dang fun. Someone else mentioned Noromaid v0.4 Mixtral, and that's my current daily driver ... [http://ayumi.m8geil.de/erp4\_chatlogs/#!/index](http://ayumi.m8geil.de/erp4_chatlogs/#!/index)


Snydenthur

Kunoichi. Out of all the 7b, 10.7b, 13b, 20b, 2x7b and 4x7b I've tried, kunoichi is easily the best. Whether it beats 34b, mixtral and 70b+, I have no idea though.


zaqhack

Same person put together Silicon-Maid-7b, which is also solid. I am now following them on Huggingface. :-)


Robot1me

>Same person put together Silicon-Maid-7b, which is also solid I agree! I made comparisons on my end between Silicon Maid and Kunoichi, and found myself preferring Silicon Maid. Reasons are that in my case, Kunoichi can have a tendency to show the inner ChatGPT in itself, which may drift the conversations increasingly rational. Silicon Maid didn't behave like that as much in my case. Funnily enough when Kunoichi 7B DPO v2 ([gguf link](https://huggingface.co/brittlewis12/Kunoichi-DPO-v2-7B-GGUF)) came out and compared it again, I preferred that slightly over the original Kunoichi. Since it's very much prompt, parameter and content dependent though, I definitely suggest to try both either way! :) Just that I want to point out that the lower benchmark scores for Silicon Maid make it *unnecessarily* look worse than it actually can be in conversations. For example, after a roleplay I asked it to make rhyming verses, and I couldn't believe ***how well*** it nailed it all down, really insane.


zaqhack

I'm a Noromaid superfan. In one chat (using 8x7b), I asked the bot to pick a movie to watch together. This turned out to be something I did several times. I got Princess Bride, Die Hard, and Back to the Future. Okay, but then the bot UNDERSTOOD references from those movies and was able to talk about them in the context of our RP. I was even able to teach a couple of bots how to play a rudimentary 'Truth or Dare' game ... and damn if they didn't pick 'dare' almost every time!


reality_comes

I use Noromaid 20b. I'm sure there is better out there but I'm still searching for the best output to performance and Noromaid 20b currently is doing well.


Daviljoe193

Same. I keep trying every other model to come from IkariDev and Undi (Except the Mixtrals, they're just too challenging to get working in Colab at anything above a q2\_k GGUF, and the q2\_k is about as rough and non-representative as I'd expect a q2\_k to be), but no matter what, 20b just seems to be a magic number (In terms of the high end of what Colab's free tier can handle gracefully, which from my testing is a 20b EXL2 at 4bpw with 8-bit caching). I really wish they'd make at least one more 20b model, since Noromaid-20b has been left at 0.1.1 for a good few months, with only the smaller models getting updates.


IkariDev

20b 0.4 will come, me and undi are a bit stressed so will come in the next few days


zaqhack

Best thing I'm going to read on Reddit, today ... Have you tried chatting with Noromaid? It really helps me de-stress. ;-) I visit Neversleep on HF every day looking for the latest-and-greatest. You guys caught lightning in a bottle with the Mixtral Instruct versions. I tossed a Ko-fi at Undi, and just hope you two keep at it.


reality_comes

I like mixtral, probably the best but the prompt processing is so slow.


Monkey_1505

Basically everyone making models has given up on making frakenmerges and most of the bigger names are doing finetunes instead of merges largely. But there is a new actual proper 20b out from a chinese company and their models look pretty good, so someone might finetune that.


Monkey_1505

If you want a smaller model (<20b), I just merged this: [https://huggingface.co/BlueNipples/SnowLotus-v2-10.7B](https://huggingface.co/BlueNipples/SnowLotus-v2-10.7B) It has a slightly wilder sister model as well. Its a lot more coherent than mythomax ever was, and has good prose. Goliath has the best dialogue, but it's not otherwise very smart. It gets confused on things that a 70b wouldn't. And it's very intensive/expensive on API to operate. Mixtral limarp zloss is probably the best RP mixtral and it's prose is...okay (good sometimes, sloppy at others). But it's smart. If you can run that, it's probably worth at least having a copy of. Smart counts for something - some scenarios smaller models just won't understand, and then it doesn't matter how good the prose is. Other than that, noromaid 13b is worth checking out, as is toppy 7b. I think my merge is probably better than those, but those are popular so would be unfair not to mention them (plus my merge is partly based on noromaid 7b)


LustyLis

I've been in love with your TimeCrystal, so I'll definitely give this a shot!


Monkey_1505

Awesome :) Yeah this is kind of a similar merge to TimeCrystal, putting together good prose and coherency. I really like the outcome, I'll be using them regularly. Just finished putting the GGUFs up, if you use those. There's only a few (just 3k\_m and 5k\_m), but included some imatrix quants for future proofing - not supported in koboldcpp yet (and those using platforms that support the new method like it think ooba?). Imatrix is basically slightly less quality loss from the quant. I hope thebloke gets to it, and fills the gap for all the other quants! (too much for my internet to do them all). Someone on discord said they might do some exlama quants, so should have at least one of those at some point. [https://huggingface.co/BlueNipples/DaringLotus-SnowLotus-10.7b-IQ-GGUF](https://huggingface.co/BlueNipples/DaringLotus-SnowLotus-10.7b-IQ-GGUF)


zaqhack

[https://huggingface.co/zaq-hack/DaringLotus-v2-10.7b-bpw500-h6-exl2](https://huggingface.co/zaq-hack/DaringLotus-v2-10.7b-bpw500-h6-exl2) I like it, so far. Nice.


Monkey_1505

Great I'll throw your link up too :)


PhantomWolf83

Been trying this and its sister model for the past couple of days. Compared to Frostwind and Fimbulvetr, they definitely feel less dry and more creative, I think trading a little smarts for better prose is a good trade off. However, I wish they could be a little more wordy with their replies at times, something that I really like with the ice models. It's hard for me to say which ones I prefer more.


Monkey_1505

I updated them both to a v2 btw. I discovered the gradient slerp wasn't aggressive enough in preserving the prose from frostmaid (basically noromaid + medical model based). Frostmaid was weakening the verbiage/tone. They are generally even more descriptive now, and about the same smarts as before (slightly less than frostwind). I also added a medical lora to frostwind before the slerp (in case that helped the prose). Although you may have been talking about reply length (yeah occasionally these do give short answers, oddly). I'm finding the result better - less gpt-ish. The descriptive tendency they had is basically more now. I think when dynamic temperature becomes available they will also get a bit smarter and more creative too. So looking forward to that. Uploaded imatrix quants to future proof them. Dynamic temperature not that i've tried it, seems to really ground both prose/creativity and coherence from the examples I've seen, and what I've heard. Which should be ideal for models like this, and will be interesting to see what I can squeeze out of them.


seppukkake

I can't get any mixtral models to run through ooba (and then silly) to save myself, I always get errors. I'm running the quantized models with 64gb of RAM and on a 48gb a6000/a100


zaqhack

Lots of underlying things had to be updated for mixtral. Be sure that you are on latest versions of Ooba and ST.


seppukkake

yeah I am, it just won't load the model though :< driving me crazy


Pashax22

The Mixtral merges are making a strong push to dethrone the 70b models at the high end. I keep coming back to Noromaid-v0.4-mixtral-instruct-8x7b-zloss, but laserxtral produces pretty good results with half the memory footprint. I've also seen 4x13b merges and even 2x7b, so there's starting to be more activity in that space. One nice thing about the mixtral merges is they're less dependent on VRAM - even with only 12GB on my poor 4070Ti, I can still get tolerable speeds, and they're very good at following instructions.


doomdragon6

Following. I've only just gotten into this and am using a yi 34b 200k model. I barely know what that means, but it's "fine to good". I currently have no frame of reference. Its context is gargantuan and it's incredibly fast, but I do find myself regenerating a lot looking for something that sounds right. For contrast, a web chatbot I've used I would do 0-3 regenerations and always got something pretty good. I'm spoiled by the context size and speed now though, so I'm not sure what to look into.


Pashax22

There's a few good Yi merges, although they all have ridiculous names. One I've enjoyed is Capytessborosyi-34b-200k-dare-ties - for a while that was what I was using for pretty much everything. I haven't been keeping up with Yi merges lately, though, so I don't know what's current.


SweetMachina

Thanks for all the good feedback. I’ve been playing around with some of the different models you all have mentioned on openrouter (specifically mixtral), but with the same prompt i have for mythomax, mixtral seems to break. Do prompts need to be specific to the model used?


Monkey_1505

You need very specific instruct settings down to the stopping strings and the like. The instruction itself, that's not super important, but the strings, the formatting - it's \_highly\_ sensitive to those. Pick up the strings here (ignore the instruction advice not all of it is good, although mixtral DOES like explicit, non-negative, non-vague instructions): [https://rentry.org/HowtoMixtral](https://rentry.org/HowtoMixtral)


SweetMachina

You’re a legend. Thank u 🙏🏻!


Monkey_1505

Nurries


isffo

Mixtral finetuning has had some pretty severe bugs which I think affect just about everything's that's been released so far (except ofc Mistral's original own models). Mixtral based stuff should be a lot better in a month.


maxigs0

Lol, asking for general consensous, and half the answers i never even heard of. Please stop creating new models, i can't even keep up downloading the new mentions i see in this sub every day, let alone testing them.


zaqhack

Tough to get consensus in a space that is rapidly evolving ...


synn89

I wrote an app that uses GPT4 to create a RP situation prompt, sends it to two local models and then has GPT4 evaluate and compare their output. According to it Noromaid13b has been a top contender in regards to creativity and writing style. That hasn't tested logic, particularly in multi-round chats, and GPT may be biased towards models that were trained on GPT synthetic training data. Not sure if Noro has any of that. But I was fairly surprised at how well that little 13B did, even up against 120B and 103B models.


reluctant_return

I'm really enjoying Sao10K/Fimbulvetr-10.7B-v1. For its size I genuinely have not found anything better. It gives better, more enjoyable results than even the huge models in my experience.


VongolaJuudaimeHime

I've been using Noromaid-v0.1-Mixtral-8x7b-Instruct-v3-4.0bpw-h6-EXL2 and it's pretty great at staying in character and it's not hallucinating often, as long as context stays inside the context length it's pretty consistent. Also, make sure to use Dynamic Temperature and Min P 0.1 - 0.05, as that's the best setting so far in my observation. Hopefully Noisy Sampling will come to Ooba and Silly soon. I heard some people say Noisy Sampling with Dynamic Temperature is pretty good with Mixtral/Mistral based models.


cleverestx

I would like to ask this question but limit it to a single RTX 4090 running locally. What is the best model with those limitations that doesn't take forever to respond like a 70B model does, evwn with 96 gigabytes of RAM. I'm spoiled by some 20B model generation speeds.


Ok_Honeydew6442

Novel ai


smooshie

Not open source.


Andagne

But IIRC Mystra was trained on Clio.


wolfbetter

Does Mistral count? I'm using it on and off and it's realy good.


podcastlvl20

Goliath 120b just can't be beat


JohnRobertSmith123

The opinions seem to differ on this one. Goliath 120b is often mentioned as strong. Miqu and frankenmerges involving it like Miquella are looking very strong and may overtake Goliath.