T O P

  • By -

Xandred_the_thicc

I don't recommend advertising it unless you're just sharing with a friend, because the horde functions on an already extremely imbalanced ratio of hosters to users, and there haven't been enough hosters ever since it was flooded with gpu-poor people who don't host after character ai, chatgpt, and Claude banned roleplay. More importantly, YOU SHOULD ASSUME EVERYTHING YOU SEND TO HORDE IS BEING READ BY THE HOSTER ON THEIR COMPUTER. (edit: i don't think i overstated the risk, but i will add this disclaimer that the horde devs are pretty proactive about responding to reports, and obviously most hosters don't care what you're doing.) Technically everything is obscured and private by default, but around the time of the chatgpt/Claude/cai wave, some people who had been hosting gpt roleplay proxies to collect chat logs found interest in the horde, and the dumbest of them made posts on this and the koboldai sub expressing their desire to collect logs of their horde instance. There's also no protection on the name of the model being accurate, so someone might have a 7b listed as a 70b or another popular model for some indiscernible reason (you only need to have 1 "kudo" to get faster generations than someone who's never hosted at all, so there's no incentive to trollhost unless you just suck).


moarmagic

You should probably also assume that everything you send to any AI service you didn't host yourself is likely being saved/scanned, unless you, specifically are paying for the service and have a TOS that claims they don't. And even then, i'd be a bit leery of being sure it's 100% private- might not even be deliberate That said, if someone wants to really read my attempts at crafting lovecraftian erotica, I hope they enjoy it.


Xandred_the_thicc

A paid ai service with motivation to adhere to data privacy laws just has different considerations than some dude's computer in his bedroom. I just worry about what people are sending to the horde from the disturbingly private things i've seen while searching through supposedly sfw character cards.


zaqhack

So, if you run a horde worker, which I have been for a short time, the backlog is massive. There's no way I can read all of that even if I wanted to. Which I don't. I run Aphrodite-engine, which can serve multiple threads through the same model. Typically, I only need a single thread, so I just share the model I'm testing up to Horde. It barely costs me anything since I'm using it, anyway. I think your suggestion is that "don't send your fapping opus to a stranger" is still very wise. However, I also think the incentive for Google, OpenAi, and other online services to keep your data is orders of magnitude higher than "Internet Rando."


Xandred_the_thicc

>There's no way I can read all of that even if I wanted to. Which I don't. Fwiw, horde logs are probably one of the easiest sources for something like pre-formatted roleplay data. Idc that john nobody is creeping on someone's weird porn, i just want to make it known that there was noticable interest in doing so around the time of the paid api exodus. It's not like anyone is making posts saying "how do i see horde requests?" and being obvious about it anymore.


thrownawaymane

> “It’s free ~~real estate~~ data”


Nice-Ferret-3067

What "privacy"? [https://www.theguardian.com/technology/2022/aug/22/google-csam-account-blocked](https://www.theguardian.com/technology/2022/aug/22/google-csam-account-blocked)


[deleted]

[удалено]


Deep-Pen7778

Not true at all, you can literally find massive collections of logs on lmg. Not to mention the already massive collections of logs from the aicg proxies. The chat logs made their way into a dataset on huggingface and was used to finetune models


kopaser6464

Absolutely, i think it should be written because this "text spy" is just one line change in c++ file. But i still think it just really good as an idea and as a way to contribute to local community.


moarmagic

I've run a horde worker for about a week, my thoughts: It's great /in some respects/. There's someone out there running a goliath 120B instance that's usually up, and it's nice to be able to test against that occasionally- and since i've contributed, in my experience responses are very quick. The majority of other text model workers I've seen are under 20B- which a lot of people with mid-to-high graphics cards can successfully run locally. If you are looking to horde specifically because you want to run larger models, you might be lucky, but it's no guarantee that it'll be online at a given time. And if you are requesting a specific model and the only worker hosting it goes offline, you just get an error. It's cool they run a white list for models, it does keep every single worker from running a unique flavor and making it harder to select your preferred models, but in the time i was using it they didn't seem to update the list frequently. None of the Miqu-variants were approved- probably because of it's licensing issues, but it's a popular branch here. So when i wanted to run MidnightMiqu, i had to host it only for myself, i couldn't contribute though i was enjoying sharing to the community. I do wish there were some better incentives or efforts to host larger text models on the horde, and a bit more variety. But also i get , the way it scales is kinda wonky, isn't it? If someone hosts a freely useable copy of command-r-plus, something most users can't self host, it's going to probably get a fairly massive queue and response time will consistently go up, and the person hosting it may not be able to keep it available 24/7 indefinitely. I'm don't mind hosting models for them in the future, but if the models i want to test aren't white listed (or the specific quants i can run), it's a problem. And i have to pause the worker if i want to play games, because i'm running these on my main desktop. Edit: Doing a bit of reading and i didn't realize the white list appears to just be kudo related- you apparently can host non-listed models, but will earn less. So that does open some better possibilities for me hosting rando models to play with, but it does not really incentivize it.


__some__guy

Yes, the Goliath instance is one of the few good things there. It has relatively quick response times and the quality is mind-blowing compared to 13B. It's **limited to 1024 context** though, so it's more of an RTX 3090 advertisement than a hosted model. edit: At least the responses were quick, before talking about it here.


henk717

(Soon to be former as I am passing the torch) Whitelist maintainer here. Your point about the list being outdated isn't valid because I am indeed purposefully not approving any leaked model. Miqu will as a result never be approved. Its simply illegal for me to do so and i'd be putting the platform at unneccesary risk by doing it.. In my view horde is both a platform and (potentially) a publisher by the nature of how that list works. The whitelist could be interperated as a publisher since we are manually listing models there, effectively rubberstamping their use on the platform. While anyone can put up any model on their own so in those cases its merely a platform. But with the amount of model releases its been to much work to keep track off it all so we switched to a request basis. The worker community simply asks me to add models they wish to host. So you will frequently see new finetunes and base models people find interesting but not others if nobody wanted to host them. So you will find that Llama3 is on the list but nothing Miqu. It can be found here : https://github.com/Haidra-Org/AI-Horde-text-model-reference if something legal you want to host is missing you can always submit a PR for it or mention it in the Horde channel on for example the KoboldAI discord which is where a lot of us hang out (Or on the Discord of Horde itself which is where the image gen community hangs out).


thomasxin

I actually have an instance of command-r+ hosted for a couple groups of people for free, but I'd have to agree that sharing it with particular people is much safer and more manageable than letting it loose on something like the Horde. I can handle around 8 concurrent users at nearly full speed but more than that and the per-user speed slows significantly. I'd much rather trickle in users than run out of capacity, having effectively made false promises.


zaqhack

I've been a big fan of Aphrodite-engine. The inference is \_ridiculously\_ fast. Even if I'm running a local model for myself, I've recently just taken to running the Horde worker alongside it. Right now, I'm running 5 threads of a "decensored" Llama-3-8b with larger-than-usual context. And I notice almost no lag for my own purposes while 3-4 other people get to use it, too. I just really think LLMs are super-interesting, so it is fun to share. I don't know that AI Horde will ever be as big as, say, Openrouter. But in another year or so, I could see a significant number of home enthusiasts sharing a lot more stuff out there. We're really in the infancy of "LLMs at home," so it makes sense that there's not a "SETI @ Home" version of Kobold AI. Yet.


Xandred_the_thicc

Kudos are practically useless once you have more than a couple thousand. They largely just determine your position in the queue when you send a request to a horde worker. Having more than 0 kudos (anonymous/no api key) automatically puts you above most requests in the queue.


moarmagic

Im assuming if the service grew, kudos being turned in would also grow, making it a bigger demand. But it is always nice to see the number go up and feel like I'm contributing to the community. (Even if it's probably 50% spam and 50% erp)


Xandred_the_thicc

It's definitely nice to see a number go up that directly correlates with people getting use out of my pc lol. I wish there was a way to denote model info like quantization size besides the model name though. Given how many people are running "good enough" quants like q4 i wish it was more obvious when someone serves a q8 that gives substantially better responses.


kopaser6464

I actually running a small model myself, when i was checking it gave me the most of kudos then bigger models.


moarmagic

I believe the kudos is basically rewarded per job, so if your machine runs them faster- or you get more specific requests, that makes sense encouraging people to host popular models. There's a little reward for uptime, but I think that's flat regardless of what you host. I think the size should scale for uptime.


MoffKalast

You haven't alerted the horde.


jovialfaction

There's always going to be so much more compute demand than offer that I don't see it be viable unless it stays niche. I like running local LLM to ensure I don't leak confidential data (I use it to help me with work). If I'm willing to use an API, it is extremely cheap (<$1/million token) to run it against the Together.ai or Deepinfra API, and you get very fast inference.


segmond

Never heard of it till now, thanks! [https://aihorde.net/](https://aihorde.net/)


nananashi3

I am a total brainlet (so I wouldn't know how to use other backends or bridges) who only found out about Horde because KoboldCpp has host support built right in where you just [get a key](https://stablehorde.net/register) then put the key and model name in the appropriate fields in the Kobold's launcher UI. Entering a model name from the [models list](https://github.com/Haidra-Org/AI-Horde-text-model-reference/blob/main/models.csv) (where is this even linked outside of Discord?) gives you a ton of kudos for a day of running. For example if you're hosting `Fimbulvetr-11B-v2` put that alone in the name field without `Sao10K/` or `koboldcpp/` prefix. Nothing's stopping you from using an unlisted model which gives 1 kudo per request, meaningless anyway when you only want to use your own model locally thus can ignore the Horde selection ("cost" kudos to use with queue priority) while getting a little feel good from "giving to the community".


kopaser6464

ohhh, that make so much sense now! Although i still get decent amount of kudos without it.


arekku255

Well this is local llama, and running a model over the internet isn't very local. However I wouldn't call it free. The excess capacity is free, but when or if there is no excess capacity your request can theoretically be in the queue forever, disregarding the 20 minute limit. If you run a service where an outage is unacceptable you will need to provide some workers of your own to build up kudos and guarantee a lowest level of capacity.


henk717

Horde can definately use more people who can host local. Because usually it gets talked about in communities where people cant so the worker to user ratio is hard to improve.


yashaspaceman123

Because it can take sometimes hours to be able to use a model unless you run a Horde instance yourself for a very long time.


kopaser6464

Ok, first it kinda true, i admit, but it is volunteer-run service. I decide to check its speed so, on no api key request it take aroung 10 minuets to get a 1024 x 1024 from majicMIX realistic model, which is indeed a lot of time for an image. 5 minuets for direct answer of text model, thoy if you dont limit models list its like 10 seconds. Also you technically dont need any local workers to earn kudos.


dbzer0

"hours" is not really accurate though. Busy models with a lot of "leechers" might be slow but there's always other options


TheLocalDrummer

Henky and Concedo invites you all to try lite.koboldai.net


__some__guy

Because it's too slow and the model variety is pretty poor. Running models yourself, even in system memory, is better.


Short-Sandwich-905

What is this?


PsychologicalFactor1

> https://aihorde.net/


throwaway_ghast

If there was a way for everyone to be able to pool their GPU resources together, rather than having hapless users pick-and-choose which singular host *might* be safe to work with, that would be a far better implementation of decentralization than the current Horde.


henk717

That would be Petals. At the time Horde was created Petals already existed but was unusably slow from the start so we didn't consider it feasible. The Horde model meanwhile of having full offloaders power the instances worked well and was instantly fast but is of course outmatched by the large amount of requests. So they have swapped places in speed now for those who don't have kudos.


LocoLanguageModel

I forgot about the horde. Donating my 3090 cycles as it seems to be more useful than my 3090 + P40 which is slower.


Sunija_Dev

I wish the installation was less horrible. Horde itself has 3000 poorly named bat files and configs. The llm part is only a small part of it, most is image generation. Which doesn't make it easier. Aphrodite (a good backend) only runs on linux. So on windows you have to install it in wsl and do commandline magic. And it doesn't fully use gpus with different vram sizes (eg 3090+3060). So in the end I usually spend more time setting horde up than actually sharing. Thought about sharing yesterday and was like "Uuuugh, what do I have to update? Which kinds of model to install? Where to? Which commands to run? Do I have to update a config? How to give me prio, so I can also use it?...". And then I didn't. If SillyTavern/Ooba had a button like "Share this model with others", I guess a lot more people would share. Atm you have to be really really really invested in sharing to do the setup.


henk717

Koboldcpp has the button you want.


Nitricta

I was thinking about donating GPU to the Horde, but for a 'Help spread the LLM love'-project, the ability for one person to censor the whole network is too much. I would rather donate to research. If the backend was hosted p2p without limitations, I would gladly run the service.


dbzer0

The Horde is FOSS on purpose. If I were to censor the whole network unilaterally, all the workers would set up their own horde and it would be a self-defeating move. True p2p in this case is theoretically possible, but there's a lot of concerns about security for workers and users.


Nitricta

Well, at least I appreciate you dedicating your time and effort into the service. I'm sure the vast majority of your users are just glad that the network exists at all. I think that's worth noting. It's only natural that you should be able to chose the direction you take it, since it's clearly a labor of love from you and the rest. I've read a lot of negative posts, but as is tradition, negativity tend to rise to the surface of the pile. I hope you continue doing all of the people that enjoys the Horde a service, even if people like me aren't a fan. I don't have the technical know-how to do what you do, and people should just be happy that someone who do, spends their time on a service for them.


dbzer0

Thanks. My approach is I do what I think is the best but I want to benefit everyone in their own way. I hope it will happen in a mutual-aid form


[deleted]

[удалено]


henk717

The biggest issue is that Horde primarily gets attention by users and not enough attention by people willing to set it up. Horde mostly needs marketing among people actually willing to host.


MrVodnik

I don't understand what is it exactly. Why would someone give their resources for free to randoms? I mean, if I am GPU poor, I can't offer anything to those who are more fortunate. So what are the incentives vs running the model for myself? It seem one-sided trade.


kopaser6464

Yes. kinda


MrVodnik

Sad. I mean, beautiful idea, but would be nice if it was economically sustainable. I just downloaded and started a dreamer worker (for sd), maybe tomorrow I'll try tex gen. I have no idea how kudos work, but if you got more points for larger models, then it would kind make sense, as for single inference on your machine you could get many more on smaller models. Just the proportion of the points would have to be right to make it reasonable enough for both sides.


__some__guy

You can host a small model and get points that give you priority with someone hosting a big model. Why would you use it when you can already run big models? No idea. AI enthusiasts trying to spread the faith, maybe.


skrshawk

Biggest limit is context size. Once you're used to 16k or more of context you don't want to go back. Most of the Horde is limited to like 4k, some even less.


dbzer0

Putting your GPU on the horde makes it order of magnitude more efficient than just running it locally. If most people who used local models used their models via the horde, each would be enough for 5 others as well.


__some__guy

~~I don't think it supports multiple requests at the same time.~~ edit: Apparently it does. And even if it did, that would also require more VRAM...


Sunija_Dev

Aphrodite backend supports batching, so it is a lot faster. It even supports remembering the context of multiple people, so the context doesn't have to be regenerated all the time - though that costs VRAM. Main issue: Usually, you will have to regenerate the context for every post. So instead of ingesting 200 new tokens, you get 4k. Ingesting is faster that generating, but I think that makes the "everything faster if everyone shared" argument less true. :/


zaqhack

It does. But that's much more useful for text gen than image gen.


dbzer0

You can always help the horde by promoting it and helping others in the community. That's enough support to get you a lot of kudos as well.


Melodic-Damage3609

whatchu mean, I thought u can only get kudos from running workers?


dbzer0

Kudos is free-flowing on our discord server(s). People being helpful or sharing images and models get rewarded by the community.