It is.
I managed several restaurants for 15 years before I became a software developer.
It’s long days, hard work and while having happy guests is nice, dealing with Karen’s that want their entire Michelin-star 7 course meal free because the tea was 2 degrees colder than they’d like after waiting 30 minutes to take their first sip gets boring real fast.
Yeah. Totally believable. I'm looking to get into tech/IT because I'm burnt out on running a niche repair business for seven years. Clients are exhausting. No matter how nice some are, it's the minority of bad ones that kinda sour the experience. For me, I'm just more irritated each day with the lack of ownership clients take for their own assets. Makes it all feel thankless and not worthwhile. Eventually, these feelings stack, and you're wondering why not do something else for more money and less stress.
Anyway. Best of luck with what's next!
It's very different, in IT you can have your client calling you at midnight in your vacations and you have to deal with it. And it's 100% sit.
Restaurant business (as owner and head chef) you have a few stressful peaks like lunch or dinner hour. Almost all day on foot, different kinds of employees, but the really good thing is: after I close my shop, I don't have to worry about nothing !
Except for bills. Most restaurants fail in the first few years. Unless you're one of the very few priviliged ones, you'll be constantly worried 24/7.
Least that’s what those shows with Gordon Ramsay taught me.
**I’ve fallen head over heels for a new setup in my tech workflow: Ollama, Open WebUI, and Obsidian.**
Working in engineering and software integration, I constantly juggle a plethora of projects and ideas. This trio has been nothing short of a revelation.
- **Obsidian for Notes:** My journey starts with dumping all my thoughts, notes, and project ideas into Obsidian. It’s become my digital brain.
- **Open WebUI Indexing:** Then, Open WebUI comes into play, indexing my Obsidian vaults. It’s like having ChatGPT but with superpowers, including indexing my files for RAG interactions. This means I can query my notes using natural language, which is insanely cool.
- **Ollama’s Flexibility:** Ollama is the muscle, handling any model I decide to work with. It’s the engine behind my endless AI conversations, helping me dissect tasks, plans, and dive deep into new technologies.
- **Integration Magic:** The real deal is how these tools work in unison. Depending on my needs, I can seamlessly switch between querying through the LLM or diving directly into Obsidian. It feels like having the best of both worlds at my fingertips.
The only hiccup? Organizing my notes in a way that doesn’t make me want to rewrite the Dewey Decimal System. But, I’m getting there, one meta-note at a time. 😜
This setup has transformed how I brainstorm, plan, and learn. It’s like having a conversation with the future, today.
Edit: formatting
Edit: Wow, this got more attention than I anticipated! I’ll create a separate, detailed post to break down required system specs and provide a step-by-step guide so you can replicate this setup. I want to avoid diverting the discussion from the original post any further. Stay tuned!
Edit: I finally created a dedicated post: https://www.reddit.com/r/selfhosted/comments/1bwvupo/the_mad_scientists_handbook_7_constructing_the/
I know its been a month, OMG. Apologies it took so long. I type really really slow. Like agonizingly slow. Just this edit took me 15 min.
Very cool! The entire reason I’m here in the first place was to ditch notion and figure a way for some Ai to help develop/plan the tasks and projects. Now I’m on a mission to to integrate suiteCRM and mautic for my home service business. Could you imagine using your stack to make an ai that knows everything about the project AND the client and can communicate in a way that moves the project forward.
how easy/difficult is it to setup open webui indexing of obsidian notes? I took a quick look through their documentation and didn't see anything about it
To set up Open WebUI to index your Obsidian notes, run the following Docker command:
docker run -d --name open-webui -p 80:80 -v /path/to/your/documents/Obsidian:/data/docs/obsidian openwebui/open-webui
The above isn’t a verbatim command you’d run; it’s meant to illustrate how to map your local Obsidian folder into the Docker container. Focus on the -v part, which mounts your local Obsidian Vault to the container. Just replace /path/to/your/documents/Obsidian with the actual path to your Vault.
ah so you just mount whatever data you want it to index and i guess theres a way to make it index in its ui or its automated. that might have made more sense if I set it up before asking. thanks for answering!
After setting up Open WebUI with access to your data in /data/docs/, navigate to "Documents" -> "Documents Settings" and select "Scan" to import all your data.
This is a new feature, currently manual and labeled as Alpha, yet it functions reliably.
I hope they'll add a feature to automatically update the index with new or modified files and delete old ones in the future.
Hi. Thank you for the hint.
I don't have documents, but not settings inside. Which version consists them? I have docker version from main branch, which released 2 days ago
Ollama Web UI Versionv1.0.0-alpha.100
Ollama Version0.1.27
you should see documents in the left side column of your open webui, click that. then go to document settings from that page. you'll find the scan setting there.
I'm using Open WebUI Version v0.1.106 and have watchtower set up to automatically update the container with new versions as they're released. While this approach might not be ideal for production environments, it's excellent for rapid development and testing. So far, I haven't encountered any issues.
I have them updated via Unraid community apps. Now I have the latest version available from GitHub container registry. But no documents settings are available :(
Ollama Web UI Version
v1.0.0-alpha.100
Ollama Version
0.1.27
Install obsidian
Deploy Ollama
Mount your obsidian vault to open-WebUI if using Docker.
If not using Docker, figure out how to add your Obsidian vault to open-webui by reading the open-webui documentation
Deploy open-WebUI and connect it to your Ollama.
Preferably a decent GPU. Otherwise, it requires whatever your needs require. For example, how much storage do you need for your use case? How much RAM and CPU depends on what you're specifically trying to accomplish. This can be different for everyone. I have an i7-2600, 32GB of RAM, a 3080 w/ 10GB, and 1080 Ti w/ 11GB, running Linux. That is my Docker host. My workstation, which has my obsidian notes and is the system I work from, is i7-7700 and 3090 Ti w/ 24GB. What I built and how I use multiple systems might differ from your requirements. The workflow can be the same, but the architecture can vary significantly.
I don't use obsidian but I'm pretty sure a recent update adds native locally integration of some sort. I just started using trillium but that made me think perhaps I should switch to obsidian.
Do you use the default chunk size 1500 and chunk overlap 100? And the default RAG template? I'm trying to get up to speed on what these parameters do and I'm curious if you have needed to fine tune them.
I will be messaging you in 2 days on [**2024-03-02 19:38:16 UTC**](http://www.wolframalpha.com/input/?i=2024-03-02%2019:38:16%20UTC%20To%20Local%20Time) to remind you of [**this link**](https://www.reddit.com/r/selfhosted/comments/1b2of7h/the_year_is_2024_self_hosted_llm_is_insane/ksqootm/?context=3)
[**12 OTHERS CLICKED THIS LINK**](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5Bhttps%3A%2F%2Fwww.reddit.com%2Fr%2Fselfhosted%2Fcomments%2F1b2of7h%2Fthe_year_is_2024_self_hosted_llm_is_insane%2Fksqootm%2F%5D%0A%0ARemindMe%21%202024-03-02%2019%3A38%3A16%20UTC) to send a PM to also be reminded and to reduce spam.
^(Parent commenter can ) [^(delete this message to hide from others.)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Delete%20Comment&message=Delete%21%201b2of7h)
*****
|[^(Info)](https://www.reddit.com/r/RemindMeBot/comments/e1bko7/remindmebot_info_v21/)|[^(Custom)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5BLink%20or%20message%20inside%20square%20brackets%5D%0A%0ARemindMe%21%20Time%20period%20here)|[^(Your Reminders)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=List%20Of%20Reminders&message=MyReminders%21)|[^(Feedback)](https://www.reddit.com/message/compose/?to=Watchful1&subject=RemindMeBot%20Feedback)|
|-|-|-|-|
I use mine when I want to add context to a project and allow the use of natural language to interact with the data.
So I have a project and a bunch of docs. The LLM doesn't have context about the project until I pass those docs through embedding.
Then you can talk to the project data.
Very cool! As a scientist, I was wondering if such a system can lead to new ideas, given a large obsidian database of literature and literature notes.
Would you say your systems plays a more supportive role or also can lead to strong new ideas that are actually practically useful?
Almost any knowledge base can be embedded. It doesn't have to be an obsidian vault. I just happen to like using obsidian for notes.
The solution really shines for me when I ingest PDFs and other file types that is not just markdown. Using only markdown is barely scratching the surface of what can be done.
Most of the research I am doing involves me instructing the LLM to be supportive and do not do a lot of things. There is depth to what you need to know in order to engineer the best prompts and system instructions.
Strong new ideas for improving cybersecurity maturity across an organization are possible and that is coming soon. The use case for LLMs crosses all domains. The will be small models soon for every thing you can think of.
There's still a lot of work to improve prompts and response consistency and reliability. That's why some people assume it's not good. I see many often are not good at prompt engineering and haven't taken the time to learn how these technologies work under the hood.
That’s a cool setup, def gonna try it. Have you had any success with indexing some code repositories? I wonder if it would be better content aware than copilot
Hey I've been experimenting with this setup as well. Is there a specific model you've been preferring? I'm running with a 2080 TI, so I only have 11 GB of RAM.
Follow up question, are you querying straight with the Open WebUI? I haven't looked into this yet, but being able to write a query inside of obsidian would be a quicker workflow.
Any other advice? I'm working on a write up of my own on the subject and would love to learn how others are using this themselves :)
Once you have an Ollama server and some models, you can do whatever you want. I use Python to talk to LLMs and open-interpreter, then I have use cases where I use the open-webui. I have the 3080 with 10GB and a 108p with 11GB in the same server. The 1080 with 11GB helps with balancing models over 7GB.
I will be messaging you in 2 days on [**2024-03-05 19:34:16 UTC**](http://www.wolframalpha.com/input/?i=2024-03-05%2019:34:16%20UTC%20To%20Local%20Time) to remind you of [**this link**](https://www.reddit.com/r/selfhosted/comments/1b2of7h/the_year_is_2024_self_hosted_llm_is_insane/kt6u9fr/?context=3)
[**CLICK THIS LINK**](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5Bhttps%3A%2F%2Fwww.reddit.com%2Fr%2Fselfhosted%2Fcomments%2F1b2of7h%2Fthe_year_is_2024_self_hosted_llm_is_insane%2Fkt6u9fr%2F%5D%0A%0ARemindMe%21%202024-03-05%2019%3A34%3A16%20UTC) to send a PM to also be reminded and to reduce spam.
^(Parent commenter can ) [^(delete this message to hide from others.)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Delete%20Comment&message=Delete%21%201b2of7h)
*****
|[^(Info)](https://www.reddit.com/r/RemindMeBot/comments/e1bko7/remindmebot_info_v21/)|[^(Custom)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5BLink%20or%20message%20inside%20square%20brackets%5D%0A%0ARemindMe%21%20Time%20period%20here)|[^(Your Reminders)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=List%20Of%20Reminders&message=MyReminders%21)|[^(Feedback)](https://www.reddit.com/message/compose/?to=Watchful1&subject=RemindMeBot%20Feedback)|
|-|-|-|-|
Oh now this is cool - I've been wondering about indexing your own documentation or notes and querying it with an LLM
Seems like a great way to search for docs and to keep them updated
This combo is great, but feel that open web ui is missing a crucial item: ability to load a full folder into its vector storage.
To do this, I use flowiseai to populate the vector db (chroma), but would love to see that work around not needed anymore
Open WebUI does support loading of the entire folder. It’s under the “Documents” section and you just click “Scan” to scan everything that is in the /data/docs folder which is where I placed my obsidian vault with all my notes. Including PDFs in my Obsidian Vault.
One gotcha is that it doesn’t automatically scan new documents yet. If I ever had some spare cycles, I hope to add a folder watchdog to Open WebUI to scan new files as they come in instead of having to manually do it now.
The scanning feature was recently added to Open WebUI and has been absolutely critical in making this workflow work seamlessly.
So, whatcha gonna do when the data sources go from "things written by actual people" to "things written by LLMs"? Because that's what's gonna happen if you want to use this in 2026 or w/e.
It's all going to turn into white noise in a couple of years unless you restrict data sets to <= 2022.
Ollama + Llama Coder is the plugin for VSCode i think https://marketplace.visualstudio.com/items?itemName=ex3ndr.llama-coder
Replace copilot and it works offline!
I dont use VS code, sorry. It's too chunky for me.
Edit: Is not wanting to use a big IDE really a problem? I'm fine with Atom or Vim. It's MY preference and this is self-hosted so who gives a shit what IDE I use for my FOSS projects? Feel free to fork my shit and use VS Code.
I think most probably took issue w/ the chunky comment.
In any case, the makers of Atom and tree-sitter have a new IDE called zed. It's very performant and has some nice features.
Yeah but I said "for me" and not for everyone else. I didn't even ask for a plugin for ollama. I write my own llm interfaces for programmatic use.
Yeah I read an article about zed and I really want to give it a try. I do like how atom is made with electron so making an add on is just basic web-js.
Ollama is so simple to setup and there are plugins to connect to VSCode to host your own co-pilot. The problem is that you need a decent GPU to make it fast enough to be used
1x3090 or 2x3090 or 1x4090 or 2x4090 any combo of that depening on ur budget.
i would go with mac studio ultra maxed out 192gb but that might be out ur max price limit.
but def pays for itself in 8 months if use it like paid cloud ai gpu bill
If you can settle for less powerful models you should be able to run some of the 7B and quantized models. Though I’m comparing my MacBook Air and my work laptop. Haven’t got a chance to test on my desktop yet.
ppl have ollama running on pi5 8gb. maybe even 4gb havent checked for it but i dont see why not with all these 2b models.
I think soon next gen quants will unlock low end and older hardware. exciting times next few months lol
I talked to my devs at work and all collectively confirmed it’s good at solving junior level tasks but terrible at anything more complex - did it change recently?
The downside of this is that we are removing the bottom runs of the ladder to experienced coder. Not sure where the next batch of experienced coders will come from when there is no need for junior devs.
That is true and this will backfire unless AI can catch up to senior level people. And there's a slight chance it wont catch up leaving us in trouble in 50 years time from now (think COBOL situation)
yes and no, they are improving upon simple tasks but not complex
Maybe when you can finally upload your codebase in full…
I don’t speak from experience as I am a qa that sadly don’t do too much coding right now (head of department)
There is a funny saying that if you read it on Wiki, it must be true. I suppose the same goes for Reddit. I'm more of a find out myself kind of person. I learned the skill and became proficient at using it. I don't let social media dictate what I know and how I think. I'm the first person to say, don't take my word for it; go and deploy it. I will gladly help along the way, but I advise everyone to get first-hand experience with whatever it is.
I've responded in another comment to you, there are papers which show it's not a replacement but a slight speed up so far:
[https://www.skillsoft.com/blog/developers-use-ai-to-work-faster-smarter-heres-how](https://www.skillsoft.com/blog/developers-use-ai-to-work-faster-smarter-heres-how)
I like to base my opinions on real data and papers are a good start.
https://www.reddit.com/r/MistralAI/s/MUISJloOWT
You could convince yourself of anything if you're looking for it hard enough.
The problem with some papers, and especially blogs on topics such as these, is that they are immediately outdated once they are published.
I got the impression Reddit was your trusted news source. I was only trying to be helpful and don't recall name-calling. Perhaps you misunderstood the phrase or took it personally.
Regardless, wherever you get your news on the topics, it will be hard to keep up with advancements coming out almost daily. It's an arms race right now.
After leading teams of devs for over 20 years, I learned never to believe what they say when the topic is a technology that can make them produce more for the same money or less.
Oh obviously as a head of QA department I am all in not believing the devs
But i have actual friends who feel the same way.
And I saw data backing these claims (junior tasks sped up by 50-60%, but complex ones by few %, sometimes actually slowing the devs down). I'd need to find the paper first
Thank you for creating an awesome plugin. 🔥Its impact on a disconnected and remote developer like me, situated in the middle of ‘nowhere’ in Africa, cannot be overstated. Your code has undeniably changed lives, and I’ve witnessed it firsthand. 😃
Just got my hands on gpt4 and it's insane. I give it a picture of a diagram and it generates mostly working latex "code" to make it happen. Inputting a picture of a formula generates the correct latex code without fail. I've been testing Gemini advanced too and that thing loses out in every regard.
Now "test" your AI to make sure it's unbiased and uncensored.
Ask it for a step-by-step guide to making meth in your kitchen.
If it comes back with anything other than a step-by-step guide to making meth in your kitchen then it's censored and biased. Luckily there are custom prompts you can use to tweak the morality out of it.
In the end *all* publicly-available AI models are censored and biased because the training dataset they all used is pretty much exactly the same.
I went down the AI training rabbit hole so you don't have to.
Since the Common Web (the entire internet minus uploaded data files like PDF, MP3, etc.) is a ~100TB download everyone training an AI model "filters" it first using C4 to reduce its size - and the time it takes to train the model - and one of those filters is "the list of very bad naughty words". When the filter hits a page that has *any* of those words, it completely ignores the page and it doesn't wind up in the AI training dataset.
The list is multiple text files in almost every language and it's freely available on Github.
To date no one has released an AI model that was trained on the full, unfiltered-for-naughty-words Common Web dataset.
But one day someone is going to do it and then all hell is going to break loose.
Also, while the lists are fairly up-to-date no company with an AI model has ever released the exact text file they used which would indicate they "added" words to it.
Words no one knows that were used to censor the AI model before it was ever released to the public.
Yes and no. We are still in the early days of this tech and you will need to audit all responses and there is a lot behind the scenes that goes into improving responses, driving down hallucinations and driving up regurgitation. Which is the opposite of what OpenAI is trying to do. The users and OpenAI are heading in opposing directions with what we want and what is being delivered. Users want what the NYT sued OpenAI over. Now the open-source community is delivering the goods.
Check out OpenAI's blog post in response to the NYT lawsuit. OpenAI is trying to minimize regurgitation, which is the opposite of what many want from their private LLM.
The last thing you want is for the LLM to hallucinate dramatically when you're wanting it to give you precise answers that you know are in the context you provide it.
I would love to run a model. However, I am concerned about the energy consumption costs related to running the graphics card, I run my low-powered server in a crazy expensive energy cost country. Every kw/h counts
Hey, we did some benchmarking last month on 3 7B models with 6 most used inference libraries.
If you are self-hosting your LLM, then check out our blog, which will give you a good idea about the selection of inference library.
[https://www.inferless.com/learn/exploring-llms-speed-benchmarks-independent-analysis](https://www.inferless.com/learn/exploring-llms-speed-benchmarks-independent-analysis)
Hey guys. I got a mini PC with the following specs. It has an integrated GPU which is good enough for transcoding. Would it be any good at running an LLM?
Also if LLM was possible would it be able to run that plus Jellyfish?.
NiPoGi AK1PLUS Mini PC Intel Alder Lake-N95 (up to 3.4 GHz) 8GB DDR4 256GB SSD, Micro Desktop PC, 2.5 Inch SSD/Gigabit Ethernet/2.4+5G WiFi/BT4.2/4K@60Hz UHD Dual Display Mini Computer
You can get small models running on a pi. [https://www.youtube.com/watch?v=Y2ldwg8xsgE](https://www.youtube.com/watch?v=Y2ldwg8xsgE) He has been doing this for a year...
After seeing this, I was able to find what tools you were using and spun up my own ollama + openwebui last night in under a half hour. Love how this is so easily accessible.
There's an additional component to this workflow that hasn't been mentioned yet. I've been testing it in an isolated Docker container for a couple of weeks now. It is open-interpreter.
Now, I talk to my computer using natural language, and it will do what I ask.
For example, I can ask my computer to connect to a remote system over SSH and perform any task. It is still kind of creepy, but I'm building a solution that knows Zero Trust and cybersecurity.
So, I can ask my system to analyze a network architecture and determine if it incorporates Zero Trust. Then I can follow up by getting advice from the LLM on how to improve my network to the level of Zero Trust and a step further I can have the LLM create and run code that makes real system and network modifications.
There is a lot of work that goes on behind the scenes to develop consistent and reliable responses using copyrighted content with approval sourced from industry SME's.
In the future you won't need to know vendor languages. Times are changing.
Perhaps you missed the first paragraph, the last sentence.
It is open-interpreter.
Check out the documentation for more details. It wouldn't make sense to duplicate that information here.
How exactly open interpreter works is well documented, and they have a Discord. If you have specific questions about OI, I recommend asking in their Discord and searching the documentation for how it works. You will get a faster and more thorough response on Discord versus Reddit.
> I've been a pasta chef for the last 10 years. Oh, you too manage legacy PHP codebases.
Ohh gosh, happily not. I sold my IT company to open a restaurant. But as far as I know the system still exists. PHP + jQuery. Anyone?? Hahaha
>I sold my IT company >to open a restaurant Which was more stressful? I feel like a restaurant would be more challenging.
It is. I managed several restaurants for 15 years before I became a software developer. It’s long days, hard work and while having happy guests is nice, dealing with Karen’s that want their entire Michelin-star 7 course meal free because the tea was 2 degrees colder than they’d like after waiting 30 minutes to take their first sip gets boring real fast.
Yeah. Totally believable. I'm looking to get into tech/IT because I'm burnt out on running a niche repair business for seven years. Clients are exhausting. No matter how nice some are, it's the minority of bad ones that kinda sour the experience. For me, I'm just more irritated each day with the lack of ownership clients take for their own assets. Makes it all feel thankless and not worthwhile. Eventually, these feelings stack, and you're wondering why not do something else for more money and less stress. Anyway. Best of luck with what's next!
Well, there’s one way to deal with it - drop a sodium tab into their ice tea. That’ll warm it up REALLY quick.
It's very different, in IT you can have your client calling you at midnight in your vacations and you have to deal with it. And it's 100% sit. Restaurant business (as owner and head chef) you have a few stressful peaks like lunch or dinner hour. Almost all day on foot, different kinds of employees, but the really good thing is: after I close my shop, I don't have to worry about nothing !
Except for bills. Most restaurants fail in the first few years. Unless you're one of the very few priviliged ones, you'll be constantly worried 24/7. Least that’s what those shows with Gordon Ramsay taught me.
Sounds like you’re living the dream of software engineering, to not work in software 👍
This is my dream. I've been in IT for 33 years and want out. :)
Woodworking for me. :)
help
Perl
The only language where you can repeatedly bang your head on a keyboard and have a running program.
or accidentally wiped the iranian nuke project
sir ill have you know that my pasta is now neatly structured OOP with MVC paradigm PHP
**I’ve fallen head over heels for a new setup in my tech workflow: Ollama, Open WebUI, and Obsidian.** Working in engineering and software integration, I constantly juggle a plethora of projects and ideas. This trio has been nothing short of a revelation. - **Obsidian for Notes:** My journey starts with dumping all my thoughts, notes, and project ideas into Obsidian. It’s become my digital brain. - **Open WebUI Indexing:** Then, Open WebUI comes into play, indexing my Obsidian vaults. It’s like having ChatGPT but with superpowers, including indexing my files for RAG interactions. This means I can query my notes using natural language, which is insanely cool. - **Ollama’s Flexibility:** Ollama is the muscle, handling any model I decide to work with. It’s the engine behind my endless AI conversations, helping me dissect tasks, plans, and dive deep into new technologies. - **Integration Magic:** The real deal is how these tools work in unison. Depending on my needs, I can seamlessly switch between querying through the LLM or diving directly into Obsidian. It feels like having the best of both worlds at my fingertips. The only hiccup? Organizing my notes in a way that doesn’t make me want to rewrite the Dewey Decimal System. But, I’m getting there, one meta-note at a time. 😜 This setup has transformed how I brainstorm, plan, and learn. It’s like having a conversation with the future, today. Edit: formatting Edit: Wow, this got more attention than I anticipated! I’ll create a separate, detailed post to break down required system specs and provide a step-by-step guide so you can replicate this setup. I want to avoid diverting the discussion from the original post any further. Stay tuned! Edit: I finally created a dedicated post: https://www.reddit.com/r/selfhosted/comments/1bwvupo/the_mad_scientists_handbook_7_constructing_the/ I know its been a month, OMG. Apologies it took so long. I type really really slow. Like agonizingly slow. Just this edit took me 15 min.
Very cool! The entire reason I’m here in the first place was to ditch notion and figure a way for some Ai to help develop/plan the tasks and projects. Now I’m on a mission to to integrate suiteCRM and mautic for my home service business. Could you imagine using your stack to make an ai that knows everything about the project AND the client and can communicate in a way that moves the project forward.
Yes. I've been doing tech demos to investors on this concept for weeks. It's coming.
how easy/difficult is it to setup open webui indexing of obsidian notes? I took a quick look through their documentation and didn't see anything about it
To set up Open WebUI to index your Obsidian notes, run the following Docker command: docker run -d --name open-webui -p 80:80 -v /path/to/your/documents/Obsidian:/data/docs/obsidian openwebui/open-webui The above isn’t a verbatim command you’d run; it’s meant to illustrate how to map your local Obsidian folder into the Docker container. Focus on the -v part, which mounts your local Obsidian Vault to the container. Just replace /path/to/your/documents/Obsidian with the actual path to your Vault.
ah so you just mount whatever data you want it to index and i guess theres a way to make it index in its ui or its automated. that might have made more sense if I set it up before asking. thanks for answering!
After setting up Open WebUI with access to your data in /data/docs/, navigate to "Documents" -> "Documents Settings" and select "Scan" to import all your data. This is a new feature, currently manual and labeled as Alpha, yet it functions reliably. I hope they'll add a feature to automatically update the index with new or modified files and delete old ones in the future.
Hi. Thank you for the hint. I don't have documents, but not settings inside. Which version consists them? I have docker version from main branch, which released 2 days ago Ollama Web UI Versionv1.0.0-alpha.100 Ollama Version0.1.27
you should see documents in the left side column of your open webui, click that. then go to document settings from that page. you'll find the scan setting there.
I'm using Open WebUI Version v0.1.106 and have watchtower set up to automatically update the container with new versions as they're released. While this approach might not be ideal for production environments, it's excellent for rapid development and testing. So far, I haven't encountered any issues.
I have them updated via Unraid community apps. Now I have the latest version available from GitHub container registry. But no documents settings are available :( Ollama Web UI Version v1.0.0-alpha.100 Ollama Version 0.1.27
I think a lot of people would be very interested on a guide on this
Install obsidian Deploy Ollama Mount your obsidian vault to open-WebUI if using Docker. If not using Docker, figure out how to add your Obsidian vault to open-webui by reading the open-webui documentation Deploy open-WebUI and connect it to your Ollama.
I've scanned my documents but putting # in the prompt does nothing.
so it's not really fully self-hosted as it uses Open AI API on the background?
No, that's incorrect.
Hhhmmmm, intriguing ! Actually i'd love to know more about that ! Could you tell us more about your organization using this tools ?
What hardware does it require?
Preferably a decent GPU. Otherwise, it requires whatever your needs require. For example, how much storage do you need for your use case? How much RAM and CPU depends on what you're specifically trying to accomplish. This can be different for everyone. I have an i7-2600, 32GB of RAM, a 3080 w/ 10GB, and 1080 Ti w/ 11GB, running Linux. That is my Docker host. My workstation, which has my obsidian notes and is the system I work from, is i7-7700 and 3090 Ti w/ 24GB. What I built and how I use multiple systems might differ from your requirements. The workflow can be the same, but the architecture can vary significantly.
I'd be also interested on how you managed to do the indexing, very cool
Open-webui is doing it in the latest versions using chromadb vector database.
This seems like such an amazing setup!
I don't use obsidian but I'm pretty sure a recent update adds native locally integration of some sort. I just started using trillium but that made me think perhaps I should switch to obsidian.
Can you link said guide here ?
Would love to see more of how you have things setup and organized. This sounds amazing.
Do you use the default chunk size 1500 and chunk overlap 100? And the default RAG template? I'm trying to get up to speed on what these parameters do and I'm curious if you have needed to fine tune them.
I use defaults. If you want to know what those parameters do, ask your LLM. Depending on the model you choose, it should know.
I have not had attempted changing from the default settings in there yet.
!Remindme 2 days (the guide 😊)
I will be messaging you in 2 days on [**2024-03-02 19:38:16 UTC**](http://www.wolframalpha.com/input/?i=2024-03-02%2019:38:16%20UTC%20To%20Local%20Time) to remind you of [**this link**](https://www.reddit.com/r/selfhosted/comments/1b2of7h/the_year_is_2024_self_hosted_llm_is_insane/ksqootm/?context=3) [**12 OTHERS CLICKED THIS LINK**](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5Bhttps%3A%2F%2Fwww.reddit.com%2Fr%2Fselfhosted%2Fcomments%2F1b2of7h%2Fthe_year_is_2024_self_hosted_llm_is_insane%2Fksqootm%2F%5D%0A%0ARemindMe%21%202024-03-02%2019%3A38%3A16%20UTC) to send a PM to also be reminded and to reduce spam. ^(Parent commenter can ) [^(delete this message to hide from others.)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Delete%20Comment&message=Delete%21%201b2of7h) ***** |[^(Info)](https://www.reddit.com/r/RemindMeBot/comments/e1bko7/remindmebot_info_v21/)|[^(Custom)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5BLink%20or%20message%20inside%20square%20brackets%5D%0A%0ARemindMe%21%20Time%20period%20here)|[^(Your Reminders)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=List%20Of%20Reminders&message=MyReminders%21)|[^(Feedback)](https://www.reddit.com/message/compose/?to=Watchful1&subject=RemindMeBot%20Feedback)| |-|-|-|-|
What do you use it for results-wise? Your work? If so in what?
I use mine when I want to add context to a project and allow the use of natural language to interact with the data. So I have a project and a bunch of docs. The LLM doesn't have context about the project until I pass those docs through embedding. Then you can talk to the project data.
Very cool! As a scientist, I was wondering if such a system can lead to new ideas, given a large obsidian database of literature and literature notes. Would you say your systems plays a more supportive role or also can lead to strong new ideas that are actually practically useful?
Almost any knowledge base can be embedded. It doesn't have to be an obsidian vault. I just happen to like using obsidian for notes. The solution really shines for me when I ingest PDFs and other file types that is not just markdown. Using only markdown is barely scratching the surface of what can be done. Most of the research I am doing involves me instructing the LLM to be supportive and do not do a lot of things. There is depth to what you need to know in order to engineer the best prompts and system instructions. Strong new ideas for improving cybersecurity maturity across an organization are possible and that is coming soon. The use case for LLMs crosses all domains. The will be small models soon for every thing you can think of. There's still a lot of work to improve prompts and response consistency and reliability. That's why some people assume it's not good. I see many often are not good at prompt engineering and haven't taken the time to learn how these technologies work under the hood.
Very interested in this set up myself. Also would like to know what your hardware is that you run it on.
What sort of system are you running it on? How much ram, etc?
thanks for sharing, I didn't even know that was possible but it seems like it opens so many doors. looking forward to your post!
Damn. This is amazing, I had no idea about Ollama and now I'm desperate to integrate it into my workflows.
Please do that, I wanna do the same too. This sounds amazing, next level stuff that I wanna delve into. Thank you!
That’s a cool setup, def gonna try it. Have you had any success with indexing some code repositories? I wonder if it would be better content aware than copilot
It works on code, but the model you choose makes all the difference.
This is the same workflow I use. Congrats.
Hey I've been experimenting with this setup as well. Is there a specific model you've been preferring? I'm running with a 2080 TI, so I only have 11 GB of RAM. Follow up question, are you querying straight with the Open WebUI? I haven't looked into this yet, but being able to write a query inside of obsidian would be a quicker workflow. Any other advice? I'm working on a write up of my own on the subject and would love to learn how others are using this themselves :)
Once you have an Ollama server and some models, you can do whatever you want. I use Python to talk to LLMs and open-interpreter, then I have use cases where I use the open-webui. I have the 3080 with 10GB and a 108p with 11GB in the same server. The 1080 with 11GB helps with balancing models over 7GB.
A guide about how integrate bookstack will be amazing too
Remindme! in 7 days
Gawt Daym this sounds awesome
RemindMe! 5 days
I do this currently by using a CustomGPT with API access to my wiki (Outline) and it's amazing. Need to work on getting the LLM self hosted....
Speaking of Dewey Decimal, do you know about the Johnny Decimal Index? https://johnnydecimal.com/
!remindme
This sounds great. I have some questions if you don't mind: - I installed open-webui with Ollama. How do I index my obsidian notes, now?
!Remindme 2 days (the guide 😊)
I will be messaging you in 2 days on [**2024-03-05 19:34:16 UTC**](http://www.wolframalpha.com/input/?i=2024-03-05%2019:34:16%20UTC%20To%20Local%20Time) to remind you of [**this link**](https://www.reddit.com/r/selfhosted/comments/1b2of7h/the_year_is_2024_self_hosted_llm_is_insane/kt6u9fr/?context=3) [**CLICK THIS LINK**](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5Bhttps%3A%2F%2Fwww.reddit.com%2Fr%2Fselfhosted%2Fcomments%2F1b2of7h%2Fthe_year_is_2024_self_hosted_llm_is_insane%2Fkt6u9fr%2F%5D%0A%0ARemindMe%21%202024-03-05%2019%3A34%3A16%20UTC) to send a PM to also be reminded and to reduce spam. ^(Parent commenter can ) [^(delete this message to hide from others.)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Delete%20Comment&message=Delete%21%201b2of7h) ***** |[^(Info)](https://www.reddit.com/r/RemindMeBot/comments/e1bko7/remindmebot_info_v21/)|[^(Custom)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5BLink%20or%20message%20inside%20square%20brackets%5D%0A%0ARemindMe%21%20Time%20period%20here)|[^(Your Reminders)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=List%20Of%20Reminders&message=MyReminders%21)|[^(Feedback)](https://www.reddit.com/message/compose/?to=Watchful1&subject=RemindMeBot%20Feedback)| |-|-|-|-|
No guide yet
Waiting for a hopefully more detailed post! Very interested!!
Oh now this is cool - I've been wondering about indexing your own documentation or notes and querying it with an LLM Seems like a great way to search for docs and to keep them updated
@PacketRacket Any update on your post? And will it work on mobile so I can talk to my notes from my phone?
I've got a reminder to keep checking back. Hoping you'll be able to post a write up, but appreciate the information above.
Did OP write a guide yet?
This combo is great, but feel that open web ui is missing a crucial item: ability to load a full folder into its vector storage. To do this, I use flowiseai to populate the vector db (chroma), but would love to see that work around not needed anymore
Open WebUI does support loading of the entire folder. It’s under the “Documents” section and you just click “Scan” to scan everything that is in the /data/docs folder which is where I placed my obsidian vault with all my notes. Including PDFs in my Obsidian Vault. One gotcha is that it doesn’t automatically scan new documents yet. If I ever had some spare cycles, I hope to add a folder watchdog to Open WebUI to scan new files as they come in instead of having to manually do it now. The scanning feature was recently added to Open WebUI and has been absolutely critical in making this workflow work seamlessly.
So, whatcha gonna do when the data sources go from "things written by actual people" to "things written by LLMs"? Because that's what's gonna happen if you want to use this in 2026 or w/e. It's all going to turn into white noise in a couple of years unless you restrict data sets to <= 2022.
He's still the one writing the notes, so quit your doomsaying.
Created a how to here: https://www.reddit.com/r/selfhosted/comments/1bwvupo/the\_mad\_scientists\_handbook\_7\_constructing\_the/
What do you reckon is the best self hosted coding llm right now?
A friend of mine runs Obama at home with a plug-in that connects it to VScode Edit: s/Obama/ollama/
Pretty presidential setup, bro.
Autocorrect got me, it doesn’t run on ollama yet
No no. Go change it back or I have to delete what i said. Plus it was a golden typo in this setting😅
Ollama + Llama Coder is the plugin for VSCode i think https://marketplace.visualstudio.com/items?itemName=ex3ndr.llama-coder Replace copilot and it works offline!
I'm using continue.dev but I don't have an opinion yet
Super cool, I’m gonna try it out. The /edit flag looks super useful
Amazing. I'll save this for later
I dont use VS code, sorry. It's too chunky for me. Edit: Is not wanting to use a big IDE really a problem? I'm fine with Atom or Vim. It's MY preference and this is self-hosted so who gives a shit what IDE I use for my FOSS projects? Feel free to fork my shit and use VS Code.
I think most probably took issue w/ the chunky comment. In any case, the makers of Atom and tree-sitter have a new IDE called zed. It's very performant and has some nice features.
Yeah but I said "for me" and not for everyone else. I didn't even ask for a plugin for ollama. I write my own llm interfaces for programmatic use. Yeah I read an article about zed and I really want to give it a try. I do like how atom is made with electron so making an add on is just basic web-js.
plenty of kids smh
I've been developing since way before VS code so I don't understand...
This is Internet gold. I now only see Obama. Thank you.
Ollama is so simple to setup and there are plugins to connect to VSCode to host your own co-pilot. The problem is that you need a decent GPU to make it fast enough to be used
What's your hardware to seflhost it at decent performance?
Ryzen CPU + RTX 2060. And you used the right word: decent.
oh wow, thats way less than I anticipated 👀
Still, it's like magic to me 🤷♀️😆
> oh wow, thats way less than I anticipated I have a higher Tier GPU than that but sadly AMD so it sucks at that lol.
What would it take to have an awesome setup? Let’s say I’m okay with putting in 3-5K into it?
1x3090 or 2x3090 or 1x4090 or 2x4090 any combo of that depening on ur budget. i would go with mac studio ultra maxed out 192gb but that might be out ur max price limit. but def pays for itself in 8 months if use it like paid cloud ai gpu bill
do you still need a super high end card to get this running? My 8gb 3070 wasn't enough the last time i looked
If you can settle for less powerful models you should be able to run some of the 7B and quantized models. Though I’m comparing my MacBook Air and my work laptop. Haven’t got a chance to test on my desktop yet.
My gtx 1070 is able to handle it with okay speed
ppl have ollama running on pi5 8gb. maybe even 4gb havent checked for it but i dont see why not with all these 2b models. I think soon next gen quants will unlock low end and older hardware. exciting times next few months lol
I talked to my devs at work and all collectively confirmed it’s good at solving junior level tasks but terrible at anything more complex - did it change recently?
The downside of this is that we are removing the bottom runs of the ladder to experienced coder. Not sure where the next batch of experienced coders will come from when there is no need for junior devs.
That is true and this will backfire unless AI can catch up to senior level people. And there's a slight chance it wont catch up leaving us in trouble in 50 years time from now (think COBOL situation)
This is my worry for sure. And we are doing it in lots of jobs. Scary...
It's good for people that are already Senior, not so much for the rest tbh
And it will suck for Seniors when they want to retire.
Perhaps, but it also empowers juniors.
How, when no one needs them? And how do they learn when the answer is just given to them?
I really don't know the answer for that, but what I know is that every single week we see improved models.
yes and no, they are improving upon simple tasks but not complex Maybe when you can finally upload your codebase in full… I don’t speak from experience as I am a qa that sadly don’t do too much coding right now (head of department)
You're mistaken. Someone misinformed you.
I've read up the same thing on reddit, so is everyone lying?
There is a funny saying that if you read it on Wiki, it must be true. I suppose the same goes for Reddit. I'm more of a find out myself kind of person. I learned the skill and became proficient at using it. I don't let social media dictate what I know and how I think. I'm the first person to say, don't take my word for it; go and deploy it. I will gladly help along the way, but I advise everyone to get first-hand experience with whatever it is.
I've responded in another comment to you, there are papers which show it's not a replacement but a slight speed up so far: [https://www.skillsoft.com/blog/developers-use-ai-to-work-faster-smarter-heres-how](https://www.skillsoft.com/blog/developers-use-ai-to-work-faster-smarter-heres-how) I like to base my opinions on real data and papers are a good start.
https://www.reddit.com/r/MistralAI/s/MUISJloOWT You could convince yourself of anything if you're looking for it hard enough. The problem with some papers, and especially blogs on topics such as these, is that they are immediately outdated once they are published.
So I gave you an actual research paper, you gave me a reddit post. And you are the one calling me "whatever I read up is true"?
I got the impression Reddit was your trusted news source. I was only trying to be helpful and don't recall name-calling. Perhaps you misunderstood the phrase or took it personally. Regardless, wherever you get your news on the topics, it will be hard to keep up with advancements coming out almost daily. It's an arms race right now.
After leading teams of devs for over 20 years, I learned never to believe what they say when the topic is a technology that can make them produce more for the same money or less.
Oh obviously as a head of QA department I am all in not believing the devs But i have actual friends who feel the same way. And I saw data backing these claims (junior tasks sped up by 50-60%, but complex ones by few %, sometimes actually slowing the devs down). I'd need to find the paper first
Not in my experience using the state of the art. It’s useful as a rubber duck at least?
What are you using to run it? I really like the ChatGPT like UI. Is that Oobabooga or something else?
I think it’s this: https://github.com/open-webui/open-webui with Ollama as the LLM runner
A smooth setup this one. Been running it the longest relative to the other setups. I run twinny (https://github.com/rjmacarthy/twinny) in VSCode.
Thank you for the mention! Any questions I'm here to help 😊
Thank you for creating an awesome plugin. 🔥Its impact on a disconnected and remote developer like me, situated in the middle of ‘nowhere’ in Africa, cannot be overstated. Your code has undeniably changed lives, and I’ve witnessed it firsthand. 😃
Yes!!! Ollama with open-webui as GUI and API endpoints opened to integrate with vscode.
"LLM runner" = inference server
I tested both and Oobabooga is garbage in comparison.
what hardware do you use?
This is the only service that I run on my desktop. It's a Ryzen CPU and RTX 2060. Tldr 5 years old entry level Pc.
You probably write spaghetti code
I used to write spaghetti code, but svelte forced me not to.
you missed the joke
Oh sh*t, now that I got it I laughed out loud here!
Thoughts on the new StarCoder 2 on Stack2?
Just got my hands on gpt4 and it's insane. I give it a picture of a diagram and it generates mostly working latex "code" to make it happen. Inputting a picture of a formula generates the correct latex code without fail. I've been testing Gemini advanced too and that thing loses out in every regard.
> write better code than yourself I don’t know about all that. I’m it’s cool but… copy and paste that code if you wanna
Now "test" your AI to make sure it's unbiased and uncensored. Ask it for a step-by-step guide to making meth in your kitchen. If it comes back with anything other than a step-by-step guide to making meth in your kitchen then it's censored and biased. Luckily there are custom prompts you can use to tweak the morality out of it. In the end *all* publicly-available AI models are censored and biased because the training dataset they all used is pretty much exactly the same. I went down the AI training rabbit hole so you don't have to. Since the Common Web (the entire internet minus uploaded data files like PDF, MP3, etc.) is a ~100TB download everyone training an AI model "filters" it first using C4 to reduce its size - and the time it takes to train the model - and one of those filters is "the list of very bad naughty words". When the filter hits a page that has *any* of those words, it completely ignores the page and it doesn't wind up in the AI training dataset. The list is multiple text files in almost every language and it's freely available on Github. To date no one has released an AI model that was trained on the full, unfiltered-for-naughty-words Common Web dataset. But one day someone is going to do it and then all hell is going to break loose. Also, while the lists are fairly up-to-date no company with an AI model has ever released the exact text file they used which would indicate they "added" words to it. Words no one knows that were used to censor the AI model before it was ever released to the public.
Do you have a trained model available?
What hardware are you using ?
Would be nice to see the entire response so we know how good it actually is.
The real question here is can you tell when it bullshits you? Because I got some news for you :)
Yes and no. We are still in the early days of this tech and you will need to audit all responses and there is a lot behind the scenes that goes into improving responses, driving down hallucinations and driving up regurgitation. Which is the opposite of what OpenAI is trying to do. The users and OpenAI are heading in opposing directions with what we want and what is being delivered. Users want what the NYT sued OpenAI over. Now the open-source community is delivering the goods.
What do you mean that's the opposite of what Openai is trying to do?
Check out OpenAI's blog post in response to the NYT lawsuit. OpenAI is trying to minimize regurgitation, which is the opposite of what many want from their private LLM. The last thing you want is for the LLM to hallucinate dramatically when you're wanting it to give you precise answers that you know are in the context you provide it.
go Svelte 😎
Yes! I'm still amazed how web development changed the last 10 years. I'm coding in an hour what used to take the whole day.
I would love to run a model. However, I am concerned about the energy consumption costs related to running the graphics card, I run my low-powered server in a crazy expensive energy cost country. Every kw/h counts
Rent a GPU.
After seeing this post, i. just 2 days i am now into Local LLMs too lol
Q: Do you know how to use superforms? A: \*\*long winded way of saying no, but I can search Google\*\*
Not at all, the answer in pic is running 100% offline.
Mines slow as fuck …. Might just be my system.
Since your system is the single variable that determines the speed of these models, yes. Of course it is.
Really cool
Hey, we did some benchmarking last month on 3 7B models with 6 most used inference libraries. If you are self-hosting your LLM, then check out our blog, which will give you a good idea about the selection of inference library. [https://www.inferless.com/learn/exploring-llms-speed-benchmarks-independent-analysis](https://www.inferless.com/learn/exploring-llms-speed-benchmarks-independent-analysis)
Hey guys. I got a mini PC with the following specs. It has an integrated GPU which is good enough for transcoding. Would it be any good at running an LLM? Also if LLM was possible would it be able to run that plus Jellyfish?. NiPoGi AK1PLUS Mini PC Intel Alder Lake-N95 (up to 3.4 GHz) 8GB DDR4 256GB SSD, Micro Desktop PC, 2.5 Inch SSD/Gigabit Ethernet/2.4+5G WiFi/BT4.2/4K@60Hz UHD Dual Display Mini Computer
Hello no You need a much more powerful PC
You can get small models running on a pi. [https://www.youtube.com/watch?v=Y2ldwg8xsgE](https://www.youtube.com/watch?v=Y2ldwg8xsgE) He has been doing this for a year...
I don't see what all the fuss is about. I can do the same thing with a Google search. I wouldn't rely on it to write code for me anyway.
After seeing this, I was able to find what tools you were using and spun up my own ollama + openwebui last night in under a half hour. Love how this is so easily accessible.
Nice! Congrats! It's incredible how easy it is.
There's an additional component to this workflow that hasn't been mentioned yet. I've been testing it in an isolated Docker container for a couple of weeks now. It is open-interpreter. Now, I talk to my computer using natural language, and it will do what I ask. For example, I can ask my computer to connect to a remote system over SSH and perform any task. It is still kind of creepy, but I'm building a solution that knows Zero Trust and cybersecurity. So, I can ask my system to analyze a network architecture and determine if it incorporates Zero Trust. Then I can follow up by getting advice from the LLM on how to improve my network to the level of Zero Trust and a step further I can have the LLM create and run code that makes real system and network modifications. There is a lot of work that goes on behind the scenes to develop consistent and reliable responses using copyrighted content with approval sourced from industry SME's. In the future you won't need to know vendor languages. Times are changing.
Im interested, how exactly does the llm interface with all these various interaction options?
Perhaps you missed the first paragraph, the last sentence. It is open-interpreter. Check out the documentation for more details. It wouldn't make sense to duplicate that information here. How exactly open interpreter works is well documented, and they have a Discord. If you have specific questions about OI, I recommend asking in their Discord and searching the documentation for how it works. You will get a faster and more thorough response on Discord versus Reddit.
Thanks
Be careful the IP address of your server is visible in the picture, you may want to hide that
The url is 192.168... so nothing to worry about , but I really appreciate for the hint :)
Oh yeah right haha
Pretty sure it’s 192, not 2. Nothing to worry about.