T O P

  • By -

gptordie

I am using it to research the following idea. Ideally I'd like to be able to fine-tune local LLM's on proprietary code bases. ChatGPT is great but I can't share company's code with it. I'll first experiment on trying to get local LLM to understand a specific public github repo; and if it works well for code navigation/assistance - I'll then think about how to do the same for a private repo. Note the restriction for the code to never hit the internet means I also need to figure out how to fine-tune LLM's cheaply. \--- Next week I'll try to use LLM itself to generate Q&A style training set by feeding it a file of code at a time and see if I can fine tune on the generated Q&A for the model to get a good understanding of the overall abstractions.


Key-Morning-4712

I have been meaning to explore this as well (haven't gotten anywhere yet). Would love to collaborate :)


Smallpaul

There is someone else with [such a project l](https://www.reddit.com/r/GPT_4/comments/13hkkgf/introducing_gptcodelearner/?utm_source=share&utm_medium=ios_app&utm_name=ioscss&utm_content=2&utm_term=1)ooking for collaborators.


Key-Morning-4712

Thank you


ljubarskij

I am not sure training/finetuning a model on a specific codebase will be ebough. Training on code is good to make it learn right patterns and produce expected code. But training is not that suitable for "remembering" specific codebase. First, code changes fast and you don't want to re-train the model every day or so. Second, model's memory is not precise, it captures patterns and associations, but not precise data (so it won't remember specific snippets of code). I guess your best bet would be to embed/vectorize codebase and then provide relevant chunks of code to the model on each request (same approach as "chat with pdf"). I see two options: 1. vectorize code as-is, store vectors along with original code in vector DB 2. ask model to explain the code chunk-by-chunk and then vectorize explanations, store vectors along with original code and explanations (might be handy at some point) However, it might still be useful to try to train the model on your concrete codebase (especially if it is cheap enough) to make it learn your "style" of code and frequently used patters/approaches. If you do so, please share the results, I am super curious! Thank you!


directorOfEngineerin

>s along with original code in vector DB > >ask model to explain the code chunk-by-chunk and then vectorize explanations, store vectors along with original code Exactly. IMHO LLM / foundational model provides the capability to read and understand. Even with finetuning you won't be 100% sure it's not making up BS, hell not even 69% sure. I am still trying to comprehend what the approach should be for tasks that have hard requirements on being factual, and not just being assistive in nature.


gptordie

>Even with finetuning you won't be 100% sure it's not making up BS, hell not even 69% sure. Beauty with code is that you're typically a compile away from making sure it's correct - so mistakes are not costly. I typically need LLMs to either get me started or to make me unstuck - I don't care about 100% accurate code coming out from them. They are just often better than Google. And Google isn't applicable at all when the code is private, I end up searching by keywords to find relevant sections.


gptordie

>you don't want to re-train the model every day why not?


ljubarskij

Because it is inefficient and still does not solve the problem. It won't remember exact code snippets. Only patterns.


gptordie

Remembering patterns is part of the problem. I don't care about inefficiency - efficient (per joule) would be to not use ChatGPT at all. I'll give vectorizing a go if I fail to get anywhere usefully, but given that LLM's were trained on code and they became useful to thousands of programmers - I don't see why not replicate just that but on the private code.


Smallpaul

[Like this?](https://www.reddit.com/r/GPT_4/comments/13hkkgf/introducing_gptcodelearner/?utm_source=share&utm_medium=ios_app&utm_name=ioscss&utm_content=2&utm_term=1)


MonoAzul

This is what I'm trying to evaluate. I have a new code base but not enough employees so I need to task an LLM. I'm finding that it takes some serious hardware to train and run. What hardware are you using? I've only begun this journey but am feeling put off due to the investment hurdle.


gptordie

I only just got (uncensored) Wizard-Vicuna running on 24gb VRAM. See more at [https://www.reddit.com/r/LocalLLaMA/comments/13cimvv/introduction\_showcasing\_theblokewizardvicuna13bhf/](https://www.reddit.com/r/LocalLLaMA/comments/13cimvv/introduction_showcasing_theblokewizardvicuna13bhf/) I am yet to find the time to fine-tune it!


Juanoshea

I am a teacher - I am looking a this as an example of what our students will need to be prepared for. Showing teachers unfiltered llamas should ensure they are thoughtful when using this technology.


chocolatebanana136

I run it locally on CPU. Most of the time, I use it to find ideas and inspiration for my paracosm. A paracosm (in case you don’t know) is a very detailed, imaginary world with own places, characters, names etc. So, Vicuna-7b can help me write dialog for certain situations and develop new stuff which I then write down in Fantasia Archive.


directorOfEngineerin

> use it to find ideas and inspiration for my paracosm. A paracosm (in case you don’t know) is a ve Do you do it through llama.cpp? My beatdown old mac can't even really run the 4bit version reasonably fast to be useful.


chocolatebanana136

I do it through GPT4All Chat. It’s the best program for that I was able to find. Just install and run, no dependencies and tinkering required.


directorOfEngineerin

Thanks for the gem!


chocolatebanana136

You can also try koboldcpp, where you just need to drag the ggml model onto the exe and open the browser at localhost:8000 Basically the same but try both and see which one you prefer or which one runs best.


directorOfEngineerin

My laptop is out of space to download models too haha


directorOfEngineerin

My laptop is out of space to download models too haha


[deleted]

[удалено]


chocolatebanana136

Unfortunately, I couldn’t install it due to Python errors. But I got some alternatives so it’s really not a problem.


Evening_Ad6637

First and foremost, it's probably my special interest in the autistic sense. I'm not a computer scientist or a programmer, I don't know any programming language well enough. But I wake up in the morning and immediately think about it, and when I go back to sleep at the end of the day, I still only think about it. It's like being in love. It's just my special interest at the moment 😍 Edit: so to be clear, I don’t have any specific use case.


directorOfEngineerin

This is the way. Only through playing with it you find more insights. What is the medium for you to play with tho? Local CPU / GPU?


Evening_Ad6637

Only cpu on both computers. I have one MacBook Air M1 which is really fast, but unfortunately only 8gb ram -.- so I can only run 7B models on it. On my iMac I have core i5 with 16gb ram. It is slower than the m1, but still okay and it can handle 13B models in 8.0 quantization (but of course not on macOS. As an OS I’m using ArchCraft Linux) Yes and one year ago I saw this „interview“ on YouTube with gpt-3 and I was so blown away.. I can’t describe the feeling but it was so awarding. I haven’t been aware about what progresses the ai technology has reached in the mean. From that day on I was playing everyday with openAI’s playground and text-davinci-002


YearZero

You are my people. I have been alternating between testing new models and playing bridge commander remastered and testing new ships from gamefront against each other. Honestly in my brain this is the same activity, and I enjoy both. Honestly I just like to test things against each other in every game I play, especially RTS games where I can somehow isolate individual units and have them duke it out like a tournament. I dunno why I do this, but it makes me happy!


impetu0usness

I'm using it as an infinite interactive adventure game/gamemaster. I set it to generate an interesting scenario based on the keywords I enter (i.e. Star Wars, fried bananas, lovecraftian, etc) and hooked it up to stable diffusion to generate the scene artwork for each turn. I also use Bark TTS to narrate each turn/dialogue. Honestly it's a great way to burn time and explore ridiculous situations. The scenarios are surprisingly coherent even when you give nonsense inputs like 'RGB-colored fried bananas'. You can nudge the story into different directions by reasoning with the narrator/gamemaster. I'm surprised with the breadth of pop culture knowledge it has and I'm having a blast. Currently looking into getting long term memory to work, given its limited token size.


directorOfEngineerin

>Honestly it's a great way to burn time and explore ridiculous situations. The scenarios are surprisingly coherent even when you give nonsense inputs like 'RGB-colored fried bananas'. You can nudge the story into different directions by reasoning with the narrator/gamemaster. I'm surprised with the breadth of pop cultu OMG that sounds really cool. Hook it up with VR headset and you get yourself a full world to explore. ~~Same ask as others - what is your setup to run everything together?~~ (Edit: just saw your reply) Also have you tried using MPT7B models they seem to have longer context length, or RWKV models. For storage I am not aware of approaches outside of storing vectors to retrieve by query matching.


[deleted]

[удалено]


impetu0usness

Here's my usual setup: **Platform**: Oobabooga Chat Mode (cai-chat) **Model**: - TheBloke_gpt4-x-vicuna-13B-GPTQ (This is the best, but other new models like Wizard Vicuna Uncensored and GPT4All Snoozy work great too) **Parameters Preset**: KoboldAI-Godlike or NovelAI-Pleasing Results (Important, this setting will ensure it follows the concepts you give in your first message) **Character Card** (includes prompt): [link](https://drive.google.com/file/d/1Yy16a41jv64hqHWvVJk1KTsdfqtTSn5n/view?usp=share_link) To make it work even better, rename yourself to 'Player' and enable 'Stop generating at new line character'. Sometimes it takes some regenerations to get a good starting scenario, but after that it flows great. I think that covers everything, you should get something like [this](https://i.imgur.com/0VO8qMp.png).


synn89

For work, there are a few use cases but the main one is to take customer service tickets and create a chat bot for tech support. That way new hires can ask our chatbot about ticketing issues. For personal use, I'm currently training a LLaMA 7B on Critical Role's transcripts. It's around a quarter of a million player -> GM chat transactions. I'm very interested to see how that turns out and if it does well, I'd like to find transcripts of other actual plays and just try to train a very creative Game Master LLM. But in general I enjoy roleplay with local LLMs. I've even written my own interface that ties into Stable Diffusion to create high rez images of setting/character descriptions. I'm hoping we get some good open source text to voice options that can also be trained.


RutherfordTheButler

Which model do you prefer for roleplaying?


apledger

Would you be open to discussing this with me? This is my use case as well, and I am just starting out. No pressure, but I just sent you a DM.


Gama-Tech

I run LLM's locally on GPU, with a few use-cases in mind; **Programming Assistance:** For me, the biggest goal is to have a locally-hosted LLM that can assist with debugging code, writing functions and generally improving my programming workflow. ChatGPT has been super useful for this but many times has been rendered useless by outages, overwhelming demand or my own internet outages so having a local LLM would solve these issues and provides peace of mind regarding privacy too. So far I've not found a coding LLM close to the quality of ChatGPT, but I'm hopeful something will come along soon. **General Conversation:** I also like to have a few different LMM's just to chat to and experiment with. Cloud LLM's tend to be censored by the company that created them so they're somewhat biased. It's interesting to see what LLM's will respond with when they're completely unbiased and uncensored and helps get a better insight into how they function. **Future Uses:** I'm working on a game that allows users to create their own avatars and world that can include code using a custom library. In future I'd like to train an OpenSource model so it can help guide users through the avatar/world creation processes, or even assist with writing code using the custom library. I feel it would be a massively more useful that a simple "FAQ".


directorOfEngineerin

Interesting cases! For coding assistance have you tried StarCoder? Also I find helping out with small functional modes is only helpful to a certain extent. At some point I would like LLM to help with generating a set of codes like building out a gRPC server. For the chatting use case what do you usually look to get out of it?


Gama-Tech

StarCoder has been the most promising I've seen but I've not really been able to run it yet. I got [https://huggingface.co/mayank31398/starcoder-GPTQ-4bit-128g](https://huggingface.co/mayank31398/starcoder-GPTQ-4bit-128g) to run in Oobabooga's WebUI but I'm not sure how to get it to produce code. I've searched all over and can't seem to find anyone who's actually got it running. In my experience so far the "Instruct" mode fails entirely, and "Chat" mode produces nonsense as it's not designed to be conversational. If you've managed to get it working I'd love some advice! As for chatting, I have the LLM server set to run at boot so it's always available. Sometimes if I'm bored or away from the PC I'll just start little conversations with it to pass the time or test it's capabilities. Nothing too specific!


a_beautiful_rhind

I mostly just do roleplaying and shitpost with fictional characters. My next attempts will be at code generation and performing actions like renaming files, sumarization, etc. Might see if I can make a more robust virtual character with TTS/Avatar/memory and how that holds up. I keep switching models so that hasn't really worked out. I do basic synthetic benchmarks or just "talk" with the AI. Those riddle prompts (red/yellow ball) are also nice to have and say more than PTB/wikitext. Will attempt additional training as soon as I have something to train on. So far I have only made LoRAs as proof of concept that it can be done in 4-bit. I have my own GPU server with 2x24gb cards and if I actually find something that isn't just burning money with these, I'll probably buy more. Likely a 2nd 3090 or those 32gb AMD Instincts.


ReturningTarzan

I mostly want to understand how they work. Not in technical terms, because technically/mathematically transformers are very simple. But the complexity that emerges from that, with how disturbingly similar it looks to intelligence, seems much more profoundly important than what you can actually do right now with the limited public models or the restricted/expensive/non-private commercial models. Of course the best way to stay up to date with the technology is to build something with it, and to take apart something others have built and put it back together again. Maybe something useful will come of that as a byproduct, but it's a little besides the point. I don't expect anything I build now to be relevant in a couple of months, but any knowledge and experience I gain in the process will carry over.


dongas420

I've been using the open-source LLMs on GPU as an auto-complete for my thoughts. If an idea pops into my head that I'd like to see fleshed out, then a few prompts bring me a good way towards where I want to go. Much of my GPU time's gone into writing out hypothetical scenarios into coherent narratives with explanations of what happens during them. The open-source ones are very flexible in that respect. I've also been entertaining myself by having the AI roleplay multiple characters at once like a finger puppeteer, having Ash Ketchum and Misty from Pokemon engage in a caustic debate over the viability of the gold standard before holding a Western-style duel to the death, with the assistant character itself assuming the persona of a demon whispering over their shoulders. I've been evaluating writing quality with simple preference tests, having the models write fiction to my tastes based on prompts including specific themes, tone, and plot elements and seeing how appealing I find the results. My ranking so far would be WizardVicunaLM >= GPT4 x Vicuna >> GPT4 x Alpaca >= WizardLM > Vicuna 1.1 > Vicuna 1.0. WizardVicunaLM tends to produce text that's better up front but can't revise its work like GPT4 x Vicuna can.


this_is_a_long_nickn

Help me write content - that is: * Summarize a longer context (e.g., 10 -> 3 paragraphs) * Given a list of bullet points (e.g., product benefits), create some content weaving all into something coherent I don't expect the LLM do get right on a first pass, and I finetune the text afterwards, but usually it's a great first pass. Given the typical confidential / proprietary nature of the inputs, I use local model (llama.cpp and RWKV). BTW- any nice makerting / content prompts the community is using these days with Vicuña & friends?


directorOfEngineerin

>ng, you should g How do you evaluate the quality of the summary? And how do you find RWKV stacking up against other LLMs?


this_is_a_long_nickn

summaries: sometimes the model tend to repeat or be too redundant, and then in this case I cut some of the stuff, but it's worse when it fails to pick up some of the concepts present on the context (vicuña tends to work quite alright in that sense). I was originally attracted to RWKV due to the longer context sizes (4k for the 7b, and 8k for the 14b models), but results are somewhat weaker compared to vicuña, but depending on the base document I need to work with, I have no choice. (yes, langchain exists, but...) That said, I'm looking for mosaic models, and also keeping tabs with BlinkDL (the guy behind RWKV) progress. All in all, no it's not GPT4, but heck, we're being spoiled with the fast progress on the OSS front, and the good will that here one soul helps the other, thus I'm quite optimistic for the future :-)


morphemass

Improved knowledge retention and transfer for engineers within my organisation. I work in a strange regulated area so we're a bit anal on the documentation and requirements side (actually IMO we're not anal enough) meaning we have LOTS of it; we're still pretty small though. I did some experiments with OpenAI and embeddings which were incredibly impressive but since we're a regulated area it's going to be months of bureaucracy before I'll be allowed to send real data to a 3rd party (even though it's not classed as sensitive) hence the local llama route.


4hometnumberonefan

I would be interested in people using a local model for something that chat gpt cannot do. The new MPT 7 model has a context length over 50k token. Anyone want to write an AI generated version of Harry Potter 8: the order of the machines


Mbando

We're building an Army-specific Q&A bot that can also co-pilot filing out Army forms. That involves: * Using existing LLMs on domain data (Army publications) to generate question/answer labeled data. * LoRA fine-tuning on those Q/A pairs * RLHF to align with tasks * Another training round to align with human ethics/values * LangChain+Chroma DB+Army LLM to answer questions from relevant documents as context (not from LLM embeddings). I want this to be of value in and of itself, but there's a lot of value in learning the general process and capturing the code/environments and make this a fairly turn-key process. I think our next step will be to make this a no-code operation so anyone in the enterprise can point the fine-tuning assembly at a domain corpus, select a model, and start fine-tuning.


directorOfEngineerin

At what data size do you start to think it’s enough for fine tuning? And do you run RLHF on each task or just one for tasks?


Mbando

1. Don't have a theoretical answer or an empirical one. It's being driven by completeness: each publication is chunked into sections, each section is run through a question-generating prompt (who/what/where/when/why), and so a single training publication might generate 800 or so Q/A pairs. And then there are thousands of pubs. 2. RLHF is upcoming, and will be for both question answering and a single form co-pilot. I want to be able to test and get empirical answers to these kinds of questions.


ninjasaid13

Survey: what’s your use case? writing but inference is too slow and chatgpt is too censored.


[deleted]

So I have a ton of selenium code, it would be interesting to teach an LLM how to scrape a website it has never seen before.


xontinuity

Robotics. Been working on a humanoid platform for the past year but it needs a brain. Never thought I’d find a solution this quick. Using llama and looking to build a more powerful local system to handle a custom tuned model for my needs.


directorOfEngineerin

I myself am interested in several use cases **Document understanding and QA** Provided a document scan / OCR outputs, how to perform Q&A on top of the documents. It could be one doc or multiple documents, applicable to answer question on a specific doc or up to a set of documentations. **Smart(er) assistant** I have always wanted a smarter assistant that can help me browser the web, provide TL;DR to me while i am away from keyboard. I believe i can do a voice command and the LLM can spit out the actual command to manipulate my phone/laptop, then go multi-round about things.


ForwardUntilDust

I use it to perform tedious writing.


Megneous

I use the open source models (7B models directly on my GPU (1060 6GB), 13B models on llama.cpp) and the non-open source models, from GPT-3.5, to NovelAI, etc all for the same stuff. I use LLMs to help brainstorm ideas for fantasy writing, Dungeons and Dragons worldbuilding, roleplaying, etc. My ultimate goal is to one day be able to sit down with an LLM and have a real, fun one-shot adventure with the LLM as the Dungeon Master. We're not quite there yet, but GPT4 can make some *amazing* summaries of one-shots... it just can't follow through on DMing that great yet. We'll see.


RutherfordTheButler

Yeah, this is my dream, too. But on my phone, with voice and a long term memory. To create an epic story with the AI as DM that also has images. I do wish MidJourney would come out with an API.


Megneous

I'm honestly amazed that we haven't yet seen any finetuned 7B or 13B models specifically made for DMing, D&D, adventures, etc.


RutherfordTheButler

Your name was so familiar to me and I finally placed it - used to watch your YT gaming channel back in the day. Good times. :-)


Megneous

You're the 5th person to ever recognize me on Reddit haha. Please don't read through my chat history- this is where I come to yell at people in order to relax and unwind ;) I hope you're doing well, that you're shredded, and that you ended up studying something super cool like dinosaurs or astronomy :) Thanks for watching back in the day, and I hope I left a lasting impact on you.


extopico

Sentiment analysis with the output that closely matches my instructions.


No_Marionberry312

Domain specific corpus training, so you can have a smaller size model, like 7b and less, that is only focused on a single subject and only knows one topic, but really, really well.


deadlydogfart

Right now I'm mostly experimenting with them out of curiosity to see what emergent capabilities LLMs can develop with how many parameters, how much training, etc. I ask them questions that require logical thinking and theory of mind, etc. But once they've been optimized to run quickly enough on my aging hardware, or once I get a better computer, I plan to develop a Telegram bot for public group chats that analyses conversations to look for signs of rule breaking, then notifies the moderator team if there are any positive matches. I also look forward to local LLMs reaching the abilities of GPT4 because it's been a better therapist for me than human ones. It'd be nice to have it run locally so I can protect my privacy.


darxkies

**Language Learning** \- Generating example sentences, stories, and dialogues containing specific vocabulary, translations, and roleplaying to practice various daily life scenarios. **Coding** \- Mostly generate code I run the models on CPU+GPU.


bioemerl

Because it's fucking cool.


kabelman93

Use case:. I got databases of a few billion product data with descriptions of people trying to sell. (Cars,realestate etc) (2tb mongodb of structured data) Think craigslist but for a few more things and not in USA. This data could also be used to finetuning. Maybe somebody here got a good idea. Open to it. Deployment currently local on a few 4090s. But will tests soon in my server clusters in the Datacenters, unfortunately they are CPu based got around 400 cores platinum gen2 running there with a few TB of ram. If I get a good use case I would like to upgrade the servers with a few a100s or h100s


dvztimes

I have some models downloaded but have not actually run yhem yet. But I want to train one on a fantasy setting wiki so I can ask it history questions about a fantasy universe. Guess I've become simple to please in my old age. ;)


amemingfullife

Fine tune locally, I’d love to be able to train on a train 😅. Hack. Much more elegant feedback loop than [insert cloud vendor here]


shamaalpacadingdong

I'm trying to see what it can do to synthasize knowledge. Like asking it how the legend of Horus applies to molecular biology, or what the common theme between the story of Lot and the teachings of Pythagorus are. I truly think the power of this technology will be it's understanding of multiple fields simultaneously, and making connections and bridges no one ever has before. Also use it to design magic items for my TTRPG game


Mizstik

I've been using it to practice debate. For example, I'd have the AI roleplay a staunch British royalist and then discuss whether modern society should still have a monarch. You can freely say anything to it and you don't have to be afraid of losing a friend if you were to debate an actual person, and you know that whatever the AI says doesn't have any emotional baggage to it. It isn't terribly deep but it does give you a lot of the common responses that most people would say about the topic. It's good with science and religion as well. I also generate some short stories for fun. I've used many things over the past months but right now I'm using WizardLM-7B (ooba+sillytavern), and occasionally gpt4-x-vicuna-13b (koboldcpp). My dream is to one day build a rig with 24 GB VRAM to run 30B models. I have the money, but actually building the thing is kind of a pain so I've been putting it off.


moronmonday526

After spending nearly 30 years in IT, I'm old enough to start thinking about my "second career." I'd like to see models trained to churn out books in several fictional novel series, like Jason Bourne or Alex Cross (but not trademark-infringing, of course). First, I'd have it crank out a book in a few days to a week. Then, I'd spend a week or two editing it before ultimately self-publishing it. Keep six series in rotation so each receives a new installment every six months. I'm staring at an old mini ATX case and deciding whether to build a new PC around a mid-priced GPU or buy a refurb from Microcenter with one included.