this post was submitted on 27 Jun 2024
6 points (87.5% liked)

LocalLLaMA

2268 readers
4 users here now

Community to discuss about LLaMA, the large language model created by Meta AI.

This is intended to be a replacement for r/LocalLLaMA on Reddit.

founded 1 year ago
MODERATORS
 

Hello y'all, i was using this guide to try and set up llama again on my machine, i was sure that i was following the instructions to the letter but when i get to the part where i need to run setup_cuda.py install i get this error

File "C:\Users\Mike\miniconda3\Lib\site-packages\torch\utils\cpp_extension.py", line 2419, in _join_cuda_home raise OSError('CUDA_HOME environment variable is not set. ' OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root. (base) PS C:\Users\Mike\text-generation-webui\repositories\GPTQ-for-LLaMa>

i'm not a huge coder yet so i tried to use setx to set CUDA_HOME to a few different places but each time doing echo %CUDA_HOME doesn't come up with the address so i assume it failed, and i still can't run setup_cuda.py

Anyone have any idea what i'm doing wrong?

top 10 comments
sorted by: hot top controversial new old
[–] Pyro 4 points 5 months ago* (last edited 5 months ago) (1 children)

Since you are using Windows, you can try setting the CUDA_HOME to point to your CUDA installation folder through the "Edit Environment Variables" window.

However, this guide seems pretty convoluted. I would recommend using one of the many Llama models people have already compiled and shared in HuggingFace.

[–] [email protected] 2 points 5 months ago

I think I have the one I downloaded back when you needed to get approved by meta to download it, however I was just looking for the guide to actually start the thing, since I'm so used to using a GUI, I guess I didn't realize I was actually building the damn thing lol

[–] [email protected] 3 points 5 months ago (1 children)

Agree with others, this guide is a bit more work than you probably need. I don't really run windows much anymore but I did have an easier time with WSL like the other poster mentioned.

And just to check, are you planning on fine-tuning a model? If so then the whole anaconda / miniconda, pytorch, etc... path makes sense.

But if you're not fine-tuning and you just want to run a model locally, I'd suggest ollama. If you want a UI on top of it, open-webui is great.

[–] [email protected] 2 points 5 months ago

Nah I'm just wanting to run for now, maybe If I get more interested down the Line, but I will check those out

[–] [email protected] 2 points 5 months ago (1 children)

I had much better success using WSL, but I haven't used it or even updated it in a long while. (I have been meaning to see how AMD GPU support has evolved over the last few months. Back in January'ish, AMD support was still bad.)

Anything that even is remotely Linux related is much easier to get working with WSL, btw. Almost all of my personal python stuff is running under it and it works great with VS Code

[–] [email protected] 1 points 5 months ago (2 children)

I mean Linux is an option but haven't people been saying nvida drivers are a huge hassle to use on Linux?

[–] [email protected] 2 points 5 months ago

Nah. There are some nvidia issues with wayland (that are starting to get cleared up), and nvidia's drivers not being open-source rubs some people the wrong way, but getting nvidia and cuda up and running on linux is pretty easy/reliable in my experience.

WSL is a bit different but there are steps to get that up and running too.

[–] [email protected] 2 points 5 months ago

They can be, I suppose. However, the AI libraries that I was tinkering with seemed to all be based around Ubuntu and Nvidia. With Docker, GPU passthrough is much better under Linux and Nvidia.

WSL improved things a bit after I got an older GTX 1650. For my AMD GPU, ROCm support is (was?) garbage under Windows using either Docker or WSL. I don't remember having much difficulty with Nvidia drivers though... I think there might have been some strange dependency problems I was able to work through though.

AMD GPU passthrough on Windows to Docker containers was a no-go. I remember that fairly clear though.

My apologies. It has been a few months since I messed with this stuff.

[–] [email protected] 1 points 5 months ago

What’s the python traceback? Can you add import os; os.getenv(“CUDA_HOME”) into the python script just to verify you’re setting correctly?

[–] [email protected] 1 points 5 months ago

Have you looked at LocalAI seems pretty useful, if I was setting up again I'd go containerized...