Open Source

36567 readers

426 users here now

All about open source! Feel free to ask questions, and share news, and interesting stuff!

Useful Links

Rules

Posts must be relevant to the open source ideology
No NSFW content
No hate speech, bigotry, etc

Related Communities

Community icon from opensource.org, but we are not affiliated with them.

founded 5 years ago

MODERATORS

[email protected]

What local LLMs are you using to create embeddings for RAG? (self.opensource)

submitted 6 days ago by otto to c/[email protected]

5 comments fedilink hide all child comments

I’ve been exploring MariaDB 11.8’s new vector search capabilities for building AI-driven applications, particularly with local LLMs for retrieval-augmented generation (RAG) of fully private data that never leaves the computer. I’m curious about how others in the community are leveraging these features in their projects.

I’m especially interested in using it with local LLMs (like Llama or Mistral) to keep data on-premise and avoid cloud-based API costs or security concerns.

Does anyone have experiences to share, in particular what LLMs are you using when generating embeddings to store in MariaDB?

you are viewing a single comment's thread
view the rest of the comments

[–] otto 1 points 3 days ago (1 children)

You mean ollama? There are so many options, any favorites?

[–] [email protected] 1 points 2 days ago

You're right! Sorry for the typo. The older nomic-embed-text model is often used in examples, but granite-embedding is a more recent one and smaller for English-only text (30M parameters). If your use case is multi-language, they also offer a bigger one (278M parameters) that can handle English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, Chinese (Simplified). I would test them out a bit to see what works best for you.

Furthermore, if you're not dependent on MariaDB for something else in your system, there are also some other vector databases I would recommend. Qdrant also works quite well, and you can integrate it pretty easily in something like LangChain. It really depends on how much you want to push your RAG workflow, but let me know if you have any other questions.