this post was submitted on 12 Jun 2023
13 points (100.0% liked)
Experienced Devs
3978 readers
1 users here now
A community for discussion amongst professional software developers.
Posts should be relevant to those well into their careers.
For those looking to break into the industry, are hustling for their first job, or have just started their career and are looking for advice, check out:
- Logo base by Delapouite under CC BY 3.0 with modifications to add a gradient
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I do not want my content to contribute to propertiery LLM that will make billion for large tech company without giving back to the community. Unfortunately I think fediverse have a harder time countering large scale data harvesting than a centralized service like reddit.
On the other hand, I don't mind open source, privacy respecting (is this a thing for LLM?) LLM to use my content.
I am also wary of big tech companies using my comment history for their LLMs. However, I worry that the tech companies will scrape data anyway and Reddit's API pricing just locks out the open source LLMs. There are a few of them, a couple that I have played with:
https://github.com/nomic-ai/gpt4all
https://github.com/ggerganov/llama.cpp
Some projects even try to preserve privacy. But I think its more on the side of what extra training data you give it and the queries you issue.
https://github.com/imartinez/privateGPT