this post was submitted on 03 Jun 2024
1471 points (97.9% liked)

People Twitter

5380 readers
599 users here now

People tweeting stuff. We allow tweets from anyone.

RULES:

  1. Mark NSFW content.
  2. No doxxing people.
  3. Must be a tweet or similar
  4. No bullying or international politcs
  5. Be excellent to each other.
  6. Provide an archived link to the tweet (or similar) being shown if it's a major figure or a politician.

founded 2 years ago
MODERATORS
 
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 13 points 6 months ago (3 children)

LLMs are not a good tool for processing data like this. They would be good for presenting that data though.

[–] [email protected] 3 points 6 months ago (1 children)

Make an LLM convert the data into a standardized format for your traditional algorithm.

[–] [email protected] 2 points 6 months ago

There's no way to ensure that data will stay in that standardized format though. A custom model could but they are expensive to train.

[–] [email protected] 1 points 6 months ago (1 children)

Llms are excellent at consuming web data.

[–] [email protected] 6 points 6 months ago (1 children)

Not if you want to ensure the validity of the compiled coupons/discounts. A custom algorithm would be best but data standardization would be the main issue, regardless of how you process it.

[–] [email protected] 0 points 6 months ago* (last edited 6 months ago) (1 children)

What does validity mean in this case? A functionary LLM can follow links and make actions. I'm not saying it's not "work" to develop your personal bot framework, but this is all doable from the home PC, with a self hosted llm

Edit and of course you'll need non LLM code to handle parts of the processing, not discounting that

[–] [email protected] 1 points 6 months ago* (last edited 6 months ago)

The LLM doesn't do that though, that the software built around it that does that which is what I'm saying. Its definitely possible to do, but the bulk of the work wouldn't be the task of the LLM.

Edit: forgot to address validity. By that I mean keeping a standard format and ensuring that the output is actually true given the input. Its not impossible, but its something that requires careful data duration and a really good system prompt.

[–] [email protected] 0 points 6 months ago (1 children)

Llms are great for scraping data

[–] [email protected] 9 points 6 months ago (1 children)

LLMs don't scrape data, scrapers scrape data. LLMs predict text.