this post was submitted on 11 Sep 2024
71 points (76.7% liked)

Programming

17483 readers
199 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Rules

  • Follow the programming.dev instance rules
  • Keep content related to programming in some way
  • If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities [email protected]



founded 1 year ago
MODERATORS
 

Over the past few years, the evolution of AI-driven tools like GitHub’s Copilot and other large language models (LLMs) has promised to revolutionise programming. By leveraging deep learning, these tools can generate code, suggest solutions, and even troubleshoot issues in real-time, saving developers hours of work. While these tools have obvious benefits in terms of productivity, there’s a growing concern that they may also have unintended consequences on the quality and skillset of programmers.

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 8 points 2 months ago (1 children)

I'm a 10+ (cumulative) yr. experience dev. While I never used The GitHub Copilot specifically, I've been using LLMs (as well as AI image generators) on a daily basis, mostly for non-dev things, such as analyzing my human-written poetry in order to get insights for my own writing. And I already did the same for codes I wrote, asking for LLMs to "Analyze and comment" my code, for the sake of insights. There were moments when I asked it for code snippets, and almost every code snippet it generated was indeed working or just needing few fixes.

They've been becoming good at this, but not enough to really replace my own coding and analysis. Instead, they're becoming really better for poetry (maybe because their training data is mostly books and poetry works) and sentiment analysis. I use many LLMs simultaneously in order to compare them:

  • Free version of Google Gemini is becoming lazy (short answers, superficial analysis, problems with keeping context, drafts aren't so diverse as they were before, among other problems)
  • free version of ChatGPT is a bit better (can keep contexts, can issue detailed answers) but not enough (it does hallucinate sometimes: good for surrealist poetry but bad for code and other technical matters when precision and coherence matters)
  • Claude is laughable hypersensitive and self-censoring to certain words independently of contexts (got a code or text that remotely mentions the word "explode" as in PHP's explode function? "Sorry, can't comment on texts alluding to dangerous practices such as involving explosives", I mean, WHAT?!?!)
  • Bing Copilot got web searching, but it has a context limit of 5 messages, so, only usable for quick and short things.
  • Same about Bing Copilot goes for Perplexity
  • Mixtral is very hallucination-prone (i.e. does not properly cohere)
  • LLama has been the best of all (via DDG's "AI Chat" feature), although it sometimes glitches (i.e. starts to output repeated strings ad æternum)

As you see, I tried almost all of them. In summary, while it's good to have such tools, they should never replace human intelligence... Or, at least, they shouldn't...

Problem is, dev companies generally focus on "efficiency" over "efficacy", wishing the shortest deadlines while wishing some perfection. Very understandable demands, but humans are humans, not robots. We need our time to deliver, we need to cautiously walk through all the steps needed to finally deploy something (especially big things), or it'll become XGH programming (Extreme Go Horse). And machines can't do that so perfectly, yet. For now, LLM for development is XGH: really fast, but far from coherent about the big picture (be it a platform, a module, a website, etc).

[–] lysdexic 2 points 2 months ago

Claude is laughable hypersensitive and self-censoring to certain words independently of contexts (...)

That's not a problem, nor Claude's main problem.

Claude's main problem is that it is frequently down, unreliable, and extremely buggy. Overall I think it might be better than ChatGPT and Copilot, but it's simply so unstable it becomes unusable.