this post was submitted on 12 Nov 2023
271 points (95.3% liked)

Programmer Humor

32721 readers
377 users here now

Post funny things about programming here! (Or just rant about your favourite programming language.)

Rules:

founded 5 years ago
MODERATORS
 

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 5 points 1 year ago* (last edited 1 year ago)

For the purpose of algorithm verification, the final and/or pushdown automaton or probably sometimes even Turing Machines are used, because they are easier to work with. "Real" regular expressions are only nice to write a grammar for regular languages which can be easily interpreted by the computer I think. The thing is, that regexs in the *nix and programming language world are also used for searching which is why there are additional special characters to indicate things like: "it has to end with ..." and there are shortcuts for when you want that a character or sequence occurs

  • at least once,
  • once or never or
  • a specified number of times back to back.

In "standard" regex, you would only have

  • () for grouping,
  • * for 0 or any number of occurances (so a* means blank or a or aa or ...)
  • + as combining two characters/groups with exclusive or (in programming, a+ is mostly the same as aa* so this is a difference)
  • and sometimes some way to have a shortcut for (a+b+c+...+z) if you want to allow any lower case character as the next one

So there are only 4 characters which have the same expressive power as the extended syntax with the exception of not being able to indicate, that it should occur at the end or beginning of a string/line (which could even be removed if one would have implemented different functions or options for the tools we now have instead)

So one could say that *nix regex is bloated /s