this post was submitted on 01 Feb 2024
315 points (98.5% liked)

Comics

453 readers
1 users here now

A community for sharing comics related to programming

Icon base by Delapouite under CC BY 3.0 with modifications to add a gradient

founded 10 months ago
MODERATORS
 

Hover Text:

Wait, forgot to escape a space. Wheeeeee[taptaptap]eeeeee!

Transcript

[in a yellow box:]
Whenever I learn a new skill I concoct elaborate fantasy scenarios where it lets me save the day.

Megan: Oh no! The killer must have followed her on vacation!
[Megan points to computer.]
Megan: But to find them we'd have to search through 200 MB of emails looking for something formatted like an address!
Cueball: It's hopeless!

Off-panel voice: Everybody stand back.

Off-panel voice: I know regular expressions.

[A man swings in on a rope, toward the computer.]

tap tap
The word PERL! appears in a bubble.

[The man swings away, and the other characters cheer.]

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 33 points 9 months ago (1 children)

I learned regex in a data entry job when receiving spreadsheets of user info they entered themselves. The number of times I went from a 40-minute by-hand solution to a 30-second one was astounding.

Need to clean up capitalization? Regex.

Need to fix all the phone numbers with dashes in them? Regex.

Need to make sure all the emails are valid? Regex.

So many hours of tedium saved. It really is one of the most powerful tools to have in your back pocket.

[–] UndercoverUlrikHD 37 points 9 months ago (4 children)

Need to make sure all the emails are valid? Regex.

Hmm...

[–] [email protected] 19 points 9 months ago (1 children)
[–] [email protected] 16 points 9 months ago (1 children)

Who's gonna tell them? I'd do it but I'm still busy parsing HTML with regex.... it's working any minute now!

[–] [email protected] 2 points 9 months ago (1 children)

What am I missing? I typically used it as a sanity check and would vet the changes. Never as a one-click modify. Or is there something else I should know about?

[–] [email protected] 4 points 9 months ago* (last edited 9 months ago) (2 children)

This short article has some good examples at the top: https://sigparser.com/developers/email-parsing/regex-validate-email-address/

Basically, you can very easily make a regex to match 99% of email addresses, but technically an email could be something like "!@[124.35.6.72]"

[–] [email protected] 6 points 9 months ago (2 children)

Ah, yeah. It was never meant to be a be all and all. Just something to clean up the complete trash before I started proofreading. Besides, these were emails the customer provided and could easily be changed afterwords. Their fault if we get bad emails in the list ¯\_(ツ)_/¯

[–] [email protected] 3 points 9 months ago

This is the way

[–] [email protected] 1 points 9 months ago* (last edited 9 months ago)

You're completely correct. In practice, it's usually good enough to just check for ".+@.+" or ".+@.+\..+". Why? It's broad enough to allow almost everything and it rejects the most obvious typos. And in the end, the final verification would be to send an email there which contains a link, that one has to click to finalize the signup/change. Even if you had a regex that could filter every possible adress that's possible according to the standard, you still wouldn't know whether it really exists.

[–] [email protected] 4 points 9 months ago (1 children)

I wrote a regex that matches 100% of email addresses and had no problems using it. It's ".+@.+"

[–] [email protected] 1 points 9 months ago* (last edited 9 months ago)

Meme aside that's what I'd use tbh. Or the ultimate email validation: just sending the signup email and if they typed an invalid email it won't send

[–] [email protected] 10 points 9 months ago (1 children)

Looks like we found the intern who coded the check that rejects “[email protected]”.

Follow up by tech support successfully emailing me at that address to tell me to use a different email address.

[–] [email protected] 4 points 9 months ago

Now do the one where spaces are allowed in the local part as well

[–] [email protected] 2 points 9 months ago

While email addresses are technically a regular language but I have seen a the regex that takes up a whole page claiming to be the first standard compliant one.