this post was submitted on 13 Aug 2024
11 points (100.0% liked)

techsupport

2450 readers
21 users here now

The Lemmy community will help you with your tech problems and questions about anything here. Do not be shy, we will try to help you.

If something works or if you find a solution to your problem let us know it will be greatly apreciated.

Rules: instance rules + stay on topic

Partnered communities:

You Should Know

Reddit

Software gore

Recommendations

founded 1 year ago
MODERATORS
 

Nursing student here!

So we get a shit load of reading assignments, and since everything's digital nowadays, I've been leaning a lot on text-to-speech software that effectively converts reading assignments to listening assignments.

The problem is textbooks have a LOT of just... noise. Every image has something like "FIGURE 13.5 SURGICAL DISASTERS!" "FIGURE 13.6 YOU GOT SUMMONED TO COURT!" etc. In-text citations are EVERYWHERE, copyright info is EVERYWHERE... reading the content, you just skip over all that crap, but pasting it into a TTS service, all that trash gets spoken aloud and adds up to a huge time sink every chapter, and distracts from the actual lesson.

Googling it, the best I've been able to come up with is doing a find and replace in MS word for things like FIGURE **.*^13 with wildcards on and the replace field blank... but it's not very consistent - sometimes it works, sometimes not. Same with nuking parenthesis and the text within with \(*\)

All that said, I'm wondering if I'm approaching this wrong by using MS word in the first place. Would be absolutely amazing if I could save all the commands on standby, then run them at the same time. By end of the school program, we're talking like 100 chapters from multiple books, so anything that lets me just nuke huge batches of BS as quickly as possible and dive right into the listening would be a godsend.

Thanks all!!

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 2 points 2 months ago

In that case, if wildcards aren't enough I'd use an LLM - chat gpt or llama can handle simple regex, and you can just try it out and see if it worked right

Honestly, as a programmer, I'd advise you to learn Python or JavaScript before diving into regex. If you could mess with html without guidance, you've passed the big gap that separates people who can code from those who can't - your eyes didn't glaze over when you looked at something you didn't understand. Writing a script to do custom string replacements isn't hard, it's less efficient but it'll stick with you in a way that regex won't

I use regex when I need speed, but it's a very powerful one trick pony - the problem is it's extremely dense

You can write code that follows your thoughts, you write regex that matches your intentions