Open Source

30789 readers

717 users here now

All about open source! Feel free to ask questions, and share news, and interesting stuff!

Useful Links

Rules

Posts must be relevant to the open source ideology
No NSFW content
No hate speech, bigotry, etc

Related Communities

Community icon from opensource.org, but we are not affiliated with them.

founded 5 years ago

MODERATORS

[email protected]

Any speech to text library that uses whisper api? (feddit.de)

submitted 7 months ago by [email protected] to c/[email protected]

4 comments fedilink hide all child comments

So I have the api working as in I can send audio files and get text back but what I am looking for is a robust way to have streaming functionality. For example, if there is a small duration of silence it should stop recording and send the audio to api etc.

Is there any such library in python?

top 4 comments

sorted by: hot top controversial new old

[–] [email protected] 4 points 7 months ago

I found this so far: https://github.com/KoljaB/RealtimeSTT

Maybe I can modify it to use whisper api.

[–] [email protected] 4 points 7 months ago (1 children)

Dunno, but this guy (all about ai) builds one with 'faster-whisper', so perhaps you can get a few pointers there? I believe he chunks the Audio on silence. He have a few other speech2x videos. Have fun. https://youtu.be/k6nIxWGdrS4

Also: https://github.com/SYSTRAN/faster-whisper

[–] [email protected] 3 points 7 months ago

Just stumbled upon this speedy one: https://github.com/sanchit-gandhi/whisper-jax

And this one for word precision time marks: https://github.com/m-bain/whisperX

[–] [email protected] 3 points 7 months ago

Don't have knowledge to answer your question but you could check how home assistant does it, I think that should point you to the right direction.