OpenAI's "Whisper" is astonishingly good (TSINAH)
| hideous cerise stage | 03/30/23 | | Big fluffy hissy fit | 03/30/23 | | hideous cerise stage | 03/30/23 | | soul-stirring forum main people | 03/31/23 | | hideous cerise stage | 03/31/23 | | Razzmatazz Son Of Senegal | 04/01/23 | | hideous cerise stage | 04/01/23 | | crusty goyim | 03/30/23 | | Big fluffy hissy fit | 03/30/23 | | hideous cerise stage | 03/31/23 | | canary swashbuckling famous landscape painting | 03/31/23 | | Coral Alpha | 03/30/23 | | soul-stirring forum main people | 03/30/23 | | hideous cerise stage | 03/31/23 | | Submissive henna house bbw | 03/31/23 | | soul-stirring forum main people | 03/31/23 | | hideous cerise stage | 03/31/23 | | hideous cerise stage | 03/31/23 | | canary swashbuckling famous landscape painting | 03/31/23 | | soul-stirring forum main people | 03/31/23 | | canary swashbuckling famous landscape painting | 03/31/23 | | hideous cerise stage | 03/31/23 | | canary swashbuckling famous landscape painting | 03/31/23 | | Fantasy-prone anal shrine death wish | 03/31/23 | | hideous cerise stage | 03/31/23 | | scarlet fortuitous meteor | 04/01/23 | | hideous cerise stage | 04/01/23 | | hideous cerise stage | 04/01/23 | | hideous cerise stage | 04/10/23 |
Poast new message in this thread
Date: March 30th, 2023 11:51 PM Author: hideous cerise stage
I spent a few hours building a real-time transcription service in python. I'm running it on a g3s.xlarge instance on AWS, which is overkill, but nvidia stopped supporting my computer's old ass graphics cards years ago, so I needed access to the CUDA libraries.
The input was one of my fiancee's pharmacy school lectures. Using ffmpeg, I streamed the output into a FIFO file. I used PulseAudio's module-pipe-source to read from that file, which in effect, makes it a virtual microphone. I used a python microphone library to read from that audio device and then used whisper to transcribe the audio frames in near real-time, as you can see here: https://i.imgur.com/i9vCzd4.mp4
The next logical step is to use something like this to interact with ChatGPT by voice to dictate my prompts. This is pretty cool.
(http://www.autoadmit.com/thread.php?thread_id=5316132&forum_id=2#46121579) |
Date: March 30th, 2023 11:53 PM Author: Big fluffy hissy fit
You know Android phones have done this for years, right?
Still cool though.
(http://www.autoadmit.com/thread.php?thread_id=5316132&forum_id=2#46121588) |
|
Date: April 1st, 2023 3:16 PM Author: Razzmatazz Son Of Senegal
Google's latest model is even better:
https://arxiv.org/abs/2303.01037
probably won't be long before see even better than Whisper models out there.
(http://www.autoadmit.com/thread.php?thread_id=5316132&forum_id=2#46128652) |
Date: March 30th, 2023 11:59 PM Author: soul-stirring forum main people
Efficient C++ implementation with SIMD acceleration
https://github.com/ggerganov/whisper.cpp
(http://www.autoadmit.com/thread.php?thread_id=5316132&forum_id=2#46121615) |
|
Date: March 31st, 2023 12:14 AM Author: soul-stirring forum main people
SIMD is a way of processing multiple numbers at the same time, instead of one at a time. It stands for Single Instruction Multiple Data.
Every CPU since the introduction of Intel’s MMX in the 90s has supported SIMD instructions. x86_64 has SIMD instruction sets like SSE, AVX, AVX2, and AVX512. ARM processors have the NEON instruction set.
To run CMake download and install it to your system. It’s a build system that helps you to organize disparate C and C++ files and libraries and compile them into a single executable.
(http://www.autoadmit.com/thread.php?thread_id=5316132&forum_id=2#46121674) |
|
|