OpenAI's "Whisper" is astonishingly good (TSINAH)
| boyish bawdyhouse | 03/30/23 | | glittery casino | 03/30/23 | | boyish bawdyhouse | 03/30/23 | | Soul-stirring Bat-shit-crazy Property | 03/31/23 | | boyish bawdyhouse | 03/31/23 | | Excitant navy plaza | 04/01/23 | | boyish bawdyhouse | 04/01/23 | | Cruel-hearted macaca | 03/30/23 | | glittery casino | 03/30/23 | | boyish bawdyhouse | 03/31/23 | | chrome turdskin | 03/31/23 | | vivacious deranged church building | 03/30/23 | | Soul-stirring Bat-shit-crazy Property | 03/30/23 | | boyish bawdyhouse | 03/31/23 | | Green vibrant sound barrier | 03/31/23 | | Soul-stirring Bat-shit-crazy Property | 03/31/23 | | boyish bawdyhouse | 03/31/23 | | boyish bawdyhouse | 03/31/23 | | chrome turdskin | 03/31/23 | | Soul-stirring Bat-shit-crazy Property | 03/31/23 | | chrome turdskin | 03/31/23 | | boyish bawdyhouse | 03/31/23 | | chrome turdskin | 03/31/23 | | ultramarine giraffe | 03/31/23 | | boyish bawdyhouse | 03/31/23 | | Tripping range hominid | 04/01/23 | | boyish bawdyhouse | 04/01/23 | | boyish bawdyhouse | 04/01/23 | | boyish bawdyhouse | 04/10/23 |
Poast new message in this thread
Date: March 30th, 2023 11:51 PM Author: boyish bawdyhouse
I spent a few hours building a real-time transcription service in python. I'm running it on a g3s.xlarge instance on AWS, which is overkill, but nvidia stopped supporting my computer's old ass graphics cards years ago, so I needed access to the CUDA libraries.
The input was one of my fiancee's pharmacy school lectures. Using ffmpeg, I streamed the output into a FIFO file. I used PulseAudio's module-pipe-source to read from that file, which in effect, makes it a virtual microphone. I used a python microphone library to read from that audio device and then used whisper to transcribe the audio frames in near real-time, as you can see here: https://i.imgur.com/i9vCzd4.mp4
The next logical step is to use something like this to interact with ChatGPT by voice to dictate my prompts. This is pretty cool.
(http://www.autoadmit.com/thread.php?thread_id=5316132&forum_id=2#46121579) |
Date: March 30th, 2023 11:53 PM Author: glittery casino
You know Android phones have done this for years, right?
Still cool though.
(http://www.autoadmit.com/thread.php?thread_id=5316132&forum_id=2#46121588) |
 |
Date: April 1st, 2023 3:16 PM Author: Excitant navy plaza
Google's latest model is even better:
https://arxiv.org/abs/2303.01037
probably won't be long before see even better than Whisper models out there.
(http://www.autoadmit.com/thread.php?thread_id=5316132&forum_id=2#46128652) |
Date: March 30th, 2023 11:59 PM Author: Soul-stirring Bat-shit-crazy Property
Efficient C++ implementation with SIMD acceleration
https://github.com/ggerganov/whisper.cpp
(http://www.autoadmit.com/thread.php?thread_id=5316132&forum_id=2#46121615) |
 |
Date: March 31st, 2023 12:14 AM Author: Soul-stirring Bat-shit-crazy Property
SIMD is a way of processing multiple numbers at the same time, instead of one at a time. It stands for Single Instruction Multiple Data.
Every CPU since the introduction of Intel’s MMX in the 90s has supported SIMD instructions. x86_64 has SIMD instruction sets like SSE, AVX, AVX2, and AVX512. ARM processors have the NEON instruction set.
To run CMake download and install it to your system. It’s a build system that helps you to organize disparate C and C++ files and libraries and compile them into a single executable.
(http://www.autoadmit.com/thread.php?thread_id=5316132&forum_id=2#46121674) |
|
|