Explain the problem as you see it
Currently transcribing audio clips goes through the Whisper API, which is great for speed and accuracy I've found, but it doesn't distinguish between different speakers in the same audio file - an interview or a meeting for instance.
Why is this a problem for you?
For most of my uses of the audio capture and transcribe feature, it’s just me talking for personal notes or todos. But I have used it to record meetings before and without speaker identification it can be difficult to sort out who said what.
Suggest a solution
I don’t think the whisper API has that functionality built in (yet?) but looks like there are some other tools that might work with it. I hope that could be integrated in the background to create another option to "transcribe audio with multiple speakers".
1 Comment
Nice Alex thanks!