MacWhisper, native macOS app for Whisper #420
Replies: 45 comments 43 replies
-
Added a bunch more features such as editing and deleting segments, as well as language selection. Love this framework, thanks again! |
Beta Was this translation helpful? Give feedback.
-
Love it! Working well. Here are a few thoughts from some initial use:
Definitely a good start! Any way to see when updates are released? |
Beta Was this translation helpful? Give feedback.
-
This is a really good app! I remember last time I tried something similar and the AI with Japanese was a bit shitty… I need to review the extracted text but it looks really good (at least what I've saw on the first mins). I have two ideas that could be good to have:
About the first point For the 2nd point Also, I would like to ask why MacWhisper isn't on Github or Gitlab so all of us can collaborate too. |
Beta Was this translation helpful? Give feedback.
-
Love it! And works really well. |
Beta Was this translation helpful? Give feedback.
-
Nice job ! Looking further for live transcript… |
Beta Was this translation helpful? Give feedback.
-
would be nice if i could choose the language |
Beta Was this translation helpful? Give feedback.
-
Nice app. Please add support for *.ogg files (WhatsApp, Telegram voice). |
Beta Was this translation helpful? Give feedback.
-
Fantastic work. Question: I initially downloaded the non-pro version and chose only the english dataset. After testing I want to try the multi-language dataset, but I see no way in the app or website to go back and get that file. Pointer would be appreciated. |
Beta Was this translation helpful? Give feedback.
-
I purchased MacWhisper Pro and love the application. I primarily use this for podcasts and hour long recordings. I'd like to request a way to export a transcript and its accompanying audio or video and have it be interactive like click on three paragraphs down and there's where the media will jump to. Or A program that could do that as I'd happily pay for something that like that. Would also love to see macwhisper come to iOS as well. Gladly pay for this again just to have it on mobile as well. |
Beta Was this translation helpful? Give feedback.
-
Also, once I identify a speaker, it would be fantastic for the AI to then label all instances of that voice appropriately!! |
Beta Was this translation helpful? Give feedback.
-
Batch mode does not work in the Pro version (only reason why I bought it). |
Beta Was this translation helpful? Give feedback.
-
I used Mac Whisper Pro Medium to transcribe an interview - stereo file. I wish I could get MW to identify each track with a name. It's too much work to do that manually. I see I can add people but I didn't figure out how that worked. Would be nice if there were YouTube videos showing what it can do and how to do it. Maybe there are but I haven't found them yet. |
Beta Was this translation helpful? Give feedback.
-
I got Mac Whisper for my Macbook Air M1 Ventura and got the pro. It then gave me size options so I chose medium. Can I have Pro large? There was no option to do that with the Pro. I do podcasts with a different person each time and am in no hurry so the best I can get is what I want. |
Beta Was this translation helpful? Give feedback.
-
Just found a little bug: Looks like it only listens to the left channel of a stereo file. I kept getting only "[BLANK_AUDIO]" for an mp3 file that clearly had voices in it. Mystery was solved when I opened it in audacity and saw that the speaking was all in the right channel. |
Beta Was this translation helpful? Give feedback.
-
you should add a settings section to tweak num of cpus / threads weights etc. |
Beta Was this translation helpful? Give feedback.
-
Hi, I have the MacWhisper Pro and work on a MavcBook Air (M1) with Airpods Pro.
What do I have to do to record both my voice and that of my conversation partner via AirPods Pro? |
Beta Was this translation helpful? Give feedback.
-
How to handle mixed-language audio? (Can it?) AFAIK Whisper itself can handle mixed conversations and I"m using the Large model that includes all the languages, is there a limit of some kind with how MacWhisper is set up, or am I perhaps just missing something in the interface? |
Beta Was this translation helpful? Give feedback.
-
Hi Ian, no, it wasn’t Facetime, it was MS Teams.
*Von: *Ian Dundas ***@***.***>
*Datum: *Freitag, 8. Dezember 2023 um 09:26
*An: *ggerganov/whisper.cpp ***@***.***>
*Cc: *cgallerhh ***@***.***>, Mention <
***@***.***>
*Betreff: *Re: [ggerganov/whisper.cpp] MacWhisper, native macOS app for
Whisper (Discussion #420)
Hi @cgallerhh <https://github.com/cgallerhh>, one of the devs of MacWhisper
here - I'm guessing this was FaceTime? As far as I know it's not possible
to record a FaceTime call due to privacy constraints in Apple's API.
—
Reply to this email directly, view it on GitHub
<#420 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/A573WDR2A3AWGYFONFVAVKTYILFMJAVCNFSM6AAAAAAT6AEGNOVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM3TOOJXGQ2DK>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Hi Ian, please find attached
[image: Bildschirmfoto 2023-12-08 um 11.11.20.png]
an audio file with my voice. It was recording “system audio” -> Airpods
and I can only listen to my own voice.
Viele liebe Grüße,
Christian Galler
Nymphenweg 8
21077 Hamburg
Web www.christian-galler.de
Mobil 0176 63107173
7. Dec at 09_59_55 Microphone.whisper
<https://drive.google.com/file/d/1Q48QEesAZA2Wg-ivldwvTgn1tO4NP3u2/view?usp=drive_web>
…On Fri, 8 Dec 2023 at 09:25, Ian Dundas ***@***.***> wrote:
Hi @detchells <https://github.com/detchells>, one of the devs of
MacWhisper here, are you able to share such an audio file with me (even
just an extract) so that I can do some experimentation? Thanks
—
Reply to this email directly, view it on GitHub
<#420 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/A573WDUU63DF3ZNXXFYWAYTYILFHNAVCNFSM6AAAAAAT6AEGNOVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM3TOOJXGQZTK>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Hi my friend, I‘m not that japanese guy - I am the Airpods guy with only one speaker in a conversation. Seems you‘ve written the wrong user.
Gesendet von Outlook für iOS<https://aka.ms/o0ukef>
…________________________________
Von: Dave Etchells ***@***.***>
Gesendet: Friday, December 8, 2023 8:52:03 PM
An: ggerganov/whisper.cpp ***@***.***>
Cc: cgallerhh ***@***.***>; Mention ***@***.***>
Betreff: Re: [ggerganov/whisper.cpp] MacWhisper, native macOS app for Whisper (Discussion #420)
Wow, thanks for the super-fast reply, and absolutely, I've uploaded an MP3 file with a representative extract of the conversation.
The linked MP3 has a bit over 4 minutes of conversatoin, saved at 170-210 Kbps, forced to mono. There are 3 speakers, myself (loud English), a Japanese interviewee (medium-volume Japanese) and the interpreter (soft-spoken, both English and Japanese).
For me, if I select auto-detect, it will grab whatever language appears first in the stream and then just show the others as [speaking Japanese] in English text or the Kanji equivalent in the other direction for Japanese text. If I manually select English or Japanese, it does the same.
Even if I have it transcribe in Japanese, so I know there's Japanese text there, selecting Translate says the data is missing.
Thanks so much for looking into this for me!
(If it could really handle mixed dialogue, this would be incredibly useful to me. I'm also interested to see how it does with broken Jinglish, although I'm not expecting much on that front :-/ - That would be Truly helpful though...)
Hmm, it won't let me attach the file here, so here's a Dropbox link that should give you access, thanks again:
https://www.dropbox.com/scl/fi/iybdicxdkb84647cyf7gn/Canon_clip_for_MacWhisperDevs.mp3?rlkey=xktmfi0z91fk0bx7tixbbjfo4&dl=0
—
Reply to this email directly, view it on GitHub<#420 (reply in thread)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/A573WDSA4PR4CP2OSI2V5STYINVWHAVCNFSM6AAAAAAT6AEGNOVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM3TQMBTGA4DM>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Found an issue with transcription of speeches when interrupted by long pauses where music is played. Using Spanish language and small model on mp3 file. In that case the onlie training lasted 4h and had several breaks for coffee and lunch where music would be played. Transcription goes well for the fisrt part, then detects the music (and informs about [MUSIC], but never detects that the spoken speech starts again. It transcribes [MUSIC] until the end of the file. Result is 2h of correct transcription followed by 2h of [MUSIC], while it should have been 2h of transcription + 15min of [MUSIC] + 1h45 of transcription. |
Beta Was this translation helpful? Give feedback.
-
Macwhisper version 6.11 - no longer lets me edit the lines in the transcript to assign speakers! I'm not a comp sci guy, I am just a simple user. If anyone can provide help with this, or can suggest something like a new procedure - please - let me know, ok? Without the ability to assign speakers - it's essentially lost 50% of its utility. yikes! |
Beta Was this translation helpful? Give feedback.
-
Anyone knows something similar for Windows? |
Beta Was this translation helpful? Give feedback.
-
This app is absolutely phenomenal, I've bought it 3 times now for myself and two friends. Is there any chance there's an equivalent for Android? I'm looking for something that allows at the very least a long, uninterrupted mic recording where I can just put the phone down and let it pick everything up and transcribe it later. I realize I could just do an audio recording and move the file and transcribe it via the MacWhisperer app later, but if I could do it all on-device that would be incredible. Thanks! |
Beta Was this translation helpful? Give feedback.
-
I am using a paid Pro version of the app. Is this thread the only community for the app? I see a feedback email address for the app but no community links. I really think the app would benefit from either a forum/discord. Even a github discussion is fine. |
Beta Was this translation helpful? Give feedback.
-
Now that I have marked speakers for the transcription, I see now output format which preserves this information. Exporting segments to pdf or html seems to just export the original timestamp based segments. Is there a way I can add this speaker information to audio segments anywhere? |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Any suggestions for optimizing speed on an M2 Max MBP 96 GB? It'd be sweet if theres any benchmarks or you have any advice on models (distilled, turbo, normal?), audio pipeline and encoding/decoding format to use (assuming whisperkit is recommended), and effects of pipeline and encoding/decoding compute units on flash attention, greedy vs beam search, etc... I saw some notes on CoreML + GPU processing in discussions back in march but have not been following repo closely enough to know whether this has been implemented (at which point I assume whisperkit is no longer best option? although tbh idk what the difference between whisperkit and .cpp models are other than swift support). |
Beta Was this translation helpful? Give feedback.
-
Will ollama be supported as an alternative to openai and anthropic at one point ? |
Beta Was this translation helpful? Give feedback.
-
First of all, a massive thanks to @ggerganov for making all this! Most of the low level stuff is voodoo to me, but I was able to get a native macOS app up and running thanks to all your hard work!
MacWhisper lets you run Whisper locally on your Mac without having to install anything else.
Features
MacWhisper is very basic right now, so please let me know if you run into anything. You can download it for free here:
http://goodsnooze.gumroad.com/l/macwhisper
Beta Was this translation helpful? Give feedback.
All reactions