Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: fix audio output on macos #106

Closed
wants to merge 0 commits into from
Closed

Conversation

louis030195
Copy link
Collaborator

@louis030195 louis030195 commented Aug 4, 2024

#101 #70

capture output audio

proof it works:
https://github.com/user-attachments/assets/8ea477ef-f364-4dc5-9aba-0d891ff6d348

but in theory this should crash on macos >=15.0

can someone test? im on 14.5

@m13v @htalvitie @DmacMcgreg

how to test

git fetch origin
git checkout fix-audio-output-macos
git pull
cargo run --bin screenpipe

this should crash after < 2 min

if it doesnt then problem solved

if it does we can iterate from there

@htalvitie
Copy link

htalvitie commented Aug 4, 2024

I tested this branch on my macOS 14.5.

Audio recording (and transcribing) seemed to work. Ran recording for around 5 minutes without any crashes.

The only strange behavior was that I was only able to select my external monitors as possible output devices. When running screenpipe with --list-audio-devices, I get this output:

  1. iPhone Max Microphone (input)
  2. MateView (input)
  3. MateView (input)
  4. RØDE NT-USB Mini (input)
  5. Logi 4K Stream Edition (input)
  6. MacBook Pro Microphone (input)
  7. RØDE Connect System (input)
  8. RØDE Connect Virtual (input)
  9. RØDE Connect Stream (input)
  10. Display 4 (output)
  11. Display 3 (output)

So, for some reason even MacBook Pro speakers are not included in the list of recognized output sources.

@louis030195
Copy link
Collaborator Author

louis030195 commented Aug 5, 2024

@htalvitie great! thanks a lot

actually it's normal

it's apple API that records audio from screen

https://developer.apple.com/documentation/screencapturekit/

but maybe could make it more intuitive in the future

would be great to have someone on >= 15.0 test this

@m13v
Copy link
Contributor

m13v commented Aug 6, 2024

[2024-08-06T00:27:49Z INFO screenpipe_audio::core] Recording Display 1 (output) for 30 seconds
thread '' panicked at /Users/matthewdi/.cargo/registry/src/index.crates.io-6f17d22bba15001f/objc_id-0.1.1/src/id.rs:52:9:
Attempted to construct an Id from a null pointer
libc++abi: terminating due to uncaught foreign exception
zsh: abort cargo run --bin screenpipe

@louis030195
Copy link
Collaborator Author

[2024-08-06T00:27:49Z INFO screenpipe_audio::core] Recording Display 1 (output) for 30 seconds thread '' panicked at /Users/matthewdi/.cargo/registry/src/index.crates.io-6f17d22bba15001f/objc_id-0.1.1/src/id.rs:52:9: Attempted to construct an Id from a null pointer libc++abi: terminating due to uncaught foreign exception zsh: abort cargo run --bin screenpipe

thanks

this is where we can iterate on

i'm thinking about merging with a if "if macos <= 14.5 use this hack otherwise audio output off"

until we fix for >=15.0

@louis030195
Copy link
Collaborator Author

@louis030195
Copy link
Collaborator Author

leaving here some experiment i did (not gonna push this)

use anyhow::{anyhow, Result};
use log::{debug, info};
use screencapturekit::cm_sample_buffer::CMSampleBuffer;
use screencapturekit::sc_content_filter::{InitParams, SCContentFilter};
use screencapturekit::sc_display::SCDisplay;
use screencapturekit::sc_error_handler::StreamErrorHandler;
use screencapturekit::sc_output_handler::{SCStreamOutputType, StreamOutput};
use screencapturekit::sc_shareable_content::SCShareableContent;
use screencapturekit::sc_stream::SCStream;
use screencapturekit::sc_stream_configuration::SCStreamConfiguration;
use std::path::PathBuf;
use std::sync::atomic::{AtomicBool, Ordering};
use std::sync::Arc;
use std::time::Duration;
use tokio::sync::mpsc;

struct DummyErrorHandler;
impl StreamErrorHandler for DummyErrorHandler {
    fn on_error(&self) {
        // Log the error or handle it as needed
    }
}

struct AudioOutput {
    tx: mpsc::Sender<Vec<u8>>,
}

impl StreamOutput for AudioOutput {
    fn did_output_sample_buffer(&self, sample: CMSampleBuffer, of_type: SCStreamOutputType) {
        if let SCStreamOutputType::Audio = of_type {
            let buffer = sample.sys_ref.get_av_audio_buffer_list();
            let flattened_buffer: Vec<u8> = buffer.iter().flat_map(|b| b.data.clone()).collect();
            if let Err(e) = self.tx.blocking_send(flattened_buffer) {
                log::error!("Failed to send audio data: {}", e);
            }
        }
    }
}

#[tokio::main]
async fn main() -> Result<()> {
    env_logger::init();

    #[cfg(not(target_os = "macos"))]
    return Err(anyhow!("This binary only works on macOS"));

    #[cfg(target_os = "macos")]
    {
        let output_path = PathBuf::from("output.mp4");
        let duration = Duration::from_secs(10); // Record for 10 seconds
        let is_running = Arc::new(AtomicBool::new(true));

        info!("Initializing audio capture");
        let display = SCShareableContent::current()
            .displays
            .first()
            .expect("No display found")
            .clone();
        let filter = SCContentFilter::new(InitParams::Display(display));
        let config = SCStreamConfiguration::default();
        let audio_stream = SCStream::new(filter, config, DummyErrorHandler);
        let (tx, rx) = mpsc::channel(1000);

        let is_running_clone = Arc::clone(&is_running);
        tokio::spawn(async move {
            capture_audio(audio_stream, tx, is_running_clone).await;
        });

        info!("Starting FFmpeg process");
        let ffmpeg_result = run_ffmpeg(
            rx,
            48000,
            2,
            &output_path,
            Arc::clone(&is_running),
            duration,
        )
        .await;

        is_running.store(false, Ordering::Relaxed);

        ffmpeg_result?;

        info!("Audio capture completed. Output saved to {:?}", output_path);
        Ok(())
    }
}

#[cfg(target_os = "macos")]
async fn capture_audio(
    mut audio_stream: SCStream,
    tx: mpsc::Sender<Vec<u8>>,
    is_running: Arc<AtomicBool>,
) {
    let audio_output = AudioOutput { tx };
    audio_stream.add_output(audio_output, SCStreamOutputType::Audio);

    if let Err(e) = audio_stream.start_capture() {
        log::error!("Failed to start audio capture: {}", e);
        return;
    }

    while is_running.load(Ordering::Relaxed) {
        tokio::time::sleep(tokio::time::Duration::from_millis(100)).await;
    }

    if let Err(e) = audio_stream.stop_capture() {
        log::error!("Failed to stop audio capture: {}", e);
    }
}

async fn run_ffmpeg(
    mut rx: mpsc::Receiver<Vec<u8>>,
    sample_rate: u32,
    channels: u16,
    output_path: &PathBuf,
    is_running: Arc<AtomicBool>,
    duration: Duration,
) -> Result<()> {
    use std::process::Stdio;
    use tokio::io::AsyncWriteExt;
    use tokio::process::Command;

    debug!("Starting FFmpeg process");
    let mut command = Command::new("ffmpeg");
    command
        .args(&[
            "-f",
            "f32le",
            "-ar",
            &sample_rate.to_string(),
            "-ac",
            &channels.to_string(),
            "-i",
            "pipe:0",
            "-c:a",
            "aac",
            "-b:a",
            "128k",
            "-f",
            "mp4",
            output_path.to_str().unwrap(),
        ])
        .stdin(Stdio::piped())
        .stdout(Stdio::piped())
        .stderr(Stdio::piped());

    debug!("FFmpeg command: {:?}", command);

    let mut ffmpeg = command.spawn().expect("Failed to spawn FFmpeg process");
    debug!("FFmpeg process spawned");
    let mut stdin = ffmpeg.stdin.take().expect("Failed to open stdin");
    let start_time = std::time::Instant::now();

    while is_running.load(Ordering::Relaxed) {
        tokio::select! {
            Some(data) = rx.recv() => {
                if start_time.elapsed() >= duration {
                    debug!("Duration exceeded, breaking loop");
                    break;
                }
                if let Err(e) = stdin.write_all(&data).await {
                    log::error!("Failed to write audio data to FFmpeg: {}", e);
                    break;
                }
            }
            _ = tokio::time::sleep(Duration::from_millis(100)) => {
                if start_time.elapsed() >= duration {
                    debug!("Duration exceeded, breaking loop");
                    break;
                }
            }
        }
    }

    debug!("Dropping stdin");
    drop(stdin);
    debug!("Waiting for FFmpeg process to exit");
    let output = ffmpeg.wait_with_output().await?;
    let status = output.status;

    if !status.success() {
        log::error!("FFmpeg process failed with status: {}", status);
        log::error!("FFmpeg stderr: {}", String::from_utf8_lossy(&output.stderr));
        return Err(anyhow!("FFmpeg process failed"));
    }

    Ok(())
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants