Files
game_engine/docs/superpowers/plans/2026-03-25-phase6-1-audio.md
2026-03-25 11:37:16 +09:00

35 KiB

Phase 6-1: Audio System Foundation Implementation Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: WAV 파일을 로드하고 WASAPI를 통해 소리를 재생하는 기본 오디오 시스템

Architecture: voltex_audio crate 신규 생성. WAV 파서와 믹싱 로직은 순수 함수로 테스트 가능하게 구현. WASAPI FFI는 별도 모듈. mpsc channel로 메인↔오디오 스레드 통신. AudioSystem이 public API 제공.

Tech Stack: Rust, Windows WASAPI (COM FFI), std::sync::mpsc, std::thread

Spec: docs/superpowers/specs/2026-03-25-phase6-1-audio.md


File Structure

voltex_audio (신규)

  • crates/voltex_audio/Cargo.toml — crate 설정 (Create)
  • crates/voltex_audio/src/lib.rs — public exports (Create)
  • crates/voltex_audio/src/audio_clip.rs — AudioClip 타입 (Create)
  • crates/voltex_audio/src/wav.rs — WAV 파서 (Create)
  • crates/voltex_audio/src/mixing.rs — 믹싱 순수 함수 (Create)
  • crates/voltex_audio/src/wasapi.rs — WASAPI FFI 바인딩 (Create)
  • crates/voltex_audio/src/audio_system.rs — AudioSystem API + 오디오 스레드 (Create)

Workspace (수정)

  • Cargo.toml — workspace members + dependencies (Modify)

Example (신규)

  • examples/audio_demo/Cargo.toml (Create)
  • examples/audio_demo/src/main.rs (Create)

Task 1: Crate 설정 + AudioClip 타입

Files:

  • Create: crates/voltex_audio/Cargo.toml

  • Create: crates/voltex_audio/src/lib.rs

  • Create: crates/voltex_audio/src/audio_clip.rs

  • Modify: Cargo.toml (workspace)

  • Step 1: Cargo.toml 생성

# crates/voltex_audio/Cargo.toml
[package]
name = "voltex_audio"
version = "0.1.0"
edition = "2021"

[dependencies]
  • Step 2: workspace에 추가

Cargo.toml workspace members에 "crates/voltex_audio" 추가. workspace.dependencies에 voltex_audio = { path = "crates/voltex_audio" } 추가.

  • Step 3: audio_clip.rs 작성
// crates/voltex_audio/src/audio_clip.rs

/// Decoded audio data with interleaved f32 samples normalized to -1.0..1.0.
#[derive(Clone)]
pub struct AudioClip {
    pub samples: Vec<f32>,
    pub sample_rate: u32,
    pub channels: u16,
}

impl AudioClip {
    pub fn new(samples: Vec<f32>, sample_rate: u32, channels: u16) -> Self {
        Self { samples, sample_rate, channels }
    }

    /// Number of sample frames (total samples / channels).
    pub fn frame_count(&self) -> usize {
        if self.channels == 0 { 0 } else { self.samples.len() / self.channels as usize }
    }

    /// Duration in seconds.
    pub fn duration(&self) -> f32 {
        if self.sample_rate == 0 { 0.0 } else { self.frame_count() as f32 / self.sample_rate as f32 }
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_mono_clip() {
        let clip = AudioClip::new(vec![0.0; 44100], 44100, 1);
        assert_eq!(clip.frame_count(), 44100);
        assert!((clip.duration() - 1.0).abs() < 1e-5);
    }

    #[test]
    fn test_stereo_clip() {
        let clip = AudioClip::new(vec![0.0; 88200], 44100, 2);
        assert_eq!(clip.frame_count(), 44100);
        assert!((clip.duration() - 1.0).abs() < 1e-5);
    }
}
  • Step 4: lib.rs 작성
// crates/voltex_audio/src/lib.rs
pub mod audio_clip;
pub use audio_clip::AudioClip;
  • Step 5: 테스트 실행

Run: cargo test -p voltex_audio Expected: 2 PASS

  • Step 6: 커밋
git add crates/voltex_audio/ Cargo.toml
git commit -m "feat(audio): add voltex_audio crate with AudioClip type"

Task 2: WAV 파서

Files:

  • Create: crates/voltex_audio/src/wav.rs

  • Modify: crates/voltex_audio/src/lib.rs

  • Step 1: wav.rs 작성

// crates/voltex_audio/src/wav.rs
use crate::audio_clip::AudioClip;

/// Parse a WAV file from raw bytes. Supports PCM 16-bit mono/stereo only.
pub fn parse_wav(data: &[u8]) -> Result<AudioClip, String> {
    if data.len() < 44 {
        return Err("WAV too short".into());
    }

    // RIFF header
    if &data[0..4] != b"RIFF" {
        return Err("Missing RIFF header".into());
    }
    if &data[8..12] != b"WAVE" {
        return Err("Missing WAVE identifier".into());
    }

    // Find fmt chunk
    let (fmt_offset, _fmt_size) = find_chunk(data, b"fmt ")
        .ok_or("Missing fmt chunk")?;

    let format_tag = read_u16_le(data, fmt_offset);
    if format_tag != 1 {
        return Err(format!("Unsupported format tag: {} (only PCM=1)", format_tag));
    }

    let channels = read_u16_le(data, fmt_offset + 2);
    if channels != 1 && channels != 2 {
        return Err(format!("Unsupported channel count: {}", channels));
    }

    let sample_rate = read_u32_le(data, fmt_offset + 4);
    let bits_per_sample = read_u16_le(data, fmt_offset + 14);
    if bits_per_sample != 16 {
        return Err(format!("Unsupported bits per sample: {} (only 16)", bits_per_sample));
    }

    // Find data chunk
    let (data_offset, data_size) = find_chunk(data, b"data")
        .ok_or("Missing data chunk")?;

    let num_samples = data_size / 2; // 16-bit = 2 bytes per sample
    let mut samples = Vec::with_capacity(num_samples);

    for i in 0..num_samples {
        let offset = data_offset + i * 2;
        if offset + 2 > data.len() {
            break;
        }
        let raw = read_i16_le(data, offset);
        samples.push(raw as f32 / 32768.0);
    }

    Ok(AudioClip::new(samples, sample_rate, channels))
}

fn find_chunk(data: &[u8], id: &[u8; 4]) -> Option<(usize, usize)> {
    let mut offset = 12; // skip RIFF header
    while offset + 8 <= data.len() {
        let chunk_id = &data[offset..offset + 4];
        let chunk_size = read_u32_le(data, offset + 4) as usize;
        if chunk_id == id {
            return Some((offset + 8, chunk_size));
        }
        offset += 8 + chunk_size;
        // Chunks are 2-byte aligned
        if chunk_size % 2 != 0 {
            offset += 1;
        }
    }
    None
}

fn read_u16_le(data: &[u8], offset: usize) -> u16 {
    u16::from_le_bytes([data[offset], data[offset + 1]])
}

fn read_u32_le(data: &[u8], offset: usize) -> u32 {
    u32::from_le_bytes([data[offset], data[offset + 1], data[offset + 2], data[offset + 3]])
}

fn read_i16_le(data: &[u8], offset: usize) -> i16 {
    i16::from_le_bytes([data[offset], data[offset + 1]])
}

/// Generate a WAV file in memory (PCM 16-bit mono). Useful for testing.
pub fn generate_wav_bytes(samples_f32: &[f32], sample_rate: u32) -> Vec<u8> {
    let num_samples = samples_f32.len();
    let data_size = num_samples * 2; // 16-bit
    let file_size = 36 + data_size;

    let mut buf = Vec::with_capacity(file_size + 8);

    // RIFF header
    buf.extend_from_slice(b"RIFF");
    buf.extend_from_slice(&(file_size as u32).to_le_bytes());
    buf.extend_from_slice(b"WAVE");

    // fmt chunk
    buf.extend_from_slice(b"fmt ");
    buf.extend_from_slice(&16u32.to_le_bytes()); // chunk size
    buf.extend_from_slice(&1u16.to_le_bytes());  // PCM
    buf.extend_from_slice(&1u16.to_le_bytes());  // mono
    buf.extend_from_slice(&sample_rate.to_le_bytes());
    buf.extend_from_slice(&(sample_rate * 2).to_le_bytes()); // byte rate
    buf.extend_from_slice(&2u16.to_le_bytes());  // block align
    buf.extend_from_slice(&16u16.to_le_bytes()); // bits per sample

    // data chunk
    buf.extend_from_slice(b"data");
    buf.extend_from_slice(&(data_size as u32).to_le_bytes());
    for &s in samples_f32 {
        let clamped = s.clamp(-1.0, 1.0);
        let i16_val = (clamped * 32767.0) as i16;
        buf.extend_from_slice(&i16_val.to_le_bytes());
    }

    buf
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_parse_valid_wav() {
        let samples = vec![0.0, 0.5, -0.5, 1.0];
        let wav_data = generate_wav_bytes(&samples, 44100);
        let clip = parse_wav(&wav_data).unwrap();
        assert_eq!(clip.sample_rate, 44100);
        assert_eq!(clip.channels, 1);
        assert_eq!(clip.samples.len(), 4);
    }

    #[test]
    fn test_sample_conversion_accuracy() {
        let samples = vec![1.0, -1.0, 0.0];
        let wav_data = generate_wav_bytes(&samples, 44100);
        let clip = parse_wav(&wav_data).unwrap();
        // 32767/32768 ≈ 0.99997
        assert!((clip.samples[0] - 32767.0 / 32768.0).abs() < 1e-3);
        assert!((clip.samples[1] - (-1.0)).abs() < 1e-3);
        assert!((clip.samples[2]).abs() < 1e-3);
    }

    #[test]
    fn test_invalid_riff() {
        let data = vec![0u8; 44];
        assert!(parse_wav(&data).is_err());
    }

    #[test]
    fn test_too_short() {
        let data = vec![0u8; 10];
        assert!(parse_wav(&data).is_err());
    }

    #[test]
    fn test_roundtrip() {
        let original = vec![0.25, -0.25, 0.5, -0.5];
        let wav_data = generate_wav_bytes(&original, 22050);
        let clip = parse_wav(&wav_data).unwrap();
        assert_eq!(clip.sample_rate, 22050);
        assert_eq!(clip.channels, 1);
        assert_eq!(clip.samples.len(), 4);
        for (a, b) in original.iter().zip(clip.samples.iter()) {
            assert!((a - b).abs() < 0.001);
        }
    }
}
  • Step 2: lib.rs에 wav 모듈 등록
pub mod wav;
pub use wav::{parse_wav, generate_wav_bytes};
  • Step 3: 테스트 실행

Run: cargo test -p voltex_audio Expected: 7 PASS (2 clip + 5 wav)

  • Step 4: 커밋
git add crates/voltex_audio/src/wav.rs crates/voltex_audio/src/lib.rs
git commit -m "feat(audio): add WAV parser with PCM 16-bit support"

Task 3: 믹싱 순수 함수

Files:

  • Create: crates/voltex_audio/src/mixing.rs

  • Modify: crates/voltex_audio/src/lib.rs

  • Step 1: mixing.rs 작성

// crates/voltex_audio/src/mixing.rs
use crate::audio_clip::AudioClip;

/// A currently playing sound instance.
pub struct PlayingSound {
    pub clip_index: usize,
    pub position: usize, // sample frame position
    pub volume: f32,
    pub looping: bool,
}

/// Mix active sounds into an interleaved f32 output buffer.
/// `output` has `frames * device_channels` elements.
/// Returns indices of sounds that finished (non-looping, reached end).
pub fn mix_sounds(
    output: &mut [f32],
    playing: &mut Vec<PlayingSound>,
    clips: &[AudioClip],
    device_sample_rate: u32,
    device_channels: u16,
    frames: usize,
) {
    // Zero output
    for s in output.iter_mut() {
        *s = 0.0;
    }

    let mut finished = Vec::new();

    for (idx, sound) in playing.iter_mut().enumerate() {
        if sound.clip_index >= clips.len() {
            finished.push(idx);
            continue;
        }
        let clip = &clips[sound.clip_index];
        let clip_frames = clip.frame_count();
        if clip_frames == 0 {
            finished.push(idx);
            continue;
        }

        let rate_ratio = clip.sample_rate as f64 / device_sample_rate as f64;

        for frame in 0..frames {
            let clip_frame_f = sound.position as f64 + frame as f64 * rate_ratio;
            let clip_frame = clip_frame_f as usize;

            if clip_frame >= clip_frames {
                if sound.looping {
                    sound.position = 0;
                    // Continue from beginning for remaining frames
                    // (simplified: just stop for this buffer, will resume next call)
                    break;
                } else {
                    finished.push(idx);
                    break;
                }
            }

            let out_offset = frame * device_channels as usize;

            for ch in 0..device_channels as usize {
                let clip_ch = if ch < clip.channels as usize { ch } else { 0 };
                let sample_idx = clip_frame * clip.channels as usize + clip_ch;
                if sample_idx < clip.samples.len() {
                    let sample = clip.samples[sample_idx] * sound.volume;
                    if out_offset + ch < output.len() {
                        output[out_offset + ch] += sample;
                    }
                }
            }
        }

        // Advance position
        let advanced = (frames as f64 * rate_ratio) as usize;
        sound.position += advanced;

        // Handle looping wrap
        if sound.position >= clip_frames && sound.looping {
            sound.position %= clip_frames;
        }
    }

    // Remove finished sounds (reverse order to preserve indices)
    finished.sort_unstable();
    finished.dedup();
    for &idx in finished.iter().rev() {
        if idx < playing.len() {
            playing.remove(idx);
        }
    }

    // Clamp output
    for s in output.iter_mut() {
        *s = s.clamp(-1.0, 1.0);
    }
}

#[cfg(test)]
mod tests {
    use super::*;
    use crate::AudioClip;

    #[test]
    fn test_single_sound_volume() {
        let clip = AudioClip::new(vec![0.5, 0.5, 0.5, 0.5], 44100, 1);
        let mut playing = vec![PlayingSound {
            clip_index: 0, position: 0, volume: 0.5, looping: false,
        }];
        let mut output = vec![0.0; 4];
        mix_sounds(&mut output, &mut playing, &[clip], 44100, 1, 4);
        for s in &output {
            assert!((*s - 0.25).abs() < 1e-5); // 0.5 * 0.5
        }
    }

    #[test]
    fn test_two_sounds_sum() {
        let clip = AudioClip::new(vec![0.4; 4], 44100, 1);
        let mut playing = vec![
            PlayingSound { clip_index: 0, position: 0, volume: 1.0, looping: false },
            PlayingSound { clip_index: 0, position: 0, volume: 1.0, looping: false },
        ];
        let mut output = vec![0.0; 4];
        mix_sounds(&mut output, &mut playing, &[clip], 44100, 1, 4);
        for s in &output {
            assert!((*s - 0.8).abs() < 1e-5); // 0.4 + 0.4
        }
    }

    #[test]
    fn test_clipping() {
        let clip = AudioClip::new(vec![0.9; 4], 44100, 1);
        let mut playing = vec![
            PlayingSound { clip_index: 0, position: 0, volume: 1.0, looping: false },
            PlayingSound { clip_index: 0, position: 0, volume: 1.0, looping: false },
        ];
        let mut output = vec![0.0; 4];
        mix_sounds(&mut output, &mut playing, &[clip], 44100, 1, 4);
        for s in &output {
            assert!(*s <= 1.0); // clamped
        }
    }

    #[test]
    fn test_non_looping_removal() {
        let clip = AudioClip::new(vec![0.5, 0.5], 44100, 1);
        let mut playing = vec![PlayingSound {
            clip_index: 0, position: 0, volume: 1.0, looping: false,
        }];
        let mut output = vec![0.0; 10]; // more frames than clip has
        mix_sounds(&mut output, &mut playing, &[clip], 44100, 1, 10);
        assert!(playing.is_empty()); // sound was removed
    }

    #[test]
    fn test_looping_continues() {
        let clip = AudioClip::new(vec![0.5, 0.5], 44100, 1);
        let mut playing = vec![PlayingSound {
            clip_index: 0, position: 0, volume: 1.0, looping: true,
        }];
        let mut output = vec![0.0; 10];
        mix_sounds(&mut output, &mut playing, &[clip], 44100, 1, 10);
        assert_eq!(playing.len(), 1); // still playing
    }

    #[test]
    fn test_mono_to_stereo() {
        let clip = AudioClip::new(vec![0.7; 4], 44100, 1);
        let mut playing = vec![PlayingSound {
            clip_index: 0, position: 0, volume: 1.0, looping: false,
        }];
        let mut output = vec![0.0; 8]; // 4 frames * 2 channels
        mix_sounds(&mut output, &mut playing, &[clip], 44100, 2, 4);
        // Both channels should have the mono sample
        for frame in 0..4 {
            assert!((output[frame * 2] - 0.7).abs() < 1e-5);
            assert!((output[frame * 2 + 1] - 0.7).abs() < 1e-5);
        }
    }
}
  • Step 2: lib.rs에 mixing 모듈 등록
pub mod mixing;
pub use mixing::{PlayingSound, mix_sounds};
  • Step 3: 테스트 실행

Run: cargo test -p voltex_audio Expected: 13 PASS (2 clip + 5 wav + 6 mixing)

  • Step 4: 커밋
git add crates/voltex_audio/src/mixing.rs crates/voltex_audio/src/lib.rs
git commit -m "feat(audio): add mixing functions with volume, looping, and channel conversion"

Task 4: WASAPI FFI 바인딩

Files:

  • Create: crates/voltex_audio/src/wasapi.rs
  • Modify: crates/voltex_audio/src/lib.rs

NOTE: 이 태스크는 Windows FFI 코드로, 단위 테스트 불가. 컴파일 확인만.

  • Step 1: wasapi.rs 작성
// crates/voltex_audio/src/wasapi.rs
//! WASAPI FFI bindings for Windows audio output.
//! This module is only compiled on Windows.
#![allow(non_snake_case, non_camel_case_types, dead_code)]

use std::ffi::c_void;
use std::ptr;

// --- COM types ---
type HRESULT = i32;
type UINT = u32;
type DWORD = u32;
type WORD = u16;
type BOOL = i32;
type HANDLE = *mut c_void;
type REFERENCE_TIME = i64;

const S_OK: HRESULT = 0;
const COINIT_MULTITHREADED: DWORD = 0x0;
const CLSCTX_ALL: DWORD = 0x17;
const AUDCLNT_SHAREMODE_SHARED: u32 = 0;
const AUDCLNT_STREAMFLAGS_EVENTCALLBACK: DWORD = 0x00040000;

#[repr(C)]
struct GUID {
    data1: u32,
    data2: u16,
    data3: u16,
    data4: [u8; 8],
}

// CLSIDs and IIDs
const CLSID_MMDEVICE_ENUMERATOR: GUID = GUID {
    data1: 0xBCDE0395, data2: 0xE52F, data3: 0x467C,
    data4: [0x8E, 0x3D, 0xC4, 0x57, 0x92, 0x91, 0x69, 0x2E],
};

const IID_IMMDEVICE_ENUMERATOR: GUID = GUID {
    data1: 0xA95664D2, data2: 0x9614, data3: 0x4F35,
    data4: [0xA7, 0x46, 0xDE, 0x8D, 0xB6, 0x36, 0x17, 0xE6],
};

const IID_IAUDIO_CLIENT: GUID = GUID {
    data1: 0x1CB9AD4C, data2: 0xDBFA, data3: 0x4c32,
    data4: [0xB1, 0x78, 0xC2, 0xF5, 0x68, 0xA7, 0x03, 0xB2],
};

const IID_IAUDIO_RENDER_CLIENT: GUID = GUID {
    data1: 0xF294ACFC, data2: 0x3146, data3: 0x4483,
    data4: [0xA7, 0xBF, 0xAD, 0xDC, 0xA7, 0xC2, 0x60, 0xE2],
};

// eRender = 0, eConsole = 0
const E_RENDER: u32 = 0;
const E_CONSOLE: u32 = 0;

#[repr(C)]
pub struct WAVEFORMATEX {
    pub wFormatTag: WORD,
    pub nChannels: WORD,
    pub nSamplesPerSec: DWORD,
    pub nAvgBytesPerSec: DWORD,
    pub nBlockAlign: WORD,
    pub wBitsPerSample: WORD,
    pub cbSize: WORD,
}

const WAVE_FORMAT_IEEE_FLOAT: WORD = 0x0003;
const WAVE_FORMAT_PCM: WORD = 0x0001;

// --- COM vtable definitions (raw pointers) ---

extern "system" {
    fn CoInitializeEx(reserved: *mut c_void, coinit: DWORD) -> HRESULT;
    fn CoUninitialize();
    fn CoCreateInstance(
        rclsid: *const GUID, outer: *mut c_void, ctx: DWORD,
        riid: *const GUID, ppv: *mut *mut c_void,
    ) -> HRESULT;
    fn CoTaskMemFree(pv: *mut c_void);
}

// IUnknown vtable
#[repr(C)]
struct IUnknownVtbl {
    QueryInterface: unsafe extern "system" fn(*mut c_void, *const GUID, *mut *mut c_void) -> HRESULT,
    AddRef: unsafe extern "system" fn(*mut c_void) -> u32,
    Release: unsafe extern "system" fn(*mut c_void) -> u32,
}

// IMMDeviceEnumerator vtable
#[repr(C)]
struct IMMDeviceEnumeratorVtbl {
    base: IUnknownVtbl,
    EnumAudioEndpoints: *const c_void,
    GetDefaultAudioEndpoint: unsafe extern "system" fn(
        *mut c_void, u32, u32, *mut *mut c_void,
    ) -> HRESULT,
    GetDevice: *const c_void,
    RegisterEndpointNotificationCallback: *const c_void,
    UnregisterEndpointNotificationCallback: *const c_void,
}

// IMMDevice vtable
#[repr(C)]
struct IMMDeviceVtbl {
    base: IUnknownVtbl,
    Activate: unsafe extern "system" fn(
        *mut c_void, *const GUID, DWORD, *mut c_void, *mut *mut c_void,
    ) -> HRESULT,
    OpenPropertyStore: *const c_void,
    GetId: *const c_void,
    GetState: *const c_void,
}

// IAudioClient vtable
#[repr(C)]
struct IAudioClientVtbl {
    base: IUnknownVtbl,
    Initialize: unsafe extern "system" fn(
        *mut c_void, u32, DWORD, REFERENCE_TIME, REFERENCE_TIME,
        *const WAVEFORMATEX, *const c_void,
    ) -> HRESULT,
    GetBufferSize: unsafe extern "system" fn(*mut c_void, *mut u32) -> HRESULT,
    GetStreamLatency: *const c_void,
    GetCurrentPadding: unsafe extern "system" fn(*mut c_void, *mut u32) -> HRESULT,
    IsFormatSupported: *const c_void,
    GetMixFormat: unsafe extern "system" fn(*mut c_void, *mut *mut WAVEFORMATEX) -> HRESULT,
    GetDevicePeriod: *const c_void,
    Start: unsafe extern "system" fn(*mut c_void) -> HRESULT,
    Stop: unsafe extern "system" fn(*mut c_void) -> HRESULT,
    Reset: unsafe extern "system" fn(*mut c_void) -> HRESULT,
    SetEventHandle: unsafe extern "system" fn(*mut c_void, HANDLE) -> HRESULT,
    GetService: unsafe extern "system" fn(*mut c_void, *const GUID, *mut *mut c_void) -> HRESULT,
}

// IAudioRenderClient vtable
#[repr(C)]
struct IAudioRenderClientVtbl {
    base: IUnknownVtbl,
    GetBuffer: unsafe extern "system" fn(*mut c_void, u32, *mut *mut u8) -> HRESULT,
    ReleaseBuffer: unsafe extern "system" fn(*mut c_void, u32, DWORD) -> HRESULT,
}

/// Wraps WASAPI COM objects for audio output.
pub struct WasapiDevice {
    client: *mut c_void,
    render_client: *mut c_void,
    buffer_size: u32,
    pub sample_rate: u32,
    pub channels: u16,
    pub bits_per_sample: u16,
    pub is_float: bool,
}

unsafe impl Send for WasapiDevice {}

impl WasapiDevice {
    /// Initialize WASAPI in shared mode with the default output device.
    pub fn new() -> Result<Self, String> {
        unsafe {
            let hr = CoInitializeEx(ptr::null_mut(), COINIT_MULTITHREADED);
            if hr != S_OK && hr != 1 { // 1 = S_FALSE (already initialized)
                return Err(format!("CoInitializeEx failed: 0x{:08X}", hr));
            }

            // Create device enumerator
            let mut enumerator: *mut c_void = ptr::null_mut();
            let hr = CoCreateInstance(
                &CLSID_MMDEVICE_ENUMERATOR, ptr::null_mut(), CLSCTX_ALL,
                &IID_IMMDEVICE_ENUMERATOR, &mut enumerator,
            );
            if hr != S_OK {
                return Err(format!("CoCreateInstance failed: 0x{:08X}", hr));
            }

            // Get default audio endpoint
            let vtbl = *(enumerator as *mut *const IMMDeviceEnumeratorVtbl);
            let mut device: *mut c_void = ptr::null_mut();
            let hr = ((*vtbl).GetDefaultAudioEndpoint)(enumerator, E_RENDER, E_CONSOLE, &mut device);
            ((*vtbl).base.Release)(enumerator);
            if hr != S_OK {
                return Err(format!("GetDefaultAudioEndpoint failed: 0x{:08X}", hr));
            }

            // Activate IAudioClient
            let vtbl = *(device as *mut *const IMMDeviceVtbl);
            let mut client: *mut c_void = ptr::null_mut();
            let hr = ((*vtbl).Activate)(device, &IID_IAUDIO_CLIENT, CLSCTX_ALL, ptr::null_mut(), &mut client);
            ((*vtbl).base.Release)(device);
            if hr != S_OK {
                return Err(format!("Activate failed: 0x{:08X}", hr));
            }

            // Get mix format
            let client_vtbl = *(client as *mut *const IAudioClientVtbl);
            let mut format_ptr: *mut WAVEFORMATEX = ptr::null_mut();
            let hr = ((*client_vtbl).GetMixFormat)(client, &mut format_ptr);
            if hr != S_OK {
                return Err(format!("GetMixFormat failed: 0x{:08X}", hr));
            }

            let fmt = &*format_ptr;
            let sample_rate = fmt.nSamplesPerSec;
            let channels = fmt.nChannels;
            let bits_per_sample = fmt.wBitsPerSample;
            let is_float = fmt.wFormatTag == WAVE_FORMAT_IEEE_FLOAT
                || (fmt.wFormatTag == 0xFFFE && bits_per_sample == 32); // EXTENSIBLE float

            // Initialize client (shared mode, 50ms buffer)
            let buffer_duration: REFERENCE_TIME = 500_000; // 50ms in 100ns units
            let hr = ((*client_vtbl).Initialize)(
                client, AUDCLNT_SHAREMODE_SHARED, 0,
                buffer_duration, 0, format_ptr, ptr::null(),
            );
            CoTaskMemFree(format_ptr as *mut c_void);
            if hr != S_OK {
                return Err(format!("Initialize failed: 0x{:08X}", hr));
            }

            // Get buffer size
            let mut buffer_size: u32 = 0;
            ((*client_vtbl).GetBufferSize)(client, &mut buffer_size);

            // Get render client
            let mut render_client: *mut c_void = ptr::null_mut();
            let hr = ((*client_vtbl).GetService)(client, &IID_IAUDIO_RENDER_CLIENT, &mut render_client);
            if hr != S_OK {
                return Err(format!("GetService failed: 0x{:08X}", hr));
            }

            // Start
            let hr = ((*client_vtbl).Start)(client);
            if hr != S_OK {
                return Err(format!("Start failed: 0x{:08X}", hr));
            }

            Ok(WasapiDevice {
                client,
                render_client,
                buffer_size,
                sample_rate,
                channels,
                bits_per_sample,
                is_float,
            })
        }
    }

    /// Write f32 samples to the WASAPI buffer. Returns number of frames written.
    pub fn write_samples(&self, samples: &[f32]) -> Result<usize, String> {
        unsafe {
            let client_vtbl = *(self.client as *mut *const IAudioClientVtbl);
            let render_vtbl = *(self.render_client as *mut *const IAudioRenderClientVtbl);

            let mut padding: u32 = 0;
            ((*client_vtbl).GetCurrentPadding)(self.client, &mut padding);
            let available = self.buffer_size - padding;

            if available == 0 {
                return Ok(0);
            }

            let frames_to_write = available.min(samples.len() as u32 / self.channels as u32);
            if frames_to_write == 0 {
                return Ok(0);
            }

            let mut buffer_ptr: *mut u8 = ptr::null_mut();
            let hr = ((*render_vtbl).GetBuffer)(self.render_client, frames_to_write, &mut buffer_ptr);
            if hr != S_OK {
                return Err(format!("GetBuffer failed: 0x{:08X}", hr));
            }

            let total_samples = frames_to_write as usize * self.channels as usize;

            if self.is_float && self.bits_per_sample == 32 {
                // Write f32 directly
                let out = std::slice::from_raw_parts_mut(buffer_ptr as *mut f32, total_samples);
                for i in 0..total_samples {
                    out[i] = if i < samples.len() { samples[i] } else { 0.0 };
                }
            } else if self.bits_per_sample == 16 {
                // Convert f32 → i16
                let out = std::slice::from_raw_parts_mut(buffer_ptr as *mut i16, total_samples);
                for i in 0..total_samples {
                    let s = if i < samples.len() { samples[i] } else { 0.0 };
                    out[i] = (s.clamp(-1.0, 1.0) * 32767.0) as i16;
                }
            }

            ((*render_vtbl).ReleaseBuffer)(self.render_client, frames_to_write, 0);
            Ok(frames_to_write as usize)
        }
    }

    pub fn buffer_frames(&self) -> u32 {
        self.buffer_size
    }
}

impl Drop for WasapiDevice {
    fn drop(&mut self) {
        unsafe {
            if !self.client.is_null() {
                let vtbl = *(self.client as *mut *const IAudioClientVtbl);
                ((*vtbl).Stop)(self.client);
                ((*vtbl).base.Release)(self.client);
            }
            if !self.render_client.is_null() {
                let vtbl = *(self.render_client as *mut *const IAudioRenderClientVtbl);
                ((*vtbl).base.Release)(self.render_client);
            }
            CoUninitialize();
        }
    }
}
  • Step 2: lib.rs에 wasapi 모듈 등록 (Windows 조건부)
#[cfg(target_os = "windows")]
pub mod wasapi;
  • Step 3: 빌드 확인

Run: cargo build -p voltex_audio Expected: 컴파일 성공 (FFI 선언만이므로 링크 에러 없음)

  • Step 4: 커밋
git add crates/voltex_audio/src/wasapi.rs crates/voltex_audio/src/lib.rs
git commit -m "feat(audio): add WASAPI FFI bindings for Windows audio output"

Task 5: AudioSystem + 오디오 스레드

Files:

  • Create: crates/voltex_audio/src/audio_system.rs

  • Modify: crates/voltex_audio/src/lib.rs

  • Step 1: audio_system.rs 작성

// crates/voltex_audio/src/audio_system.rs
use std::sync::mpsc::{self, Sender, Receiver};
use std::sync::Arc;
use std::thread::{self, JoinHandle};
use std::time::Duration;

use crate::audio_clip::AudioClip;
use crate::mixing::{PlayingSound, mix_sounds};

#[cfg(target_os = "windows")]
use crate::wasapi::WasapiDevice;

pub enum AudioCommand {
    Play { clip_index: usize, volume: f32, looping: bool },
    Stop { clip_index: usize },
    SetVolume { clip_index: usize, volume: f32 },
    StopAll,
    Shutdown,
}

pub struct AudioSystem {
    sender: Sender<AudioCommand>,
    _thread: JoinHandle<()>,
}

impl AudioSystem {
    /// Create a new audio system. Clips are shared with the audio thread.
    #[cfg(target_os = "windows")]
    pub fn new(clips: Vec<AudioClip>) -> Result<Self, String> {
        let (sender, receiver) = mpsc::channel();
        let clips = Arc::new(clips);

        let thread_clips = Arc::clone(&clips);
        let handle = thread::spawn(move || {
            audio_thread(receiver, thread_clips);
        });

        Ok(AudioSystem {
            sender,
            _thread: handle,
        })
    }

    pub fn play(&self, clip_index: usize, volume: f32, looping: bool) {
        let _ = self.sender.send(AudioCommand::Play { clip_index, volume, looping });
    }

    pub fn stop(&self, clip_index: usize) {
        let _ = self.sender.send(AudioCommand::Stop { clip_index });
    }

    pub fn set_volume(&self, clip_index: usize, volume: f32) {
        let _ = self.sender.send(AudioCommand::SetVolume { clip_index, volume });
    }

    pub fn stop_all(&self) {
        let _ = self.sender.send(AudioCommand::StopAll);
    }
}

impl Drop for AudioSystem {
    fn drop(&mut self) {
        let _ = self.sender.send(AudioCommand::Shutdown);
        // Thread will exit when it receives Shutdown
    }
}

#[cfg(target_os = "windows")]
fn audio_thread(receiver: Receiver<AudioCommand>, clips: Arc<Vec<AudioClip>>) {
    let device = match WasapiDevice::new() {
        Ok(d) => d,
        Err(e) => {
            eprintln!("[voltex_audio] WASAPI init failed: {}", e);
            return;
        }
    };

    let mut playing: Vec<PlayingSound> = Vec::new();
    let buffer_frames = device.buffer_frames() as usize;
    let channels = device.channels;
    let sample_rate = device.sample_rate;

    let mut mix_buffer = vec![0.0f32; buffer_frames * channels as usize];

    loop {
        // Process commands (non-blocking)
        while let Ok(cmd) = receiver.try_recv() {
            match cmd {
                AudioCommand::Play { clip_index, volume, looping } => {
                    playing.push(PlayingSound {
                        clip_index, position: 0, volume, looping,
                    });
                }
                AudioCommand::Stop { clip_index } => {
                    playing.retain(|s| s.clip_index != clip_index);
                }
                AudioCommand::SetVolume { clip_index, volume } => {
                    for s in playing.iter_mut() {
                        if s.clip_index == clip_index {
                            s.volume = volume;
                        }
                    }
                }
                AudioCommand::StopAll => {
                    playing.clear();
                }
                AudioCommand::Shutdown => {
                    return;
                }
            }
        }

        // Mix and write
        let frames = buffer_frames / 2; // write half buffer at a time
        let sample_count = frames * channels as usize;
        if mix_buffer.len() < sample_count {
            mix_buffer.resize(sample_count, 0.0);
        }

        mix_sounds(
            &mut mix_buffer[..sample_count],
            &mut playing,
            &clips,
            sample_rate,
            channels,
            frames,
        );

        match device.write_samples(&mix_buffer[..sample_count]) {
            Ok(_) => {}
            Err(e) => {
                eprintln!("[voltex_audio] Write error: {}", e);
            }
        }

        thread::sleep(Duration::from_millis(5));
    }
}
  • Step 2: lib.rs에 audio_system 모듈 등록
pub mod audio_system;
pub use audio_system::AudioSystem;
  • Step 3: 빌드 확인

Run: cargo build -p voltex_audio Expected: 컴파일 성공

  • Step 4: 전체 테스트

Run: cargo test --workspace Expected: all pass (기존 165 + 13 audio = 178)

  • Step 5: 커밋
git add crates/voltex_audio/src/audio_system.rs crates/voltex_audio/src/lib.rs
git commit -m "feat(audio): add AudioSystem with WASAPI audio thread"

Task 6: audio_demo 예제

Files:

  • Create: examples/audio_demo/Cargo.toml

  • Create: examples/audio_demo/src/main.rs

  • Modify: Cargo.toml (workspace members)

  • Step 1: Cargo.toml

# examples/audio_demo/Cargo.toml
[package]
name = "audio_demo"
version = "0.1.0"
edition = "2021"

[dependencies]
voltex_audio.workspace = true
  • Step 2: main.rs — 사인파 생성 + 재생
// examples/audio_demo/src/main.rs
use voltex_audio::{AudioClip, AudioSystem, parse_wav, generate_wav_bytes};

fn generate_sine_clip(freq: f32, duration: f32, sample_rate: u32) -> AudioClip {
    let num_samples = (sample_rate as f32 * duration) as usize;
    let mut samples = Vec::with_capacity(num_samples);
    for i in 0..num_samples {
        let t = i as f32 / sample_rate as f32;
        samples.push((t * freq * 2.0 * std::f32::consts::PI).sin() * 0.3);
    }
    AudioClip::new(samples, sample_rate, 1)
}

fn main() {
    println!("=== Voltex Audio Demo ===");
    println!("Generating 440Hz sine wave (2 seconds)...");

    let clip = generate_sine_clip(440.0, 2.0, 44100);
    let clip2 = generate_sine_clip(660.0, 1.5, 44100);

    println!("Initializing audio system...");
    let audio = match AudioSystem::new(vec![clip, clip2]) {
        Ok(a) => a,
        Err(e) => {
            eprintln!("Failed to init audio: {}", e);
            return;
        }
    };

    println!("Playing 440Hz tone...");
    audio.play(0, 0.5, false);
    std::thread::sleep(std::time::Duration::from_secs(1));

    println!("Playing 660Hz tone on top...");
    audio.play(1, 0.3, false);
    std::thread::sleep(std::time::Duration::from_secs(2));

    println!("Done!");
}
  • Step 3: workspace에 예제 추가

Cargo.toml members에 "examples/audio_demo" 추가.

  • Step 4: 빌드 확인

Run: cargo build --bin audio_demo Expected: 빌드 성공

  • Step 5: 커밋
git add examples/audio_demo/ Cargo.toml
git commit -m "feat(audio): add audio_demo example with sine wave playback"

Task 7: 문서 업데이트

Files:

  • Modify: docs/STATUS.md

  • Modify: docs/DEFERRED.md

  • Step 1: STATUS.md에 Phase 6-1 추가

Phase 5-3 아래에:

### Phase 6-1: Audio System Foundation
- voltex_audio: WAV parser (PCM 16-bit, mono/stereo)
- voltex_audio: AudioClip (f32 samples), mixing (volume, looping, channel conversion)
- voltex_audio: WASAPI backend (Windows, shared mode, COM FFI)
- voltex_audio: AudioSystem (channel-based audio thread, play/stop/volume)
- examples/audio_demo (sine wave playback)

crate 구조에 voltex_audio 추가. 테스트 수 업데이트. 예제 수 10으로.

  • Step 2: DEFERRED.md에 Phase 6-1 미뤄진 항목 추가
## Phase 6-1

- **macOS/Linux 백엔드** — WASAPI(Windows)만 구현. CoreAudio, ALSA 미구현.
- **OGG/Vorbis 디코더** — WAV PCM 16-bit만 지원.
- **24-bit/32-bit WAV** — 16-bit만 파싱.
- **ECS 통합** — AudioSource 컴포넌트 미구현. AudioSystem 직접 호출.
- **비동기 로딩** — 동기 로딩만.
  • Step 3: 커밋
git add docs/STATUS.md docs/DEFERRED.md
git commit -m "docs: add Phase 6-1 audio system status and deferred items"