Robust Sound Field Reproduction against Listener’s Movement Utilizing Image Sensor

17 %
83 %
Information about Robust Sound Field Reproduction against Listener’s Movement Utilizing...
Technology

Published on March 10, 2014

Author: NAIST_IS

Source: slideshare.net

Robust Sound Field Reproduction against Listener’s Movement Utilizing Image Sensor Toshihide Aketo,Hiroshi Saruwatari,Satoshi Nakamura (Nara Institute of Science and Technology, Japan)

Outline Research background Conventional method Spectral Division Method Local sound field synthesis Proposed method Equiangular filter Sound field reproduction system utilizing image sensor Simulation experiment Subjective assessment on directional perception on sound quality

Research background (1/3) Objective of sound field reproduction (SFR) system To reproduce the primary sound field to another space with wide range and high accuracy. However, it is difficult to realize such a system because the system size becomes larger and the system configuration becomes complex. Therefore, the recent research is focused on reproducing sound field with wide range and high accuracy using small and simple system. Surrounded (large and complex) Circular or spherical (a little complex) Linear or planer (simple) Boundary surface control (BoSC) Ambisonics Stereo or surround system Wave field synthesis (WFS) Focused Complex Simple

Research background (2/3) Spectral Division Method (SDM) [J. Ahrens, S. Spors., 2008] One of the SFR methods that reproduces the sound field by synthesizing a number of wavefronts. This method can be realized with a simple system like linear loudspeaker array. However, SDM has two problems. Problem 1: A sound pressure error is occurred by mismatching the reference listening line. Problem 2: A disturbance of wavefront is occurred by a spatial aliasing. Reproduction accuracy: Low Reproduction region: Wide High We aim to reproduce the sound field with high accuracy by solving these problems in SDM.

Research background (3/3) To cope with these problems, we propose the novel SFR system with linear loudspeaker array, which combines listener’s position estimation by Kinect and SDM with local sound field synthesis. Image sensor Kinect Local sound field synthesis Reproduction accuracy Low Reproduction region: Wide Reproduction accuracy: High Reproduction region: localized around listener

Spectral Division Method (SDM) [J. Ahrens, S. Spors., 2008] Primary source Primary source nth secondary source nth secondary source Reference listening line Reference listening line Spatial domain IDFT Fourier transform Wavenumber domain The driving function in the wavenumber domain The driving function in the spatial domain : angular frequency : wavenumber in : speed of sound -direction : imaginary unit : reference listening distance : zero-th order modified Bessel function of the second kind : zero-th order Hankel function of the second kind

Spectral Division Method (SDM) [J. Ahrens, S. Spors., 2008] Primary source Primary source nth secondary source nth secondary source Reference listening line Reference listening line Spatial domain IDFT Fourier transform Wavenumber domain The driving function in the wavenumber domain The driving function in the spatial domain :reference listening distance Problems in SDM A sound pressure error is occurred by mismatching the reference listening line. A disturbance of wavefront is occurred by a spatial aliasing.

Problem 1 : sound pressure error A sound pressure is correctly reproduced only on the reference listening line under 2.5-dimensional synthesis condition. Sound pressure is correctly reproduced on the reference listening line. 2.0 2.0 1.0 1.0 0.0 0.0 -1.0 0.0 1.0 Primary sound field -1.0 0.0 Sound pressure error 1.0 occurs outside the reference listening line. Reproduced sound field Therefore, to correctly reproduce the sound field to listener's position, we must set the reference listening distance equal to listener's distance.

Problem 2: spatial aliasing (1/2) 0 10 -24 0 -48 0 R参 加 -30 30 20 0 10 -24 0 -48 -30 0 30 Spectral overlap occurs Discretization of the secondary source Magnitude[dB] 20 Magnitude [dB] In SDM, a spectral overlap of the driving function is occurred by discretization of secondary source, and filter power at high frequency becomes larger like in the right figure.

Problem 2: spatial aliasing (2/2) The effect of spectral overlap in the wavenumber domain appears as a spatial aliasing in the spatial domain. 1.5 0.00 0.0 -1.5 0.0 1.5 -0.10 3.0 Synthesized wavefront (discrete array) 0.10 1.5 0.00 0.0 -1.5 0.0 1.5 Amplitude 0.10 Amplitude 3.0 Synthesized wavefront (continuous array) -0.10 Disturbance of wavefront occurs Discretization of the secondary source

Local sound field synthesis (1/2) [J. Ahrens, S. Spors., 2011] 0 10 -24 0 -48 -30 0 30 Spectral overlap occurs 20 0 10 -24 0 -48 -30 0 30 Spectral overlap is suppressed Rectangular window for the spectrum of the driving function By applying a rectangular window to a spectrum in the left figure, we enable to suppress a spectral overlap like in the right figure. Magnitude[dB] 20 Magnitude[dB] Local sound field synthesis: the method enables to suppress a spatial aliasing by limiting spatial bandwidth in the wavenumber domain.

Local sound field synthesis (2/2) [J. Ahrens, S. Spors., 2011] By applying a rectangular window, we enable to suppresses a disturbance of wavefront and enable to increase the maximum frequency in which the sound field can be correctly reproduced. Synthesized wavefront (unfiltered) Synthesized wavefront (filtered) 0.0 -1.5 0.0 1.5 -0.10 Spatial aliasing occurs 1.5 0.00 0.0 -1.5 0.0 1.5 Amplitude 0.00 Amplitude 1.5 0.10 3.0 0.10 3.0 -0.10 Disturbance of wavefront is suppressed Reproduction area is localized Therefore, It is necessary to design a filter to precisely control the reproduced direction in order to take advantage of this method.

Equiangular filter In order to design a filter to accurately control the reproduced direction, we derive the relation equation between reproduced direction , wavenumber in -direction and frequency . constant proportional : wavenumber in -direction : speed of sound :reproduced direction : frequency If reproduced direction is constant, since it is found that proportional to , we design a new filter as follows : angular frequency : angular width : wavenumber : equiangular filter is

Result of applying the equiangular filter (1/2) An example when we applied a designed filter to a spectrum 0 10 -24 0 -48 -30 0 30 Spectral overlap occurs and the angular width is . 20 0 10 -24 0 -48 -30 0 30 Spectral overlap is suppressed Equiangular filter for the spectrum of the driving function Equiangular filter used in this presentation is cut by applying a low-pass filter with respect to the frequency that exceeds the maximum frequency , and we do not reproduce the sound field. Magnitude[dB] 20 is Magnitude[dB] This case that the angular

Result of applying the equiangular filter (2/2) By applying the equiangular filter, we enable to suppress a disturbance of wavefront and enable to reproduce the sound field to the specific direction. Synthesized wavefront (unfiltered) Synthesized wavefront (filtered) 0.0 -1.5 0.0 1.5 -0.10 Spatial aliasing occurs 1.5 0.00 0.0 -1.5 0.0 1.5 Amplitude 0.00 Amplitude 1.5 0.10 3.0 0.10 3.0 -0.10 Disturbance of wavefront is suppressed However, there is a problem that it is impossible to match the sweet spot to the listener’s position if listener’s direction is unknown in advance.

Summary of problems Problems in SDM A sound pressure error occurs in the case that the reference listening distance does not match listener's distance. A spatial aliasing is occurred by discretization of secondary sources. Second problem can be solved by applying an equiangular filter Problems in equiangular filter It is impossible to match the sweet spot to the listener’s position if listener’s direction is unknown in advance. These problems can be solved if we know the listener’s position, therefore, introduction of the image sensor enables to solve these problems.

Condition of simulation experiment Primary source (monopole source) 34 ch linear secondary source array (monopole source) Parameter name measurement plane aliasing frequency Parameter value W4.0 D4.0 approximately 2019 Hz angular width reproduced direction Reference listening line synthesis frequency 3, 5 kHz Evaluation score : radiation characteristic of primary sound field : radiation characteristic of secondary sound field It is assumed that listener’s position is obtained by the image sensor, we calculate the reproduced direction from sound source position and listener's position.

Results of simulation experiment 0.10 0.10 2.0 1.0 0.00 0.0 -1.0 -1.5 0.0 1.5 -0.10 2.0 1.0 0.00 0.0 -1.0 -1.5 0.0 Amplitude Synthesized wavefront (5 kHz) Amplitude Synthesized wavefront (3 kHz) 1.5 -0.10 Evaluated value (3 kHz) Evaluated value (5 kHz) 0 0 2.0 2.0 -24 0.0 -1.0 1.0 -24 -48 1.0 0.0 -48 -1.0 -1.5 0.0 1.5 -1.5 0.0 1.5 : Listener : Primary source

Results of simulation experiment 0.10 0.10 2.0 1.0 0.00 0.0 -1.0 -1.5 0.0 1.5 -0.10 2.0 1.0 0.00 0.0 -1.0 -1.5 0.0 Amplitude Synthesized wavefront (5 kHz) Amplitude Synthesized wavefront (3 kHz) 1.5 -0.10 Evaluated value (3 kHz) Evaluated value (5 kHz) 0 0 2.0 2.0 -24 0.0 -1.0 1.0 -24 -48 1.0 0.0 -48 -1.0 -1.5 0.0 1.5 -1.5 0.0 1.5 The sound field is correctly reproduced at listener’s direction regardless of the frequency. : Listener : Primary source

Condition of subjective assessment on directional perception parameter name Acoustic transparent curtain : Primary source : Answer number card parameter value sampling frequency 48 kHz quantization bit rate 16 bit test sound white Gaussian noise with 3 seconds aliasing frequency 34 ch linear loudspeaker array angular width approximately 2019 Hz sound source direction number of evaluator type of sound source Loudspeaker distance Reference listening line 7 ・sound source without bandwidth limitation (Conventional1) ・sound source with bandwidth limitation in frequencies under 2 kHz (Conventional2) ・sound source in which we applied the equiangular filter(Proposed) Evaluation score Pos 1 Pos 2 Pos 3 : number of evaluator : answered direction : true source direction We asked evaluators to answer which card position you perceive the sound source exists as an evaluation procedure.

Results of subjective assessment on directional perception Conventional1 (without bandwidth limitation) Conventional2 (with bandwidth limitation in frequencies under 2 kHz) Proposed (in which we applied the equiangular filter) Bad (a) In Pos1 (b) In Pos2 (c) In Pos3 Good Proposed is superior to Conventional1 and Conventional2 in Pos1 and Pos2. However, Proposed is almost the same as Conventional2 in Pos3. This is because in equiangular filter, as the angle of reproduced direction becomes larger, the maximum frequency becomes low. As the user moves to right (from Pos1 to Pos3), directional perception error of Conventional1 becomes larger owing to the effect of a spatial aliasing. The superiority of the proposed method is shown on directional perception.

Condition of subjective assessment on sound quality Acoustic transparent curtain : Primary source : Reference loudspeaker parameter name parameter value sampling frequency 34 ch linear loudspeaker array 48 kHz quantization bit rate 16 bit test sound aliasing frequency White Gaussian noise with 3 seconds approximately 2019 Hz angular width Loudspeaker distance sound source direction number of evaluator type of sound source Reference listening line Pos 1 Pos 2 Pos 3 7 ・sound source without bandwidth limitation (Conventional1) ・sound source with bandwidth limitation in frequencies under 2 kHz (Conventional2) sound source in which we applied the equiangular filter(Proposed) We sounded two synthesized sound after reference sound radiated by reference loudspeaker, and asked evaluators to answer which synthesized sound you perceive closer to the reference sound as an evaluation procedure.

Results of subjective assessment on sound quality Conventional1 (without bandwidth limitation) Conventional2 (with bandwidth limitation in frequencies under 2 kHz) Proposed (in which we applied the equiangular filter) Good (a) In Pos1 (b) In Pos2 (c) In Pos3 ꥰꥰ Bad In all results, evaluators chose Conventional1 or Proposed, and didn’t choose Conventional2. In all listener’s position, more evaluator chose Conventional1 than Proposed. It was suggested that the effect in which high frequency region of sound is cut is larger than the effect of spatial aliasing on sound quality.

Conclusion The objective of SFR system is to reproduce the primary sound field to another space with wide range and high accuracy as much as possible. Since it is difficult to reproduce the sound field with a complex system, the SFR method utilizing simple system has been desired. SDM can be realized with a simple system like linear loudspeaker array. However, to reproduce the sound field with high accuracy utilizing this method is impossible. ꥰꥰ We proposed the SFR system which reproduce the sound field with high accuracy to listener's position by estimating the listener's direction. As results of subjective assessment, the superiority of proposed method is shown on directional perception. However, since the superiority failed to show on sound quality, it is necessary to improve the equiangular filter that we do not apply the lowpass filter. Thank you for your attention!

Add a comment

Related presentations

Related pages

Robust Sound Field Reproduction against Listener’s ...

1. Robust Sound Field Reproduction against Listener’s Movement Utilizing Image Sensor Toshihide Aketo,Hiroshi Saruwatari,Satoshi Nakamura (Nara ...
Read more

Robust Sound Field Reproduction against Listener's ...

Robust Sound Field Reproduction against Listener's Movement Utilizing Image ... sound field reproduction ... sound field at a specific listener's ...
Read more

発表文献 | 東京大学 猿渡研究室

... for sound field recording and reproduction ... “Robust sound field reproduction against listener’s movement utilizing image sensor ...
Read more

Naoki Shibata (@Naoki_Shibata) | Twitter

Robust Sound Field Reproduction against Listener’s Movement Utilizing Image Sensor http://www. slideshare.net/NAIST_IS/robus t-sound-field-reproduction ...
Read more

発表文献 | Saruwatari lab., The University of Tokyo

... for sound field recording and reproduction ... “Robust sound field reproduction against listener’s movement utilizing image sensor ...
Read more

Robust Field Data Logger - Field Studies Hackday - Technology

Robust Sound Field Reproduction against Listener’s Movement Utilizing Image Sensor
Read more

研究発表 | 情報科学研究科 | NAIST 国立大学法人奈良先端科学技術大学院大学

Robust Sound Field Reproduction against Listener’s Movement Utilizing Image Sensor ... Depth Estimation of Sound Images Using Directional Clustering ...
Read more

受賞 - isw3.naist.jp

この度は「Robust Sound Field Reproduction against Listener ... Robust Sound Field Reproduction against Listener’s Movement Utilizing Image Sensor ...
Read more

猿渡 洋 - 研究者 - researchmap

Robust Sound Field Reproduction against Listener's Movement Utilizing Image ... Sound Field Reproduction ... Robust sound field reproduction ...
Read more