Loading…
Attending this event?
The Papers and day two Keynote schedules have not been finalized. As soon as they are, they will be posted on this page. Please check back soon for the full convention program.
Tuesday, October 24
 

7:00pm EDT

The Dolby Atmos Music Tech Tour Experience
Join Dolby for the Dolby Atmos music experience at Dolby’s NY 88 screening room with Dolby Atmos music in cinema, smart speaker and auto, plus discussion and Q&A.


Tuesday October 24, 2023 7:00pm - 8:30pm EDT
OFFSITE
 
Wednesday, October 25
 

9:00am EDT

(Lecture) Reproducing Virtual Acoustic Environments in the Recording Studio: Part I
The second generation of the Virtual Acoustic Technology Laboratory (VATLab) at McGill University features a new real-time auralizer with a feedback canceller developed by CCRMA at Stanford University, allowing for the creation of a virtual acoustic environment with exceptionally high gain. As the first part of a two-paper study, we employ a one-microphone- to-one-speaker configuration method to capture impulse responses (IRs) from two of McGill University’s performance halls. The halls have been “recreated” acoustically with the VAT system in the studio and an acoustic analysis is performed on the IRs captured from the halls and on the auralized environment.

Speakers
AA

Aybar Aydin

McGill University
GG

Gianluca Grazioli

McGill University
avatar for Richard King

Richard King

McGill University
Richard King is an Educator, Researcher, and a Grammy Award winning recording engineer. Richard has garnered Grammy Awards in various fields including Best Engineered Album in both the Classical and Non-Classical categories. Richard is an Associate Professor at the Schulich School... Read More →
WW

Wieslaw Woszczyk

McGill University


Wednesday October 25, 2023 9:00am - 9:15am EDT
1E07

9:00am EDT

(Lecture) Comprehensive Objective Analysis of Digital to Analog Conversion in Consumer and Professional Applications
A six-year project measuring performance of over 400 Digital to Analog Converters brings significant insight into the range of performance and prices in this audio category. It shows that little correlation exists between price and objective performance. And that state-of-the-art performance, achieving full audible transparency has been achieved by mostly lessor known companies. Challenges with measurement equipment is presented as well as need for better publishing standards for audio measurements.

Speakers

Wednesday October 25, 2023 9:00am - 9:15am EDT
1E09

9:00am EDT

Scientific milestones in the evolution of modern recording control room design
As critical listening rooms have evolved from monophonic, stereo, and surround to the present fully immersive designs, many approaches have been taken in the design of recording control rooms. The importance of the neutrality and transferability of these room is essential, because it is well understood that if you can’t take the room out of the mix, you can’t take the mix out of the room. This presentation will review significant milestones in this evolution over the past 4 decades of my career in acoustics. It will include the importance of measurement technology, beginning with time delay spectrometry; novel acoustical material product development and measurementstandards, including reflection phase grating diffusers and dedicated low frequency modal absorbers; as well as the creation of the reflection free zone/diffuse field zone control room design, which is currently adopted as a de facto standard in many current projects. In addition, the role of design software will be discussed from the original cuboid image model room optimization programs to a current full bandwidth hybrid program, which combines a wave based FEM modal model and a geometrical acoustic model above the Schroeder frequency, to simultaneously optimize the geometrical design of any shaped room, the location of speakers and listeners, as well as the acoustical treatment. The result is a full bandwidth impulse response which can be used to auralize the room. Many examples of milestone projects will be presented, accompanied by a discussion of my scientific collaborations with the recording studio designers associated with each project.


Speakers
avatar for Peter D'Antonio

Peter D'Antonio

RPG Acoustical Systems
Dr. Peter D’Antonio is a pioneering sound diffusion expert. He received his B.S. from St. John’s University in 1963 and his Ph.D. from the Polytechnic Institute of Brooklyn, in 1967. During his scientific career as a diffraction physicist, Dr. D’Antonio was a member of the Laboratory... Read More →


Wednesday October 25, 2023 9:00am - 10:00am EDT
1E08

9:00am EDT

[Panel] Standardizing HRTF Datasets
Thanks to the development and wide adoption of the SOFA file format, researchers and developers can access multiple published repositories of HRTF data. However, due to a lack of standardization for the capture, post-processing and publication of datasets, publicly available databases cannot be readily aggregated into extensive corpora. Areas of discrepancy include spatial distribution, measurement distances, temporal alignment, spectral equalization, and extrapolation at low frequencies (where acoustic measurements are often not reliable). In this session, panelists will share their experience and perspectives on the collection and processing of measured HRTF data, their application to the realization of binaural rendering systems, and their use in machine learning and analytics for HRTF personalization and individualization. An open initiative is considered to facilitate interoperability and data aggregation across existing and future HRTF databases.

Speakers
avatar for Jean-Marc Jot

Jean-Marc Jot

Principal, Virtuel Works
Virtuel Works provides audio technology strategy, IP creation and licensing services to help accelerate the development of audio and voice spatial computing and interoperability standards that will power our future ubiquitous copresence, remote collaboration or immersive music and... Read More →
HB

Helene Bahu

Virtuel Works
PO

Peter Otto

Virtuel Works


Wednesday October 25, 2023 9:00am - 10:00am EDT
1E11

9:00am EDT

Integrating Dynamics Processing into a Parametric EQ Filter
In the same manner that a graphic equalizer can be converted into a dynamic EQ, a parametric filter can serve as the basis of a dynamics processor. As the parametric equalizer allows an engineer to focus on acoustic conditions rendering an audio source, integrating dynamics can yield a processor more applicable to issues encountered in audio production, as has been done in multiple specialized processors. A general dynamic parametric EQ can be applied to most, if not all, acoustic anomalies encountered in the recording studio. As most of these anomalies occur at outlying levels, threshold dynamics can control the anomalies while leaving normal levels untouched. We look at the architecture of a general dynamic parametric EQ and apply it to issues of instrument acoustics, room acoustics, and microphone setup acoustics in recorded tracks.

Speakers
avatar for Duane Wise

Duane Wise

Founder and President, Wholegrain Digital Systems LLC
Developer of DynPEQ algorithm, for dynamic parametric equalization.  The DynPEQ design allows for detailed and robust control of the audio spectrum, earning it the designation 2021 Best In Market from Mix Magazine / Pro Sound News.
avatar for George Massenburg

George Massenburg

Associate Professor of Sound Recording, Massenburg Design Works
George Y. Massenburg is a Grammy award-winning recording engineer and inventor. Working principally in Baltimore, Los Angeles, Nashville, and Macon, Georgia, Massenburg is widely known for submitting a paper to the Audio Engineering Society in 1972 regarding the parametric equali... Read More →
avatar for Thomas Lund

Thomas Lund

Senior Technologist, Genelec Oy
Thomas Lund is senior scientist at Genelec, doing subjective research and listening tests. He has written papers on perception, spatialisation, loudness, sound exposure and true-peak level. Thomas is convenor of a working group under the European Comission, tasked with the prevention... Read More →


Wednesday October 25, 2023 9:00am - 10:00am EDT
1E10

9:00am EDT

Translating immersive microphone techniques from classical music to pop productions
We set out to explore how immersive microphone techniques from classical music could translate to modern pop productions. The 2L microphone array is designed to capture classical, jazz and folk music in large acoustic venues. During three days in Cederberg studios we experimented how to modify the microphone technique to adopt to a traditional studio environment and its implications on the workflow to engineers, producers, composers and performers. We will play the immersive result and discuss the process.

Speakers
avatar for Morten Lindberg

Morten Lindberg

Producer & Engineer, 2L (Lindberg Lyd AS)
Recording Producer and Balance Engineer with 42 American GRAMMY-nominations since 2006, 34 of these in craft categories Best Engineered Album, Best Surround Sound Album, Best Immersive Audio Album and Producer of the Year. Founder and CEO of the record label 2L. Grammy Award-winner... Read More →
CC

Christer-André Cederberg

Cederberg Studios


Wednesday October 25, 2023 9:00am - 10:30am EDT
1E06

9:15am EDT

(Lecture) Reproducing Virtual Acoustic Environments in the Recording Studio: Part 2
Following objective analysis on the feasibility of representing a virtual acoustic environment in an existing physical space in Part 1 of this paper series, this pilot study asks what signal attributes musicians use to interact with acoustic environments, and whether these attributes are important for identifying and interacting with virtual environments as well. Five guitarists are asked to complete a performance task in a concert hall and answer questions regarding their experience with the room acoustics. This process is then repeated in a virtual acoustic representation of the same space and the musicians’ responses are evaluated. Their statements will help researchers determine how musicians interact with a space with virtual and physical acoustic characteristics.

Speakers
AA

Aybar Aydin

McGill University
avatar for Richard King

Richard King

McGill University
Richard King is an Educator, Researcher, and a Grammy Award winning recording engineer. Richard has garnered Grammy Awards in various fields including Best Engineered Album in both the Classical and Non-Classical categories. Richard is an Associate Professor at the Schulich School... Read More →
WW

Wieslaw Woszczyk

McGill University


Wednesday October 25, 2023 9:15am - 9:30am EDT
1E07

9:15am EDT

(Lecture) Exploiting 55 nm Silicon Process To Improve Analog-to-Digital Converter Performance, Functionality and Power Consumption
High-performance audio analog-to-digital converter integrated circuits are foundational components of the audio signal path, converting analog signals from microphones to digital audio data for processing and recording. Use of a more-advanced silicon technology node than previous generations of analog-to-digital converters enables fundamental improvements in performance, functionality and power consumption, significantly advancing the state of the art.


Wednesday October 25, 2023 9:15am - 9:30am EDT
1E09

9:30am EDT

(Lecture) The Optimization of Microphone Techniques for capturing Virtual Acoustic Environments
Loudspeaker-based virtual acoustics allow multiple performers to interact with simulated environments that have been recreated within existing physical spaces. When creating these simulations, steps are taken to minimize the impact of the physical room structure on the virtual room simulation. As a result, capturing these spaces in recordings using traditional techniques can lead to unsatisfactory results that do not accurately represent the experience of being present in the room. By contrast, this paper presents a preliminary approach to capturing virtual environments by working with, rather than against the physical environment. Using a high number of spaced capture points, a methodology of realistically capturing a virtual acoustic environment is achieved for playback using immersive media systems. The results of preliminary recordings have shown versatility in capturing the virtual room response in a way that is both musical and realistic.

Speakers
VB

Vlad Baran

McGill University
avatar for Kathleen Ying-Ying Zhang

Kathleen Ying-Ying Zhang

McGill University
YIng-Ying Zhang is a music technology researcher and sound engineer. She is the first woman to be admitted to the Sound Recording PhD program at McGill University, where she is projected to graduate in 2024. She has experience in both on-set and post-production film sound, including... Read More →
AA

Aybar Aydin

McGill University
avatar for Richard King

Richard King

McGill University
Richard King is an Educator, Researcher, and a Grammy Award winning recording engineer. Richard has garnered Grammy Awards in various fields including Best Engineered Album in both the Classical and Non-Classical categories. Richard is an Associate Professor at the Schulich School... Read More →
WW

Wieslaw Woszczyk

McGill University


Wednesday October 25, 2023 9:30am - 9:45am EDT
1E07

9:30am EDT

(Lecture) Analysis of the Dinaburg C2S™ Alignment through CAE Simulation & Prototype
The Concentric Coplanar Stabilizer (C2STM) design theory based on Mikail Dinaburg’s patent is analyzed using COMSOL Multiphysics® modeling as well as comparison to prototypes based on optimized design based on the simulations. A brief description of the design theory will be presented. The building of the simulation model will be illustrated and described, including best practices. The results will be presented and the optimization of the design will be shown. The results for a full audio band solution will be compared directly to a similar passive radiator design. The unique acoustic phase behavior in the interior of the design based on the Dinaburg design theory will be illustrated. The direct relationship of the design’s simulated performance to the measured sound quality and listening tests will be shown and described. The results show performance improvements when compared to a typical passive radiator design. Possible applications of the design theory will briefly be listed.

Speakers

Wednesday October 25, 2023 9:30am - 9:45am EDT
1E09

9:45am EDT

(Lecture) Synthetic dataset generation with cloud-based hybrid wave-based/geometrical acoustics simulation for improved machine learning in audio signal processing algorithms
The growing usage of machine learning and artificial intelligence within audio signal processing underscores the significance of high-quality audio datasets for advancing audio algorithms such as speech enhancement, echo cancellation, de-reverberation, and blind room estimation. However, the conventional approaches of collecting such data present various limitations. Measurement based approaches are costly and time-consuming, and synthetic data generation using standard acoustics simulation methodology has been shown to generalize poorly to real world scenarios, due to limitations in capturing the intricacies of real world room acoustics.

In this talk, we present a framework that offers a solution to the challenges associated with dataset creation, enabling the efficient production of extensive datasets that closely mimic real-world audio scenes, thereby enhancing the efficacy of machine learning models. Through the lens of a specific use-case illustration, we highlight the integration of a hybrid wave-based / geometrical acoustics simulation for dataset generation. Notably, our focus extends to accurate device modeling—a critical aspect for the development of multiple-microphone devices and subsequent refinement of machine learning algorithms. We illustrate how the dataset accuracy surpasses the standard limitations of geometrical acoustics simulations. We present analysis of the computational performance of the system and we demonstrate examples of improved machine learning performance using the data.


Wednesday October 25, 2023 9:45am - 10:00am EDT
1E07

9:45am EDT

(Lecture) A new transient distortion measurement algorithm for detecting audible loudspeaker manufacturing defects
Transient distortion, or ‘loose particle’ measurement, is an important loudspeaker production line quality control metric that identifies and facilitates troubleshooting of manufacturing issues.

This paper introduces a new enhanced loose particle measurement technique that discriminates more accurately and reliably than current methods. This new method introduces ‘prominence’, a new metric for audio measurements, that effectively isolates transient distortion in the presence of periodic distortion. This technique also offers the unique ability to listen to the isolated transient distortion waveform which makes it easy to set limits based on audibility.

Although measuring loose particles in loudspeaker drivers motivated this work, this technique also shows promising results for measuring impulsive distortion or Buzz, Squeak and Rattle (BSR) in automotive audio applications, and identifying rattling components such as buttons and wires, in audio devices.


Wednesday October 25, 2023 9:45am - 10:00am EDT
1E09

10:00am EDT

(Lecture) Excitation Stimuli For Simultaneous Deconvolution of Room Responses
This paper compares three state-of-the-art stimuli (multitone-pink, MLS, and log-sweep) for the deconvolution of several loudspeaker-room impulse responses using a single time-domain measurement after exciting all the loudspeakers simultaneously. A Bayesian hyper-parameter optimization algorithm constructs the stimulus, where the algorithm optimizes the stimuli parameters by minimizing a {\em time-domain error} between the actual impulse responses and the simultaneously deconvolved responses over a training dataset. Objective results are presented for the various stimuli on a test dataset, whereas subjective tests compare the preference to the excitation stimuli played on all the loudspeakers. Additionally, the robustness of the constructed stimuli to various noises at different signal-to-noise ratios (SNR) is compared in the context of simultaneous deconvolution.

Speakers
avatar for Sunil G. Bharitkar

Sunil G. Bharitkar

Samsung Research America
avatar for Pascal Brunet

Pascal Brunet

Samsung Research America
Pascal Brunet obtained his Bachelor's in Sound Engineering from Ecole Louis Lumiere, Paris, in 1981, his Master's in Electrical Engineering from CNAM, Paris, in 1989 and a PhD degree in EE from Northeastern University, Boston, in 2014. His thesis was on nonlinear modeling of loudspeakers... Read More →


Wednesday October 25, 2023 10:00am - 10:15am EDT
1E07

10:00am EDT

(Lecture) Dynamic Equalizer applications to Loudspeaker Systems
When a loudspeaker system is driven close to its limit, Power Compression (PC) kicks in. Power Compression is due to heating of the voice coil and consequent increase of voice coil resistance: in PC range, an increment of voltage is not yielding a proportional increment of current anymore. The loudspeaker sensitivity changes.
The tonal balance of a loudspeaker system, usually set for moderate signal operation, is supposed to change when it comes to large signal level, mostly due to power compression.
Dynamic Equalizer (DE) could be a very useful tool to compensate for tonal balance modification: it is, in fact, a tool for equalizing an audio signal only under certain conditions of the input signal amplitude, for example when the input signal exceeds a predefined threshold.
The paper describes the loudspeaker measurements and the consequent dynamic equalizer design and settings that have been done to accomplish this compensation.


Wednesday October 25, 2023 10:00am - 10:15am EDT
1E09

10:00am EDT

AES Student Networking Center
Come network with students from audio programs around the world and share resources, discuss the conference, and share ideas for AES student engagement. Open to all audio students.

Moderators
Wednesday October 25, 2023 10:00am - 11:15am EDT
TBA Education Stage

10:15am EDT

(Lecture) The fast measurement of loudspeaker responses for all azimuthal directions using the continuous measurement method with a turntable
This paper proposes a method for the fast measurement of loudspeaker impulse responses for all azimuthal directions using the continuous measurement method with a turntable. The loudspeaker radiates all azimuthal directions with a constant angular velocity as the turntable rotates, and a measuring microphone records the related radiation sound. In our continuous measurement method, we use a maximum length sequence (MLS) as the excitation signal, record the received signal using a measuring microphone placed in the anechoic room away from the target loudspeaker, and feed them, along with the MLS signal, into a PC so that impulse response can be extracted for all azimuthal directions. This paper describes the concept of the method. Further, some results of the proposed method are verified using physical realization and empirical measurements.


Wednesday October 25, 2023 10:15am - 10:30am EDT
1E07

10:15am EDT

(Lecture) Towards Vibrotactile Transducer Characterization
Vibrotactile transducers (VTTs) are used in vibroacoustic applications for enhancing perception of low frequencies. These are used in conjunction with traditional audio transducers for a combined auditory and physical experience of sound and vibration perception, for applications ranging from music production to video gaming.

Audio transducers can be optimally selected taking into account audio content, environment and transducer characteristics required for the desired application. Optimal selection for VTTs additionally requires taking into account variations resulting from coupling of varying external mechanical loads at different power levels. The traditional method for measuring total harmonic distortion (THD) to capture harmonic content, while useful in the audio domain, does not factor in varying external mechanical loads and thus is not appropriate for judging performance of VTTs. Unlike traditional transducers, VTTs operate in highly nonlinear modes in most applications resulting in over 100% estimates for THDs. Thus, we introduce a new metric called transducer harmonics moment (THM), which considers the significance of harmonics with respect to their frequency and amplitude within a selected bandwidth. This results in more useful estimates compared to traditional THD in the vibrotactile domain.

Speakers
avatar for Jackie Green

Jackie Green

Nexonic Design
Jackie Green has enjoyed many opportunities to pursue great sound and innovative technologies. After BS and MBA degrees, Green pursued graduate courses in microprocessor design and digital signal processing in order to support creative work in digital wireless and audio. She is an... Read More →


Wednesday October 25, 2023 10:15am - 10:30am EDT
1E09

10:15am EDT

Forgotten Futures / Forbidden Planet: Rediscovering the unheard recordings of Louis and Bebe Barron
Most well-known for their pioneering electronic score to the science fiction film Forbidden Planet (1956), Louis and Bebe Barron’s experimental tape-based compositions and recordings spanned genres and decades. Over the past year, Wally De Backer (Gotye), David Barron, Volker Straebel, and Jessica Thompson have collaborated on the preservation of the Barron Archive, some 500 reel-to-reel tapes comprising masters, live performances, work tapes, work-for-hire recordings, and more. This group will discuss the technical challenges of working on experimental recordings dating back to the mid-1950s, the need to develop a lexicon to describe early electronic music, and Forgotten Futures plans to study and create access to these extraordinary recordings.



Forgotten Futures revives lost and forgotten artifacts of electronic music history by preserving original instruments, presenting the stories of their inventors, and bringing to light their cultural impact both from the time creation, and more importantly, how they can make an impression on our current and future cultural moments.

Speakers
AR

Aaron Rosenblum

Gotye, Forgotten Futures


Wednesday October 25, 2023 10:15am - 11:15am EDT
1E11

10:15am EDT

Iron Mountain Entertainment Services Presentation on Automated Media Image Capture System
Iron Mountain Entertainment Services (IMES) will present on the design, development and roll-out of its new Automated Media Image Capture System (AMICS), a state-of-the-art pilot project designed to improve the search and access capabilities of music archives. AMICS can process 2500+ media assets per shift allowing unprecedented inventory search, filtering, workflow assignments and more, transforming media inventory management. The panel will explain AMICs, showcase inventory examples, share challenges and lessons learned, and educate how this technology could be seamlessly analyzed and curated by an Ai/ML engine.

Speakers
avatar for Nick Allen

Nick Allen

Vice President of Asset & Archive Management, Universal Music Group
Nick Allen is Vice President of Asset & Archive Management at Universal Music Group.  Based in LA, Nick is a seasoned professional in the music industry with over 20-years experience in asset and archive management. He has a strong passion for technology and stays current on the... Read More →
avatar for Steven Hollencamp

Steven Hollencamp

Engineering Manager, Iron Mountain Entertainment Services
Steven Hollencamp is an Engineering Manager for Iron Mountain Entertainment Services in Boyers, Pennsylvania, an ultra-secure facility located 220 feet underground. Steve supports technology operations across IMES’ North American studios, including building out new satellite media... Read More →
avatar for Robert Koszela

Robert Koszela

Director of Studio Operations, Iron Mountain Entertainment Services
Bob Koszela is the Director of Studio Operations for Iron Mountain Entertainment Services North America.  A musician and songwriter, Bob brings over 30-years’ experience in archive media migration, major record label operations, A&R, marketing/promotion, new media, production... Read More →
avatar for Meg Travis

Meg Travis

Director, Global Head of Marketing, Iron Mountain Entertainment Services
Meg Travis is Director, Global Head of Marketing for Iron Mountain Entertainment Services, where she leads industry and client engagement to raise awareness of the urgency and importance of preserving our collective musical heritage. Meg works closely with industry organizations such... Read More →


Wednesday October 25, 2023 10:15am - 11:15am EDT
1E08

10:15am EDT

AI for Multitrack Music Mixing
Mixing is a central task within audio post-production where expert knowledge is required to deliver professional quality content, encompassing both technical and creative considerations. In this comprehensive workshop, we will explore recent advances in deep learning approaches for multitrack music mixing that utilize large-scale datasets, surpassing traditional expert systems. Topics covered will encompass intelligent music production, the importance of context in mixing, challenges in system design, and novel deep learning techniques such as differentiable mixing consoles and mixing style transfer. This workshop will also address future directions and challenges, and emphasise the need for interpretable and interactive systems. By bringing together researchers and professionals in audio engineering and digital signal processing, this workshop intends to foster the exploration of deep learning techniques specifically tailored for multitrack music mixing and wider audio engineering problems. The participation of the AES community is invaluable as we collectively strive to find solutions and shape the future of deep learning in audio mixing.

Speakers
avatar for Gary Bromham

Gary Bromham

Researcher & Independent Music Professional, Centre for Digital Music, Queen Mary University of London
MM

Marco Martínez

Sony Research, Tokyo, Japan
JK

Junghyun Koo

Music and Audio Research Group, Seoul National University
BD

Brecht De Man

PXL University of Applied Sciences and Arts
DR

David Ronan

CEO, RoEX Audio


Wednesday October 25, 2023 10:15am - 11:45am EDT
1E10

10:30am EDT

(Lecture) Loudspeaker position identification using human speech directivity index
A regular user of a multichannel loudspeaker system in typical living rooms sets the loudspeakers in a non-uniform manner, with angles and distances that don’t necessarily follow the recommended ITU-R BS.2159-4 standard. Assuming a multichannel audio system equipped with N number of loudspeakers and M very near-field (NF) microphones attached to each loudspeaker, the listener location with respect to the loudspeakers can be estimated by utilizing a supervised machine learning (ML) model. Two neural networks (NN) were trained with the human speech directivity index (DI) computed by room simulations, where the sound source was the typical directivity radiation pattern of human speech, and the receivers were the NF microphones attached to the loudspeakers. The distances between loudspeakers and the DI data was combined as input for the two NN models. One network was dedicated to estimate distances from loudspeaker to user, and the other network was dedicated to the angle estimation. The results shown a 95% confidence interval (CI) of ±1.7 cm and a CI of ±7 degrees for the incidence angle.


Wednesday October 25, 2023 10:30am - 10:45am EDT
1E07

10:30am EDT

(Lecture) Iterative metric-based waveguide optimisation
This paper explores an automated iterative optimisation process aimed at enhancing the performance of acoustic waveguides which are thin in one dimension by making them better at supporting single parameter (1P) wave propagation across a wide frequency range. The optimisation process is driven by two performance metrics that are calculated from the solution of Laplace’s equation in the waveguide. These highlight regions of error in the relative pathlength (“stretch”) and change in area (“flare”) continuously through the domain. Finite Element Analysis (FEA) is used to calculate these metrics on a test case. The error in the metrics is reduced using a fast iterative optimisation loop which equalises the relative pathlength and adjusts the area expansion by adding corrugations and deformations to regions of a thin domain, as is disclosed in and protected by GP Acoustics patent GB2588142 (Dodd & Oclee-Brown, 2021). The relationship between the waveguide metric errors and the acoustic wave coherence is investigated during each iteration for a simple test case. The metric-based optimisation approach is then used to create a wide directivity line array horn and wave-shaper. FEA simulations of the Helmholtz wave equation are used to analyse the performance of the assembly, and the results are compared to a traditional line array conical-diffraction horn and an exponential horn.

Speakers
JO

Jack Oclee-Brown

GP Acoustics (UK) Ltd.
MD

Mark Dodd

GP Acoustics (UK) Ltd.


Wednesday October 25, 2023 10:30am - 10:45am EDT
1E09

10:45am EDT

(Lecture) On the Impact of Neglecting Accurate Sound-Speed Models on the Cylinder Measurement Method for Directivity Balloons
The far-field acoustical transfer function of an electroacoustic device, such as a loudspeaker, is of fundamental importance in acoustic modelling software to predict the resulting sound field produced by multiple devices in 3D space. This dataset, commonly known as a directivity balloon, is usually acquired though time-consuming, sophisticated measurement techniques involving extensive hardware and dedicated post-processing algorithms. Additionally, such dataset is usually compensated for the effect of acoustical propagation to reference the magnitude and phase values to conventional distances. This works investigates the effects of propagation compensation within the cylinder measurement method for directivity balloons, specifically considering the use of inaccurate sound speed values relative to the ambient conditions during the measurement process. The importance of employing environmental parameter-dependent models for the propagation speed of sound in this type of measurement is emphasized, as to maintain high accuracy in the final phase response of the directivity data.

Speakers

Wednesday October 25, 2023 10:45am - 11:00am EDT
1E07

10:45am EDT

(Lecture) APPLICATION OF MATRIX ANALYSIS AND FEA FOR THE MODELING OF HORN DRIVERS
Matrix analysis is an efficient tool for the modeling of electroacoustical circuits and systems. In the submitted work an approach is proposed where the matrix model is based on the fully coupled electro-mechanical-acoustical FEA approach. This method makes it possible to calculate the fully coupled 3-dimensional model of a compression driver that considers all details and components including glue joints, complex mechanical behavior of the diaphragm, acoustical properties of the compression chamber, and the phasing pug. In the analysis of the electrical circuits, the parameters A11 and A21 are obtained by the condition of an open output (zero current), and the parameters A21 and A22 are obtained from the conditions of the shorn circuit at the output (zero voltage). The acoustical equivalents of the voltage and current are the sound pressure and volume velocity. The condition of zero velocity is easily modeled by the acoustically hard boundary at the exit of the compression driver, whereas the condition of the zero-sound pressure cannot be modeled by FEA. To overcome this obstacle, the indirect method is used by applying the plane-wave tube conditions. The results obtained by the FEA matrix analysis are compared with the data derived from the FEA fully coupled model and from the model based on the measured matrix coefficients. The developed method makes it possible to speed up the process of modeling and optimization of horns and waveguides because it does not require running the fully coupled FEA model of compression driver each time an iteration in the geometry of the horn or waveguide is made.

Speakers
avatar for Alexander Voishvillo

Alexander Voishvillo

Fellow, Acoustics Engineering, Harman Professional
Alexander Voishvillo was born and grew up in Saint Petersburg, Russia where he graduated from the State University of Telecommunications. He worked at Popov Institute for Radio and Acoustics. He designed loudspeakers and studio monitors and did research on loudspeakers and transducers... Read More →


Wednesday October 25, 2023 10:45am - 11:00am EDT
1E09

10:45am EDT

Camera Based Head Tracking for Multichannel Audio: A Technology for the Masses
During the past few years, Spatial Audio has become part of the listening experience for millions of people, but only a few of these experiences allow you to fully appreciate all the advantages of Spatial Audio due to the lack of head tracking. Some of the challenges around achieving full head-tracked audio include the need for a sensor that accurately tracks your head pose, and the technology's sensitivity to latency. Given the necessity for a non-invasive and low-cost method to integrate head tracking data into binaural renderers, cameras appear to be the ideal sensor as they already exist in most of the devices we use to consume multichannel audio experiences. In this presentation, we will discuss the challenges and advantages of using camera-based head-tracked audio for multichannel experiences.

Speakers

Wednesday October 25, 2023 10:45am - 11:45am EDT
1E06

11:00am EDT

(Lecture) Using high-resolution directivity data of musical instruments for acoustic simulation and auralization
In a larger measurement campaign by TU Berlin and RWTH Aachen, the directivity, sound power and a number of audio features were measured for 42 musical instruments at different dynamic levels. These data sets were processed and made available in a number of public, general-purpose data formats.

This paper discusses the conversion and adaptation of the data for use in acoustic modeling programs. In particular, high-resolution AES56-based GLL data files have been created. For this purpose, the directional measurements had to be processed into FIR data sets. For correct simulation of the sound pressure levels, the absolute output level had to be calibrated for each data set. The resulting GLL data sets for use in software such as EASE or EASE Focus are now publicly available.

The paper also discusses typical application scenarios for the new data sets. In practice, modeling is important to evaluate the coverage and radiation patterns of a single instrument or groups of instruments. This is of interest, for example, for the analysis of source localization and sound reinforcement options, including spatial audio applications. Another point of interest is the simulated critical distance of musical instruments, which can be used to optimize the positions of pickup microphones and room microphones. Finally, auralizations of rooms and venues are often based on musical performances, which can now be generated using real-world data sets.


Wednesday October 25, 2023 11:00am - 11:15am EDT
1E07

11:00am EDT

Ulrike Schwartz: Enveloping Masterclass
Ulrike Schwarz plays high resolution 7.1.4 music recordings, including recent examples from New York City during the pandemic lockdown, describing also the techniques used. Ulrike and Thomas discuss the Inception Dilemma, and how recordings come across in this particular room. Seats are limited to keep playback variation at bay, and the session is concluded with Q&A.

Immersive formats do not guarantee envelopment, or better control of a listening experience than stereo has to offer. In this masterclass series, we discuss and exemplify factors of recording, mixing, distribution and reproduction that make immersive worth the effort.

Speakers
avatar for Thomas Lund

Thomas Lund

Senior Technologist, Genelec Oy
Thomas Lund is senior scientist at Genelec, doing subjective research and listening tests. He has written papers on perception, spatialisation, loudness, sound exposure and true-peak level. Thomas is convenor of a working group under the European Comission, tasked with the prevention... Read More →
avatar for Ulrike Schwarz

Ulrike Schwarz

Engineer/Producer, Co-Founder, Anderson Audio New York
A broadly acclaimed engineer and producer, Ulrike Schwarz is a trailblazing audio innovator with more than two decades of experience across the film, television, radio, and recording industries. Born and raised in Germany, Schwarz discovered she had perfect pitch at an early age... Read More →


Wednesday October 25, 2023 11:00am - 12:00pm EDT
3D01

11:15am EDT

(Lecture) Comparing Virtual Source Configurations for Pipe Organ Auralization
It is challenging to study the sound of a pipe organ without considering both its size and the room where it is located. The present work investigates how to record a pipe organ with minimal room reverberation and how to realistically auralize dry organ recordings in a room acoustic model. Musical excerpts were recorded with a number of microphones positioned within the buffets of a large organ in order to capture the "dry" sound of the organ. Simultaneously, the music was also recorded with a binaural head positioned in the nave of the church. The dry organ recordings were then auralized from the same listener perspective using a calibrated geometric acoustic model of the church with various virtual source configurations, ranging in complexity from a single source at the center of the instrument to a virtual source position for each recorded microphone track. A listening test was performed to evaluate the realism and plausibility of the auralizations. The results yield suggestions for simulating the sound of a pipe organ in a geometric acoustic model, having broad implications for the planning of new pipe organs and for studying historic organs located in cultural heritage sites.


Wednesday October 25, 2023 11:15am - 11:30am EDT
1E07

11:30am EDT

Decolonizing Audio Education, Patrick Arnoud Koffi, AKA Tupaï N’Gouin
Patrick Arnoud Koffi, also known as Tupaï N’Gouin, is a beatmaker and music producer from Abidjan, Côte d'Ivoire. Tupaï will present on his audio career including a concise introduction, providing insight into himself, his work, and Victory B Studios, his studio where he produces beats, arranges music, and mixes tracks for trap and drill artists across local and international scenes, spanning France, the UK, and the USA. Additionally, he will discuss his work producing music for religious and Mande artists, as well as crafting commercial jingles and music for television shows. Tupaï was first introduced to the AES community through a profile in a paper titled, The Art of Remixing Abidjan (Ivory Coast), that was presented at the AES 153 Convention in 2022. Participants will learn about his experience in the field of audio and sound recording, including training and educational (or lack thereof) opportunities that were available to him. This presentation provides the opportunity to reflect upon what the next generation of audio professionals can do to help build a more inclusive and diverse community of audio engineers. This presentation acknowledges that historical and current Western paradigms should not automatically be privileged and endorses and promotes alternative epistemologies of audio engineering alongside the Western approaches and practices that are foundational to a diverse and vibrant audio community.


Wednesday October 25, 2023 11:30am - 12:30pm EDT
TBA Education Stage

11:45am EDT

Awards & Opening Ceremonies
Wednesday October 25, 2023 11:45am - 11:45am EDT
1E08

12:30pm EDT

Keynote: Prince Charles w/ Hank Shocklee
Hank Shocklee, a true pioneer in the audio industry and founder of Public Enemy and the Bomb Squad is taking the stage as a keynote speaker at this year’s convention. Shocklee revolutionized the game by bringing sampling to the forefront and using groundbreaking techniques in his productions that were unheard of at the time.

He's not just a hip-hop legend; he’s also a Rock and Roll Hall of Fame inductee, Hank Shocklee has also been a force behind many cult classic and breakthrough music and film projects, including artists Mary J. Blige, Anthony Hamilton, Ice Cube, LL Cool J, Slick Rick and films such as Ridley Scott’s American Gangster, Spike Lee's Do The Right Thing, and Ernest Dickerson's Juice.

The conversation will be moderated by esteemed Record Producer, Audio Engineer and Berklee College of Music Professor, Prince Charles Alexander.

Speakers
avatar for Hank Shocklee

Hank Shocklee

Hank Shocklee, a true pioneer in the world of music production, sound design, and composition, will share his remarkable journey during his keynote address. Bursting onto the music scene during the formative years of hip-hop, Shocklee's impact can be traced back to his groundbreaking... Read More →
avatar for Prince Charles Alexander

Prince Charles Alexander

Prince Charles Alexander is a multi-platinum, 3x Grammy winning, music producer and audio engineer that has worked with Mary J. Blige, Puff Daddy, The Notorious B.I.G, Aretha Franklin, Sting and many others. As a Professor of Music Production and Engineering at Berklee College of... Read More →


Wednesday October 25, 2023 12:30pm - 1:30pm EDT
1E08

1:45pm EDT

(Lecture) Towards the Rational Classification of Recording Technology
This paper outlines the foundation and development of a classification system for recording technology. It seeks to identify and define the essential characteristics of recording technology by which its artefacts can be defined and rationally organised relative to one another. The paper details the necessary theoretical foundations upon which this classification system has been constructed and provides historical examples where appropriate. The classification system has been developed using two parallel methods which have each informed and refined each other. The first is constructed from the observation of differences between groups of functionally similar recording devices. It then analyses these differences to describe the defining distinctions between these categories in a rational way. The second method is a facet analysis, whereby the fundamental properties of recording devices (or facets) have been mathematically defined and are represented for each example of recording device. This method allows for the further development of precise language and tools for the description and understanding of recording technology. The paper will present and evaluate both these methods before detailing the first four broad classes of recording technology which have thus been identified and rationally described. These first four classes examine devices ranging from the earliest examples of phonographic recording, to magnetic media, through to contemporary digital technologies. It will present each of these classes in a clear and ordered manner, with choice examples and analysis of edge cases where appropriate. The paper will conclude with an overview of the current state of the classification system and a survey of the expected outcomes and opportunities for future research.
This paper represents a component of the primary author’s ongoing doctoral thesis due for submission in 2025 and is an iteration upon a presentation made by both authors to the Adelaide AES Chapter in February 2023. For further information or feedback, the authors can be reached at alexander.mader@adelaide.edu.au.


Wednesday October 25, 2023 1:45pm - 2:00pm EDT
1E07

1:45pm EDT

(Lecture) Spatial Sound Stability Enhancement By Advanced User-Tracked Loudspeaker Rendering
For consumer home entertainment, rendering spatial sound on multiple loudspeakers is important. Stimulated by the recent push for Virtual and Augmented Reality (VR/AR) applications, precise and inexpensive tracking devices have become available that can help to enhance the loudspeaker playback user experience for any type of setup from 2.0 (stereo) to 22.2 3D audio. By using the tracked listener position and digital signal processing, the sensation of a large “sweet area” and vastly increased spatial stability can be experienced by the listener, almost similar to what is known from wavefield synthesis (but at a small fraction of rendering effort and for a single listener). This paper describes an advanced user-tracked rendering system for multi-channel audio including a number of novel features (compensation of loudspeaker directivity pattern, consideration of room reverberation, and a simple parametrization for both aspects). The system’s subjective rendering performance for image stability is evaluated in a subjective test and shows marked improvement over non-tracked rendering. The described technology has been accepted for loudspeaker rendering in the upcoming MPEG-I Immersive Audio standard.


Wednesday October 25, 2023 1:45pm - 2:00pm EDT
1E09

1:45pm EDT

The 4pi sound studio
MIL's vision for developing the future of immersive sound production: MIL is a testing and research facility equipped with a 43.2 (up to 62.2) playback array with multiple rendering methods, such as Dolby Atmos, Auro-3D, 360 RA, HOA, SPAT Revolution, etc. We create several versions of immersive content from the same sound resources using different types of playback arrays and rendering methods at MIL and then discuss the findings of future vision of immersive sound production.

Speakers
avatar for Jonathan Wyner

Jonathan Wyner

Mastering Engineer, Berklee College of Music, Professor of Music Production and Engineering; M Works, Chief Engineer/Consultant
AES President 2021, Jonathan Wyner is a Technologist,  Education Director for iZotope in Cambridge, MA,  Professor at Berklee College of Music in Boston, Chief Engineer at M Works Mastering.  A musician and performer, he’s a Grammy Nominated Producer whose mastered and produced... Read More →
avatar for Masataka Nakahara

Masataka Nakahara

Acoustic Designer / Acoustician, SONA Corp. / ONFUTURE Ltd.
Masataka Nakahra is an acoustician who specialized in acoustic design of studios and R&D work on room acoustics, and is also an educator.After he had learned acoustics at the Kyushu Institute of Design, he joined SONA Corporation and started his career as an acoustic designer.In 2005... Read More →
RK

Ryuichi Kitaki

Media Integration, Inc.
YM

Yosuke Maeda

Media Integration, Inc.


Wednesday October 25, 2023 1:45pm - 2:45pm EDT
1E06

2:00pm EDT

(Lecture) Emulating Vector Base Amplitude Panning Using Panningtable Synthesis
This paper presents Panningtable Synthesis (PTS) as an alternative approach to panning virtual sources in spatial audio that is both a generalization to and more efficient than Vector Base Amplitude Panning (VBAP). This new approach is inspired by a previous technique called Rapid Panning Modulation Synthesis (RPMS). RPMS however exhibits the limitation in that all secondary sources need to be regularly spaced across the circle and organized in equally spaced circles across the sphere. We demonstrate that PTS is not only able to overcome these restrictions, but that it is also fully compliant with VBAP, more computationally efficient and can be regarded as a generalization to the same. Furthermore, we demonstrate that PTS is also able to supersede RPMS both in its capacity to create and shape sound spectra, independently from the number of secondary sources used in the array. Considering creative spatial sound synthesis techniques, PTS can be compared to Wavetable or Wave-Terrain Synthesis, but with the added, inherent spatial characteristics. The flexibility of PTS allows any degree of trade-off between using perceptually correct panning curves and those that target specific sound spectra.


Wednesday October 25, 2023 2:00pm - 2:15pm EDT
1E09

2:00pm EDT

(Lecture) Assessing Accessibility within the Recording Industry for Engineers and Producers with Vision Loss
As recording technology shifts primarily to digital interfaces, these highly graphics-based solutions present potential access issues for the millions of Americans who identify as blind or low vision. This paper assessed the accessibility of recording technology for engineers and producers with vision loss in the U.S, and what potential roles financial and societal accessibility barriers play in the broader discussion of accessibility and career success. A mixed-methods approach was employed, including an online survey of 57 participants, with and without vision loss, as well as interviews with industry experts. Findings revealed that while users with vision loss had more difficulty navigating recording software, they navigated basic keyboard-shortcut tasks better than those without vision loss. Financial burdens, societal issues, and lack of practical opportunities were recognized as significant barriers to success for recording professionals with vision loss despite the accessibility of technology. This paper provides suggestions for improving the navigability of recording technology and the broader recording industry barriers, and proposes that future research take the extensive survey data collected to conduct further in depth and scientific analysis.

Speakers

Wednesday October 25, 2023 2:00pm - 2:30pm EDT
1E07

2:00pm EDT

Immersive Audio in Education
Immersive audio has been an important part of the audio industry for many years, especially in the area of visual media. But with the adoption of immersive audio by streaming platforms such as Apple Music and Tidal, along with easily accessible tools for producing immersive audio content, it is clear that immersive audio is now an essential part of an audio engineer’s tool set. Both experienced and novice engineers must be educated in the technology and techniques to create and deliver immersive audio content.

This panel of content creators and educators will explore issues surrounding immersive audio in education and the challenge of creating educational materials and curricula in a such a varied and rapidly-changing field.

Moderators
avatar for Konrad Strauss

Konrad Strauss

Professor, Indiana University Jacobs School of Music
Konrad Strauss is a Professor of Music in the Department of Audio Engineering and Sound Production at Indiana University's Jacobs School of Music. While at the Jacobs School Mr. Strauss built a new curriculum centered around current audio production techniques and technology and designed... Read More →

Speakers
avatar for Hyunkook Lee

Hyunkook Lee

Professor, Applied Psychoacoustics Lab, University of Huddersfield
I am Professor of Audio and Psychoacoustic Engineering and Director of the Applied Psychoacoustics Lab (APL)/Centre for Audio and Psychoacoustic Engineering (CAPE) at the University of Huddersfield, UK. Past and current research areas include the perception of auditory height and... Read More →


Wednesday October 25, 2023 2:00pm - 3:00pm EDT
1E11

2:00pm EDT

Morten Lindberg: Enveloping Masterclass
Morten Lindberg plays high resolution 7.1.4 music recordings, describing also the microphone and recording techniques used.

Morten and Thomas discuss the Inception Dilemma, and how the recordings come across in this particular room. Seats are limited to keep playback variation at bay, and the session is concluded with Q&A.

Immersive formats do not guarantee envelopment, or better control of a listening experience than stereo has to offer. In this masterclass series, we discuss and exemplify factors of recording, mixing and reproduction that make "immersive" worth the effort. Examples are based on discrete channel, linear coding.

Speakers
avatar for Thomas Lund

Thomas Lund

Senior Technologist, Genelec Oy
Thomas Lund is senior scientist at Genelec, doing subjective research and listening tests. He has written papers on perception, spatialisation, loudness, sound exposure and true-peak level. Thomas is convenor of a working group under the European Comission, tasked with the prevention... Read More →
avatar for Morten Lindberg

Morten Lindberg

Producer & Engineer, 2L (Lindberg Lyd AS)
Recording Producer and Balance Engineer with 42 American GRAMMY-nominations since 2006, 34 of these in craft categories Best Engineered Album, Best Surround Sound Album, Best Immersive Audio Album and Producer of the Year. Founder and CEO of the record label 2L. Grammy Award-winner... Read More →


Wednesday October 25, 2023 2:00pm - 3:00pm EDT
3D01

2:15pm EDT

(Lecture) Perceptually Motivated Bitrate Allocation for Object-Based Audio Using Opus Codec
With the increasing popularity of immersive audio, using legacy tools for these new formats can be challenging. This paper presents an overview of how to utilize Opus audio codec for object-based audio. We reviewed the performance of Opus using two different bit-allocation strategies: a vanilla method that uses the same bitrate for each object and joint allocation method that distributes the total bitrate among objects using their estimated perceptual importance. Proposed joint allocation significantly outperformed the vanilla method at the same total bitrate and achieved an "Excellent" score during MUSHRA testing with a significant bitrate saving.


Wednesday October 25, 2023 2:15pm - 2:30pm EDT
1E09

2:30pm EDT

(Lecture) Enhancing Spatial Audio Generation with Source Separation and Channel Panning Loss
Spatial audio is essential for many immersive content services; however, it is challenging to obtain or create it. Recently, multimodal-based ambisonic audio generation has emerged as a promising approach for addressing the limitation. It combines multiple modalities, such as audio and video, and provides more intuitive control of ambisonic audio generation. Moreover, it leverages the advantages of machine-learning methods to automatically learn the correlation between different features and generate high-quality ambisonic sounds. Herein, we propose a separation- and localization-based spatial audio generation model. First, the network extracts visual features and separates audio into sound sources. Then, it conducts localization by mapping the separated sound sources to the visual features. To overcome the performance limitation of the previous self-supervised source separation approach, we employ a pretrained source separator with superior performance. To improve the localization performance, we introduce a channel panning loss function between each channel of the ambisonic signal. We use three different types of datasets to train the model experimentally and evaluate the proposed method with four quantitative metrics. The results show that the proposed model achieves better spatialization performance than baseline models.


Wednesday October 25, 2023 2:30pm - 2:45pm EDT
1E09

2:30pm EDT

(Lecture) Evaluating Web Audio for Learning, Accessibility, and Distribution
Web Audio has a great potential for interactive audio content in which an open standard and easy integration with other web-based tools makes it particularly interesting. From earlier studies, obstacles for students to materialize creative ideas through programming were identified; focus shifted from artistic ambition to solving technical issues. This study builds upon 20 years of experience from teaching sound and music computing and evaluates howWeb Audio contributes to the learning experience. Data was collected from different student projects through analysis of source code, reflective texts, group discussions, and online self-evaluation forms. The result indicates that Web Audio serves well as a learning platform and that an XML abstraction of the API helped the students to stay focused on the artistic output. It is also concluded that an online tool can reduce the time for getting started with Web Audio to less than 1 h. Although many obstacles have been successfully removed, the authors argue that there is still a great potential for new online tools targeting audio application development in which the accessibility and sharing features contribute to an even better learning experience.


Wednesday October 25, 2023 2:30pm - 3:00pm EDT
1E07

2:45pm EDT

(Lecture) SVD-Domain Basis Vector Interpolation and Bidirectional Cascaded Long Term Prediction for Frame Loss Concealment in Higher Order Ambisonics Signals
This paper proposes a novel frame loss concealment technique for higher order ambisonics (HOA) audio signals. It is designed to overcome the challenge of interpolating lost frames of HOA data and recover a close approximation of the original data without significantly impacting it’s localization. The underlying idea uses two techniques. The first is cascaded long term prediction, a technique which uses a cascade of long-term prediction filters to capture periodic components of music signals, to predict the lost frame’s ambisonics channels, in the SVD domain, from the periodic components of the past and future frames. Additionally, interpolation of the SVD basis vectors is used to accurately reconstruct the spatialization of the lost frame. Objective and subjective evaluations show this method to be superior in accurately reconstructing lost frames to cascaded long term prediction being applied directly to the ambisonics signal in the time-domain, as well as other techniques.


Wednesday October 25, 2023 2:45pm - 3:00pm EDT
1E09

3:00pm EDT

(Lecture) Advancing DEI in AES: A Pilot Study of AES Convention Participants Data Analysis
This paper presents a comprehensive data analysis of a pilot study conducted at the AES Europe 2023 154th Convention to explore diversity, equity, and inclusion (DEI) efforts within the AES community. This pilot study gathered 109 unique survey responses, offering valuable insights into demographics and opinions among AES members. While acknowledging the limitations of the sample size, the study provides a foundation for identifying trends and enhancing future surveys to advance DEI endeavors. By examining various dimensions of identity, such as gender, race, and language, the analysis reveals how different demographic groups perceive AES and its DEI initiatives. Notably, the survey demonstrated inclusivity by attracting non-male, non-white members, emphasizing the importance of expanding inclusivity to non-binary genders. Support for DEI initiatives is evident. However, two rare cases expressed underlying skepticism about the authority of the AES DEI Committee to pursue its programs and suggested that DEI work is unnecessary. In the analysis, the authors dive deeper into these varying opinions of AES and show the merit of conducting a larger-scale DEI survey in order to establish a baseline of the current demographic makeup of the AES community and track how it changes over time. This research contributes to the ongoing efforts of creating a more diverse, equitable, and inclusive AES community and lays the groundwork for future studies.

Speakers
avatar for Jiayue Cecilia Wu

Jiayue Cecilia Wu

Assistant Professor, University of Colorado Denver
Jiayue Cecilia Wu, Ph.D. is a scholar, composer, audio engineer, technologist, and vocalist. Her work focuses on how music technology can augment the healing power of music. She earned her BS in Design and Engineering in 2000. She then worked as a professional musician, publisher... Read More →
avatar for Mary Mazurek

Mary Mazurek

Audio Educator/ Recording Engineer, University of Lethbridge
Audio Educator at the University of Lethbridge. GRAMMY-nominated recording engineer based in Lethbridge, AB and Chicago, IL. Research & professional work: classical and acoustic music recording, live radio broadcast, podcasting, the aesthetic usage of noise, noise art, sound art... Read More →


Wednesday October 25, 2023 3:00pm - 3:15pm EDT
1E07

3:00pm EDT

(Lecture) Validation of a Neural Network Clustering Model for Affective Response to Immersive Music
Individual differences are a rising topic in auditory science and immersive experiences. Socio-cultural and anthropometric idiosyncrasies of listeners could lead to unintended auditory experiences, far from what media content creators intended. To better understand how this individuality may influence a listener's preferences, we investigated various individually related factors, including previous listening experiences and cognitive profiles. In addition, we proposed a data-driven clustering method and showed its efficacy for meaningful grouping of listeners. In this study, we validated the data-driven method with 13 new subjects who generated attribute rating data for 16 stimulus conditions. The method, employing neural network clustering, successfully grouped participants into two preference-based categories with a 92% accuracy rate. The results support the proposed model's reliability and its potential in applications to enhance individually optimized 3D music presentations.


Wednesday October 25, 2023 3:00pm - 3:15pm EDT
1E09

3:00pm EDT

Converting your 5.1 home studio for immersive mixing
This tutorial will guide the participants in converting their 5.1 surround home studio to immersive mixing, hints and ideas will be discussed. Including the ins and outs of interfaces, what to look for in the included software, choosing additional loudspeakers, bass management, different calibration schemes, etc.
The tutorial will also address a list of new plugins to consider and unlock some previously owned ones as well. Using excerpts from actual sessions, simple ways to configuring all those precious 5.1 templates to an efficient Dolby Atmos workflow.

Speakers
RG

Roger Guerin

CAS & MPSE


Wednesday October 25, 2023 3:00pm - 4:00pm EDT
1E06

3:00pm EDT

Jason Achilles - Microphones on Mars
Record producer and touring multi-instrumentalist Jason Achilles presents a storied personal accounting of developments that led to the world hearing the first sounds captured from Mars by the NASA Perseverance rover on February 22nd, 2021. This historic journey will culminate with a show-and-tell analysis of technical hurdles, sharing recent recordings from the Martian surface, and what the future holds for audio that is literally out of this world. Q&A to follow.

Moderators
Speakers

Wednesday October 25, 2023 3:00pm - 4:30pm EDT
TBA Education Stage

3:00pm EDT

The Evolution of AES - Celebrating the First 75 Years
Our organization has enjoyed amazing growth over the course of the last 75 years. In the words of one of the Charter Members and the Chair of the AES Historical Committee Donald J Plunkett in 1997, "... when a few engineers interested in radio and sound met in early 1948 in New York, they met to discuss starting an organization that would be concerned with the rapid developments in the field of audio that had resulted from World War II research and engineering."

What started as a gathering of scientists and inventors has become a major force in all aspects of audio, including artists, practitioners, educators, manufacturers, and of course, scientists and inventors.

Let's explore the history of AES over the last 75 years, and also look forward to the future. Where has AES been? How have we changed over the years? Where are we heading? This panel will feature some of our past, present, and future leaders along with some rare footage from the early days of AES.

Speakers
BO

Bruce Olson

Olson Sound Design, LLC
BF

Bill Foster

Founder Tape One Studios, former AES UK Section Chair, Governor/Regional VP & Interim Executive Director
avatar for Agnieszka Roginska

Agnieszka Roginska

Professor of Music Technology, Past President
Agnieszka Roginska is a Professor of Music Technology at New York University. She conducts research in the simulation and applications of immersive and 3D audio including the capture, analysis and synthesis of auditory environments, auditory displays and applications in augmented... Read More →
avatar for Lesley Fogle

Lesley Fogle

Lesley Ann Fogle has worked for 25 years in mainly sound-for-picture, from production to compositional design and commercial audio finish. She is a multi-disciplinary artist, vocalist (Mal VU, After-Death Plan), and writer. As a studied vocalist (U of Toledo, Ohio State U), her range... Read More →


Wednesday October 25, 2023 3:00pm - 4:30pm EDT
1E08

3:15pm EDT

(Lecture) Analysis of the audio engineering society’s research trend of the last four decades using the topic modeling of the AES publications
This paper presents a comprehensive analysis of the evolution of research topics in audio engineering over the past four decades. This study’s aim is to examine the evolution of research topics by analyzing abstracts from the Journal of the Audio Engineering Society (JAES). Using the g-DMR topic modeling technique, the authors identified 16 key research trends from 2,038 abstracts from the JAES. These findings provide an analysis of how sound engineering has evolved in relation to media technology, showing the rise and fall of the research keywords such as spatial hearing, loudspeaker response and measurement, spatial reproduction, music information retrieval (MIR), etc. The authors hope that this historical analysis will help reflect on the society’s development and look to the future, with plans to expand the analysis to include more AES publications and conference proceedings.


Wednesday October 25, 2023 3:15pm - 3:30pm EDT
1E07

3:15pm EDT

(Lecture) Easy-to-Build Higher-Order Ambisonic Microphones using MEMS
This paper addresses designing and evaluating various MEMS-based Higher Order Ambisonics (HOA) arrays. MicroElectric Mechanical Sensors (MEMS) have become popular in microphone array design due to their part-to-part consistency, scalability, and improved resilience. The author of this paper fabricated three different HOA arrays and evaluated these in hemi-anechoic conditions using a semi-automated impulse response system. This publication details the design of these arrays, as well as preliminary results. We also present a subjective study conducted using one of these three models.


Wednesday October 25, 2023 3:15pm - 3:30pm EDT
1E09

3:15pm EDT

The Future of Podcasts
Long gone are the days of audio-only entertainment being relegated to just audio books. Podcasts have become an integral and ubiquitous part of daily life. According to the latest available data, there are over 450 million worldwide podcast listeners with over 4 million podcasts accessible to the public. But what sets one apart from the other? With podcasts seeming to have a strong and growing foothold in the media market, what does the future hold to keep this medium exciting and interesting? Is immersive audio a viable way to stand out? Our panel of engineers, producers and editors explore the possibilities.

Speakers
GM

Gillian Moon

sound designer/audio engineer, self employed
I am a sound designer for theater and themed entertainment. I am trying to get more post production work and my dream is to work in video games.


Wednesday October 25, 2023 3:15pm - 4:15pm EDT
1E11

3:15pm EDT

Harnessing the Power of AI in Designing Electronic Musical Instruments: Innovation, Interactivity, and Inclusivity
How is AI technology influencing the future of electronic musical instruments? We propose an informative panel discussion to explore this critical question. Inventing new musical instruments is a highly multidisciplinary endeavor bringing together skills in music, sound design, audio engineering, HCI, digital fabrication, and software design.

Our panel will bring together diverse expert profiles to explore AI's multifaceted impact on their respective disciplines and its role in diversifying electronic musical instrument design. Key topics will cover how AI influences personalized instrument creation, transforms sound synthesis, and evolves interactive musical systems.

Special emphasis will be given to the potential of AI to replace vs. enhance human skills and the possibilities for experts to work together with machines to improve their musical and creative outcomes. In this context, the discussion will delve into the democratic potential of AI in music, making electronic instruments more accessible to a broader audience, irrespective of their musical skills.

This panel discussion aims to foster a comprehensive, interdisciplinary discourse on AI's transformative influence, promoting a fusion of creativity, science, and technology in designing future electronic musical instruments.

Speakers
avatar for Akito Van Troyer

Akito Van Troyer

Associate Professor, Berklee College of Music
MF

Morwaread Farbood

New York University
HS

Harpreet Sareen

Parsons School of Design
NS

Nikhil Singh

MIT Media Lab
avatar for Rébecca Kleinberger

Rébecca Kleinberger

Researcher / Assistant Professor, Northeastern University


Wednesday October 25, 2023 3:15pm - 4:15pm EDT
1E10

4:00pm EDT

Li Dakang: Enveloping Masterclass
Professor Li Dakang plays unique, high resolution 7.1.4 recordings of Chinese folk music, traditional orchestra, Peking opera, pipe organ and Beethoven’s 9th Symphony. Examples are also given of different methods of ambient miking.

Mr. Dakang and Thomas discuss the Inception Dilemma, and how recordings come across in this particular room. Seats are limited to keep playback variation at bay, and the session is concluded with Q&A. Feng Hanying provides translation throughout.

3D audio formats do not guarantee listener envelopment, or better control of a listening experience than stereo has to offer. In this masterclass series, with discrete-channel, linear audio examples, we discuss factors of recording, mixing and reproduction that make "immersive" worth the effort.

Speakers
avatar for Thomas Lund

Thomas Lund

Senior Technologist, Genelec Oy
Thomas Lund is senior scientist at Genelec, doing subjective research and listening tests. He has written papers on perception, spatialisation, loudness, sound exposure and true-peak level. Thomas is convenor of a working group under the European Comission, tasked with the prevention... Read More →


Wednesday October 25, 2023 4:00pm - 5:00pm EDT
3D01

4:15pm EDT

How To Create Impressive 3D Audio Productions
3D audio has enormous potential to touch audiences on a deeply emotional level: The most powerful impact occurs when the sonic and musical layers work hand in hand. If this is the case, everyone gets goosebumps. But what is the secret to maximizing this potential?

Timbres attract a lot of attention when they are colorful and lively. In the first part of his workshop, Lasse Nipkow presents a quasi-binaural spot-miking technique that makes instruments appear three-dimensional in the playback room as if the musicians were standing there. The recording room also contributes significantly to the sonic signature of the production. Lasse Nipkow shows the influence of great-sounding rooms and how they can be used as multichannel impulse responses in 3D audio productions at any time.

The musical structure is key to an extraordinary musical experience. It creates a connection between notes and sounds and thus contributes to an intense envelopment. The result can be paraphrased by the following image: The musicians sit around the audience and play a piece of music together. The listener feels like they're sitting in the middle of an orchestra or ensemble. In the second part of his workshop, Lasse Nipkow will play sound samples that follow this musical concept and were created in collaboration with Grammy-nominated sound engineer Ronald Prent.

Speakers
avatar for Lasse Nipkow

Lasse Nipkow

CEO, Silent Work LLC
Lasse Nipkow began his career in 1989 as an electronics technician at Studer Revox and subsequently studied electrical engineering at Technikum Winterthur. In 2003, he passed the examination to become a sound engineer with a Swiss federal diploma. From that point on, he focused on... Read More →


Wednesday October 25, 2023 4:15pm - 5:15pm EDT
1E06

4:30pm EDT

Diversity, Equity & Inclusion within STEM Societies
AES DEI Chairpersons Mary Mazurek and Cecilia Wu host a panel discussion with invited representatives from other STEM societies to discuss progress, challenges, implications, and implementations of diversity, equity, inclusion, and accessibility with in their organizations. The goal is to learn from each other ways to address member needs while providing safe, welcoming, and inclusive environments.

Speakers
avatar for Mary Mazurek

Mary Mazurek

Audio Educator/ Recording Engineer, University of Lethbridge
Audio Educator at the University of Lethbridge. GRAMMY-nominated recording engineer based in Lethbridge, AB and Chicago, IL. Research & professional work: classical and acoustic music recording, live radio broadcast, podcasting, the aesthetic usage of noise, noise art, sound art... Read More →
avatar for Jiayue Cecilia Wu

Jiayue Cecilia Wu

Assistant Professor, University of Colorado Denver
Jiayue Cecilia Wu, Ph.D. is a scholar, composer, audio engineer, technologist, and vocalist. Her work focuses on how music technology can augment the healing power of music. She earned her BS in Design and Engineering in 2000. She then worked as a professional musician, publisher... Read More →


Wednesday October 25, 2023 4:30pm - 5:30pm EDT
1E10

4:30pm EDT

Everything You Hear is True – the Telarc Legacy
The technical achievements of the production team at Telarc have been a frequent topic at AES gatherings, but how did a small, independent label from the Midwest achieve such a significant impact in the music world? This Historical Track panel will discuss Telarc’s place in the history of classical and jazz recording, its origins and predecessors, and the impact it has had on engineers and producers working today.

Speakers
avatar for Scott Burgess

Scott Burgess

Director, Audio and Media Production, Aspen Music Festival and School
avatar for Morten Lindberg

Morten Lindberg

Producer & Engineer, 2L (Lindberg Lyd AS)
Recording Producer and Balance Engineer with 42 American GRAMMY-nominations since 2006, 34 of these in craft categories Best Engineered Album, Best Surround Sound Album, Best Immersive Audio Album and Producer of the Year. Founder and CEO of the record label 2L. Grammy Award-winner... Read More →
avatar for Susan Schmidt Horning

Susan Schmidt Horning

Professor, St. John’s University


Wednesday October 25, 2023 4:30pm - 5:30pm EDT
1E11

6:30pm EDT

NYU 370 Jay St. Audio Lab Tour
Please stop by and visit our new multi-channel audio and video lab at the New York University 370 Jay Street Brooklyn . There will be demos from friends and NYU faculty/students as well as industry folks who have their equipment installed in the space.

Information about the lab:
https://sites.google.com/nyu.edu/370jproject/reservations/230-audio-lab

We will admit people on a first come/first serve bases as the lab will hold up to 25 visitors at a time.

Speakers

Wednesday October 25, 2023 6:30pm - 8:30pm EDT
OFFSITE
 
Thursday, October 26
 

9:00am EDT

AES MIDI Association Panel Description and Participants
MIDI, the Musical Instrument Digital Interface, was defined 40 years ago. It has proceeded to be one of the most enduring influences on audio and music technology and creativity for decades. In June of this year, The MIDI Association released MIDI 2.0, which is now found across hardware and software for audio production, music creation, show control, education, and much more.

Join Berklee’s Michael Bierylo as he moderates a panel discussion about MIDI 2.0 and what its proliferation means to audio engineering and creation with Brett g Porter from the MIDI Association’s executive board, Dr. Chandler Bridges of Indiana University, and Dr. Seth Cluett of Columbia University.

Speakers
avatar for Michael Bierylo

Michael Bierylo

Chair Emeritus, Electronic Production and Design, Berklee College of Music
Michael Bierylo Michael Bierylo is an electronic musician based in Boston Massachusetts. He is Chair Emeritus of the Electronic Production and Design Department at Berklee College of Music where he led the development of Berklee’s Electronic Digital Instrument Program, the Electronic... Read More →


Thursday October 26, 2023 9:00am - 9:00am EDT
1E11

9:00am EDT

(Lecture) MPEG-I Immersive Audio – Reference Model For The New Virtual / Augmented Reality Audio Standard
MPEG-I Immersive Audio is a forthcoming standard that is under development within the MPEG Audio group (ISO/IEC JTC1/SC29/WG6) to provide a compressed representation and rendering of audio for Virtual and Augmented Reality (VR/AR) applications with six degrees of freedom (6DoF). MPEG-I Immersive Audio supports bitrate-efficient and high-quality storage/transmission of complex virtual scenes including sources with spatial extent and distinct radiation characteristics (like musical instruments) as well as geometry description of acoustically relevant elements (e.g. walls, doors, occluders). The rendering process includes detailed modelling of room acoustics and complex acoustic phenomena like occlusion and diffraction due to acoustic obstacles and Doppler effects as well as interactivity with the user. Based on many contributions, this paper reports on the state of the MPEG-I Immersive Audio standardization process and its first technical Reference Model architecture. MPEG-I Immersive Audio establishes the first long-term stable audio format specification in the field of VR/AR and can be used for many consumer applications like broadcasting, streaming, social VR/AR or Metaverse technology.


Thursday October 26, 2023 9:00am - 9:30am EDT
1E07

9:00am EDT

(Lecture) Style Transfer for Non-differentiable Audio Effects
Digital audio effects are widely used by audio engineers to alter the acoustic and temporal qualities of audio data. However, these effects can have a large number of parameters which can make them difficult to learn for beginners and hamper creativity for professionals. Recently, there have been a number of efforts to employ progress in deep learning to acquire the low-level parameter configurations of audio effects by minimising an objective function between an input and reference track, commonly referred to as style transfer. However, current approaches use inflexible black-box techniques or require that the effects under consideration are implemented in an auto-differentiation framework. In this work, we propose a deep learning approach to audio production style matching which can be used with effects implemented in some of the most widely used frameworks, requiring only that the parameters under consideration have a continuous domain. Further, our method includes style matching for various classes of effects, many of which are difficult or impossible to be approximated closely using differentiable functions. We show that our audio embedding approach creates logical encodings of timbral information, which can be used for a number of downstream tasks. Further, we perform a listening test which demonstrates that our approach is able to convincingly style match a multi-band compressor effect.

Speakers

Thursday October 26, 2023 9:00am - 9:30am EDT
1E09

9:00am EDT

Ask the Lawyer: IP Law & Beyond
Your intellectual property is the lifeblood of your business – find out how to protect your brand and monetize your assets. Learn the differences between copyrights, trademarks, patents, and trade secrets. From creating a brand identity to monetizing your innovation, learn how you can capitalize on every aspect of your invaluable IP. After a brief discussion of the hot legal topics in the audio space, you’ll have the opportunity to have your questions answered directly from a panel of lawyers. Do you have a legal problem or question relating to your audio products or want to discuss different ways to approach licensing and strategic partnerships? Are you a UK or EU company doing business in the U.S. or vice versa? Bring us your best legal questions!

Speakers
avatar for Heather Rafter

Heather Rafter

Principal, RafterMarsh US
Heather Dembert Rafter has been providing legal and business development services to the audio, music technology, and digital media industries for over twenty-five years. As principal counsel at RafterMarsh US, she leads the RM team in providing sophisticated corporate and IP advice... Read More →
avatar for Philipp Lengeling

Philipp Lengeling

Senior Counsel, RafterMarsh
Philipp G. Lengeling, Mag. iur., LL.M. (New York), Esq. is an attorney based in New York (U.S.A.) and Hamburg (Germany), who is heading the New York and Hamburg based offices for RafterMarsh, a transatlantic boutique law firm (California, New York, U.K., Germany) that specializes... Read More →
JF

Jessica Fajfar

RafterMarsh US
ZK

Zack Klein

RafterMarsh.


Thursday October 26, 2023 9:00am - 10:00am EDT
1E08

9:00am EDT

Form vs Function: Considerations of Modern Instrument Design
As electronic instruments become increasingly more powerful, user interface and user experience design become essential elements it the development of new instruments. In this session, we'll explore some of the strategies used in modern instrument design taking into consideration the availability and layout of physical controls, how they are mapped, as well as new approaches to performance control such as MIDI Polyphonic Expression (MPE).

Speakers
avatar for Michael Bierylo

Michael Bierylo

Chair Emeritus, Electronic Production and Design, Berklee College of Music
Michael Bierylo Michael Bierylo is an electronic musician based in Boston Massachusetts. He is Chair Emeritus of the Electronic Production and Design Department at Berklee College of Music where he led the development of Berklee’s Electronic Digital Instrument Program, the Electronic... Read More →
PB

Peter Brown

Roland Corporation USA


Thursday October 26, 2023 9:00am - 10:00am EDT
1E11

9:00am EDT

Geffen Hall Panel
The new $550,000,000.00 Geffen Hall at New York's Lincoln center is the fourth concert hall to be built at this location. The panel will include representatives from Lincoln Center, the New York Philharmonic Orchestra, the architects and the acousticians who designed it.

Why was this project undertaken? How does this new hall differ from what was there before?

Following the panel discussion, the proceedings will move to Lincoln Center for a tour of Geffen Hall.

Speakers
avatar for John Allen

John Allen

High Performance Stereo
Primary Affiliation: High Performance Stereo – Henderson, Nevada USAAES Member Type: Life MemberPrior to founding High Performance Stereo in 1979, John F. Allen was one of a small group of sound engineers allowed to mix concerts of the Boston Symphony and Boston Pops orchestras... Read More →
avatar for Christopher Blair

Christopher Blair

Akustiks
Christopher Blair with 50 years professional experience, brings to our clients an innovative flair that extends well outside the bounds of the prototypical acoustician, creating practical solutions to a broad range of applications. As Akustik’s Chief Scientist and Principal Consultant... Read More →
avatar for Robert Campbell

Robert Campbell

Fisher Dachs Associates
Principal Robert Campbell has over 35 years’ experience working in all types of performing arts and in a wide variety of formats, from set and lighting design and directing to theatre production management.In his 32 years at FDA, Robert has been an invaluable project manager, programmer... Read More →
avatar for Peter Flamm

Peter Flamm

Lincoln Center Development Project & P. Flamm Arts Consulting LLC
In 2023 Peter Flamm established P. Flamm Arts Consulting LLC to apply his unique experience in planning, operations, and construction at Lincoln Center, the world’s largest and most significant arts campus, to provide consulting services for the optimal management, operation and... Read More →
avatar for Gary McCluskie

Gary McCluskie

Gary McCluskie is a Principal at Diamond Schmitt. In over 30 years of practice, he has championed the design of many of the firm's leading perforGary McCluskie is a Principal at Diamond Schmitt. In over 30 years of practice, he has championed the design of many of the firm's leading... Read More →
avatar for Paul Scarbrough

Paul Scarbrough

Akustiks
Paul Scarbrough is an acoustical design professional with over 30 years of experience. He has developed effective working partnerships with a broad array of architects and theater planners. His formal training in architecture allows him to appreciate a diverse range of architectural... Read More →
avatar for Bill Thomas

Bill Thomas

New York Philharmonic
Bill Thomas has held various positions at the New York Philharmonic from 1999 to 2022, ranging from Chief Financial Officer, to General Manager, to Executive Director, to Project Executive for the renovation of David Geffen Hall. As Project Executive Bill was the Philharmonic’s... Read More →


Thursday October 26, 2023 9:00am - 10:30am EDT
1E10

9:00am EDT

The Why and How Behind Immersive and Surround Recordings
Producer/Engineers, Ulrike Schwarz, Morten Lindberg, and Jim Anderson will discuss the interaction between composers, musicians, producers, engineers and labels in today’s development of music productions in immersive audio. Schwarz, Lindberg, and Anderson will illustrate these comments by playing examples of their recent work. The recent pandemic era also presented a wealth of challenges and the producers will discuss the difference of how those experiences differed for productions recorded in Europe and the US.

Speakers
avatar for Jim Anderson

Jim Anderson

Producer/Engineer, Anderson Audio New York
For nearly 50 years, Jim Anderson has set the standard for acoustic audio engineering and production, capturing pristine, high definition stereo and surround sound recordings that have garnered thirteen Grammy and Latin Grammy Awards, two Peabodys, and a pair of Emmy nominations among... Read More →
avatar for Ulrike Schwarz

Ulrike Schwarz

Engineer/Producer, Co-Founder, Anderson Audio New York
A broadly acclaimed engineer and producer, Ulrike Schwarz is a trailblazing audio innovator with more than two decades of experience across the film, television, radio, and recording industries. Born and raised in Germany, Schwarz discovered she had perfect pitch at an early age... Read More →
avatar for Morten Lindberg

Morten Lindberg

Producer & Engineer, 2L (Lindberg Lyd AS)
Recording Producer and Balance Engineer with 42 American GRAMMY-nominations since 2006, 34 of these in craft categories Best Engineered Album, Best Surround Sound Album, Best Immersive Audio Album and Producer of the Year. Founder and CEO of the record label 2L. Grammy Award-winner... Read More →


Thursday October 26, 2023 9:00am - 10:30am EDT
1E06

9:30am EDT

(Lecture) Improvement of sound reproducibility using open-ear-canal microphones for immersive audio applications
The principle of out-of-head sound image localization technology is the correction of the sound stimulus at the eardrum in the free sound field and that at the eardrum of the headphone listener to equalize them. A correction filter is designed assuming that the pressure division ratio (PDR) is unity. However, it is impossible to strictly achieve a PDR of one, which can result in a timbre change of the reproduced sound. In this study, to reproduce the original sound field more faithfully, we used open-ear-canal microphones instead of the conventionally used blocked-ear-canal microphones and evaluated sound reproducibility from the viewpoint of PDR. It was found that the PDR was closer to one when recording with the ear canal open than with the ear canal blocked. In addition, the angular dependence due to the presentation direction of the sound source was reduced. The dependence on the position of the microphone placed in the ear canal was low. From the viewpoint of sound field reproducibility at the position of the eardrum, the validity of using an open-ear canal microphone was confirmed by experiments.


Thursday October 26, 2023 9:30am - 10:00am EDT
1E07

9:30am EDT

(Lecture) Proposing a Novel Digital Audio Network for Musical Instruments
We present the rationale behind the development of a new digital audio network to be used by professional audio equipment consumers, musicians and audio manufacturers aimed at minimizing cost while simplifying system complexity, instrument connectivity, digital transmission, channel count, bus power capability and equipment manufacturer variants for today’s musicians needs. This paper presents a physical layer and digital protocol that must be carefully studied by the industry for further refinement.


Thursday October 26, 2023 9:30am - 10:00am EDT
1E09

10:00am EDT

(Lecture) Investigation of the Impact of Spectral Cues from Torso Shadowing on Front-Back-Confusion and Perceived Differences along Cones of Confusion
Front-back-confusion effects have been found to occur for sound sources where localization cue differences are insufficient to distinguish between front-back symmetric positions. However, in positions below ear level, which are of increasing interest for virtual reality applications with binaural rendering, torso shadowing introduces additional localization cues. This paper investigates if and how those additional localization cues affect frontback-confusion effects and the perceptibility of vertical position differences. Analysis of spectral cues in HRTF measurements shows substantially stronger localization cue differences below ear level compared to above. Subjective listening experiments on binaural headphones confirm significantly increased perceptibility of differences between sound source positions. In conclusion, the results show reduced front-back-confusion and increased vertical localization accuracy at low elevations, which should be considered in binaural measurement and rendering systems. Furthermore, good agreement between spectral cue analysis and subjective results was found and hence indicates a way to explain and predict perceptual differences.


Thursday October 26, 2023 10:00am - 10:30am EDT
1E07

10:00am EDT

(Lecture) Generative Machine Listener
We show how a neural network can be trained on individual intrusive listening test scores to predict a distribution of scores for each pair of reference and coded input stereo or binaural signals. We nickname this method the Generative Machine Listener (GML), as it is capable of generating an arbitrary amount of simulated listening test data. Compared to a baseline system using regression over mean scores, we observe lower outlier ratios (OR) for the mean score predictions, and obtain easy access to the prediction of confidence intervals (CI). The introduction of data augmentation techniques from the image domain results in a significant increase in CI prediction accuracy as well as Pearson and Spearman rank correlation of mean scores.


Thursday October 26, 2023 10:00am - 10:30am EDT
1E09

10:00am EDT

Exhibit Hall
Welcome to the AES Exhibit Hall where innovation and excellence in audio technology come to life! The AES Exhibit Hall is a captivating hub for audio professionals, enthusiasts, and industry leaders alike. Here, you'll discover a vibrant and immersive environment that showcases leading companies and partners, new technologies, and must-see products all in one location! Don't miss our more than 150 exhibitors, education stages, poster sessions, career fair, and MORE! Start planning your visit to the Exhibit Hall by checking out the Floor Plan.

Thursday October 26, 2023 10:00am - 5:30pm EDT

10:15am EDT

The Future of Multisensory Immersive Experiences in Gaming with AI
Proposal :


The gaming industry has experienced an extraordinary evolution, transforming from humble beginnings to a realm of unparalleled immersion. Technological advancements in spatial audio and artificial intelligence (AI) have played a pivotal role in this revolution. In this compelling panel, we invite you to embark on a journey into the future of gaming, where multisensory immersive experiences intertwine seamlessly with AI.

Artificial intelligence has become a game-changer, enabling the creation of remarkably realistic and intelligent non-player characters (NPCs), as well as the generation of dynamic levels and quests. With AI at their disposal, game developers can craft experiences that are not only challenging but also deeply engaging, captivating players like never before.

Spatial audio, on the other hand, transcends the boundaries of traditional sound design, offering an unparalleled level of realism and immersion. By leveraging spatial audio, game developers can transport players into intricate three-dimensional soundscapes, where sounds emanate from various directions and distances. This technology finds its prime application in games that demand players to navigate complex environments, be it heart-pounding first-person shooters or vast, immersive role-playing worlds.

This panel is a gateway to uncover the boundless possibilities that arise when spatial audio and AI converge. Together, they form the foundation for creating multisensory immersive experiences that push the boundaries of realism, interactivity, and challenge. Join us as we delve into the collaborative potential of these technologies and explore how they can revolutionize the gaming landscape.

Our distinguished panelists comprise experts in the fields of spatial audio and AI, alongside seasoned game developers who have successfully integrated these technologies into their projects. Through their valuable insights and real-world experiences, they will enlighten the audience on how spatial audio and AI can be harnessed to craft exceptional gaming experiences. By shedding light on the challenges developers encounter, such as the need for powerful hardware and software, our panelists will provide practical knowledge to propel the industry forward.

This panel promises to captivate the interest of industry professionals and enthusiasts alike, offering a glimpse into the future of gaming. Prepare to be inspired as we unravel the potential of spatial audio and AI, and discover how their synergy can reshape the gaming landscape. Join us for this enlightening panel and embrace the opportunity to witness the evolution of gaming firsthand.

Tentative Questions for the Panel :

For the AI experts:

1. How has artificial intelligence transformed the way games are developed and experienced?
2. Can you explain how AI is used to create more realistic non-player characters (NPCs) and generate game content?
3. Are there any limitations or challenges that game developers should be aware of when integrating AI into their games?
4. With the rapid advancements in AI, what future developments do you foresee in the integration of AI into gaming?
5. How can AI further enhance the immersive and interactive aspects of gaming experiences?

To the game developers on the panel:

1. Could you share your experiences with incorporating spatial audio and AI into your games?
2. What benefits did these technologies bring to the overall gaming experience?
3. Were there any particular challenges you faced during the implementation process, and how did you overcome them?
4. What are some emerging trends or areas of innovation that you believe will shape the future of multisensory immersive gaming?
5. Are there any particular advancements in spatial audio or AI that you are excited about and would like to explore in your future game projects?

To the experts in spatial audio and AI:

1. How do these technologies complement each other in creating immersive gaming experiences?
2. Can you provide examples of how spatial audio and AI have been combined effectively in games?
3. What unique opportunities do these technologies offer when used together?
4. Are there any ongoing research or developments in spatial audio that have the potential to revolutionize gaming experiences even further?
5. How can game developers and audio engineers stay updated on the latest advancements in spatial audio technology?


For all panelists:

1. As the gaming industry continues to evolve, how do you see the relationship between technology and storytelling in games?
2. How can spatial audio and AI contribute to creating more engaging narratives and memorable gaming experiences?

Speakers
KS

Kaushik Sunder

Director of Engineering, Embody


Thursday October 26, 2023 10:15am - 11:15am EDT
1E08

10:15am EDT

Digital Archiving 101 for Students
This tutorial will be an overview of digital archiving concepts and methodologies. It will focus on best practices for file naming & folder structure and DAW session creation, an overview of strategies for backup and long-term archiving, software solutions, and an overview of industry standards for digital archiving and deliverables. While this tutorial will be student-focused, anyone who wants to learn more about digital archiving is welcome to attend.

Speakers
avatar for Konrad Strauss

Konrad Strauss

Professor, Indiana University Jacobs School of Music
Konrad Strauss is a Professor of Music in the Department of Audio Engineering and Sound Production at Indiana University's Jacobs School of Music. While at the Jacobs School Mr. Strauss built a new curriculum centered around current audio production techniques and technology and designed... Read More →


Thursday October 26, 2023 10:15am - 11:15am EDT
1E11

10:30am EDT

(Lecture) Word based end-to-end real time neural audio effects for equalisation

Audio production, typically involving the use of tools such as equalisers and reverberators, can be challenging for non-expert users due to the intricate parameters inherent in these tools’ interfaces. In this paper, we present an end-to-end neural audio effects model based on the temporal convolutional network (TCN) architecture which processes equalisation based on descriptive terms sourced from a crowdsourced vocabulary of word labels for audio effects, enabling users to communicate their audio production objectives with ease. This approach enables users to express their audio production objectives in descriptive language (e.g., "bright," "muddy," "sharp") rather than relying on technical terminology that may not be intuitive to untrained users. We experimented with two word embedding methods to steer the TCN to produce the desired output. Real-time performance is achieved through the use of TCNs with sparse convolutional kernels and rapidly growing dilations. Objective metrics demonstrate the efficacy of the proposed model in applying the appropriately parameterized effects to audio tracks.


Thursday October 26, 2023 10:30am - 10:45am EDT
1E09

10:30am EDT

(Lecture) Perceptual Comparison of Dynamic Binaural Reproduction Methods for Head-Mounted Microphone Arrays
This paper presents results of a listening experiment evaluating three-degrees-of-freedom binaural reproduction of head-mounted microphone array signals. The methods are applied to an array of five microphones whose signals were simulated for static and dynamic array orientations. Methods under test involve scene-agnostic binaural reproduction methods as well as methods that have knowledge of (a subset of) source directions. In the results, the end-to-end magnitude-least-squares reproduction method outperforms other scene-agnostic approaches. Above all, linearly constrained beamformers using known source directions in combination with the end-to-end magnitude-least-squares method outcompete the scene-agnostic methods in perceived quality, especially for a rotating microphone array under anechoic conditions.

Speakers
avatar for Benjamin Stahl

Benjamin Stahl

University of Music and Performing Arts Graz


Thursday October 26, 2023 10:30am - 11:00am EDT
1E07

10:45am EDT

Applications of the combination of LLMs & Natural Language Processing on Video Game Player Interaction
In this session industry experts alongside experienced practitioners will propose a model for how natural language processors & large-language models (LLMs) could enhance the user experience for those who play video games. The applications for this technology are quite broad, however we will primarily focus on player interaction with Non-Player Characters in video games (NPCs). Imagine talking into your mic and getting a dynamic vocal response with emotion from a natural language processor based on how you talk to it. These tools already exist. There are challenges with this, such as latency, training the LLM, and the long-term storage of user responses in the LLM, among others. To create such an NPC, three APIs would be needed: an LLM trained on character data and dialogue, a real-time transcription API, and a natural language processing API.

Speakers

Thursday October 26, 2023 10:45am - 11:30am EDT
1E10

10:45am EDT

Mix Critiques
Speakers
avatar for Ian Corbett

Ian Corbett

Educator, Kansas City Kansas Community College
Dr. Ian Corbett is the Coordinator and Professor of Audio Engineering and Music Technology at Kansas City Kansas Community College. He also owns and operates “off-beat-open-hats LLC,” providing live sound, audio production, recording, and consulting services to clients in the... Read More →


Thursday October 26, 2023 10:45am - 11:45am EDT
1E06

11:00am EDT

(Lecture) Neural modeling and interpolation of binaural room impulse responses with head tracking
Binaural room impulse responses (BRIRs) are widely used in spatial audio applications, serving as audio filters to simulate immersive environments for headphone-based spatial sound reproduction and approximate acoustic transfer functions for loudspeaker-based sound field control. In such systems, incorporating head tracking in multiple degrees of freedom (DoFs) can either enhance the immersion of reproduced sound field, or improve the filter robustness against possible head movements. This paper focuses on the modeling and interpolation of BRIRs in multiple DoFs using a machine learning-based approach.

In general, BRIRs used in spatial audio applications can be either synthesized by combining head-related impulse responses (HRIRs) and spatial room impulse responses (RIRs) or measured in real environments. While synthesized BRIRs are suitable for auralization applications, loudspeaker-based systems require BRIRs that faithfully represent the actual acoustic system response. Thus, interpolating between pre-measured BRIRs becomes more appropriate as it accurately reproduces the salient peaks caused by room reflections. To reduce the number of required measurements and data storage, various interpolation methods have been explored for HRIRs, but they often fail to accurately estimate BRIRs, especially at high frequencies. DSP-based methods such as dynamic time warping have been proposed, but they come with additional computation for peak detection.

Our proposed interpolation method involves training a deep neural network (DNN) that takes head coordinates as inputs and predicts the corresponding BRIR. The DNN architecture is inspired by previous work on anechoic HRIR estimation and the original NeRF paper from computer vision. We modify the DNN architecture to enable better parallelization and extend its use to BRIRs with strong early reflections, considering both head rotations and translations. In addition, we propose a new frequency-domain formulation of the DNN output and a corresponding loss function, in contrast with the conventional time-domain loss. We also introduce model optimization principles based on Fourier analysis, providing meaningful guidelines for tuning the DNN to interpolate multi-DoF BRIRs.

For performance evaluation, we apply both simulated and in-house measured BRIR datasets to examine the modeling efficiency and the interpolation accuracy of the proposed method. We adopt error metrics such as signal-to-distortion ratio and spectral distortion to quantify the accuracy of modeled and interpolated BRIRs. Furthermore, the proposed model is compared with its time-domain counterpart and other conventional interpolation methods, such as nearest neighbor and dynamic time warping. As a result, the proposed frequency-domain model shows to be more efficient than the time-domain model for band-limited BRIRs, and can accurately predict the early reflection patterns changed with head movements. Overall, our proposed method offers an efficient solution for accurately modeling and interpolating realistic BRIRs, which can potentially facilitate many spatial audio applications with multi-DoF head tracking.


Thursday October 26, 2023 11:00am - 11:15am EDT
1E07

11:00am EDT

(Lecture) Sonifying time series via music generated with machine learning
Conventional sonifications directly assign different aspects of data to auditory features and the results are not always “musical” as they do not adhere to a recognizable structure, genre, style, etc. Our system tackles this problem by learning orthogonal features in the latent space of a given musical corpus and using those features to create derivative compositions. We propose using a Singular Autoencoder (SAE) algorithm that identifies the most important Principal Components (PCs) in the latent space. As a proof-of-concept, we created sonifications of ionizing radiation measurements obtained from the Safecast project. Although the system successfully generates new compositions by manipulating the latent space, with each principal component changing different musical aspects, these changes may not be readily noticeable by listeners, despite the PCs being mathematically decorrelated. This finding suggests that higher-level features (such as associated emotion, etc.) may be needed for better results.


Thursday October 26, 2023 11:00am - 11:30am EDT
1E09

11:00am EDT

Florian Camerer: Enveloping Masterclass
Florian Camerer plays high resolution 7.1.4 atmospheric recordings, describing also the double-UFIX microphone and the recording techniques used.

Florian and Thomas discuss the Inception Dilemma, and how his recordings come across in this particular room. Seats are limited to keep playback variation at bay, and the session is concluded with Q&A.

Immersive formats do not guarantee envelopment, or better control of a listening experience than stereo has to offer. In this masterclass series, we discuss and exemplify factors of recording, mixing, distribution and reproduction that make "immersive" worth the effort.

Speakers
avatar for Thomas Lund

Thomas Lund

Senior Technologist, Genelec Oy
Thomas Lund is senior scientist at Genelec, doing subjective research and listening tests. He has written papers on perception, spatialisation, loudness, sound exposure and true-peak level. Thomas is convenor of a working group under the European Comission, tasked with the prevention... Read More →


Thursday October 26, 2023 11:00am - 12:00pm EDT
3D01

11:00am EDT

Archival Audio in Education Panel
In his 2015 International Association of Sound and Audiovisual Archives (IASA) Journal article, “Why Media Preservation Can’t Wait: The Gathering Storm,” Indiana University’s Mike Casey writes: “As obsolescence deepens, the knowledge of how to repair old players becomes scarce. Even the knowledge and experience required to successfully play a deteriorating obsolete recording on a legacy playback machine fades away.” While the curriculum of most audio programs is rooted in the born-digital realm, there are a few training programs and initiatives that also focus on the preservation of analog and early digital carriers, and their playback equipment. This panel includes presentations on Iron Mountain Entertainment Services’ first Audio Archiving Workshop that took place last summer and the developments in working to establish a Media Archival and Restoration program at University of Saint Francis. Moderated by Uiniversity of Hartford Professor, Gabe Herman, this conversation will help any educator become more aware of the iportance of audio archival in a modern audio curriculum. The presentations will be followed by a discussion on goals and future plans, and a Q&A session with attendees.

Moderators
Speakers
avatar for Miles Fulwider

Miles Fulwider

Associate Division Director for Division of Creative Arts, University of Saint Francis
Miles Fulwider, Tonmeister M.M. is a Producer, Engineer, and Educator.  Currently Miles is the Associate Director of the Division of Creative Arts and the Program Director for the Music Technology program at the University of Saint Francis in Fort Wayne Indiana.  Miles is Co-Vice... Read More →


Thursday October 26, 2023 11:00am - 12:30pm EDT
TBA Education Stage

11:15am EDT

(Lecture) Generation of highly realistic virtual sound field by modifying head-related transfer functions
The characteristics of a reproduced sound field in acoustic augmented reality (AR) are crucial to present a virtual sound source with realism for a listener. An effective approach is to add the reverberation characteristics of the reproduction field. Herein the reverberant components from impulse responses measured at a single point in a room are extracted and added to the listener's individual head-related transfer function (HRTF). It can be regarded as a synthesis of the equivalent of a binaural room transfer function (BRTF) of the field where the listener is. The effectiveness of the proposed method is verified via a comparison experiment between a real sound source and a virtual sound source. The proposed method has the same effect as the listener's BRTF. Additionally, neither the sound source position nor individualization is necessary to acquire the reverberation characteristics.


Thursday October 26, 2023 11:15am - 11:30am EDT
1E07

11:30am EDT

(Lecture) Implementation of and application scenarios for plausible immersive audio via headphones
The paper describes both recent progress on plausible binaural rendering of audio via headphones and some application scenarios and their requirements.
A number of perceptual cues help the brain to get immersed into sound scapes. We discuss these cues and their relative importance. To reproduce plausible immersive sound, both information on the pose and position of the listener as well as information about the room are necessary.
We discuss a number of application scenarios including music therapy of tinnitus, listening to spatial audio mixes over headphones with much improved plausibility and some of the applications defined in the context of the upcoming MPEG-I (Immersive Audio) standard.


Thursday October 26, 2023 11:30am - 11:45am EDT
1E07

11:30am EDT

Archiving room acoustics of a historic studio by 4pi scene-based sampling reverb, VSVerb.
Buildings will collapse one day. Even if we try to renovate them, the original acoustic texture cannot be fully restored. Therefore, archiving the room acoustics of a historic studio is important to the audio industry.
Sometimes, an impulse response (IR) is measured to archive room reverberation. However, it also contains unwanted information, i.e. frequency responses of measurement devices and background noise. In addition, the spatial characteristics of the reverberation are limited by the microphone array we use to measure the IRs. For example, if we use a 5.0ch array, we will lose the 3D properties of the reverberation.
Instead of conventional IR reverbs, we propose a new method of archiving a room reverberation, VSVerb (Virtual sound Sources’ reVerb). The VSVerb is the method to sample the spatial properties of virtual sound sources of a room and generate a 4pi scene-based reverberation without noise and without responses of measurement tools.
In the workshop, we introduce our trial to archive the room acoustics of a recording studio in Tokyo, whose name is ONKIO HAUS. The studio has two legendary recording booths, Studio 1 and Studio 2. This time, we sampled their virtual sound sources at various source and receiver positions to archive their reverberant field. A total of 14 sets of virtual sound sources were obtained from measured IRs using an A-format microphone. These virtual sound sources were translated into time responses, then 14 kinds of VSVerbs were generated for the archive resources of the studio acoustics.
Since the VSVerb is a 4pi scene-based reverb, we can apply many types of post-processing. This time, two types of post-processing, 1) moving and rotating the microphone’s position and 2) changing the microphone’s directivity, were tested to see if the archived reverbs could provide virtual recording experiences at the ONKIO HAUS.
The listening check was performed by several recording/mixing engineers of the ONKIO HAUS in their mixing room using 2ch decoded VSVerbs. Their listening impressions were quite positive, they felt as if they were recording in the real booths of Studio 1 and Studio 2. We will also report their impressions in detail at the workshop.

Appendix: About VSVerb (Virtual sound Sources’ reVerb)
VSVerb is a sound intensity oriented technique for generating a 4pi scene-based reverb by detecting virtual sound sources from four impulse responses measured by an A-format microphone.
An overview of the processing flow is given below.
( Some papers on VSVerb can be found in the AES E-Library by typing VSV in a search window.)
1) Impulse responses are measured in a target room using an A-format microphone, and they are filtered into low, mid, and high frequency bands.
2) The filtered impulse responses are converted to W, X, Y and Z of B-format signals, and Hilbert transforms are applied to them.
4) By multiplying the Hilbert transformed W by X, Y, Z, three orthogonal sound intensities, Ix, Iy and Iz, are obtained. Then time averaging operations are applied to the sound intensities.
5) Virtual sound sources, i.e. dominant reflections, are detected from the averaged sound intensities using the "Speed detection" method.
“Speed detection” is the method to detect acoustic information of real and virtual sound sources, i.e. strength, distance, arrival direction and phase (+/-), from measured sound intensities by focusing on moving speeds of sound intensities.
6) The spatial information of the obtained virtual sound sources is translated into time responses, then the 4pi scene-based reverberation is generated in low, mid, and high frequency bands.
7) Since VSVerb is a 4pi scene-based reverb, the reverb can be divided into any types of playback channel formats and has high flexibility for post-processing operations, e.g. moving and rotating the receiver’s position, changing the receiver’s directivity, changing the averaged absorption coefficient of a room, changing the room size, etc.

Speakers
avatar for Jun Yamazaki

Jun Yamazaki

Tac System, Inc.
avatar for Masataka Nakahara

Masataka Nakahara

Acoustic Designer / Acoustician, SONA Corp. / ONFUTURE Ltd.
Masataka Nakahra is an acoustician who specialized in acoustic design of studios and R&D work on room acoustics, and is also an educator.After he had learned acoustics at the Kyushu Institute of Design, he joined SONA Corporation and started his career as an acoustic designer.In 2005... Read More →
SN

Shigeharu Nakauchi

ONKIO HAUS Inc.


Thursday October 26, 2023 11:30am - 12:15pm EDT
1E11

11:30am EDT

Case Study: Stereo in the round at the stadium level: Metallica 72 Seasons Tour
Delivering the powerful clear sound of a metal band in a stadium setting is a major challenge. Sound reinforcement “In-the-Round” is also a major challenge. Providing stereo coverage for anything more than a small minority of a concert audience is yet another one. Metallica’s M72 concert tour takes on all of these as it delivers full stereo, in -the-round, in stadiums creating a near-field experience for huge audiences. system designer Bob McCarthy will share the details of this unique and unprecedented project.

Speakers
avatar for Bob McCarthy

Bob McCarthy

Meyer Sound
Bob McCarthy’s involvement in the field of sound system optimization began in 1984 with the development of Source Independent Measurement™ (SIM™) with John Meyer at Meyer Sound Laboratories. Since that time he has been an active force in developing the tools and techniques of... Read More →
avatar for Josh Dorn-Fehrmann

Josh Dorn-Fehrmann

Senior Technical Support Specialist, Meyer Sound
Josh Dorn-Fehrmann is senior technical support specialist for Meyer Sound with a background in touring, music, and sound design for theatre. At Meyer Sound, Josh serves a multipurpose role where every day is different. Some of his responsibilities include customer support, sound system... Read More →


Thursday October 26, 2023 11:30am - 12:30pm EDT
1E08

11:45am EDT

(Lecture) In-Ear Headphones on Ear Canal Simulator vs Real Human Ear Geometries: Quantifying the Errors with Simulations
Ear simulators are used to predict the pressure at the ear drum reference point (DRP) for an average human ear, but they do not provide information on the range of pressure responses in individuals, which can be significant. It is possible to predict an approximation of the pressure response at DRP by measuring the pressure at a near-field microphone (NFM) inside the earbud and appropriate transfer function G of pressure at NFM to pressure at DRP. Using a transfer function obtained on an ear simulator leads to inaccurate approximations, and this work aims to quantify the potential error ranges.
This work is using the Finite Element Method (FEM) to predict the pressure response of a sealed, in-ear headphone for 20 different ears. Magnetic Resonance Imaging (MRI) scans of ten people are simulated with in-ear earbuds at five different locations, yielding 100 different pressure responses at their respective DRPs and NFMs. Knowledge of the pressure at NFM and the personal transfer function G allows for a prediction of a personalized pressure response at DRP. We predict lower bounds for pressure ranges of less than +- 1dB below 1 kHz, +-5 dB between 1 kHz and 4 kHz, and up to +- 15 dB above 4 kHz, and we show significant differences between real human ears and a common ear simulator.

Speakers

Thursday October 26, 2023 11:45am - 12:00pm EDT
1E07

11:45am EDT

Building Equitable Spaces: The Loop Lab's Model of Diversity and Representation in Media and Audio Industries
In this presentation, we will delve into the transformative work of The Loop Lab in promoting equity, diversity, and representation within the media and audio industries. We will explore our innovative model that aims to break down barriers and create inclusive spaces for underrepresented voices. Through a combination of education, mentorship, and hands-on experience, we empower individuals from marginalized communities to thrive in these creative fields. Join us as we share our strategies, successes, and lessons learned, highlighting the impact of our approach in fostering a more inclusive and vibrant media landscape.

Speakers
CH

Christopher Hope

The Loop Lab
LR

Lucas Raagas

The Loop Lab
AL

Abraham Lopez

The Loop Lab


Thursday October 26, 2023 11:45am - 12:30pm EDT
1E10

12:00pm EDT

(Lecture) Comparison of measurement methods for immersive sound calibration
Immersive sound systems are increasingly used in the production of recorded music. Proper calibration of these systems is critically important to achieve a neutral reference where best translation can be achieved. With regards to final equalization, an accurate and sufficiently high-resolution measurement is required to properly adjust the system. Various methods of measurement in small rooms, including both static microphone and moving microphone methods are compared, and recommendations based on calibration requirements are made.

Speakers

Thursday October 26, 2023 12:00pm - 12:15pm EDT
1E07

12:00pm EDT

Beam Me Up, Scotty: Transporting Musicians to Virtual Environments
Virtual acoustics is a broad term describing the processing an audio signal with the characteristics of an immersive, simulated environment that envelopes the listener or performer. Over a multichannel speaker system, an active virtual acoustics system superimposes the acoustic impression of the simulated space on top of an existing physical one.

The Virtual Acoustic Technology Laboratory (VAT) Lab) at McGill University has been researching and developing innovative electro-acoustically enhanced spaces since 2005. Most recently, an acoustic system of 15 dodecahedral loudspeakers has been installed in a studio-sized space at the Schulich School of Music’s Immersive Media Lab. The VAT system uses a feedback canceller developed by researchers at Stanford and Limerick Universities that enables for the loudspeaker reproduction of room impulse responses of various acoustic spaces in a typical recording setting.

As part of the lab’s development, preliminary testing has brought a wide range of musicians into the studio to interact with the system who have, in the process, unlocked its application as a creative tool. Artists have reported feeling transported to concert halls, churches, and uniquely designed unreal spaces while inside the virtual environment. This presentation will give an overview of the technology accompanied by immersive playback of music recorded using the VAT system with the explanation of various microphone systems employed to capture the space.

Speakers
avatar for Kathleen Ying-Ying Zhang

Kathleen Ying-Ying Zhang

McGill University
YIng-Ying Zhang is a music technology researcher and sound engineer. She is the first woman to be admitted to the Sound Recording PhD program at McGill University, where she is projected to graduate in 2024. She has experience in both on-set and post-production film sound, including... Read More →
VB

Vlad Baran

McGill University
AA

Aybar Aydin

McGill University
GG

Gianluca Grazioli

McGill University
WW

Wieslaw Woszczyk

McGill University


Thursday October 26, 2023 12:00pm - 12:45pm EDT
1E06

1:00pm EDT

Saul Walker Student Design Competition Round Table
Enjoy a panel of audio-design icons and experts to address various topics around audio design. This session, hosted by the Audio Education Committee of the AES, is aimed at everyone interested and involved in audio design, but especially students and their educators. We will examine pathways into audio design, understanding the importance of involvement in a supportive community, and the great opportunities that the AES offers audio designers. The discussion is followed by a Q&A, allowing the audience to ask the A-list designers questions. The panel will feature Dave Derr of Empirical Labs, George Massenburg of GML, and others leading the charge in audio design. The panel will be moderated by Christoph Thompson, director of the AES Design Competition.


Thursday October 26, 2023 1:00pm - 2:30pm EDT
TBA Education Stage

1:45pm EDT

(Lecture) Measurement Techniques for Dynamic Equalizers
A Dynamic Equalizer is a tool for equalizing an audio signal only under certain conditions of the input signal amplitude: for example, when the input signal exceeds or goes below a predefined threshold. The Dynamic equalizer is a quite modern and well diffused tool in the music production industry but, as well as for compressors and limiters, can find a specific usage in the protection/enhancement of professional audio equipment. Such interventions can be done, for example, to compensate for power compression tonal effects at high levels, or to “fill” the under-perceived spectrum extremes at low levels (that is an automatic loudness filter). To accomplish this purpose, the settings of a Dynamic Equalizer should be based on precise quantitative information, possibly assessed by loudspeaker measurements, and the behavior of the Dynamic Equalizer itself must be quantitatively characterized. It appears that there are no such things like normed methodologies for measuring and validating a Dynamic Equalizer. In this presentation, some methods for validating and describing the audio process of Dynamic Equalizers designed for pro audio equipment enhancement will be investigated.


Thursday October 26, 2023 1:45pm - 2:00pm EDT
1E07

1:45pm EDT

(Lecture) Bass Preamplifier Emulation with Conditional Recurrent Neural Network
Preamplifiers are widely used in the music industry to amplify audio signals and improve the signal-to-noise ratio. They also incorporate circuits for non-linear distortion, creating unique tones in iconic amplifiers. As music production goes digital, preamplifiers are being analyzed, modeled, and digitized for emulation. Several studies have confirmed the effectiveness of AI recurrent neural networks in accurately emulating amplifier output. However, traditional network architectures can only fit specific time-series characteristics, requiring re-training and storing different models when preamplifier settings change. This research introduces a conditional input structure, synchronizing knob parameters with input signals into a long short-term memory network, effectively predicting the preamplifier's output. Experimental design involved implementing five rotary knob parameters and conducting experiments at five different angles for each knob setting. Results showed the proposed model had an average RMSE of less than 0.01, reducing the need to store multiple parameter sets and enhancing AI modeling efficiency for multiple preamplifier characteristics.


Thursday October 26, 2023 1:45pm - 2:00pm EDT
1E09

2:00pm EDT

(Lecture) A practical approach to the use of center channel in immersive music production
The center channel has been around since the mono playback systems employed in early film sound. At the advent of stereo reproduction, the film world adopted a three-channel setup, where the center channel was used for anchoring dialog to the action on screen, while ambiences and music soundtracks took advantage of the width and spread of the left and right channels. Since the beginning of 5.1 Surround Sound, center channel has received mixed reviews by music production practitioners. Immersive mixers using binaural rendering may not be considering center channel as a component of a headphone-only mix. This paper provides an overview of best practices for center channel use in immersive audio mixing. Various measurements are shown to exemplify the differences between the use of center channel versus phantom center.

Speakers
avatar for Richard King

Richard King

McGill University
Richard King is an Educator, Researcher, and a Grammy Award winning recording engineer. Richard has garnered Grammy Awards in various fields including Best Engineered Album in both the Classical and Non-Classical categories. Richard is an Associate Professor at the Schulich School... Read More →
avatar for George Massenburg

George Massenburg

Associate Professor of Sound Recording, Massenburg Design Works
George Y. Massenburg is a Grammy award-winning recording engineer and inventor. Working principally in Baltimore, Los Angeles, Nashville, and Macon, Georgia, Massenburg is widely known for submitting a paper to the Audio Engineering Society in 1972 regarding the parametric equali... Read More →


Thursday October 26, 2023 2:00pm - 2:15pm EDT
1E07

2:00pm EDT

(Lecture) Squared Chebyshev and Elliptic Crossovers
Squared Butterworth crossovers, also referred to as Linkwitz-Riley crossovers, have been widely employed in audio applications for to split a spectrum in multiple sections. However, when it comes to low-pass and high-pass filters, alternative options such as Chebyshev (Type-I and II) and elliptic filters often prove beneficial. This paper proposes squared Chebyshev and elliptic filters in crossovers designs. Moreover, since perfect magnitude reconstruction IIR crossovers are all-pass filters, this can also be used to build them.

Speakers

Thursday October 26, 2023 2:00pm - 2:15pm EDT
1E09

2:00pm EDT

Barbie Goes to AES
Barbie, the movie, has taken the country by storm – hitting the top of the box office charts and breaking records. But what would Barbie be without its stellar music and performances? In this panel, we speak with Barbie engineers and producers about the recording process and what obstacles they overcame to get it to sound the way it does. What it was like to collaborate on this soundtrack with some of the biggest names in the business? Get some fun behind the scenes and step into Barbie’s world for an unforgettable and fantastic session.


Thursday October 26, 2023 2:00pm - 3:00pm EDT
1E10

2:00pm EDT

Voice Acting Soundstage: Creating the Space for Opportunities in Voiceover
A workshop designed to helping independent engineers, sound designers and producers make the transition to working in the rewarding world of voiceovers for TV commercial voiceover, radio,film and animation.

Learn about the possibilities for yourself and your artists in the Advertising World, from TV Voiceover to Promo and Podcast Recording World

Discover your unique technical strengths that can be marketed for well paying voiceover industry projects.


About Xavier Paul Cadeau, With 25 years of voiceover/audio production experience as the voice of several networks like CBS.HBO, The Weather Channel as well as voice acting in major video games Grand Theft Auto cartoons like Teenage Ninja Turtles and a host of national projects, Xavier has jump started the careers of many working national voiceover artists. Xavier gives an overview to artist management on how getting into the voiceover industry can expand their artist brand reach, make the brand more money and create more longevity for the artists career.

Xavier will cover:

What the opportunities are in the media voiceover world for diverse engineers, producers and sound designers from varied ethnic backgrounds

Where the work is in Commercials, Video Games, TV Shows and Radio

What to do to get access to these money making opportunities in Media

What to expect on various voiceover production contracts

The best negotiating tactics

When a producer is ready to expand into Voiceovers/Commercials/Video Games

Speakers
XC

Xavier Cadeau

Media Producer, SAG-AFTRA


Thursday October 26, 2023 2:00pm - 3:00pm EDT
1E11

2:00pm EDT

George Massenburg: Enveloping Masterclass
George plays high resolution stereo, 5.1 and 3D recordings from his fabulous back catalogue, commenting on production techniques and the most appropriate listening format considering each track. George and Thomas discuss the Inception Dilemma, and how recordings come across in this particular room. Seats are limited to keep playback variation at bay, and the session is concluded with Q&A.

3D audio formats do not guarantee listener envelopment, or better control of a listening experience than stereo has to offer. In this masterclass series, we discuss factors of recording, mixing and reproduction that make "immersive" worth the effort. Listening examples are all linear audio and have never been lossy data reduced.

Speakers
avatar for Thomas Lund

Thomas Lund

Senior Technologist, Genelec Oy
Thomas Lund is senior scientist at Genelec, doing subjective research and listening tests. He has written papers on perception, spatialisation, loudness, sound exposure and true-peak level. Thomas is convenor of a working group under the European Comission, tasked with the prevention... Read More →
avatar for George Massenburg

George Massenburg

Associate Professor of Sound Recording, Massenburg Design Works
George Y. Massenburg is a Grammy award-winning recording engineer and inventor. Working principally in Baltimore, Los Angeles, Nashville, and Macon, Georgia, Massenburg is widely known for submitting a paper to the Audio Engineering Society in 1972 regarding the parametric equali... Read More →


Thursday October 26, 2023 2:00pm - 3:00pm EDT
3D01

2:00pm EDT

Factors influencing consumer acceptance of immersive audio
Immersive audio playback is now possible at home, on the go, and on the road. We now have a historical chance to make a success of immersive audio! Yet consumer acceptance will determine the success of the current iterations of spatial/immersive audio formats.
There are many factors to achieving consumer acceptance. One of the key factors is creating a compelling listening experience, and this takes effort at both ends of the chain. Creative decisions made during mixing, along with how the reproduction system has been configured, will directly affect the end-user experience and therefore consumer acceptance.
Due to the nature of the delivery medium (a single immersive mix that needs to be compatible with all reproduction formats), there will inevitably need to be compromises made at either end of the chain:
During music mixing, will the center channel be utilised? (Indeed, is it “allowed” to be used? Some record labels stipulate that there is to be no isolated vocal in the center channel!) Will care be taken about off-center listening? Or will the mix simply ignore the majority of people in larger spaces, or all of the occupants of a car? Which tuning curve is being used in the mixing room? Are immersive mixes created on loudspeaker systems being checked using headphone playback? What about headphone-based mixes?
When setting-up the immersive sound reproduction environment, will care be taken in speaker arrays to cater for off-center listening? Does one second-guess the mixing engineer’s decisions (or the record label’s rules) and try to accommodate multiple types of mix paradigm (e.g. phantom/hard center). Which tuning curve is being used in the reproduction environment? What factors in the headphone rendering or head-tracking algorithms have an effect on sound quality?
One further discussion point is that there appears to be a quality versus quantity debate for immersive music mixes. If the market is saturated with non-compelling immersive music mixes, then consumer acceptance of the format is likely to be damaged irrevocably. In cases where budgets are extremely limited the use of automatic upmixers from stereo-to-immersive appear to be being utilized. Often only stereo stems will be available for the immersive mix. What are the benefits of mixing in immersive from the start, or having access to the multichannel stems for remixing. What are the cost implications? Are there any recommendations and best practices when remixing stereo-to-immersive?
These and other topics will be considered by the panel.

Moderators
avatar for Rafael Kassier

Rafael Kassier

Senior Principal Engineer, Harman
Rafael is responsible for Harman’s scientific sound quality evaluation (Benchmarking) for European automotive sound systems and is based in Germany. Previously he worked in the audio industry and as a research fellow. His research areas are subjective evaluation, listener training... Read More →

Speakers
avatar for Jacqueline Bošnjak

Jacqueline Bošnjak

CEO & Co-Founder, Mach1
avatar for Buddy Judge

Buddy Judge

Advanced Media Design, Apple
Buddy Judge likes music. He has been on both sides of the microphone during his career — as a recording artist and as a producer/engineer. He currently works at Apple Inc. where he developed the Apple Digital Masters program and recently launched Lossless and Spatial audio on Apple... Read More →
NK

Nathaniel Kunkel

Studio Without Walls
avatar for Agnieszka Roginska

Agnieszka Roginska

Professor of Music Technology, Past President
Agnieszka Roginska is a Professor of Music Technology at New York University. She conducts research in the simulation and applications of immersive and 3D audio including the capture, analysis and synthesis of auditory environments, auditory displays and applications in augmented... Read More →
avatar for Andrew Scheps

Andrew Scheps

Owner, Tonequake
Andrew is a record producer/mixer, label owner, and software developer who has been in the industry for over 30 years.  He hasn't cut his hair or beard for at least 15 of those years.
MZ

Mark Ziemba

Panasonic


Thursday October 26, 2023 2:00pm - 3:30pm EDT
1E06

2:15pm EDT

(Lecture) Reexamining Traditional Stereo Microphone Techniques with Continuously Variable Pattern Microphones: Tools and Methodologies
Classic stereo microphone recording techniques were developed decades ago. There are three categories for stereo setups: coincident (X/Y, Blumlein, M-S), near-coincident (ORTF, NOS, DIN), and spaced (AB, Decca). All of these methodologies rely on the pickup pattern of the microphones used. This is the critical factor for the technique selected. The author's hypothesis is that there are real world variations in the actual patterns between both models of microphones and sometimes between the actual microphones themselves. There are also microphones that let you continuously adjust the pattern produced and microphones with dual outputs. The latter allows the pattern to be determined in post, after recording. This paper will revisit several of the traditional stereo microphone techniques but with the use of dual output microphones such that the pickup pattern of each microphone can be determined in post production and the workflow around their use.

Speakers

Thursday October 26, 2023 2:15pm - 2:30pm EDT
1E07

2:15pm EDT

(Lecture) Investigation of the influence of noise signal characteristics on the efficiency of the ANC system for causal and non-causal systems.
Currently, ANC systems based on the FxLMS algorithm are widely used in cars to reduce road noise. However, the efficiency of these systems is still insufficient even in the case of high coherence of the reference signal and the error signal. One of the main characteristics of the ANC system is causality. If the noise signal reaches the microphone (ear) earlier than the compensating signal, then from the point of view of theory, the efficiency of the system will be zero. When implementing an ANC system in a car interior to suppress road noise, causality is often zero or negative. The aim of this work is to study the possibility of obtaining stable noise suppression in the case of non-causal systems.


Thursday October 26, 2023 2:15pm - 2:30pm EDT
1E09

2:30pm EDT

(Lecture) Master Bus Coloring with Microphone Preamplifiers
Summing boxes are increasingly used for digital mixing to provide benefits like added dynamic range and coloration compared to in-the-box mixing. However, the coupling of dynamic range and coloration in summing boxes lacks flexibility. We propose using microphone preamplifiers, commonly found in studios for coloration on vocal and instrument recordings, with summing boxes in passive mode to decouple these two aspects. Our investigation focuses on the validity of this approach for mixing and mastering. We start by characterizing microphone preamps in full-frequency settings by probing its frequency-magnitude and phase response, along with THD at different amplitudes. We then explore their use for mastering without separate stems or summing, and for mixing where engineers can adjust the signal chain as needed. We conduct interviews with engineers for qualitative feedback. This novel approach decouples dynamic range and coloration, providing more flexible bus processing for the digital age while reusing existing studio equipment.


Thursday October 26, 2023 2:30pm - 2:45pm EDT
1E07

2:30pm EDT

(Lecture) Application of ML-Based Time Series Forecasting to Audio Dynamic Range Compression
Time Series Forecasting (TSF) is used in astronomy, geology, weather forecasting, and finance to name a few. Recent research [1] has shown that, combined with Machine Learning (ML) techniques, TSF can be applied successfully for short-term predictions of music signals. We present here an application of this approach for predicting audio level changes of music and appropriate Dynamic Range Compression (DRC). This ML-based look ahead prediction of audio level allows to apply compression just-in-time, avoiding latency and attack/release time constants, which are proper to traditional DRC and challenging to tune.

Speakers
avatar for Pascal Brunet

Pascal Brunet

Samsung Research America
Pascal Brunet obtained his Bachelor's in Sound Engineering from Ecole Louis Lumiere, Paris, in 1981, his Master's in Electrical Engineering from CNAM, Paris, in 1989 and a PhD degree in EE from Northeastern University, Boston, in 2014. His thesis was on nonlinear modeling of loudspeakers... Read More →


Thursday October 26, 2023 2:30pm - 2:45pm EDT
1E09

2:45pm EDT

(Lecture) Fatigue: What can sports science teach use about performance in audio production?
It’s common for audio professionals to speak of fatigue: hearing fatigue, listener fatigue, workload fatigue, and especially for those working in event audio, physical fatigue. Due to the financial incentive in professional sports, sports science has a large body of research and literature on the topic of fatigue, and audio professionals can benefit from some of the core concepts and protocols.
The field of audiology offers much in terms of research on hearing fatigue (local level fatigue within the auditory system) as well as listener fatigue (systemic fatigue occurring across the auditory and cognitive systems), we have few actionable protocols for audio professionals whose professional life requires high-performing activities that produce hearing and listener fatigue.
Sports science, while not addressing the hearing issue, provides a framework for fatigue management strategies required for people whose daily high-performance activities produce high levels of fatigue, and whose performance is dependent on their ability to manage that fatigue. Concepts and practices like stimulus-recovery-adaptation, periodization, stimulus to fatigue ratio, and recovery maximization can provide a framework for protocols that can help manage auditory fatigue (hearing and listener), as well as physical fatigue common to many audio careers.

Speakers

Thursday October 26, 2023 2:45pm - 3:00pm EDT
1E07

2:45pm EDT

(Lecture) AudioVMAF: Audio Quality Prediction with VMAF
VMAF [1],[2],[3] is a popular tool in the industry for measuring coded video quality. In this study, we propose an auditory-inspired frontend in existing VMAF for creating videos of reference and coded spectrograms, and extended VMAF for measuring coded audio quality. We name our system AudioVMAF. We demonstrate that image replication is capable of further enhancing prediction accuracy, especially when band-limited anchors are present. The proposed method significantly outperforms all existing visual quality features repurposed for audio, and even demonstrates a significant overall improvement of 7.8% and 2.0% of Pearson and Spearman rank correlation coefficient, respectively, over a dedicated audio quality metric (ViSQOL-v3 [4]) also inspired from the image domain.


Thursday October 26, 2023 2:45pm - 3:00pm EDT
1E09

3:00pm EDT

(Lecture) High-fidelity noise reduction with differentiable signal processing
Noise reduction techniques based on deep learning have demonstrated impressive performance in enhancing the overall quality of recorded speech. While these approaches are highly performant, their application in audio engineering can be limited due to a number of factors. These include operation only on speech without support for music, lack of real-time capability, lack of interpretable control parameters, operation at lower sample rates, and a tendency to introduce artifacts in challenging scenarios. On the other hand, traditional signal processing noise reduction algorithms offer fine-grained control and operation on a broad range of content, however, they often require manual operation to achieve the best results. To address the limitations of both approaches, in this work we introduce a method that leverages a signal processing-based denoiser that when combined with a neural network controller, enables fully automatic noise reduction. By doing so, we achieve efficient and high-fidelity noise reduction across both speech and music signals. We evaluate our proposed method with both objective metrics and a perceptual listening test. Our evaluation reveals that deep learning systems designed for speech enhancement can be extended to noise reduction for music, however training the model to remove only stationary noise is critical. Furthermore, our proposed approach achieves performance on par with the deep learning models, while being significantly more efficient and introducing fewer artifacts in some cases.

Speakers
avatar for Christian J. Steinmetz

Christian J. Steinmetz

PhD Researcher, Centre for Digital Music, Queen Mary University of London
I am a PhD student working with Prof. Joshua D. Reiss within the Centre for Digital Music at Queen Mary University of London. I research applications of machine learning in audio with a focus on differentiable signal processing. Currently, my research revolves around high fidelity audio and music production, which involves enhancing audio, intelligent systems for au... Read More →


Thursday October 26, 2023 3:00pm - 3:15pm EDT
1E09

3:00pm EDT

(Lecture) Stereo Speech Enhancement Using Custom Mid-Side Signals and Monaural Processing
Speech enhancement (SE) systems typically operate on monaural input and are used for applications including voice communications and capture cleanup for user-generated content. Recent advancements and changes in the devices used for these applications are likely to lead to an increase in the amount of two-channel content for the same applications. However, SE systems are typically designed for monaural input; stereo results produced using trivial methods such as channel-independent or mid-side processing may be unsatisfactory, including substantial speech distortions. To address this, the authors propose a system that creates a novel representation of stereo signals called custom mid-side signals (CMSS). CMSS allow benefits of mid-side signals for center-panned speech to be extended to a much larger class of input signals. This, in turn, allows any existing monaural SE system to operate as an efficient stereo system by processing the custom mid signal. This paper describes how the parameters needed for CMSS can be efficiently estimated by a component of the spatio-level filtering source separation system. Subjective listening using state-of-the-art deep learning-based SE systems on stereo content with various speech mixing styles shows that CMSS processing leads to improved speech quality at approximately half the cost of channel-independent processing.

Speakers
avatar for Pascal Brunet

Pascal Brunet

Samsung Research America
Pascal Brunet obtained his Bachelor's in Sound Engineering from Ecole Louis Lumiere, Paris, in 1981, his Master's in Electrical Engineering from CNAM, Paris, in 1989 and a PhD degree in EE from Northeastern University, Boston, in 2014. His thesis was on nonlinear modeling of loudspeakers... Read More →


Thursday October 26, 2023 3:00pm - 3:30pm EDT
1E07

3:00pm EDT

Education & Career Fair
The only education and career fair focused entirely on degree and certificate programs in audio around the world. Come meet professors, students, and college admissions representatives and discover how to advance your career as an audio professional!

Moderators
Thursday October 26, 2023 3:00pm - 5:30pm EDT
TBA Education Stage

3:15pm EDT

(Lecture) Transient Detection Methods for Audio Coding
Transient detection is an important algorithm in perceptual audio codecs. It is generally used to adapt frequency vs. time resolution in the signal representation in order to reduce pre-echo artifacts in encoded audio signals. Due to the lack of published research on existing algorithms adopted in audio coding, we present a curated selection of transient detection techniques tailored for audio coding purposes based on variations of: high frequency energy, block perceptual entropy, sub-block peak energy, and spectral flatness measure. To evaluate these methods, we conduct ground-truth comparisons and MUSHRA-based listening tests. The test results show the trade-offs between the different transient detection methods across a wide range of critical material for a simple, baseline audio codec. This paper contributes to the advancement of perceptual audio coding and paves the way for further optimization in this domain.


Thursday October 26, 2023 3:15pm - 3:30pm EDT
1E09

3:15pm EDT

Artificial Intelligence in Audio: Staying Ahead of the Law
The future is coming and artificial intelligence and machine learning are everywhere. With new technology comes important considerations. Who owns the generated content? When does “inspiration” become “infringement”? How can you protect yourself as a developer of AI tools or as user of AI generated materials and tools? Hear about the evolving legal landscape of AI in both the creative space and technical spaces, and have the opportunity to ask your AI related questions to a group of lawyers.

Speakers
avatar for Heather Rafter

Heather Rafter

Principal, RafterMarsh US
Heather Dembert Rafter has been providing legal and business development services to the audio, music technology, and digital media industries for over twenty-five years. As principal counsel at RafterMarsh US, she leads the RM team in providing sophisticated corporate and IP advice... Read More →
JL

Jay LeBoeuf

Head of Business & Corporate Development, Descript
I lead business and corporate development for Descript.Descript builds simple and powerful collaborative tools for new media creators.
GC

Gabe Cowan

Audio Design Desk
ZK

Zack Klein

RafterMarsh.


Thursday October 26, 2023 3:15pm - 4:15pm EDT
1E11

3:15pm EDT

An Inside Look at In The Heights: From Broadway to Silver Screen
A conversation with engineer Derik Lee & and arranger/producer Jaime Lozano - moderated by Jeanne Montalvo. We take a look at the recording process for In The Heights, the movie and the process of adapting the music from the Tony Award winning Broadway musical.

Speakers

Thursday October 26, 2023 3:15pm - 4:15pm EDT
1E08

3:15pm EDT

Safe Listening in Pro Audio
Recent studies on hearing health have led to a new understanding of how temporary "ringing of the ears" and other short-term warnings to sound exposure should not be ignored. If there is a safe sound exposure level, it is lower than previously thought.

The latest findings on safe listening will be summarised, including short-term and long-term consequences of overdosing. Medical data is compared to hazard-based requirements on sound levels in Europe, and a best practice for audio professionals is suggested.

We will also discuss seeding reasonable listening habits in our children, the first generation that could potentially benefit from a better understanding of common threats to hearing in modern life.

Speakers
avatar for Thomas Lund

Thomas Lund

Senior Technologist, Genelec Oy
Thomas Lund is senior scientist at Genelec, doing subjective research and listening tests. He has written papers on perception, spatialisation, loudness, sound exposure and true-peak level. Thomas is convenor of a working group under the European Comission, tasked with the prevention... Read More →
avatar for Deanna Meinke

Deanna Meinke

Deanna Meinke received her master’s degree in Audiology from Northern Illinois University. She holds a Ph.D. from the University of Colorado in Audiology and is currently a Winchester Distinguished Professor in the Audiology and Speech-Language Sciences program at the University... Read More →
avatar for Petteri Hyvärinen

Petteri Hyvärinen

Petteri Hyvärinen received his M.Sc. (Tech.) degree in acoustics and audio signal processing from Aalto University School of Electrical Engineering, Finland, in 2012 and his D.Sc. (Tech.) degree in biomedical engineering from Aalto University School of Science in 2017.In his doctoral... Read More →


Thursday October 26, 2023 3:15pm - 4:15pm EDT
1E10

3:30pm EDT

(Lecture) The Watkins Woofer
The Watkins Woofer is an arrangement, invented and patented by William (Bill) Watkins and subsequently used by Infinity, that uses a novel technique to increase the efficiency of an infinite baffle or closed box loudspeaker. Watkins himself described succinctly the principle of operation of his dual-coil woofer, but no rigorous analysis was published. Furthermore, the self- and mutual inductances were ignored, causing a dip in the impedance magnitude. Using the same approach as Thiele and Small, the Watkins woofer is - for the first time - fully analyzed to outline the volume, bandwidth, and sensitivity trade-offs.

Speakers

Thursday October 26, 2023 3:30pm - 4:00pm EDT
1E07

3:45pm EDT

Student Recording Competition
Speakers
avatar for Miles Fulwider

Miles Fulwider

Associate Division Director for Division of Creative Arts, University of Saint Francis
Miles Fulwider, Tonmeister M.M. is a Producer, Engineer, and Educator.  Currently Miles is the Associate Director of the Division of Creative Arts and the Program Director for the Music Technology program at the University of Saint Francis in Fort Wayne Indiana.  Miles is Co-Vice... Read More →


Thursday October 26, 2023 3:45pm - 5:45pm EDT
1E06

4:00pm EDT

(Lecture) Improved Panning on Non-Equidistant Loudspeakers with Direct Sound Level Compensation
Loudspeaker rendering techniques that create phantom sound sources often assume an equidistant loudspeaker layout. Typical home setups might not fulfill this condition as loudspeakers deviate from canonical positions, thus requiring a corresponding calibration. The standard approach is to compensate for delays and to match the loudness of each loudspeaker at the listener’s location. It was found that a shift of the phantom image occurs when this calibration procedure is applied and one of a pair of loudspeakers is significantly closer to the listener than the other. In this paper, a novel approach to panning on non-equidistant loudspeaker layouts is presented whereby the panning position is governed by the direct sound and the perceived loudness is governed by the full impulse response. Subjective listening tests are presented that validate the approach and quantify the perceived effect of the compensation. In a setup where the standard calibration leads to an average error of 10◦, the proposed direct sound compensation largely returns the phantom source to its intended position.


Thursday October 26, 2023 4:00pm - 4:30pm EDT
1E07

4:30pm EDT

(Lecture) Delta Technique: Advancing Recordist Agency via Dual-Output Microphones and Dynamic Polar Patterns
While immersive mediation continues to grow in popularity, traditional mono and stereophonic recording techniques remain those underpinning audio production workflows. By incorporating dual-output microphone technology into established practices, capacity exists for nuancing recordist agency in ways not documented in existing literature. “Delta Technique” is introduced as a simple process to simulate polar patterns changing shape over time, with affordances associated to proximity effect, frequency masking and stereo width. Practice based methodology catalogues benefits of dual-output-based agency including the ability to capture multiple stereo techniques simultaneously, pedagogical attribute demonstration, rear-output panning, performance panning, sample packaging and DIY microphone modelling. An overarching position for “Why employ dual-output microphones?” is interrogated alongside technical data.


Thursday October 26, 2023 4:30pm - 5:00pm EDT
1E07

4:30pm EDT

Joe Tarsia Remembered: Sigma Sound Studios and the Philly Sound
Joseph D. Tarsia, founder and original owner of Sigma Sound Studios in Philadelphia PA and New York City, passed away in November of 2022. As a recording and mixing engineer, Tarsia was the “sonic architect” behind the sophisticated blend of R&B, soul, and funk music that became known as the Philadelphia Sound, thus contributing to that music becoming a national and international phenomenon.

A panel made up of former Sigma Sound engineers will discuss Tarsia, his life, and the studio he founded. Other special guests may be added to the panel. The discussion will be moderated by Toby Seay, director of Drexel University’s Audio Archives, home to the Sigma Sound Studios Collection, a historic collection of approximately 7000 audiotapes documenting the legacy of Sigma Sound Studios

Speakers
AS

Arthur Stoppe

Former Sigma Sound Engineer and Studio Manager
DD

Dirk Devlin

Sigma Sound Philadelphia Engineer
JG

Jim Gallagher

Sigma Sound Philadelphia Engineer
JM

Jay Mark

Sigma Sound Philadelphia/New York Engineer
TS

Toby Seay

Director of Drexel University’s Audio Archives


Thursday October 26, 2023 4:30pm - 5:30pm EDT
1E08

4:30pm EDT

Unlocking the Creative Potential: Exploring Recording and Production
The art of recording and producing music has witnessed a digital revolution in recent years. This tutorial proposal aims to provide an engaging and educational session that explores the fundamentals of recording and production in a DAW. The tutorial will be presented at an introductory level, catering to beginners and those seeking to expand their knowledge and skills in music production.

Objective:
The primary objective of this tutorial is to equip participants with a solid foundation in using a DAW for recording and producing music. Through a combination of practical demonstrations, and interactive discussions, attendees will gain a comprehensive understanding of the features, tools, and workflows offered by a DAW.

Tutorial Outline:

I. Introduction DAW
A. Overview of DAW parameters
B. Features and capabilities

II. Recording Techniques
A. Setting up audio interfaces and MIDI controllers
B. Understanding microphone types and placement
C. Recording vocals and instruments
D. Utilizing MIDI for virtual instrument recording

III. Editing and Arranging
A. Navigating the user interface
B. Editing audio and MIDI regions
C. Quantization and groove manipulation
D. Arranging and structuring a song

IV. Mixing and Mastering
A. Balancing levels and panning
B. Applying EQ, compression, and other audio effects
C. Creating spatial depth with reverb and delay
D. Mastering techniques for finalizing a mix

V. Workflow Tips and Advanced Features
A. Time-saving shortcuts and automation techniques
B. Integrating third-party plugins and virtual instruments
C. Collaborative workflows and file sharing
D. Harnessing the power of virtual instruments and samplers

VI. Q&A and Interactive Discussion
A. Addressing participants' questions throughout the tutorial
B. Facilitating an interactive dialogue among attendees
C. Sharing personal experiences and best practices

Speakers
TW

Tyron Williams

Certified Banga Productions LLC
JH

Jason Harydial

Straight Forward Studios


Thursday October 26, 2023 4:30pm - 5:30pm EDT
1E10

4:30pm EDT

TUTORIAL - Acoustical Simulation and Calculation Techniques for Small Spaces
(Wolfgang Ahnert (ADA-AMC, a WSDG Company), Stefan Feistel (AFMG), Dirk Noy (WSDG)
The presentation offers a brief overview of statistical acoustic calculation techniques and geometrical acoustic modelling. The challenges for small to medium size spaces will be discussed as well as the inner workings of modern raytracing and FEM / BEM software tools. A number of real-world projects are presented as case studies.

Speakers
HS

Howard Sherman

WSDG Walters-Storyk Design Group
avatar for Sergio Molho

Sergio Molho

Partner, Director of Business Development, WSDG Walters-Storyk Design Group
Sergio Molho has been an audio/video and recording industry professional since 1982. An accomplished keyboard artist and vocalist, in the 1980’s he led popular Argentine funk band CASH. As an engineer, composer, and producer he was responsible for international productions for Sony... Read More →


Thursday October 26, 2023 4:30pm - 6:00pm EDT
1E11

5:00pm EDT

(Lecture) Microphone Comparison for Female R&B Vocal Recording
Microphone selection is an imperative part of vocal recording, as it is one of the first variables the engineer can utilise to alter the timbre of the recordings. In this paper, the role that microphone selection has on listeners' perception of quality as it pertains to female R&B vocal recording is investigated. Seven microphones were used to capture four female R&B vocalists, and the subjective quality of the recordings were then scored by 17 trained audio engineers. Findings suggest that no single optimal microphone can be found for all female R&B voices, although several microphones were found to be statistically higher rated than others on a singer-by-singer basis, suggesting that microphone selection is highly singer dependent. Additionally, two microphones showed phrase dependent results, indicating other factors also contribute to subjective preference of microphones.


Thursday October 26, 2023 5:00pm - 5:30pm EDT
1E07

5:00pm EDT

Ronald Prent: Enveloping Masterclass
In this series of masterclasses, we discuss 3D recording/mixing techniques and the Inception Dilemma with excellent recording artists. High resolution listening examples are given, and each masterclass is concluded with Q&A. Seats are limited to ensure a faithful representation of the recordings.

Speakers
avatar for Thomas Lund

Thomas Lund

Senior Technologist, Genelec Oy
Thomas Lund is senior scientist at Genelec, doing subjective research and listening tests. He has written papers on perception, spatialisation, loudness, sound exposure and true-peak level. Thomas is convenor of a working group under the European Comission, tasked with the prevention... Read More →


Thursday October 26, 2023 5:00pm - 6:00pm EDT
3D01

6:00pm EDT

Richard Heyser Memorial Lecture
Sound and “Enlightenment”: A series of eye (and ear) opening events linking sound recording theory and practice.

There are many practical issues in music production that may, at times, hinder a strict adherence to the basic theories of audio and acoustics. Sound recording practitioners are constantly faced with situations in which they must navigate between “following the science” and just getting the job done, and with as little compromise as possible. The talk will cover a selection of illuminating, educative experiences drawn from over 30 years of researching, recording and mixing music.

Speakers
avatar for Richard King

Richard King

McGill University
Richard King is an Educator, Researcher, and a Grammy Award winning recording engineer. Richard has garnered Grammy Awards in various fields including Best Engineered Album in both the Classical and Non-Classical categories. Richard is an Associate Professor at the Schulich School... Read More →


Thursday October 26, 2023 6:00pm - 7:30pm EDT
1E08

6:45pm EDT

The United Palace Theatre Tech Tour
Visit the 93-year-old United Palace and learn how it has evolved to meet today's theatre needs. Attendees will need to enter at 176 Street entrance so they enter backstage.

Speakers

Thursday October 26, 2023 6:45pm - 7:45pm EDT
OFFSITE
 
Friday, October 27
 

9:00am EDT

(Lecture) Perceptual Study Exploring Locked and Unlocked Head Rotation Panning in Jazz Fusion Reproduction Over Headphones
In 2021, head-tracked binaural music became widely available to consumers. This research investigated the merits of mixes implementing varying degrees of head-tracked audio sources in the context of jazz fusion, an intersectional subgenre that sits between the more naturally produced world of jazz, and the more heavily produced worlds of rock and funk. A perceptual study was designed to collect data on preference, immersiveness, localizability, naturalness, and engagingness in a three degrees of freedom controlled environment. Three saxophone solo excerpts were prepared, each approximately 60 seconds in length, and each with three different versions: no stems head-tracked, all stems head-tracked, and some stems head-tracked. A repeated measures one- way analysis of variance was conducted on the data to test for statistical significance. The study concluded that spatial impression increased with the amount of head-tracking implemented, but had minimal bearing on listener preference. Additionally, familiarity with immersive audio was not found to have a significant link to preference. Preference was program dependent, but a mix that implemented any amount of head-tracking was always most preferred. Future studies will explore different genres of music, different implementations of the hybrid mix approach, and a six degrees of freedom uncontrolled environment with participants of different backgrounds.


Friday October 27, 2023 9:00am - 9:15am EDT
1E09

9:00am EDT

(Lecture) Capturing audience film sound preferences
The concerns of the public on poor sound experience in TV and streaming platforms often drive content creators and organisations to try to improve the situation. Yet, the current trend is that of a subtle revision to existing standards along with utilising legacy, outdated, and occasionally ineffective methods. These standards and recommendations do not fully consider the wide variety of mobile, and stationary playback hardware available, nor the acoustics of the playback environments or even background noise. In addition, the diversity of the population's needs is often overlooked – accessibility or otherwise - and fails to fully address audience preferences. There are both data-driven and anecdotal reasons to question the effectiveness of the current standards in their present form. An audience survey (n = 500) was undertaken to better understand film audience experiences along with their preferences in terms of reproduction hardware and relative listening levels. Results provide an insight into audience preferences, and behaviour when watching films across different mediums. The results from this will be utilised to develop an adaptive mixing framework in order to deliver a personalised listening experience on stationary and mobile devices.


Friday October 27, 2023 9:00am - 9:30am EDT
1E07

9:00am EDT

A Tribute to Jay McKnight
A tribute to Jay Mcknight

In November last year the Audio Engineering Society lost one of its most loved and respected supporters. Please join our panel to pay tribute and celebrate his brilliant legacy. This distinguished panel will include Jay's family members and professional associates.

Jay was an influential figure in professional audio, noted for his substantial contributions to both the industry and the AES. Jay was a Fellow of the AES (1960), received the AES Award (1971), was President of the Society the for the year 1978/1979, was made an Honorary Member in 1979, received the Board of Governors Award in 1990, and the Distinguished Service Medal Award in 2008, for extraordinary service to the Society and contributions to the advancement of knowledge in magnetic recording over a period of more than 50 years. He was a member of the AES Journal Review Board for the years 1960-2007; he has been a Governor four times (1962-1964, 1971-1973, 1976-1977, and 1980-1982), Chairman of the Standards Committee (1971-1974), Chairman of the Publications Policy Committee (1977-1978), and Chairman of the Historical Committee (1999-2006).

Speakers
avatar for Brad McCoy

Brad McCoy

Audio Engineer, Retired, Library of Congress
I am a retired audio engineer after 39 years at the US Library of Congress where I worked mostly in the audio archiving and preservation field.  I am Chair of the AES Historical Committee and Co-Chair of ARSC Technical Committee.  Proud to be part of IASA and grateful for this conference... Read More →
JK

John K. Chester

Owner, JKC Labs
John has been a live sound engineer, analog circuit designer, equipment manufacturer, and an independent consultant for A/V, video conferencing, network and telecommunications. In 2002 John began repairing and upgrading tape machines and began using modern digital tools to improve... Read More →


Friday October 27, 2023 9:00am - 10:00am EDT
1E10

9:00am EDT

Immersive Listening Standards
In recreational listening, we have all but said goodbye to stereo as a predictable format; as consumers are mainly listening on pods, soundbars and headphones. Even so, AES standards have served production well, ensuring 75 years of content that may still be reproduced as the artist intended.

3D sound, however, has taken off on a tangent with a multitude of ad hoc listening and distribution methods, creating confusion in production. 3D sound lacks the support from global standardization stereo had; and as a result, experienced recording artists in the field are being adversely influenced by moving playback targets; and by arbitrary content flooding the consumers.

From recording, recreational and physiological perspectives, we discuss qualities offered by 3D music compared to stereo, and propose guidelines that enable a transparent listening experience, also for future generations to enjoy.

Speakers
avatar for Thomas Lund

Thomas Lund

Senior Technologist, Genelec Oy
Thomas Lund is senior scientist at Genelec, doing subjective research and listening tests. He has written papers on perception, spatialisation, loudness, sound exposure and true-peak level. Thomas is convenor of a working group under the European Comission, tasked with the prevention... Read More →
avatar for Ronald Prent

Ronald Prent

Proper Prent Sound
Ronald Prent started his career at Wisseloord Studios in the Netherlands in 1980 as an inexperiencedassistant and has since established himself as one of the most accomplished and innovative recording &mix engineers in the world. In the early years of surround sound, Ronald was a... Read More →


Friday October 27, 2023 9:00am - 10:00am EDT
1E06

9:00am EDT

All About The Sphere
A discussion on the challenges and the innovative solutions to delivering “studio quality” immersive audio to 18,000+ guests inside the world's largest spherical structure. Talking points will include the impact of ingress and egress on the acoustics, acoustical transparency of the LED screen, bowl design and layout, beam forming and wave field synthesis, and infrasound system and dynamic delay processing.

Speakers
EH

Erik Hockman

The Sphere Entertainment Co


Friday October 27, 2023 9:00am - 10:00am EDT
1E11

9:00am EDT

Tiny Desk arrives at AES!
NPR's Tiny Desk is a personal favorite of many across the globe. Praised for its incredible sound with a simple office back drop, musicians feel at ease and give very intimate performances. But who's behind that and how does it sound so good? Tiny Desk's engineer, Josh Rogosin, discusses his workflow, best recording practices, and what really makes the Tiny Desk so special.

Speakers

Friday October 27, 2023 9:00am - 10:30am EDT
1E08

9:15am EDT

(Lecture) Perceptual impression of room impulse responses simulated by CE-FDTD method
The room impulse response (RIR) is widely used to represent arbitrary fields by convolving them with acoustic signals. While the RIR is conventionally measured in real space, the development of sound field simulation has enabled to calculate RIR from the room geometry. The Finite Difference Time Domain (FDTD) method has been used to analyze and synthesize sound fields. In particular, the Compact Explicit (CE) -FDTD method has improved the FDTD method to calculate the full-band acoustic signals. However, there is no study to evaluate the impression of RIRs simulated by the CE-FDTD method for a real space. We conducted subjective evaluation experiments on acoustic signals convolved with the simulated RIR and on acoustic signals captured in the real space. It is confirmed that simulated RIR shows natural impression for narrow-bandwidth acoustic signals without transient sounds.


Friday October 27, 2023 9:15am - 9:30am EDT
1E09

9:30am EDT

(Lecture) Design Choices in a Binaural Perceptual Model for Improved Objective Spatial Audio Quality Assessment
Spatial audio quality assessment is crucial for achieving immersive user experiences, but subjective evaluations are time-consuming and costly. To address this, automated algorithms have been developed for objective quality assessment. This study focuses on developing an improved binaural perceptual model for spatial audio quality measurement by choosing the best-performing set of design parameters among previously proposed methods. Existing binaural models, particularly extensions of the Perceptual Evaluation of Audio Quality (PEAQ) method, are investigated to enhance spatial audio quality metrics.

The performance of the popular Gammatone Filterbank (GTFB) and PEAQ's built-in Filterbank is compared for the its use in developing spatial distortion metrics related to inter-aural time and level differences (ITD and ILD) and inter-aural cross-correlation (IACC). Evaluation includes different binaural cue types and window lengths, with subjective scores from a spatial audio quality database used for correlation analysis. Additionally, three binaural cue extraction systems are evaluated using spatial and timbre distortion metrics, employing a common peripheral model. Objective quality scores are derived using multivariate regression and validated against subjective scores from a listening test database.

Results indicate similar performance between GTFB and PEAQ's Filterbank in predicting spatial audio quality. The binaural cue extraction model proposed by Seo et al. (2013) demonstrates the best overall performance, making an additional GTFB unnecessary for spatial audio quality assessment. Future work aims to expand the binaural model to incorporate higher-level spatial distortion metrics, such as directional loudness. This research contributes to advancing audio quality evaluation and optimization, particularly in spatial audio coding, to enhance the immersive user experience through accurate source localization and perceived sound quality.


Friday October 27, 2023 9:30am - 9:45am EDT
1E09

9:30am EDT

(Lecture) SynthAX: A Fast Modular Synthesizer in JAX
Modern audio production relies heavily on realtime audio synthesis. However, accelerating audio synthesis far beyond realtime speeds has a significant role to play in advancing intelligent audio production techniques. Fast synthesis methods have been used to generate useful datasets, implement audio matching procedures for automatic sound design, and infer synthesis parameters for real-world sounds. In this paper, we present SynthAX, a fast virtual modular synthesizer written in JAX. At its peak, SynthAX generates audio over 60,000 times faster than realtime, and significantly faster than the state-of-the-art in accelerated sound synthesis. We present SynthAX as an open source and easily extensible API to stimulate and support applications of fast sound synthesis at scale.

Speakers

Friday October 27, 2023 9:30am - 10:00am EDT
1E07

9:45am EDT

(Lecture) An Over-Ear Headphone Target Curve for Brüel & Kjær Head And Torso Simulator Type 5128 measurements
This study aimed to identify the need for a revised target curve for closed-back headphones when measured on
a Brüel & Kjær (B&K) high-frequency Head And Torso Simulator (HATS) Type 5128. Since the publications
by Olive et al. [1] , Olive and Welti [2], the Harman frequency response curve has been a widespread target for
manufacturers of headphones. This curve is, however, based on measurements on a GRAS 43AG Ear and Cheek
simulator equipped with an IEC 711 coupler. With the introduction of B&K HATS 5128 in 2017, an updated
preference curve could be needed. This new HATS incorporates a more realistic configuration of the external
ear and a skin-like material that surrounds the pinna, improving the sealing and repeatability of headphones
measurements. We investigated whether a new target curve is needed when measuring closed-back headphones on a
HATS 5128. Our study included preference evaluation of 32 different frequency curves in lab-based listening tests
with a total of 56 consumers. Older studies investigating target frequency response curves have often assumed that
consumers have one global preference across demographics, while more recent studies have indicated otherwise.
We continued this work by investigating whether consumers in Denmark and Japan share the same preference.


Friday October 27, 2023 9:45am - 10:00am EDT
1E09

10:00am EDT

(Lecture) Jazz Mapping: An Advanced Framework for Solo Analysis and Discovery in Jazz Music
Jazz Mapping presents a comprehensive framework for delving into the fascinating realm of jazz improvisation. By categorizing and hierarchically segmenting the solo components according to their significance, this innovative approach unveils the unique musical language characteristics employed by different jazz players and styles. With our user-friendly implementation, users gain access to a restful API served by a PostgreSQL database enriched with the annotated constituents per solo, allowing users to access basic statistics and retrieve solos from specific artists or time periods. Our approach enables the user to delve deeper into the analysis of a solo by examining references and responses within the performance, offering valuable insights into the captivating narrative and storytelling elements of jazz improvisation.


Friday October 27, 2023 10:00am - 10:15am EDT
1E07

10:00am EDT

(Lecture) Spatial resolution of human hearing with different azimuth, elevation and bandwidth of source signals
In order to design efficient 3D audio coding and rendering systems, it’s important to understand how spatial resolutions of human hearing are affected by sound source directions and its bandwidth. However, sufficient study has not yet been done with this problem. We therefore investigated spatial resolutions of human hearing with different azimuths, elevations, and bandwidths of source signals. We found that they get lowered when the azimuth angles of the sound source signals come to lateral side of a listener, and the elevation angles become larger. Also, when the bandwidths of the source signals become narrower, the resolutions are lowered. In this paper, we provide actually measured figures, and they are useful in the design of 3D immersive audio systems.


Friday October 27, 2023 10:00am - 10:15am EDT
1E09

10:00am EDT

DEI in Audio Education Panel
This panel introduces pedagogical examples from the University of Colorado, Texas A&M, Temple University, the University of Lethbridge, and the University of Hartford, which incorporate topics that address diversity, equity, inclusion (DEI), and student well-being in the broader field of music technology. These topics include 1) how "embodied sonic mediation" pedagogy can help audio engineering students improve their sonic awareness, technical skills, and personal well-being; 2) how music theory teachers can bring low-cost technology into core music theory classes to promote diversity, equity, and inclusion using MuseScore, OpenScore Lieder Corpus, Ed tech software such as Artusi, Aurelia, Harmonia, makemusic cloud, and Musition, as well as open-educational books and resources like Open Music Theory can make learning more equitable for students; 3) how to create a safe learning space and building community in the classroom where ideas can be shared and valued; 4) how to embraces a variety of aesthetics and styles in electroacoustic music in a collaborative environment that includes talented students with diverse cultural, racial, gender, and economical backgrounds. Through this panel discussion, we strive to improve accessibility, welcoming diverse genres, and radiating inclusiveness to all races, gender, and gender identities. Meanwhile, we aim to provide some insights for educators to improve students' mental health and well-being during the post-COVID era.

Moderators
Speakers

Friday October 27, 2023 10:00am - 11:30am EDT
TBA Education Stage

10:00am EDT

Exhibit Hall
Welcome to the AES Exhibit Hall where innovation and excellence in audio technology come to life! The AES Exhibit Hall is a captivating hub for audio professionals, enthusiasts, and industry leaders alike. Here, you'll discover a vibrant and immersive environment that showcases leading companies and partners, new technologies, and must-see products all in one location! Don't miss our more than 150 exhibitors, education stages, poster sessions, career fair, and MORE! Start planning your visit to the Exhibit Hall by checking out the Floor Plan.

Friday October 27, 2023 10:00am - 4:00pm EDT

10:15am EDT

(Lecture) Integrating Live Computer Tools into the Creation, Adaptation, and Performance of Japanese Noh Theatre
This paper introduces the technical and aesthetic issues involved in creating, adapting, and performing Noh with live electronics. My method is based on a thematic analysis of the digital audio effects utilized by Jōji Yuasa in his 1961 musique concrète adaptation of Aoi no Ue, the classic Noh created by Zeami in the 14th century. Yuasa’s work is used as a template for curating and developing a suite of real-time audio processing tools for performances of new and classic Noh. Through Yuasa’s formal training in Noh and engagement with Noh's traditional aesthetic principles in his use of audio effects, I endeavor to expand his vision into live settings in collaboration with certified Noh instructor-performers from Theatre Nohgaku, the Noh ensemble led by Richard Emmert. Preliminary findings are presented to facilitate interdisciplinary collaboration with Noh ensembles in line with third-wave human-computer interaction research, alongside designs for Max/MSP packages for the five categories of Noh theatre: god, man/warrior, woman, miscellaneous, and demon.

Speakers

Friday October 27, 2023 10:15am - 10:30am EDT
1E07

10:15am EDT

(Lecture) Spatial auditory masking between source signals at different elevations on the median plane
Spatial auditory masking between sound objects vertically located on the median plane of the listener has been examined. It was known that the masking threshold has symmetric property with regard to the frontal plane of the listener, which means signals at front center masks signals at back center as well as those at front center. In this study, however, we found that, at certain frequencies of masker signals, a masker signal at front center of the subject does not mask signals at top center (zenith) with the same level as signals at front center or back center of the listener, though their Interaural Time Differences (ITDs) are almost identical. Conversely, stronger masking occurs when masker signal is located at top center than other directions regardless of the maskee directions. We found these are partially due to the difference of loudness of source signals depending on the elevations.


Friday October 27, 2023 10:15am - 10:30am EDT
1E09

10:15am EDT

5 Ways to Market Your Studio for Growth and Higher Revenue
This session will highlight five marketing strategies that you can implement in your audio production studio business to encourage growth and higher revenues from new and existing clients.

You will receive actionable takeaways from such topics as the importance of being seen in your market, building a website and social presence that are focused on lead generation, establishing community involvement and partnership opportunities that build awareness and trust in your studio brand, and more.

The presentation will also look at the value of using software tools for planning and tracking marketing activities, as well as measuring results. Find out how to use marketing skills to increase your audio production studio growth and revenue potential.

Speakers
avatar for Brady Hoggard

Brady Hoggard

CEO, Sonido Software
Brady Hoggard is an audio professional and has worked in the audio industry for the past 15+ years.With over 10 years experience in software and technology, Brady is the CEO of Sonido Software. Sonido is a recording studio management software that makes the daily management of an... Read More →


Friday October 27, 2023 10:15am - 11:15am EDT
1E10

10:15am EDT

[Tutorial] Externalized Binaural Rendering
In daily communication and entertainment applications, audio is frequently delivered over headphones or earbuds and commonly heard near or inside the listener’s head. During this session, attendees will experience and compare audio signal processing approaches aiming to mitigate this well-known phenomenon. Strategies include the addition of artificial acoustic reflections, “augmented-reality” adaptation to listening room acoustics, or room-agnostic “colorless” externalization processing.
The audio presentation will be exclusively via wireless web-based two-channel audio transmission, accessed via each attendee’s personal electronic device connected to their own headphones or earbuds. During most of the session, the audio source will be provided by live microphone capture of the presenter’s voice, enabling the synchronous presentation of audio and visual localization cues.

Speakers
avatar for Jean-Marc Jot

Jean-Marc Jot

Principal, Virtuel Works
Virtuel Works provides audio technology strategy, IP creation and licensing services to help accelerate the development of audio and voice spatial computing and interoperability standards that will power our future ubiquitous copresence, remote collaboration or immersive music and... Read More →
avatar for Christopher Landschoot

Christopher Landschoot

Virtuel Works
Chris' work bridges the gap between music, audio, acoustics, and technology. Chris is an avid musician who writes and produces his own music, and works as an acoustics consultant designing the acoustics of the built environment. He also has a passion for audio and technology which... Read More →


Friday October 27, 2023 10:15am - 11:15am EDT
1E06

10:15am EDT

UWB – a new wireless tech in the future of pro audio
It’s not just Wi-Fi and Bluetooth anymore. UWB as an RF protocol has become a standard feature in modern mobile handsets. As applications increase, the availability of cost-effective chip implementations makes UWB an increasingly attractive and practical RF solution. It is now both natural and advantageous for device manufacturers to turn to UWB as a wireless content capture and delivery tool for high quality audio. As a transport option, UWB offers tremendous possibilities for Professional audio use cases. Early small-scale purpose built product has already proven the ability of this digital wireless technology to deliver low latency, high dynamic range audio quality suitable for the most demanding professional audio applications. This workshop examines how the audio industry can now take the next steps to fully utilizing the power of UWB for reliably delivering low latency (sub 5 milliseconds) and high data throughput for Hi Res (24/96) Linear PCM audio along with the configurability and flexibility that will be required of advanced wireless devices to truly perform in the future.
A panel of UWB experts will explain UWB technology and examples of its application in the next wave of cutting-edge wireless audio products. We will also discuss the need for a UWB Audio Interface Standard in order to ensure the interoperability and high quality of professional audio implementations.

Speakers
avatar for Jackie Green

Jackie Green

Nexonic Design
Jackie Green has enjoyed many opportunities to pursue great sound and innovative technologies. After BS and MBA degrees, Green pursued graduate courses in microprocessor design and digital signal processing in order to support creative work in digital wireless and audio. She is an... Read More →
JM

Jonny McClintock

Audio Codecs Ltd


Friday October 27, 2023 10:15am - 11:15am EDT
1E11

10:30am EDT

(Lecture) Generic ProAV Network Model
The rapid shift towards Ethernet-based technologies in ProAV systems has exposed significant challenges in the integration and convergence of diverse media transport protocols. Conflicting assumptions among de facto and official standards about underlying network features often result in severe incompatibilities. The situation is exacerbated by the absence of an overarching use case model to guide the development and application of these protocols. To address this, the authors propose a generic use case model grounded in real-world ProAV requirements, especially for live events and productions. This model focuses on system integration and interoperability, taking into account the coexistence of various signal and control traffic types, loose collaborative teams, and the contrast to traditional IT network design principles. Through this model, the authors aim to clarify requirements, shape realistic expectations for products and implementations, and stimulate discussion for the next evolution in AV networking.

Speakers
avatar for Henning Kaltheuner

Henning Kaltheuner

Independent consultant, Board of Directors, Avnu Alliance
Henning's main expertise is market research based on his master degree in Psychology at University of Cologne with a specialization on media psychology and qualitative research. His work has concentrated on gaining insights into market trends, brand perceptions and customer expectations... Read More →


Friday October 27, 2023 10:30am - 10:45am EDT
1E07

10:30am EDT

(Lecture) Binaural rendering for professional audio applications.
Binaural technology could be used as a working tool for acoustic consultants or audio system engineers. Binaural technology could reach more critical areas of applications only if it was recognized as authentic, meaning providing perceptual identity with an explicitly presented real sound scene. Several studies have tried to identify whether binaural rendering can be considered authentic while in our study we focus on plausibility, rather than authenticity to refer to the agreement with the listener’s expectation. We ran perceptual test in which audio system engineers could compare real music audio reproduction using professional audio equipment and binaural recordings of the previous playback and we repeated the perceptual test using modeled system responses and binaural synthesis, to verify whether acoustic modelling or binaural synthesis could have introduced critical discrepancies with respect to real binaural recording. The engineering brief reports the metrics defined by experts to rate Binauralization, describe the audio setup and processing chain in both recording and synthesis case, and finally presents and discusses the results of the tests.


Friday October 27, 2023 10:30am - 10:45am EDT
1E09

10:45am EDT

(Lecture) Implementation of Simultaneous Deconvolution on a Real-time Smartphone App
In-room speaker system equalization was traditionally implemented by exciting one speaker at a time. With a higher
number of speakers, restrictions of measurement microphone setup, the annoyance factor due to traditional stimuli,
and background noises, the process of measuring the impulse response of a multi-channel system in real-time
can be cumbersome. With FFT computation restrictions on a smartphone DSP, the accuracy and resolution of
the impulse responses are compromised. This paper addresses all of these concerns with a novel approach to
implementing the Simultaneous Deconvolution of a multichannel speaker system. It uses a set of circularly shifted
Sine-Sweep stimuli to excite the speakers and calculate the impulse responses in real-time on a smartphone app
over a cloud-based architecture. An independent recording and playback system, along with manual delays or
system delays due to Bluetooth, Wi-Fi, or cloud-based communication, pose further challenges to the accuracy
of our measurements. To surmount these complications, we discuss a time-alignment method that uses bin-wise
matched filtering of spectrograms, followed by a statistical analysis of its results.


Friday October 27, 2023 10:45am - 11:00am EDT
1E07

10:45am EDT

(Lecture) Knockout tournaments for screening top-ranking samples among a large number of alternatives
In this paper, we apply theoretical findings of mathematical properties of knockout tournaments to psychoacoustic experimentation. By simulating pairwise comparisons between samples from a Thurstonian model, we investigate statistics of the outcomes of knockout tournaments under various realistic assumptions regarding the arrangement of stimuli along a hypothetical psychometric scale. We suggest that knockout tournament designs can be beneficial especially when the aim is to narrow down on a small subset of best samples among a large pool of initial alternatives.

Speakers
avatar for Petteri Hyvärinen

Petteri Hyvärinen

Petteri Hyvärinen received his M.Sc. (Tech.) degree in acoustics and audio signal processing from Aalto University School of Electrical Engineering, Finland, in 2012 and his D.Sc. (Tech.) degree in biomedical engineering from Aalto University School of Science in 2017.In his doctoral... Read More →


Friday October 27, 2023 10:45am - 11:00am EDT
1E09

10:45am EDT

This is the Fresh Air Archive
This presentation will outline the multi-project effort to preserve and provide access to NPR's Fresh Air with Terry Gross. Over 40 Years of interviews have been preserved from the influential radio show.

Fresh Air is one of public media's most popular programs, with 5 million listeners on 665 stations and 4 million weekly podcast downloads. Terry Gross has, for more than 40 years, engaged in conversations with newsmakers, writers, film directors, musicians, actors, and artists to open windows into their hearts, minds, and work.

Fresh Air received the Peabody Institutional Award in 2022. Terry Gross was awarded a National Humanities Medal from President Obama in 2016. The radio program has been broadcast nationally since 1987, produced in Philadelphia at WHYY, Inc. and distributed by NPR.

The transcripts and audio archives travel the world, used by public libraries, government agencies, law firms and industry. Transcripts are created using Artemis software at NPR in Washington D.C. and are available at LexisNexis, Dow Jones, ProQuest, Universities, and other archives.

Archiving and preserving the program began in 1978 with organizing the reel-to-reel tapes for the WHYY-FM local Fresh Air. In 2006 the effort began at WHYY to digitize the reel-to-reel audio tapes, DATs, and CDs. Decisions were made for sampling rate, bit resolution, and the best medium to produce the archives. Working with NPR, the audio and transcripts are now accessible to the public at https://freshairarchive.org/ and https://www.npr.org/programs/fresh-air/

Speakers
avatar for Joyce Lieberman

Joyce Lieberman

Fresh Air/Radio Engineering Supervisor, Fresh Air/WHYY, Inc.


Friday October 27, 2023 10:45am - 11:30am EDT
1E08

11:00am EDT

(Lecture) Natural Ambiance Expansion Processing For An Automotive Environment
An automotive cabin is much smaller than a typical room and contains a combination of reflective and absorptive materials in proximity. As a result, the acoustic response within the cabin sound is initially dense but decays quickly. This creates a dry, unnatural, and less immersive sound field as compared to typical room environments. One approach to address this phenomenon has been to equalize the acoustics of the vehicle cabin and/or introduce artificial reverberation that better matches the desired acoustic properties. However, it is challenging to fully neutralize the acoustics of the cabin due to a dependency on head position and the non-minimum phase behavior within the space. This paper describes a more practical method of morphing the acoustics of the cabin to that of a more desirable space by integrating the reverberation decay characteristics of the vehicle and the desired room through cross-analysis, energy normalization, and frequency spectrum equalization.


Friday October 27, 2023 11:00am - 11:15am EDT
1E07

11:00am EDT

Richard King: Enveloping Masterclass
Richard King plays high resolution 7.1.4 music recordings of Yo-Yo Ma and Eric Clapton, and from the excellent Chevalier sound track album, describing also some of the techniques used.

Richard and Thomas discuss the Inception Dilemma, and how recordings come across in this particular room, also compared to cinema. Seats are limited to keep playback variation at bay, and the session is concluded with Q&A.

3D audio formats do not guarantee listener envelopment, or better control of a listening experience than stereo has to offer. In this masterclass series, with discrete-channel, linear audio examples, we discuss factors of recording, mixing and reproduction that make "immersive" worth the effort.

Speakers
avatar for Thomas Lund

Thomas Lund

Senior Technologist, Genelec Oy
Thomas Lund is senior scientist at Genelec, doing subjective research and listening tests. He has written papers on perception, spatialisation, loudness, sound exposure and true-peak level. Thomas is convenor of a working group under the European Comission, tasked with the prevention... Read More →
avatar for Richard King

Richard King

McGill University
Richard King is an Educator, Researcher, and a Grammy Award winning recording engineer. Richard has garnered Grammy Awards in various fields including Best Engineered Album in both the Classical and Non-Classical categories. Richard is an Associate Professor at the Schulich School... Read More →


Friday October 27, 2023 11:00am - 12:00pm EDT
3D01

11:15am EDT

(Lecture) Close and Distant Gunshot Recordings for Audio Forensic Analysis
We describe contemporary forensic interpretation of multiple concurrent gunshot audio recordings made in acoustically complex surroundings. Criminal actions involving firearms are of ongoing concern to law enforcement and the public. The U.S. FBI annually lists 166,000 criminal incidents involving firearms each year. Meanwhile, over 80% of the large general-purpose law enforcement departments in the U.S. now use audio-equipped body-worn cameras (BWCs), more than 135 communities use ShotSpotter™ gunshot audio detection systems, and tens of millions of private homes and businesses now have round-the-clock surveillance camera systems—many of which also record audio. Thus, it is increasingly common for audio forensic examination of gunshot incidents to include multiple concurrent audio recordings from a range of distances. A case study example is presented.

Speakers

Friday October 27, 2023 11:15am - 11:30am EDT
1E07

11:30am EDT

Connection management in AES70-2023
This session will describe connection management in the update of the AES70 standard that is due to be released this year:
• Description of AES70-21-2023, connection management for AES70 and SMPTE ST 2110-30
• Description of AES70-22-2023, connection management for MILAN.

Speakers
JB

Jeff Berryman

Bosch Communications Systems, Inc.
EW

Ethan Wetzell

Bosch Communications Systems, Inc.


Friday October 27, 2023 11:30am - 12:15pm EDT
1E11

11:30am EDT

Mix Critiques
Speakers
avatar for Ian Corbett

Ian Corbett

Educator, Kansas City Kansas Community College
Dr. Ian Corbett is the Coordinator and Professor of Audio Engineering and Music Technology at Kansas City Kansas Community College. He also owns and operates “off-beat-open-hats LLC,” providing live sound, audio production, recording, and consulting services to clients in the... Read More →


Friday October 27, 2023 11:30am - 12:30pm EDT
1E06

11:45am EDT

Back to Mono, Again: Microphone Placement Strategies for Non-Immersive Formats Motivated by These Immersive Times
Immersive audio has inspired a new way to think about the presentation of sound, but it has also provoked a re-think about how to approach stereo and – yes – mono! Explorations in ambisonics stir a return to first principals for coincident stereo recording, particularly M/S and Blumlein. Understanding their value in stereo capture remains ever important, for both stereo production and for tracking stereo elements within an immersive production. But they also represent an interesting approach to many mono, close microphone techniques. Fluency in the decoding/shuffling of the output of two coincident microphones has production value that goes beyond width, space, and localization. Get it right when tracking and you’ll be empowered to explore and fine tune timbre and to virtually move a microphone while mixing. In this way we will fine tune the quality of capture before reaching for EQ and other effects. From mono to stereo to immersive and back, broaden your microphone techniques to become a better recording and mix engineer.

Speakers
AC

Alex Case

fermata audio + acoustics


Friday October 27, 2023 11:45am - 12:30pm EDT
1E08

12:00pm EDT

Student Competitions Award Ceremony
Announcement of awards for the Student Recording Competition, and MATLAB Hackathon

Moderators
Speakers
avatar for Miles Fulwider

Miles Fulwider

Associate Division Director for Division of Creative Arts, University of Saint Francis
Miles Fulwider, Tonmeister M.M. is a Producer, Engineer, and Educator.  Currently Miles is the Associate Director of the Division of Creative Arts and the Program Director for the Music Technology program at the University of Saint Francis in Fort Wayne Indiana.  Miles is Co-Vice... Read More →


Friday October 27, 2023 12:00pm - 1:30pm EDT
TBA Education Stage

1:30pm EDT

AI ”Townhall” Presented by the AES TC-MLAI
Recent developments in AI technology (e.g., the most recent release of ChatGPT and other generative AI technologies) are having a profound effect on creative industries. At the time of writing this proposal there is a writer’s strike in Hollywood. Various legal challenges have been launched against companies providing these services. Thought leaders, think tanks, professional organizations and industry associations are calling for action and posting open letters, recommendations, and position papers. Legislators around the world are meeting. As a professional organization representing creative audio professionals, how should AES react?

The AES community is somewhat unique among professional organizations owing to its diversity. It is comprised of audio practitioners, scientific researchers, technology developers and students. As a technical committee (TC-MLAI) and professional organization, what is our responsibility to these different groups? What leadership can and should we as a technical committee and an organization provide?

The AES TC-MLAI proposes holding a “town hall” style event to foster an exchange of ideas among the whole AES community about recent developments in AI and the AES’s role in shaping the future of audio AI. This facilitated, highly interactive discussion will offer insight into the views, opinions, ambitions, and concerns held by our varied members. The event organizers will not advocate for a particular position. Ideas will be captured, and with this information, the AES TC-MLAI plans to draft in late 2023/early 2024 a set of core principles (values), code of conduct, recommendations and/or guidelines (TBD) that can provide leadership on the use and/or development of generative and other AI technologies intended for use in creative audio work. This information may also be shared with other relevant TCs, AES management, etc. who are in positions to address the community’s concerns and interests. Additionally, and more generally, the event provides an opportunity for community members to share thoughts and concerns, which is constructive for community building and building a healthy discourse.

It is important to note the proposed scope: creative audio work. AI is having an impact in many areas of audio, and each deserves due consideration. However, for this town hall, we propose focusing on the use of AI in creative production. The exact structure of the event is to be worked out depending on the time allotted, etc. It is not our intention to control the content or direction of discussions. Interaction with the audience is a priority. However, we also recognize that without structure conversations can quickly become fragmented and even incoherent. Moreover, the issue of AI can be contentious. In planning, we will work towards striking a constructive balance between structure and openness.

One possible configuration might be as follows:

- The scope of the discussion could be limited to 2-3 pre-determined questions about the design and/or application of AI.
- Informative, educational materials will be made available in advance.
- The questions, discussion scope, etc. will be announced in advance of the town hall (in the conference program perhaps).
- Questions may include, for example: “If TC-MLAI were to draft a position document, what kind of document would best serve the community? Should this be a recommended “code of conduct” for audio engineers using AI technology? Should it attempt to guide research and development? Is it information aimed at clients of audio engineers that can help them to get the services they need?
- Other questions might include, for example: what steps, if any, can/should the AES community take to protect jobs and/or the integrity of the work that we do? How can/should we educate novice engineers about the technical and creative potential, the ethics, and limits of AI audio technologies?

To ensure that various relevant perspectives are represented in the room, we will invite certain participants with expertise in, for example, education, standards, commercial product/system development, legal, AES governance, mastering and other stages of the production workflow, etc. We will make it our goal to facilitate a coherent discussion among diverse participants that sheds light on impressions of audio AI now, and also, suggests future considerations and next steps for the AES community.

Speakers
avatar for Nyssim Lefford

Nyssim Lefford

Associate Professor, Audio Technology, Luleå University of Technology, Associate Professor Audio Technology
avatar for Jonathan Wyner

Jonathan Wyner

Mastering Engineer, Berklee College of Music, Professor of Music Production and Engineering; M Works, Chief Engineer/Consultant
AES President 2021, Jonathan Wyner is a Technologist,  Education Director for iZotope in Cambridge, MA,  Professor at Berklee College of Music in Boston, Chief Engineer at M Works Mastering.  A musician and performer, he’s a Grammy Nominated Producer whose mastered and produced... Read More →
avatar for Gordon Wichern

Gordon Wichern

Principal Research Scientist - Speech and Audio Team, MERL, Sr. Principal Research Scientist
Audio signal processing and machine learning resarcher
avatar for J Keith McElveen

J Keith McElveen

Founder/CTO, Wave Sciences, Research Lead, Spatial Hearing
My technical focus is on solving The Cocktail Party Problem - i.e., separating speech in a reverberant environment with competing (masking) talkers and noise. Applications include audio forensics, hearing assistance, and voice user interfaces.


Friday October 27, 2023 1:30pm - 3:00pm EDT
1E11

1:30pm EDT

Hip-Hop at 50: Innovating Techniques Across Decades
Over the course of its history, Hip-Hop has pushed the audio envelope with new sounds and new ways to create them. Using turntables, samplers, sequencers and now software, Hip-Hop producers and engineers across eras have shaped the sound of modern music. This panel featuring multi-platinum and Grammy winning producers and engineers will discuss their unique methods of creation.

Speakers
PW

Paul Willie Green Womack

Record Producer/Recording Engineer


Friday October 27, 2023 1:30pm - 3:00pm EDT
1E08

2:00pm EDT

Hyunkook Lee: Enveloping Masterclass
Hyunkook Lee plays high resolution 7.1.4 recordings, including atmospheric sounds from New York City, and explains the recording techniques used. Hyunkook and Thomas discuss the "Inception Dilemma" and recent Huddersfield University studies in the field, concluded with Q&As. Seats are limited to ensure a faithful representation of the recordings.

Immersive content is no guarantee of an enveloping listening experience, comparable to what e.g. a concert hall has to offer. In this masterclass series, we hear and discuss differences between “enveloping” and just “immersive” from multiple angles.

Speakers
avatar for Thomas Lund

Thomas Lund

Senior Technologist, Genelec Oy
Thomas Lund is senior scientist at Genelec, doing subjective research and listening tests. He has written papers on perception, spatialisation, loudness, sound exposure and true-peak level. Thomas is convenor of a working group under the European Comission, tasked with the prevention... Read More →
avatar for Hyunkook Lee

Hyunkook Lee

Professor, Applied Psychoacoustics Lab, University of Huddersfield
I am Professor of Audio and Psychoacoustic Engineering and Director of the Applied Psychoacoustics Lab (APL)/Centre for Audio and Psychoacoustic Engineering (CAPE) at the University of Huddersfield, UK. Past and current research areas include the perception of auditory height and... Read More →


Friday October 27, 2023 2:00pm - 3:00pm EDT
3D01

2:00pm EDT

Navigating the Pitfalls of Immersive Audio Production
Immersive audio production has become a common practice for music mixers and mastering engineers everywhere. Artists and labels are increasingly asking for 3D content to be delivered along with a standard stereo master. Practitioners, however, are at a loss when it comes to how and what to deliver, as various streaming services implement their own binaural processes. The differences include LFE filter settings, room simulation, loudness measurement and whether or not spatialization settings are respected. A panel of experts will discuss these issues and advise on best strategies regarding rear channel content, proper use of LFE and center channel as well as side surround and height channel signals. Questions from the audience will be a welcome part of the workshop.

Speakers
avatar for Richard King

Richard King

McGill University
Richard King is an Educator, Researcher, and a Grammy Award winning recording engineer. Richard has garnered Grammy Awards in various fields including Best Engineered Album in both the Classical and Non-Classical categories. Richard is an Associate Professor at the Schulich School... Read More →
avatar for George Massenburg

George Massenburg

Associate Professor of Sound Recording, Massenburg Design Works
George Y. Massenburg is a Grammy award-winning recording engineer and inventor. Working principally in Baltimore, Los Angeles, Nashville, and Macon, Georgia, Massenburg is widely known for submitting a paper to the Audio Engineering Society in 1972 regarding the parametric equali... Read More →
avatar for Leslie Ann Jones

Leslie Ann Jones

Skywalker Sound, CA
avatar for Ronald Prent

Ronald Prent

Proper Prent Sound
Ronald Prent started his career at Wisseloord Studios in the Netherlands in 1980 as an inexperiencedassistant and has since established himself as one of the most accomplished and innovative recording &mix engineers in the world. In the early years of surround sound, Ronald was a... Read More →


Friday October 27, 2023 2:00pm - 3:00pm EDT
1E10

2:30pm EDT

Audio for Games Returns to Tokyo
AES Technical Committee on Audio for Games (AfG) was established in 2003, and is celebrating its 20th anniversary this year, 2023.
AfG TC has hosted five international conferences in London, in 2009, 2011, 2013, 2015 and 2016, and many people from around the world have attended and shared valuable experiences.
After eight years science, the Audio for Games Conference will return to Tokyo.
From April 27th to 29th in 2024, “AES 6th International Conference on Audio for Games (AfG6)” will be held in Tokyo with the theme of “Interactive Audio Innovation, Rediscovering the Heritage.”
This workshop provides an overview of the AfG6 conference and video messages from AfG6 committee members talking about current issues and future visions of Audio for Games.
It is the right time to meet the AfG people again in-person.
AfG6 welcomes many smiles in the best season of Tokyo.

Speakers
avatar for Masataka Nakahara

Masataka Nakahara

Acoustic Designer / Acoustician, SONA Corp. / ONFUTURE Ltd.
Masataka Nakahra is an acoustician who specialized in acoustic design of studios and R&D work on room acoustics, and is also an educator.After he had learned acoustics at the Kyushu Institute of Design, he joined SONA Corporation and started his career as an acoustic designer.In 2005... Read More →
SS

Scott Selfon

Meta (Reality Labs Research)


Friday October 27, 2023 2:30pm - 3:00pm EDT
1E06

3:15pm EDT

DEI Committee Town Hall
The AES DEI Committee will present reports from its initiatives including Data Collection, Editorial Initiative, Mentorship Program, and the newly formed Accessibility Sub-committee. The floor will then be open for discussion.

Speakers
avatar for Mary Mazurek

Mary Mazurek

Audio Educator/ Recording Engineer, University of Lethbridge
Audio Educator at the University of Lethbridge. GRAMMY-nominated recording engineer based in Lethbridge, AB and Chicago, IL. Research & professional work: classical and acoustic music recording, live radio broadcast, podcasting, the aesthetic usage of noise, noise art, sound art... Read More →
avatar for Jiayue Cecilia Wu

Jiayue Cecilia Wu

Assistant Professor, University of Colorado Denver
Jiayue Cecilia Wu, Ph.D. is a scholar, composer, audio engineer, technologist, and vocalist. Her work focuses on how music technology can augment the healing power of music. She earned her BS in Design and Engineering in 2000. She then worked as a professional musician, publisher... Read More →


Friday October 27, 2023 3:15pm - 4:15pm EDT
1E11

3:15pm EDT

Educating the Next Generation of Musical Instrument Designers
Today's designers of musical instruments require expertise in a diverse assortment of disciplines: materials science, electronics, human factors engineering, and computer programming, as well as music. How can a course of study fulfill all of these needs? What challenges do we face in preparing the next generation of instrument designers, and who is going to do the training?
Participants on the panel will be educators from colleges and secondary schools who are teaching instrument design, and industry representatives looking for new talent.

Speakers
PL

Paul Lehrman

Tufts University
AN

Alex Nieva

BRAMS Laboratory
avatar for Akito Van Troyer

Akito Van Troyer

Associate Professor, Berklee College of Music
AS

Alexandria Smith

Georgia Institute of Technology


Friday October 27, 2023 3:15pm - 4:15pm EDT
1E10

3:15pm EDT

The Current State of Spatial Audio in Games
Spatial audio has been in games since 2016, but there have been some recent major updates and improvements. This session will cover the what, why, and how of integrating spatial audio into games. There will be examples of creative applications in a mix, including playback of demos over the Dolby Atmos system in the room.

Speakers

Friday October 27, 2023 3:15pm - 4:15pm EDT
1E06

3:15pm EDT

7 Non Traditional Production Methods w/ QuestionATL
Some engineers and producers take advantage of a few of the customizations and optimizations of their machines (laptops, desktops, mobile recorders, tablets, etc.) However, unlocking the deeper settings of your production machine can lead to greater efficiency and productivity with cutting-edge applications and technologies. QuestionATL will take attendees through a variety of production optimizations–including mobile devices when working remotely with artists who need to provide high-resolution content without having access to a studio or professional-grade gear.

Speakers
QA

Question Atl

QuestionATL Music


Friday October 27, 2023 3:15pm - 4:15pm EDT
1E08

4:30pm EDT

Can AI-based methods revamp audio coding?
Deep learning/AI techniques have shown to be highly successful in audio signal processing, for example in denoising and audio source separation. In coding of audio signals there has been remarkably impressive performance for AI methods in coding speech at a low bit rate. Coding of general audio, such as music, has so far been a considerably harder challenge. However, recent advances in the field have resulted in AI-based systems that are competitive with traditional perceptual audio codecs.
This workshop, organized by the AES Technical Committee on Coding of Audio Signals, contains panelists from both industry and academia and will give an introduction to the topic, brief presentations of current techniques, followed by an interactive discussion between the audience and the panel about future directions and applications in this emerging field.

Speakers
MK

Minje Kim

Indiana University


Friday October 27, 2023 4:30pm - 5:30pm EDT
1E10

4:30pm EDT

Written In Their Soul: The Story & Restoration of The Stax Songwriter Demos
Multi-GRAMMY winning producer Cheryl Pawelski and multi-GRAMMY winning mastering/restoration engineer Michael Graves will share the background of this historically important and illuminating project - the first extensive set of its kind to focus on the Stax songwriters, telling their stories largely through their own previously unheard demo recordings. The recordings in this set range from demos captured on boomboxes or who-knows-what kind of early home recording gear, to outtakes from the studio with full-blown arrangements that didn’t make the final cut for an album, to everything in between. Co-produced with Stax artist, songwriter and long-time publicist, Deanie Parker, Written In Their Soul shines a light on the legendary Memphis label through the lens of its composers from a source who knew and worked with them all. The demos range from hit songs by known artists/writers like Eddie Floyd and Carla Thomas, to demos of songs never cut or released by more behind-the-scenes artists/songwriters like Homer Banks, Bettye Crutcher and Mack Rice. From sourcing the demo recordings through restoration of all the tracks, the presentation will walk listeners through the different audio anomalies and challenges that were present in both the original sources and subsequent digital transfers, as well as playing recordings of both the demos and final released masters.

Speakers
avatar for Michael Graves

Michael Graves

Mastering Engineer, Mastering Engineer, Osiris Studio
Michael Graves is a multiple GRAMMY-Award winning mastering engineer. During his career, Graves has worked with clients worldwide and has gained a reputation as one of the leading audio restoration and re-mastering specialists working today, earning 4 GRAMMY Awards and 13 nominations... Read More →
avatar for Cheryl Pawelski

Cheryl Pawelski

Co-Founder/Co-Owner, Co-Founder/Co-Owner Omnivore Recordings
Three-time Grammy® Award-winning producer, Cheryl Pawelski has, for more than 30 years, been entrusted with preserving, curating, and championing some of music’s greatest legacies. Before co-founding Omnivore Entertainment Group, she held positions at Rhino, Concord, and EMI-Capitol... Read More →


Friday October 27, 2023 4:30pm - 5:30pm EDT
1E08

4:30pm EDT

Dimensions of Immersive Audio Experience
Immersive audio is a buzzword today, and it is often regarded as a synonym for spatial audio. But what does immersive mean exactly? There is currently a lack of consensus on how the term should be defined, and there is a misconception about what makes audio content truly immersive. This session will first explicate different dimensions of immersion and related concepts such as presence and involvement. Then various technical and context-dependent factors associated with immersive audio experience will be discussed. The session will be accompanied by various 7.1.4 audio demos.

Speakers
avatar for Hyunkook Lee

Hyunkook Lee

Professor, Applied Psychoacoustics Lab, University of Huddersfield
I am Professor of Audio and Psychoacoustic Engineering and Director of the Applied Psychoacoustics Lab (APL)/Centre for Audio and Psychoacoustic Engineering (CAPE) at the University of Huddersfield, UK. Past and current research areas include the perception of auditory height and... Read More →


Friday October 27, 2023 4:30pm - 5:30pm EDT
1E06

4:30pm EDT

The New Network Reality in Live Events
The past decade has witnessed a significant rise in real-time networking protocols’ maturity in live sound applications. Each has demonstrated value in diverse applications, yet their simultaneous implementation poses challenges. For networked audio applications to truly evolve and tap into their full potential, it is crucial to integrate these protocols seamlessly.

Traditionally, real-time networked audio applications have been confined to dedicated networks to simplify switch configuration and prevent conflicts – ensuring reliability at the expense of interoperability and efficiency. In response to this challenge, AVB (Audio Video Bridging) has emerged as a key component in enabling network convergence and promoting open, interoperable systems. Furthermore, leading switch manufacturers have begun to recognize AVB's potential and are increasingly integrating AVB functionality into their switches, marking a significant step towards network convergence. This session explores how other protocols can take advantage of the value of AVB and Milan to foster more efficient and collaborative network environments.

In this session, representatives from leading audio manufacturers will present a path to network convergence, openness, and interoperability. Panelists will explore the benefits of network convergence; present specific use cases demonstrating what’s currently possible; and discuss how they plan to tackle the challenges ahead for the new reality in live events.

Speakers
avatar for Genio Kronauer

Genio Kronauer

Executive Director of Electronics & Networks Technologies, L-Acoustics
avatar for Richard Bugg

Richard Bugg

Digital Products Solutions Architect, Meyer Sound
Richard Bugg is the Digital Products Solutions Architect for Meyer Sound. He is responsible for developing solutions to meet demanding artistic requirements for Meyer Sound customers. For the past two decades Richard has been working with Digital Audio Show Control and immersive sound... Read More →
avatar for Henning Kaltheuner

Henning Kaltheuner

Independent consultant, Board of Directors, Avnu Alliance
Henning's main expertise is market research based on his master degree in Psychology at University of Cologne with a specialization on media psychology and qualitative research. His work has concentrated on gaining insights into market trends, brand perceptions and customer expectations... Read More →
MG

Meghan Glickman

Caster Communications, Inc.


Friday October 27, 2023 4:30pm - 5:30pm EDT
1E11

4:30pm EDT

David Geffen Hall Tour
This tour, led by the principal designers at Akustiks, will focus on the Wu Tsai Theater, the reimaged 2,200 seat concert hall within David Geffen Hall.

Speakers
avatar for Paul Scarbrough

Paul Scarbrough

Akustiks
Paul Scarbrough is an acoustical design professional with over 30 years of experience. He has developed effective working partnerships with a broad array of architects and theater planners. His formal training in architecture allows him to appreciate a diverse range of architectural... Read More →


Friday October 27, 2023 4:30pm - 6:00pm EDT
OFFSITE
 
  • Timezone
  • Filter By Date AES Fall Convention 2023 Oct 24 -27, 2023
  • Filter By Venue Javits Convention Center, 429 11th Avenue New York, NY 10001
  • Filter By Type
  • Acoustics
  • AI
  • Applications in Audio
  • Archiving & Restoration
  • Audio Culture/Education/Accessibility
  • Audio for Multimedia
  • Broadcast and Online Delivery
  • Business
  • Diversity Equity and Inclusion
  • Education
  • Electronic Instrument Design & Applications
  • Game Audio & XR
  • Hip Hop and R&B
  • Historical
  • Immersive Music
  • Immersive & Spatial Audio
  • Networked Audio
  • Perception
  • Product Development
  • Recording and Production
  • Signal Processing
  • Sound Reinforcement
  • Special Events
  • Tech Tours
  • Transducers / Converters

Filter sessions
Apply filters to sessions.