Repository logo
Log In(current)
  • Inicio
  • Personal de Investigación
  • Unidad Académica
  • Publicaciones
  • Colecciones
    Datos de Investigacion Divulgacion cientifica Personal de Investigacion Protecciones Proyectos Externos Proyectos Internos Publicaciones Tesis
  1. Home
  2. Universidad de Santiago de Chile
  3. Publicaciones ANID
  4. Sound Source Localization Using Hybrid Convolutional Recurrent Neural Networks in Undesirable Conditions
Details

Sound Source Localization Using Hybrid Convolutional Recurrent Neural Networks in Undesirable Conditions

Journal
Electronics (Switzerland)
ISSN
2079-9292
Date Issued
2025
Author(s)
Adasme-Soto, P  
Abstract
Sound event localization and detection (SELD) is a fundamental task in spatial audio processing that involves identifying both the type and location of sound events in acoustic scenes. Current SELD models often struggle with low signal-to-noise ratios (SNRs) and high reverberation. This article addresses SELD by reformulating direction of arrival (DOA) estimation as a multi-class classification task, leveraging deep convolutional recurrent neural networks (CRNNs). We propose and evaluate two modified architectures: M-DOAnet, an optimized version of DOAnet for localization and tracking, and M-SELDnet, a modified version of SELDnet, which has been designed for joint SELD. Both modified models were rigorously evaluated on the STARSS23 dataset, which comprises 13-class, real-world indoor scenes totaling over 7 h of audio, using spectrograms and acoustic intensity maps from first-order Ambisonics (FOA) signals. M-DOAnet achieved exceptional localization (6.00° DOA error, 72.8% F1-score) and perfect tracking (100% MOTA with zero identity switches). It also demonstrated high computational efficiency, training in 4.5 h (164 s/epoch). In contrast, M-SELDnet delivered strong overall SELD performance (0.32 rad DOA error, 0.75 F1-score, 0.38 error rate, 0.20 SELD score), but with significantly higher resource demands, training in 45 h (1620 s/epoch). Our findings underscore a clear trade-off between model specialization and multifunctionality, providing practical insights for designing SELD systems in real-time and computationally constrained environments. © 2025 by the authors.
Get Involved!
  • Source Code
  • Documentation
  • Slack Channel
Make it your own

DSpace-CRIS can be extensively configured to meet your needs. Decide which information need to be collected and available with fine-grained security. Start updating the theme to match your Institution's web identity.

Need professional help?

The original creators of DSpace-CRIS at 4Science can take your project to the next level, get in touch!

Logo USACH

Universidad de Santiago de Chile
Avenida Libertador Bernardo O'Higgins nº 3363. Estación Central. Santiago Chile.
ciencia.abierta@usach.cl © 2023
The DSpace CRIS Project - Modificado por VRIIC USACH.

  • Accessibility settings
  • Privacy policy
  • End User Agreement
  • Send Feedback
Logo DSpace-CRIS
Repository logo COAR Notify