All

What are you looking for?

All
Projects
Results
Organizations

Quick search

  • Projects supported by TA ČR
  • Excellent projects
  • Projects with the highest public support
  • Current projects

Smart search

  • That is how I find a specific +word
  • That is how I leave the -word out of the results
  • “That is how I can find the whole phrase”

BERTraffic: BERT-based Joint Speaker Role and Speaker Change Detection for Air Traffic Control Communications

The result's identifiers

  • Result code in IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216305%3A26230%2F23%3APU149419" target="_blank" >RIV/00216305:26230/23:PU149419 - isvavai.cz</a>

  • Result on the web

    <a href="https://ieeexplore.ieee.org/document/10022718" target="_blank" >https://ieeexplore.ieee.org/document/10022718</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.1109/SLT54892.2023.10022718" target="_blank" >10.1109/SLT54892.2023.10022718</a>

Alternative languages

  • Result language

    angličtina

  • Original language name

    BERTraffic: BERT-based Joint Speaker Role and Speaker Change Detection for Air Traffic Control Communications

  • Original language description

    Automatic speech recognition (ASR) allows transcribing the communications between air traffic controllers (ATCOs) and aircraft pilots. The transcriptions are used later to extract ATC named entities, e.g., aircraft callsigns. One common challenge is speech activity detection (SAD) and speaker diarization (SD). In the failure condition, two or more segments remain in the same recording, jeopardizing the overall performance. We propose a system that combines SAD and a BERT model to perform speaker change detection and speaker role detection (SRD) by chunking ASR transcripts, i.e., SD with a defined number of speakers together with SRD. The proposed model is evaluated on real-life public ATC databases. Our BERT SD model baseline reaches up to 10% and 20% token-based Jaccard error rate (JER) in public and private ATC databases. We also achieved relative improvements of 32% and 7.7% in JERs and SD error rate (DER), respectively, compared to VBx, a well-known SD system.1

  • Czech name

  • Czech description

Classification

  • Type

    D - Article in proceedings

  • CEP classification

  • OECD FORD branch

    10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Result continuities

  • Project

  • Continuities

    R - Projekt Ramcoveho programu EK

Others

  • Publication year

    2023

  • Confidentiality

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

  • Article name in the collection

    IEEE Spoken Language Technology Workshop, SLT 2022 - Proceedings

  • ISBN

    978-1-6654-7189-3

  • ISSN

  • e-ISSN

  • Number of pages

    8

  • Pages from-to

    633-640

  • Publisher name

    IEEE Signal Processing Society

  • Place of publication

    Doha

  • Event location

    Doha

  • Event date

    Jan 9, 2023

  • Type of event by nationality

    WRD - Celosvětová akce

  • UT code for WoS article

    000968851900086