Meeting the challenge: A benchmark corpus for automated Urdu meeting summarization
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F25%3ACXCZ277J" target="_blank" >RIV/00216208:11320/25:CXCZ277J - isvavai.cz</a>
Result on the web
<a href="https://www.scopus.com/inward/record.uri?eid=2-s2.0-85190131770&doi=10.1016%2fj.ipm.2024.103734&partnerID=40&md5=41bc0ab2008a8a59c01dfba52690d63b" target="_blank" >https://www.scopus.com/inward/record.uri?eid=2-s2.0-85190131770&doi=10.1016%2fj.ipm.2024.103734&partnerID=40&md5=41bc0ab2008a8a59c01dfba52690d63b</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1016/j.ipm.2024.103734" target="_blank" >10.1016/j.ipm.2024.103734</a>
Alternative languages
Result language
angličtina
Original language name
Meeting the challenge: A benchmark corpus for automated Urdu meeting summarization
Original language description
Meeting summarization has become crucial as the world is gradually shifting towards remote work. Nowadays, automation of meeting summary generation is really needed in order to minimize the time and effort. The surge in online meetings has made summarization an indispensable requirement, yet summarizing Urdu meetings poses a formidable challenge due to the scarcity of pertinent corpora. Abstractively summarizing Urdu meetings compounds this challenge. This research addresses these gaps by introducing the Center for Language Engineering (CLE) Meeting Corpus, a benchmark resource tailored for meeting summarization in administrative and technical domains where Urdu is the primary language. Comprising 240 recorded meetings, encompassing both scenario-based and natural discussions, the corpus spans approximately 7900 min (∼132 h) of meeting duration. Beyond corpus creation, the study delves into the performance analysis of various deep learning models in Urdu abstractive meeting summarization. Models, including ur_mT5-small, ur_mT5-base, ur_mBART-large, ur_RoBERTa-urduhack-small, and GPT-3.5 with prompting, undergo comprehensive evaluation using both automated metrics and manual assessments based on five specific criteria. This research not only addresses the immediate challenges of Urdu meeting summarization but also contributes to advancing the capabilities of meeting summarization systems in diverse organizational contexts where Urdu is the language of communication during meetings. © 2024 Elsevier Ltd
Czech name
—
Czech description
—
Classification
Type
J<sub>SC</sub> - Article in a specialist periodical, which is included in the SCOPUS database
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
—
Continuities
—
Others
Publication year
2024
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
Information Processing and Management
ISSN
0306-4573
e-ISSN
—
Volume of the periodical
61
Issue of the periodical within the volume
2024
Country of publishing house
US - UNITED STATES
Number of pages
21
Pages from-to
1-21
UT code for WoS article
—
EID of the result in the Scopus database
2-s2.0-85190131770