dx.doi.org/10.1109/ICASSP49660.2025.10888138
Preview meta tags from the dx.doi.org website.
Linked Hostnames
2Thumbnail

Search Engine Appearance
Speech Data Selection for Efficient ASR Fine-Tuning using Domain Classifier and Pseudo-Label Filtering
In real-world speech data processing, the scarcity of annotated data and the abundance of unlabelled speech data present a significant challenge. To address this, we propose an efficient data selection pipeline for fine-tuning ASR models by generating pseudo-labels using WhisperX pipeline and selecting efficient labels for fine-tuning. In our work, we propose a domain classifier system developed with a computationally inexpensive TFIDF and classical machine learning algorithm. Later, we filter data from the classifier output using a novel metric that assesses word ratio and perplexity distribution. The filtered pseudo labels are then used for fine-tuning standard encoder-decoder Whisper models and Zipformer. Our proposed data selection pipeline reduces the dataset size by approximately 1/100<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">th</sup> while maintaining performance comparable to the full dataset, outperforming random domain-independent selection strategies.
Bing
Speech Data Selection for Efficient ASR Fine-Tuning using Domain Classifier and Pseudo-Label Filtering
In real-world speech data processing, the scarcity of annotated data and the abundance of unlabelled speech data present a significant challenge. To address this, we propose an efficient data selection pipeline for fine-tuning ASR models by generating pseudo-labels using WhisperX pipeline and selecting efficient labels for fine-tuning. In our work, we propose a domain classifier system developed with a computationally inexpensive TFIDF and classical machine learning algorithm. Later, we filter data from the classifier output using a novel metric that assesses word ratio and perplexity distribution. The filtered pseudo labels are then used for fine-tuning standard encoder-decoder Whisper models and Zipformer. Our proposed data selection pipeline reduces the dataset size by approximately 1/100<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">th</sup> while maintaining performance comparable to the full dataset, outperforming random domain-independent selection strategies.
DuckDuckGo
Speech Data Selection for Efficient ASR Fine-Tuning using Domain Classifier and Pseudo-Label Filtering
In real-world speech data processing, the scarcity of annotated data and the abundance of unlabelled speech data present a significant challenge. To address this, we propose an efficient data selection pipeline for fine-tuning ASR models by generating pseudo-labels using WhisperX pipeline and selecting efficient labels for fine-tuning. In our work, we propose a domain classifier system developed with a computationally inexpensive TFIDF and classical machine learning algorithm. Later, we filter data from the classifier output using a novel metric that assesses word ratio and perplexity distribution. The filtered pseudo labels are then used for fine-tuning standard encoder-decoder Whisper models and Zipformer. Our proposed data selection pipeline reduces the dataset size by approximately 1/100<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">th</sup> while maintaining performance comparable to the full dataset, outperforming random domain-independent selection strategies.
General Meta Tags
12- titleSpeech Data Selection for Efficient ASR Fine-Tuning using Domain Classifier and Pseudo-Label Filtering | IEEE Conference Publication | IEEE Xplore
- google-site-verificationqibYCgIKpiVF_VVjPYutgStwKn-0-KBB6Gw4Fc57FZg
- DescriptionIn real-world speech data processing, the scarcity of annotated data and the abundance of unlabelled speech data present a significant challenge. To address thi
- Content-Typetext/html; charset=utf-8
- viewportwidth=device-width, initial-scale=1.0
Open Graph Meta Tags
3- og:imagehttps://ieeexplore.ieee.org/assets/img/ieee_logo_smedia_200X200.png
- og:titleSpeech Data Selection for Efficient ASR Fine-Tuning using Domain Classifier and Pseudo-Label Filtering
- og:descriptionIn real-world speech data processing, the scarcity of annotated data and the abundance of unlabelled speech data present a significant challenge. To address this, we propose an efficient data selection pipeline for fine-tuning ASR models by generating pseudo-labels using WhisperX pipeline and selecting efficient labels for fine-tuning. In our work, we propose a domain classifier system developed with a computationally inexpensive TFIDF and classical machine learning algorithm. Later, we filter data from the classifier output using a novel metric that assesses word ratio and perplexity distribution. The filtered pseudo labels are then used for fine-tuning standard encoder-decoder Whisper models and Zipformer. Our proposed data selection pipeline reduces the dataset size by approximately 1/100<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">th</sup> while maintaining performance comparable to the full dataset, outperforming random domain-independent selection strategies.
Twitter Meta Tags
1- twitter:cardsummary
Link Tags
9- canonicalhttps://ieeexplore.ieee.org/document/10888138/
- icon/assets/img/favicon.ico
- stylesheethttps://ieeexplore.ieee.org/assets/css/osano-cookie-consent-xplore.css
- stylesheet/assets/css/simplePassMeter.min.css?cv=20250308_00000
- stylesheet/assets/dist/ng-new/styles.css?cv=20250308_00000
Links
17- http://www.ieee.org/about/help/security_privacy.html
- http://www.ieee.org/web/aboutus/whatis/policies/p9-26.html
- https://dx.doi.org/Xplorehelp
- https://dx.doi.org/Xplorehelp/overview-of-ieee-xplore/about-ieee-xplore
- https://dx.doi.org/Xplorehelp/overview-of-ieee-xplore/accessibility-statement