Перейти до вмісту

Кафедра Акустичних та Мультимедійних Електронних Систем

  • Кафедра
    • Склад кафедри
    • Паспорт наукової спеціальності
    • Робота для випускників
    • Історія розвитку
    • Наші випускники
  • Вступ
    • Вступ на ОР бакалавр (за результатами НМТ)
    • Вступ на ОР магістр
      • Програми фахових іспитів
    • Третій (освітньо-науковий) рівень
      • Програма вступного іспиту
    • Офіційні документи
    • Контакти відбіркової комісії
    • Міжнародна співпраця
    • Цікава акустика
  • Навчання
    • Перший (бакалаврський) рівен ОП
      • Нормативні дисципліни, бакалавр
      • Вибіркові дисципліни, бакалавр
      • Матеріально-технічна база
    • Другий (магістерський) рівен ОП
      • Нормативні дисципліни, магістр
      • Вибіркові дисципліни, магістр
      • Матеріально-технічна база
      • Рецензії та відгуки
    • Третій (освітньо-науковий) рівень вищої освіти
      • Нормативні дисципліни, PhD
      • Вибіркові дисципліни, PhD
    • Сертифікатні програми
      • “Фахові” сертифікатні програми
      • Сертифікатна програма “Електронні охоронні системи та засоби Інтернету речей”
      • Сертифікатна програма “Програмно-апаратні комплекси захисту приміщень”
    • Допоміжна інформація
    • Міжнародна співпраця
    • Cпівпраця із IT-компаніями
    • Лабораторні роботи
  • Студентське життя
  • Наука
    • Магістри-наукові
    • Аспіранти
      • Захисти к.т.н./PhD
    • Наукова робота
      • Наукові школи та групи
      • Broadband Acoustic Metamaterials
    • Наукові конференції
    • Лабораторія АЕК
      • Роль ранніх відбиттів
      • Локатор джерела пострілу
      • Електронна пошта лабораторії
    • Гурток Medical Acoustics
    • Гурток “Школа-студія кіно”
  • Події та новини
    • Проєкти документів для обговорення
    • Виставки
    • Преса про нас
    • Досягнення та нагороди
  • Контакти
Кнопка закриття
  • Кафедра
    • Склад кафедри
    • Паспорт наукової спеціальності
    • Робота для випускників
    • Історія розвитку
    • Наші випускники
  • Вступ
    • Вступ на ОР бакалавр (за результатами НМТ)
    • Вступ на ОР магістр
      • Програми фахових іспитів
    • Третій (освітньо-науковий) рівень
      • Програма вступного іспиту
    • Офіційні документи
    • Контакти відбіркової комісії
    • Міжнародна співпраця
    • Цікава акустика
  • Навчання
    • Перший (бакалаврський) рівен ОП
      • Нормативні дисципліни, бакалавр
      • Вибіркові дисципліни, бакалавр
      • Матеріально-технічна база
    • Другий (магістерський) рівен ОП
      • Нормативні дисципліни, магістр
      • Вибіркові дисципліни, магістр
      • Матеріально-технічна база
      • Рецензії та відгуки
    • Третій (освітньо-науковий) рівень вищої освіти
      • Нормативні дисципліни, PhD
      • Вибіркові дисципліни, PhD
    • Сертифікатні програми
      • “Фахові” сертифікатні програми
      • Сертифікатна програма “Електронні охоронні системи та засоби Інтернету речей”
      • Сертифікатна програма “Програмно-апаратні комплекси захисту приміщень”
    • Допоміжна інформація
    • Міжнародна співпраця
    • Cпівпраця із IT-компаніями
    • Лабораторні роботи
  • Студентське життя
  • Наука
    • Магістри-наукові
    • Аспіранти
      • Захисти к.т.н./PhD
    • Наукова робота
      • Наукові школи та групи
      • Broadband Acoustic Metamaterials
    • Наукові конференції
    • Лабораторія АЕК
      • Роль ранніх відбиттів
      • Локатор джерела пострілу
      • Електронна пошта лабораторії
    • Гурток Medical Acoustics
    • Гурток “Школа-студія кіно”
  • Події та новини
    • Проєкти документів для обговорення
    • Виставки
    • Преса про нас
    • Досягнення та нагороди
  • Контакти
Кнопка закриття

[:uk]Електронна пошта лабораторії[:en]Lab mail[:]

[:uk]

Електронна пошта Лабораторії акустичної експертизи та корекції

На даній сторінці розміщуються фрагменти листування між Лабораторією та її користувачами


Нижче ми розміщуємо фрагмент листування між студенткою американского університету штату Колорадо Крістен та професором кафедри акустики та акустоелектроніки Продеусом Аркадієм Миколайовичем. Повчальність цього листування полягає в тому, що наші українскі студенти можуть побачити, з якою наполегливістю та прискіпливістю окремі американскі студенти працюють над своїми магістерскими дисертаціями.

Якщо коротко, зміст запитань Крістен полягає в її намаганні оволодіти такими питаннями як використання алгоритмів шумозаглушення, вивчення їх можливостей та оцінювання якості мовлення за допомогою об’єктивних показників, таких, зокрема, як PESQ, що широко застосовується в системах зв’язку для оцінювання якості ліній зв’язку.

October, request from Kristen to Arkadiy Prodeus

I am a graduate student from Colorado state university from Electrical and Computer Engineering. I am interested in speech signal processing and worked on some small speech enhancement  techniques including wiener filtering for my Masters thesis.

The reason I am contacting you is that while I was trying to implement your code and it mentioned that we need to download the PESQ.exe file from the ITU website but the package that was available on the website was implementation in C. There is no executable file that gets downloaded from there, so I would like to request you to send me a the PESQ.exe that was used in your implementation of PESQ algorithm.

I will be very thankful to you if you could send me the entire code that could take the input and provides a measure of the quality of enhanced speech. I am interested in your research area and hope to become as expert as you are.

I look forward to hearing from you soon. Thank you for your time and consideration.

October, answer from Arkadiy Prodeus to Kristen 

You can find at web-page http://www.mathworks.com/matlabcentral/fileexchange/47333-pesq-matlab-driver next words: “If you have problems with downloading or compilation, you can try get pesq2.exe from here: https://yadi.sk/d/NwFNZ25RZDTXg “

So, click it and download it

November, request from Kristen to Arkadiy Prodeus

 

Thank you for this function, I was able to successfully resample the file to 16KHz but the problem I have is that the PESQ measure gives MOS and LQ scores. I am having a hard time understanding which score is the one that gives the speech quality? I would appreciate if you could clarify that for me.

Also, I am getting scores of ” WB MOS LQO  = 1.360″ when I compare the clean with the enhanced speech. According to the PESQ algorithm, this seems to be poor. What do you think? Is it common in speech enhancement to get such low scores or I should work on my algorithm of noise reduction?

I sincerely appreciate all your help in this matter.

November, answer from Arkadiy Prodeus to Kristen 

If you see book: P. Loizou, Speech enhancement. Theory and Practice, 2013, p.673, you can find:

“If the detected sampling frequency is 8 kHz, then it returns 2 PESQ scores, the raw PESQ score according to ITU P.862 [9] and the MOS-mapped score according to ITU P.862.1[10]. If the detected sampling frequency is 16 kHz, then it returns the MOS-mapped score according to ITU P.862.2 [11], which covers the wideband implementation of the PESQ measure.”

About mapping rules see pp.502-503 (Fig. 11.14 and Eqs. (11.35)-(11.36)) of the book.

Useful book is also: N. Cote, Integral and Diagnostic Intrusive prediction of speech quality, p.75

So you need use MOS_LQO for both NB-PESQ and WB-PESQ because it is mapped score…

As far as you question: “I am getting scores of ” WB MOS LQO  = 1.360″ when I compare the clean with the enhanced speech. According to the PESQ algorithm, this seems to be poor. What do you think? Is it common in speech enhancement to get such low scores or I should work on my algorithm of noise reduction? ”

I think the last assumption is more probable and you should work on your algorithm…

November, request from Kristen to Arkadiy Prodeus

Thank you so much for clarifying my doubts on PESQ scores. With my wav files after I process the speech enhancement, I am resampling them to 16KHz according to your reply on resampling. After that I run the PESQ, so it means I am providing 16KHz signal to the PESQ algorithm but I am getting 2 scores for NB and one score for WB.
NB PESQ MOS = 1.854
NB MOS LQO  = 1.524

WB MOS LQO  = 1.156

So, which one should I prefer?

I am trying to work on Two Step Noise Reduction which implements the algorithm in two steps: noise reduction and then the harmonic regeneration by “Cyril Plapous, Claude Marro, Pascal Scalart”

I have attached the article and my version of code with this email. I would appreciate if you could have a look and let me know where I can improve this algorithm to get better PESQ scores?

I know I am asking a bit more, but I have to show some improvements in the speech intelligibility. Thank you for your time and all the support in this matter.

November, answer from Arkadiy Prodeus to Kristen

> So, which one should I prefer?

NB MOS LQO  = 1.524, if original wav file was narrowband (3.5-4 kHz), and WB MOS LQO  = 1.156, if it was wideband (7,5-8 kHz). As your original file had had Fs = 25 kHz, it seems me you can consider your signal as wideband, so WB MOS LQO  = 1.156 will be right.

> …let me know where I can improve this algorithm to get better PESQ scores?

Dear Kristen, it isn’t so simple task as you think 🙂
Some time ago, I studied the TSNR algorithm and had found it isn’t so good as it’s authors announced. You can find my article here, there I had compared TSNR algorithm with spectral subtraction, MMSE and logMMSE algorithms – and found that TSNR algorithm is worse…

Of course, next step need be improvement of the algorithm, but I have no time now to solve the task. Excuse me…

November, request from Kristen to Arkadiy Prodeus

I understand your concerns about the algorithm and this is exactly I was suspecting from all the results I have been getting. I will highly be interested in trying other algorithms.

You article was very good and informative, it was efficiently comparing the four algorithms very well. It seems from figure 3(b) that logMMSE outperformed the enhancement and was the best among all.

I would like to investigate this kind of comparison as well with my database. Would you be able to send me the Matlab implementation of this work? I would like to see if I can use logMMSE for speech enhancement. I would really appreciate all your help and support in this matter.

November, answer from Arkadiy Prodeus to Kristen

I used Matlab programs from VoiceBox Toolbox in my investigations: http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html

First, it was ssubmmse.m Matlab program. There are detailed comments in it, so you can use it without any effort… Of course, its usage demands some another programs-functions, but all they are in the Toolbox, so I can find and download them easely.

Second, program specsub.m performs speech enhancement using spectral subtraction, I had used it also.

Third, I used estnoisem.m algorithm for noise spectrum estimation – its activation is made by default in above programs.

November, request from Kristen to Arkadiy Prodeus

Thanks you so much for directing me the Voicebox, it was surely helpful. I was able to locate all the algorithms you had mentioned in the email about the speech enhancement.

Now when it comes to evaluating the speech quality for these techniques, I would use PESQ for speech quality assessment, right? I had another question relating to PESQ, from the code that you had provided to me, does it compute any correlation coefficients?

Also, I read in some of the articles that PESQ scores are computed using disturbance values and average assymetric disturbance values using  PESQ = a0 – a1 . Dind – a2 . Aind.  Is this accurate? if so, then where can I find these constants a0, a1 and a2 in your matlab code ( pesq2_mtlb.m file )?

November, answer from Arkadiy Prodeus to Kristen

> I had another question relating to PESQ, from the code that you had provided to me, does it compute any correlation coefficients?

No, PESQ assessment program does not compute any correlation coefficients

> Also, I read in some of the articles that PESQ scores are computed using disturbance values and average assymetric disturbance values using  PESQ = a0 – a1 . Dind – a2 . Aind.  Is this accurate? if so, then where can I find these constants a0, a1 and a2 in your matlab code ( pesq2_mtlb.m file )?

PESQ source code was downloaded from ITU-T website. So, I think, it is “pure” PESQ score, without any correction coeffitients. You can use it “as it is” and say about it in your report.

November, request from Kristen to Arkadiy Prodeus

Thank you for your feedback, I appreciate it. Your response has clarified my doubts about using the results from PESQ. Also, I was reviewing the set up for PESQ in case of noise suppression algorithms ( ITU-T P835) and it seems like the clean (reference) and enhanced(degraded) signals need to be processed in order to be used by PESQ, do we need to perform any preprociessing (see figure I.1/P835 in ITU-T P835) before using PESQ for speech quality?

As of now, I am just  adding 0dB, 5dB, 10dB noise(ssn or babble) to clean and performing noise reduction to obtain enhanced speech and then comparing it with clean using your code in Matlab. Is there anything I am missing here? Please clarify.

November, answer from Arkadiy Prodeus to Kristen

First, as far as figure I.1/P835 in ITU-T P835, “Reference condition: SNR constant, MNRU varies”. I think it isn’t valid for you because: “…The Modulated Noise Reference Unit (MNRU) is a reference condition described in ITU–T Rec. P.810 (1996) which simulates quantizing noise produced by logarithmic PCM technique, e.g. ITU–T Rec. G.726 (1990). In this specific case the noise is correlated to the speech signal. The MNRUs are used quite extensively in the assessment of speech codecs.” (Cote N., Integral and Diagnostic Intrusive Prediction of Speech Quality, 2011, p. 219).

On my opinion, you can forget about MNRU because of noise reduction algorithm (NRA) quality is object of your master’s thesis.

Another question: “do we need to perform any preprocessing”.

On my mind, NO, because of PESQ algorithm without our assistance makes power normalization and time alignment of degraded and reference signals.
But you need make preprocessing such as power normalization and time alignment of degraded and reference signals, when you use another quality measures.

 

 


 

Пропонуємо ще один приклад листування – цього разу із студентом з Ірану.

November, request from Arash to Arkadiy Prodeus

Theme: Sensitivity of Automatic Speech Recognition to Excessive Noise and Late Reverberation Reduction

Hello,
My name is Arash Farhani and currently I am a MS student in university of Tehran .I am working about your paper that mentioned in subject of email. I have wanted to simulate your paper to better understood. 
I would be grateful if give me your code.

November, answer from Arkadiy Prodeus to Arash

Below you can see links to proper sorces.
1. D. Ellis, “PLP and RASTA (and MFCC, and inversion) in Matlab ,” [Online]. Available:http://www.ee.columbia.edu/~dpwe/resources/matlab/rastamat/ – some of RASTAMAT programs were used for BSD assessment. My version of programs set for BSD calculation is attached to this letter.
2. M. Brooks, “VOICEBOX: Speech Processing Toolbox for MATLAB,” Imperial College London, Electrical Engineering Department, 2014. [Online]. Available: http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html – programs for noise reduction were used from this resource.
3. A. Prodeus, “PESQ Matlab driver.” [Online]. Available: https://www.mathworks.com/matlabcentral/fileexchange/47333-pesq-matlab-driver
4. P. Scalart, “Wiener Noise Suppressor based on Decision-Directed method with TSNR and HRNR algorithms”
https://www.mathworks.com/matlabcentral/fileexchange/24462-wiener-filter-for-noise-reduction-and-speech-enhancement – these programs were used for usage of TSNR and HRNR algorithms.
5. Loizou P. “Matlab Software. PESQ and other objective measures for evaluating quality of speech” [Online]. Available: http://ecs.utdallas.edu/loizou/speech/software.htm – here you can find a lot of programs for speech quality assessment.
6. Modelling of ASR systems. Five lessons on HTK. [Online]. Available:http://speech.com.ua/htk_course.html – I used 1-st and 2-nd lessons from this resource for ASR system modelling. Sorry, but it is in Russian.

[:en][fusion_builder_container hundred_percent=”no” equal_height_columns=”no” menu_anchor=”” hide_on_mobile=”small-visibility,medium-visibility,large-visibility” class=”” id=”” background_color=”” background_image=”” background_position=”center center” background_repeat=”no-repeat” fade=”no” background_parallax=”none” parallax_speed=”0.3″ video_mp4=”” video_webm=”” video_ogv=”” video_url=”” video_aspect_ratio=”16:9″ video_loop=”yes” video_mute=”yes” overlay_color=”” video_preview_image=”” border_size=”” border_color=”” border_style=”solid” padding_top=”” padding_bottom=”” padding_left=”” padding_right=””][fusion_builder_row][fusion_builder_column type=”1_1″ layout=”1_1″ background_position=”left top” background_color=”” border_size=”” border_color=”” border_style=”solid” border_position=”all” spacing=”yes” background_image=”” background_repeat=”no-repeat” padding_top=”” padding_right=”” padding_bottom=”” padding_left=”” margin_top=”0px” margin_bottom=”0px” class=”” id=”” animation_type=”” animation_speed=”0.3″ animation_direction=”left” hide_on_mobile=”small-visibility,medium-visibility,large-visibility” center_content=”no” last=”no” min_height=”” hover_type=”none” link=””][fusion_text]

Електронна пошта Лабораторії акустичної експертизи та корекції

На даній сторінці розміщуються фрагменти листування між Лабораторією та її користувачами


Нижче ми розміщуємо фрагмент листування між студенткою американского університету штату Колорадо Крістен та професором кафедри акустики та акустоелектроніки Продеусом Аркадієм Миколайовичем. Повчальність цього листування полягає в тому, що наші українскі студенти можуть побачити, з якою наполегливістю та прискіпливістю окремі американскі студенти працюють над своїми магістерскими дисертаціями.

Якщо коротко, зміст запитань Крістен полягає в її намаганні оволодіти такими питаннями як використання алгоритмів шумозаглушення, вивчення їх можливостей та оцінювання якості мовлення за допомогою об’єктивних показників, таких, зокрема, як PESQ, що широко застосовується в системах зв’язку для оцінювання якості ліній зв’язку.

October, request from Kristen to Arkadiy Prodeus

I am a graduate student from Colorado state university from Electrical and Computer Engineering. I am interested in speech signal processing and worked on some small speech enhancement  techniques including wiener filtering for my Masters thesis.

The reason I am contacting you is that while I was trying to implement your code and it mentioned that we need to download the PESQ.exe file from the ITU website but the package that was available on the website was implementation in C. There is no executable file that gets downloaded from there, so I would like to request you to send me a the PESQ.exe that was used in your implementation of PESQ algorithm.

I will be very thankful to you if you could send me the entire code that could take the input and provides a measure of the quality of enhanced speech. I am interested in your research area and hope to become as expert as you are.

I look forward to hearing from you soon. Thank you for your time and consideration.

October, answer from Arkadiy Prodeus to Kristen 

You can find at web-page http://www.mathworks.com/matlabcentral/fileexchange/47333-pesq-matlab-driver next words: “If you have problems with downloading or compilation, you can try get pesq2.exe from here: https://yadi.sk/d/NwFNZ25RZDTXg “

So, click it and download it

November, request from Kristen to Arkadiy Prodeus

 

Thank you for this function, I was able to successfully resample the file to 16KHz but the problem I have is that the PESQ measure gives MOS and LQ scores. I am having a hard time understanding which score is the one that gives the speech quality? I would appreciate if you could clarify that for me.

Also, I am getting scores of ” WB MOS LQO  = 1.360″ when I compare the clean with the enhanced speech. According to the PESQ algorithm, this seems to be poor. What do you think? Is it common in speech enhancement to get such low scores or I should work on my algorithm of noise reduction?

I sincerely appreciate all your help in this matter.

November, answer from Arkadiy Prodeus to Kristen 

If you see book: P. Loizou, Speech enhancement. Theory and Practice, 2013, p.673, you can find:

“If the detected sampling frequency is 8 kHz, then it returns 2 PESQ scores, the raw PESQ score according to ITU P.862 [9] and the MOS-mapped score according to ITU P.862.1[10]. If the detected sampling frequency is 16 kHz, then it returns the MOS-mapped score according to ITU P.862.2 [11], which covers the wideband implementation of the PESQ measure.”

About mapping rules see pp.502-503 (Fig. 11.14 and Eqs. (11.35)-(11.36)) of the book.

Useful book is also: N. Cote, Integral and Diagnostic Intrusive prediction of speech quality, p.75

So you need use MOS_LQO for both NB-PESQ and WB-PESQ because it is mapped score…

As far as you question: “I am getting scores of ” WB MOS LQO  = 1.360″ when I compare the clean with the enhanced speech. According to the PESQ algorithm, this seems to be poor. What do you think? Is it common in speech enhancement to get such low scores or I should work on my algorithm of noise reduction? ”

I think the last assumption is more probable and you should work on your algorithm…

November, request from Kristen to Arkadiy Prodeus

Thank you so much for clarifying my doubts on PESQ scores. With my wav files after I process the speech enhancement, I am resampling them to 16KHz according to your reply on resampling. After that I run the PESQ, so it means I am providing 16KHz signal to the PESQ algorithm but I am getting 2 scores for NB and one score for WB.
NB PESQ MOS = 1.854
NB MOS LQO  = 1.524

WB MOS LQO  = 1.156

So, which one should I prefer?

I am trying to work on Two Step Noise Reduction which implements the algorithm in two steps: noise reduction and then the harmonic regeneration by “Cyril Plapous, Claude Marro, Pascal Scalart”

I have attached the article and my version of code with this email. I would appreciate if you could have a look and let me know where I can improve this algorithm to get better PESQ scores?

I know I am asking a bit more, but I have to show some improvements in the speech intelligibility. Thank you for your time and all the support in this matter.

November, answer from Arkadiy Prodeus to Kristen

> So, which one should I prefer?

NB MOS LQO  = 1.524, if original wav file was narrowband (3.5-4 kHz), and WB MOS LQO  = 1.156, if it was wideband (7,5-8 kHz). As your original file had had Fs = 25 kHz, it seems me you can consider your signal as wideband, so WB MOS LQO  = 1.156 will be right.

> …let me know where I can improve this algorithm to get better PESQ scores?

Dear Kristen, it isn’t so simple task as you think 🙂
Some time ago, I studied the TSNR algorithm and had found it isn’t so good as it’s authors announced. You can find my article here, there I had compared TSNR algorithm with spectral subtraction, MMSE and logMMSE algorithms – and found that TSNR algorithm is worse…

Of course, next step need be improvement of the algorithm, but I have no time now to solve the task. Excuse me…

November, request from Kristen to Arkadiy Prodeus

I understand your concerns about the algorithm and this is exactly I was suspecting from all the results I have been getting. I will highly be interested in trying other algorithms.

You article was very good and informative, it was efficiently comparing the four algorithms very well. It seems from figure 3(b) that logMMSE outperformed the enhancement and was the best among all.

I would like to investigate this kind of comparison as well with my database. Would you be able to send me the Matlab implementation of this work? I would like to see if I can use logMMSE for speech enhancement. I would really appreciate all your help and support in this matter.

November, answer from Arkadiy Prodeus to Kristen

I used Matlab programs from VoiceBox Toolbox in my investigations: http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html

First, it was ssubmmse.m Matlab program. There are detailed comments in it, so you can use it without any effort… Of course, its usage demands some another programs-functions, but all they are in the Toolbox, so I can find and download them easely.

Second, program specsub.m performs speech enhancement using spectral subtraction, I had used it also.

Third, I used estnoisem.m algorithm for noise spectrum estimation – its activation is made by default in above programs.

November, request from Kristen to Arkadiy Prodeus

Thanks you so much for directing me the Voicebox, it was surely helpful. I was able to locate all the algorithms you had mentioned in the email about the speech enhancement.

Now when it comes to evaluating the speech quality for these techniques, I would use PESQ for speech quality assessment, right? I had another question relating to PESQ, from the code that you had provided to me, does it compute any correlation coefficients?

Also, I read in some of the articles that PESQ scores are computed using disturbance values and average assymetric disturbance values using  PESQ = a0 – a1 . Dind – a2 . Aind.  Is this accurate? if so, then where can I find these constants a0, a1 and a2 in your matlab code ( pesq2_mtlb.m file )?

November, answer from Arkadiy Prodeus to Kristen

> I had another question relating to PESQ, from the code that you had provided to me, does it compute any correlation coefficients?

No, PESQ assessment program does not compute any correlation coefficients

> Also, I read in some of the articles that PESQ scores are computed using disturbance values and average assymetric disturbance values using  PESQ = a0 – a1 . Dind – a2 . Aind.  Is this accurate? if so, then where can I find these constants a0, a1 and a2 in your matlab code ( pesq2_mtlb.m file )?

PESQ source code was downloaded from ITU-T website. So, I think, it is “pure” PESQ score, without any correction coeffitients. You can use it “as it is” and say about it in your report.

November, request from Kristen to Arkadiy Prodeus

Thank you for your feedback, I appreciate it. Your response has clarified my doubts about using the results from PESQ. Also, I was reviewing the set up for PESQ in case of noise suppression algorithms ( ITU-T P835) and it seems like the clean (reference) and enhanced(degraded) signals need to be processed in order to be used by PESQ, do we need to perform any preprociessing (see figure I.1/P835 in ITU-T P835) before using PESQ for speech quality?

As of now, I am just  adding 0dB, 5dB, 10dB noise(ssn or babble) to clean and performing noise reduction to obtain enhanced speech and then comparing it with clean using your code in Matlab. Is there anything I am missing here? Please clarify.

November, answer from Arkadiy Prodeus to Kristen

First, as far as figure I.1/P835 in ITU-T P835, “Reference condition: SNR constant, MNRU varies”. I think it isn’t valid for you because: “…The Modulated Noise Reference Unit (MNRU) is a reference condition described in ITU–T Rec. P.810 (1996) which simulates quantizing noise produced by logarithmic PCM technique, e.g. ITU–T Rec. G.726 (1990). In this specific case the noise is correlated to the speech signal. The MNRUs are used quite extensively in the assessment of speech codecs.” (Cote N., Integral and Diagnostic Intrusive Prediction of Speech Quality, 2011, p. 219).

On my opinion, you can forget about MNRU because of noise reduction algorithm (NRA) quality is object of your master’s thesis.

Another question: “do we need to perform any preprocessing”.

On my mind, NO, because of PESQ algorithm without our assistance makes power normalization and time alignment of degraded and reference signals.
But you need make preprocessing such as power normalization and time alignment of degraded and reference signals, when you use another quality measures.

 

 


 

Пропонуємо ще один приклад листування – цього разу із студентом з Ірану.

November, request from Arash to Arkadiy Prodeus

Theme: Sensitivity of Automatic Speech Recognition to Excessive Noise and Late Reverberation Reduction

Hello,
My name is Arash Farhani and currently I am a MS student in university of Tehran .I am working about your paper that mentioned in subject of email. I have wanted to simulate your paper to better understood. 
I would be grateful if give me your code.

November, answer from Arkadiy Prodeus to Arash

Below you can see links to proper sorces.
1. D. Ellis, “PLP and RASTA (and MFCC, and inversion) in Matlab ,” [Online]. Available:http://www.ee.columbia.edu/~dpwe/resources/matlab/rastamat/ – some of RASTAMAT programs were used for BSD assessment. My version of programs set for BSD calculation is attached to this letter.
2. M. Brooks, “VOICEBOX: Speech Processing Toolbox for MATLAB,” Imperial College London, Electrical Engineering Department, 2014. [Online]. Available: http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html – programs for noise reduction were used from this resource.
3. A. Prodeus, “PESQ Matlab driver.” [Online]. Available: https://www.mathworks.com/matlabcentral/fileexchange/47333-pesq-matlab-driver
4. P. Scalart, “Wiener Noise Suppressor based on Decision-Directed method with TSNR and HRNR algorithms”
https://www.mathworks.com/matlabcentral/fileexchange/24462-wiener-filter-for-noise-reduction-and-speech-enhancement – these programs were used for usage of TSNR and HRNR algorithms.
5. Loizou P. “Matlab Software. PESQ and other objective measures for evaluating quality of speech” [Online]. Available: http://ecs.utdallas.edu/loizou/speech/software.htm – here you can find a lot of programs for speech quality assessment.
6. Modelling of ASR systems. Five lessons on HTK. [Online]. Available:http://speech.com.ua/htk_course.html – I used 1-st and 2-nd lessons from this resource for ASR system modelling. Sorry, but it is in Russian.

[/fusion_text][/fusion_builder_column][/fusion_builder_row][/fusion_builder_container][:]

Пошук

Архіви

  • Березень 2026
  • Жовтень 2025
  • Травень 2025
  • Квітень 2025
  • Березень 2025
  • Лютий 2025
  • Квітень 2024
  • Березень 2024
  • Лютий 2024
  • Січень 2024
  • Жовтень 2023
  • Вересень 2023
  • Липень 2023
  • Червень 2023
  • Травень 2023
  • Березень 2023
  • Лютий 2023
  • Січень 2023
  • Грудень 2022
  • Жовтень 2022
  • Вересень 2022
  • Серпень 2022
  • Липень 2022
  • Червень 2022
  • Травень 2022
  • Квітень 2022
  • Березень 2022
  • Лютий 2022
  • Січень 2022
  • Листопад 2021
  • Жовтень 2021
  • Вересень 2021
  • Серпень 2021
  • Травень 2021
  • Квітень 2021
  • Березень 2021
  • Січень 2021
  • Грудень 2020
  • Листопад 2020
  • Жовтень 2020
  • Вересень 2020
  • Серпень 2020
  • Липень 2020
  • Червень 2020
  • Травень 2020
  • Квітень 2020
  • Березень 2020
  • Лютий 2020
  • Січень 2020
  • Грудень 2019
  • Жовтень 2019
  • Вересень 2019
  • Липень 2019
  • Червень 2019
  • Травень 2019
  • Квітень 2019
  • Лютий 2019
  • Січень 2019
  • Грудень 2018
  • Липень 2018
  • Червень 2018
  • Травень 2018
  • Квітень 2018
  • Березень 2018
  • Лютий 2018
  • Січень 2018
  • Січень 2017
  • Січень 2016
  • Січень 2015
  • Січень 2014
  • Січень 2013

Мета

  • Увійти

Categories

  • [:uk]Uncategorized[:]
  • [:uk]Uncategorized[:]
  • [:uk]история кафедр[:]
  • [:uk]склад кафедр[:]
    • kuratory_b
    • kuratory_m
    • Асистенти
    • Доцент
    • Завідувач
    • Професор
    • Ст.викладач
  • [:uk]СтудЖитя[:]
  • Entrollee
  • OP
    • op_1
    • op_2
    • op_3
  • Science
    • LabАЕК_projects
    • master-science
      • 1 course (mn)
      • 2 course (mn)
    • PhD
      • PhD студенти, 3 рік
      • PhD студенти, 4 рік
      • PhD_sillab_1
      • PhD_sillab_1v
      • PhD_sillab_2
      • PhD_sillab_2v
      • PhD/к.т.н. захисти
  • Study
    • Bachelor
      • bach_mattekh_aestoai
      • bach_mattekh_esmzip
      • bach_sillab_1
      • bach_sillab_1v
      • bach_sillab_2
      • bach_sillab_2v
      • bach_sillab_3
      • bach_sillab_3v
      • bach_sillab_4
      • bach_sillab_4v
      • bach_thesis
    • Master
      • mast_sillab_1
      • mast_sillab_1v
      • mast_sillab_2
      • mast_sillab_2v
      • master_mattekh_mn_ames
      • master_mattekh_mn_eds
  • unpublish
  • Абитуриентам
  • Вступ
    • vstup_or_magistr_fakh_result
    • vstup_or_magistr_rekomend
  • Наука
    • Med AC
    • med AC meetings
    • Школа-студія зустрічі
    • Школа-студія кіно
  • Новини
    • Нагороди
    • Преса про нас
    • проєкти
    • сумні події

Тема eLearning Education WordPress Від Themespride

Ми використовуємо файли cookie на нашому веб-сайті, щоб надати вам найбільш відповідний досвід, запам’ятовуючи ваші уподобання та повторні відвідування. Натискаючи «Прийняти все», ви погоджуєтесь на використання ВСІХ файлів cookie. Однак ви можете відвідати "Налаштування файлів cookie", щоб надати контрольовану згоду.
Налаштування CookieПрийняти ВСІ
Manage consent

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
CookieDurationDescription
cookielawinfo-checkbox-analytics11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional11 monthsThe cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy11 monthsThe cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
Functional
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Performance
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
Analytics
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
Advertisement
Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.
Others
Other uncategorized cookies are those that are being analyzed and have not been classified into a category as yet.
SAVE & ACCEPT