Difference between revisions of "Projects:2018s1-103 Improving Usability and User Interaction with KALDI Open-Source Speech Recogniser"
(→Introduction) |
(→Introduction) |
||
Line 13: | Line 13: | ||
This project aims to refine and improve the capabilities of KALDI (an Open Source Speech Recogniser). This will require: | This project aims to refine and improve the capabilities of KALDI (an Open Source Speech Recogniser). This will require: | ||
− | + | * Improving the current GUI's flexibility | |
− | + | * Introducing new elements or replacing older elements in the GUI for ease of use | |
− | + | * Refining current Language and Acoustic model networks in the software to reduce the Word Error Rate (WER) | |
− | + | * Introducing a Pronunciation model network into the software to reduce the Word Error Rate (WER) | |
− | + | * Creating an interconnected neural network in the software to introduce Deep Learning | |
This project will involve the use of Deep Learning algorithms (Automatic Speech Recognition related), software development (C++) and performance evaluation through the Word Error Rate formula. Very little hardware will be involved through its entirety. | This project will involve the use of Deep Learning algorithms (Automatic Speech Recognition related), software development (C++) and performance evaluation through the Word Error Rate formula. Very little hardware will be involved through its entirety. |
Revision as of 16:39, 10 April 2018
Project Team
Students
- Shi Yik Chin
- Yasasa Saman Tennakoon
Supervisors
- Dr. Said Al-Sarawi
- Dr. Ahmad Hashemi-Sakhtsari (DST Group)
Introduction
This project aims to refine and improve the capabilities of KALDI (an Open Source Speech Recogniser). This will require:
- Improving the current GUI's flexibility
- Introducing new elements or replacing older elements in the GUI for ease of use
- Refining current Language and Acoustic model networks in the software to reduce the Word Error Rate (WER)
- Introducing a Pronunciation model network into the software to reduce the Word Error Rate (WER)
- Creating an interconnected neural network in the software to introduce Deep Learning
This project will involve the use of Deep Learning algorithms (Automatic Speech Recognition related), software development (C++) and performance evaluation through the Word Error Rate formula. Very little hardware will be involved through its entirety.