Projects:2018s1-103 Improving Usability and User Interaction with KALDI Open-Source Speech Recogniser
Project Team
Students
- Shi Yik Chin
- Yasasa Saman Tennakoon
Supervisors
- Dr. Said Al-Sarawi
- Dr. Ahmad Hashemi-Sakhtsari (DST Group)
Introduction
This project aims to refine and improve the capabilities of KALDI (an Open Source Speech Recogniser). This will require:
- Improving the current GUI's flexibility
- Introducing new elements or replacing older elements in the GUI for ease of use
- Refining current Language and Acoustic model networks in the software to reduce the Word Error Rate (WER)
- Introducing a Pronunciation model network into the software to reduce the Word Error Rate (WER)
- Creating an interconnected neural network in the software to introduce Deep Learning
This project will involve the use of Deep Learning algorithms (Automatic Speech Recognition related), software development (C++) and performance evaluation through the Word Error Rate formula. Very little hardware will be involved through its entirety.