Difference between revisions of "Projects:2019s2-24101 Improving Usability and User Interaction with KALDI Open- Source Speech Recogniser"

From Projects
Jump to: navigation, search
 
(9 intermediate revisions by the same user not shown)
Line 1: Line 1:
Aim
+
'''Introduction'''
To improve the user interact ability of KALDI systems.
+
This project is about improving the word error rate in audio transcription by KALDI. KALDI is used to transcript audio using various language and acoustic models.  
To improve the audio transcription quality by text to word accuracy rate.
+
The previous work in improving the word error rate is done by improving the language and acoustic models of audio. our project focus on improving the input signal quality by audio processing using the software HARK. HARK works on Missing feature theory and eliminates noise by various algorithms.
Interfacing KALDI decoder to implement Neural Network with Kaldi decoder and HARK.  
 
 
----
 
----
Motivation
+
'''Background'''
To create an open source environment for audio transcription using KALDI.
+
Speech transcription is a process done by computer hardware and software to transform an audio input into a text output.
 +
Speech transcription increases the accessibility and understand-ability of voice recordings and other audio materials for many different use cases.
 +
 
 +
----
 +
'''Project students'''
 +
Riya Parth Dube
 +
Pengyue Song
 +
----
 +
'''Supervisors'''
 +
Dr. Said- Al- Sarawi
 +
 
 +
DSTG Dr Ahmad Hashemi-Sakhtsari
 +
----
 +
'''Method'''
 +
----
 +
 
 +
'''Results'''
 
----
 
----
SUPERVISORS:
+
 
 +
'''Conclusion'''
 
----
 
----
SAS:Dr Said Al-Sarawi
 
  
DSTG (Dr Hashemi-Sakhtsari)
+
'''References'''
 +
1 The HARK Documentation available online at https://www.hark.jp/document/3.0.0/hark-document-en/sect0002.html
 +
2 The KALDI Decoder online available at https://kaldi-asr.org/
 +
3 D. Povey, “KALDI – Home”, KALDI, n.d. [Online] Available at: http://kaldi-asr.org/ [Accessed: September 2019].

Latest revision as of 04:18, 7 October 2019

Introduction This project is about improving the word error rate in audio transcription by KALDI. KALDI is used to transcript audio using various language and acoustic models. The previous work in improving the word error rate is done by improving the language and acoustic models of audio. our project focus on improving the input signal quality by audio processing using the software HARK. HARK works on Missing feature theory and eliminates noise by various algorithms.


Background Speech transcription is a process done by computer hardware and software to transform an audio input into a text output. Speech transcription increases the accessibility and understand-ability of voice recordings and other audio materials for many different use cases.


Project students Riya Parth Dube Pengyue Song


Supervisors Dr. Said- Al- Sarawi

DSTG Dr Ahmad Hashemi-Sakhtsari


Method


Results


Conclusion


References 1 The HARK Documentation available online at https://www.hark.jp/document/3.0.0/hark-document-en/sect0002.html 2 The KALDI Decoder online available at https://kaldi-asr.org/ 3 D. Povey, “KALDI – Home”, KALDI, n.d. [Online] Available at: http://kaldi-asr.org/ [Accessed: September 2019].