|
|
Line 1: |
Line 1: |
− | To enable users to access functionalities of KALDI (http://kaldi.sourceforge.net/about.html) without the
| + | |
− | knowledge of scripting, a language like Bash, or detailed knowledge of the internal algorithms of KALDI. Furthermore
| |
− | attempts will be made to transcribe live audio speech continuously.
| |
− | Project Proposals: The proposal consists of two parts. For the first part is focused on improving usability and User
| |
− | Interaction with KALDI through a GUI that has the following features:
| |
− | • Availability of a microphone soft ON and OFF switch
| |
− | • Minimal scripting knowledge or commands to operate.
| |
− | • Provide users the ability to select acoustic and language models of their choice. This can be done by allowing
| |
− | the users either to select one of the pre-trained models or to perform their own acoustic and language model
| |
− | training in order to subsequently use those models.
| |
− | • Allow the user to select transcribing from continuous live speech input or from recorded audio. Recording
| |
− | audio from the speaker during live input allows the audio to be played back in order to correct errors in the
| |
− | transcript.
| |
− | • Isolating Utterance/Speaker ID and Speaker ID/Utterance pairs from decoded results for later analysis of
| |
− | recognition performance of each user. This process also allows plain transcript for each user to be produced
| |
− | that is free from labels and indices.
| |
− | • A facility whereby a user can improve her/his recognition performance with KALDI through user adaptive
| |
− | training i.e. by saving changes to her/his acoustic model after each decoding session.
| |
− | The second part is reporting the project outcomes through
| |
− | • Documenting the developed graphical user interface design and functionality for KALDI including the
| |
− | processes for selecting acoustic and language models, and incorporating online decoding features.
| |
− | • Documenting the results of evaluation studies related to the usability of the new GUI design.
| |
− | • Presenting the work to interested staff in Intelligence Analytics Branch of DST Group.
| |