dc.contributor.author |
Punchimudiyanse, M. |
|
dc.contributor.author |
Meegama, R.G.N. |
|
dc.date.accessioned |
2017-10-23T09:06:55Z |
|
dc.date.available |
2017-10-23T09:06:55Z |
|
dc.date.issued |
2015 |
|
dc.identifier.citation |
Punchimudiyanse, M., Meegama, R.G.N. (2015). "Unicode Sinhala and Phonetic English Bi-directional Conversion for Sinhala Speech Recognizer", IEEE International Conference on Industrial and Information Systems 2015, pp. 01-06 |
en_US, si_LK |
dc.identifier.uri |
http://dr.lib.sjp.ac.lk/handle/123456789/6035 |
|
dc.description.abstract |
Attached |
en_US, si_LK |
dc.description.abstract |
An automated speech recognizer (ASR) having a
large vocabulary is yet to be developed for the Sinhala language
because of the time consuming nature of gathering the training
data to build a language model. The dictionary and building the
language model require non-English text, in our case, Sinhala
Unicode, to be transcribed in phonetic English text Unlike text to
speech conversions which only require transcribing the nonEnglish text to phonetic English text an ASR needs correct
reproduction of the original language text when the phonetic
English text is produced as the output of the speech recognizer.
In the present research, newspaper articles are used to gather a
large set of sentences to build a language model having thousands
of words for the Sphinx ASR. We present a decoder algorithm
that produces phonetic English text from Sinhala Unicode text
and an encoder algorithm that produces the correct reproduction
of Unicode Sinhala text from phonetic English. For a near
phonetic tag set for Sinhala alphabet, results indicate 100%
accuracy for the decoder algorithm while for numberless text,
accuracy of the encoder algorithm stands at 98.61% for distinct
phonetic English words. |
|
dc.language.iso |
en_US |
en_US, si_LK |
dc.publisher |
IEEE International Conference on Industrial and Information Systems 2015 |
en_US, si_LK |
dc.subject |
Sinhala to Phonetic English |
en_US, si_LK |
dc.subject |
Phonetic English to Sinhala |
en_US, si_LK |
dc.subject |
Sinhala ASR |
en_US, si_LK |
dc.subject |
Sinhala Phonetic Tag set |
en_US, si_LK |
dc.subject |
Sphinx |
en_US, si_LK |
dc.title |
Unicode Sinhala and Phonetic English Bi-directional Conversion for Sinhala Speech Recognizer |
en_US, si_LK |
dc.type |
Article |
en_US, si_LK |