Hierarchical Tag-set for Rule-based Processing of Tamil Language

DSpace Home
→
USJP - Academic Journals
→
International Journal of Multidisciplinary Studies
→
Volume 01 Issue 02 - 2014
→
View Item

dc.contributor.author	Sarveswaran, Kengatharaiyer
dc.contributor.author	Mahesan, Sinnathamby
dc.date.accessioned	2016-10-25T07:03:08Z
dc.date.available	2016-10-25T07:03:08Z
dc.date.issued	2016-10-25T07:03:08Z
dc.identifier.citation	Sarveswaran, K., & Mahesan, S. (2014). Hierarchical Tag-set for Rule-based Processing of Tamil Language. International Journal of Multidisciplinary Studies (IJMS), 1(2), 67-74.
dc.identifier.uri	http://dr.lib.sjp.ac.lk/handle/123456789/3307
dc.description.abstract	Corpora are fundamental tools for Natural Language Processing. Part of Speech tagging provides more meaning to the corpora by annotating words. A tag-set used to annotate a corpus should be selected in such a way that it represents grammatical structure of the respective language. These tag-sets can be flat or hierarchical in structure. There are several efforts have been made in Tamil language to identify a tag-set. However, existing tag-sets have many shortcomings including inability of tagging all the words, inability to capture required syntactic information such as divisibility, too many numbers of tags in a set, flat in tag structure, and lack of extendibility. The scholar works Tolkāppiyam and Naṉṉūl clearly shows the grammatical classification of words. This paper proposes a new hierarchical tag-set with 10 labels for Tamil language in view of developing a morphological analyser by considering the existing limitations and using Tamil grammar. The morphological analyser can be used to extend the proposed tag-set easily with more grammatical information.	en_US
dc.language.iso	en	en_US
dc.subject	POS tagging	en_US
dc.subject	Tag-set	en_US
dc.subject	Morphological analyser	en_US
dc.subject	Tamil grammar	en_US
dc.title	Hierarchical Tag-set for Rule-based Processing of Tamil Language	en_US
dc.type	Article	en_US
dc.date.published	2014