

浏览全部资源
扫码关注微信
新疆大学信息科学与工程学院
Published:2009
移动端阅览
[1]古丽拉·阿东别克,达吾勒·阿布都哈依尔,木合亚提·尼亚孜别克,等.现代哈萨克语词级标注语料库的构建研究[J].新疆大学学报(自然科学版),2009,26(04):394-401+505.
古丽拉·阿东别克, 达吾勒·阿布都哈依尔, 木合亚提·尼亚孜别克, et al. 现代哈萨克语词级标注语料库的构建研究[J]. Journal of Xinjiang University (Natural Science Edition in Chinese and English), 2009, 26(4): 394-401.
建设高质量的标注语料库是现代哈萨克语自然语言信息处理领域的基础性工程
本文根据哈萨克语独特的语言特点
进行词级带标注的哈萨克语语料库构建研究
首先介绍了不同语言语料库的国内外研究现状
随后针对语料库构建中涉及的主要问题
实现了哈萨克语词级基本语料库的设计与构建.
The basic work for buiding up a high-standard word-tagging corpus lies in the natural language information processing of modern Kazakh language(KML).This paper intruduces the recent developments of corpus research in different languages befor it starts to design and configurate a word-tagging corpus of KML
it carries forward the research of the word-tagging corpus of KML
including the integrating morphological processing
affixe segmentation tagging and part-of speech(POS) tagging in light of the peculiarities of the KML.
俞士汶.计算语言学概论[M].北京:商务印书馆,2003.
冯志伟.中国语料库研究的历史与现状[J].Journal of Chinese Language and Computing,2002,11(2):127-136.
俞士汶,朱学锋,段慧明.大规模现代汉语标注语料库的加工规范[J].中文信息学报,2000,14(6):58-64.
黄昌宁,等.语料库语言学[M].北京:商务印书馆,2002.
Galcin Cebi and G(o|¨)khan Dalkilic.Turkish Word N-gram Analyzing Algorithms for a Large Scale Turkish Corpus Turco[C]. Proceedings of the International Conference on Information Technology:Coding and Computing(ITCC'04),2004.
Eric Brill.A Simple rule-based part of speech tagger[C].Proc.of the Third conference on Applied Natural Language Processing(ACL),Trento Italy,1992,152-155.
Evangelos Dermatas,George K.Automatic Stochastic Tagging of Natural Language Texts[J].Computational linguistics, 1995,21(2):137-163.
新疆哈萨克自治区语委会.现代哈萨克语[M].乌鲁木齐:新疆人民出版社,2002.
张定京.现代哈萨克语实用语法[M].北京:中央民族大学出版社,2004.
0
Views
345
下载量
0
CSCD
Publicity Resources
Related Articles
Related Author
Related Institution
京公网安备11010802024621