买日旦·吾守尔, 维尼拉·木沙江. 维哈柯汉多语种词典中关键词语言识别技术(英文)[J]. Journal of Xinjiang University (Natural Science Edition in Chinese and English), 2014, 31(1): 7-11.DOI:
维哈柯汉多语种词典中关键词语言识别技术(英文)
摘要
本文以维哈柯汉多语种、多向词典为背景
指出了语言所特有的一些技术难点
这些技术难题包括:如何识别书写方向
如何区分维哈柯字母.针对这些问题
本文给出了相应的解决方案
例如:用XML属性和Unicode区域分析来决定书写方向
计算特殊字母出现的频率并选择用户定义字体.最后通过实验验证我们的方案的可行性.
Abstract
This paper takes the designing of Chinese
Uyghur
Kazak
Kirghiz Multi-lingual Multi-directional dictionary system as background
pointed out the language specific technical difficulties including how to determine the writing directions
how to distinguish the letters of Uyghur
Kazak
Kirghiz from each other.Then proposed corresponding solutions:using XML attributes and Unicode region analyzing method to determine the writing directions;calculate the usage rates of letters in specific words select the user defined fonts.Applying results indicate the feasibility and validity of these solutions.
关键词
Keywords
references
Ahmed B,Cha B,Tappert C.Language Identifcation from Text Using NGram Cumulative Frequency Addition Proceedings of Student[C].Faculty Research Day,CSIS,2004,12:1-8.
Kruengkrai C,Srichaivattana P,Sorlertlamvanich V,et al.Language Identi cation Based on String Kernels[C].In IEEE International Symposium on Commmunications and Information Technology,2005,2:926-929.
Pengkong Ma.Lixiang Yue.Discussion and Application of Character Code Based on NET Platform[J].Computer Science,2004,31(2):28-35.
Witold Litwin,David B Lomet.The Bounded Disorder Access Method[C].Infernational Cenference on Data Engineering,1986,38-48.
Bi Yude,Xian Queji,Yang Li.Several Issues in Construction Multilingual Semantic Network[C].In Eighth National Conference on Computational Linguistics Proceedings,Nanjing,2008.