Academia Sinica Balanced Corpus (Sinica Corpus) is the first proportionally sampled Chinese corpus with part-of-speech tagging. The corpus (Sinica 1.0) was compiled and opened to the research community through direct license in 1995 (Huang et al. 1995). Its size was two million words. After 10 years of further development, it was upgraded to the Sinica 5.0 with ten million words in 2005. Its on-line web service is available at http://asbc.iis.sinica.edu.tw. The corpus can also be a…
Cite this page
Keh-Jiann CHEN and
“Academia Sinica Balanced Corpus”, in:
Encyclopedia of Chinese Language and Linguistics, General Editor Rint Sybesma.
Consulted online on 19 October 2017 <http://dx.doi.org/10.1163/2210-7363_ecll_COM_000191>