On December 27-29, 2020, the International Consortium of Chinese Mathematicians (ICCM) was held in Hefei, Anhui. Associate Professor Ke Deng from the Center for Statistical Science of Tsinghua University wins the ICCM 2020 Best Paper Award-Silver Award, as the first author of the academic paper On the unsupervised analysis of domain-specific Chinese texts. The paper was completed by Associate Professor Deng Ke, Professor Peter Bol from Harvard University, Professor Jun S. Liu from Harvard University, and Associate Professor Jiayi Li from Suffolk University. The paper was published in PNAS, the journal of the National Academy of Sciences of USA.
The paper proposes a new method for unsupervised Chinese text analysis - TopWORDS using statistical models and principles, which can perform word discovery and Chinese word segmentation for Chinese text in specific fields. This method can also be combined with other text analysis tools, such as word embedding, topic model, association rule mining, etc., to extract the main features and information in the text, which is an important breakthrough in the field of Chinese text mining.