In this thesis, we present an approach to build an opinion mining system of customer reviews according to product features based on Vietnamese syntax rules and VietSentiWordNet dictionary in four phases: (1)Pre-processing; (2)Extracting explicit/implicit product features and opinion-words,and grouping synonym product features; (3)Identifying orientation of opinion; and (4)Summarizing the results. With three main contributions as following: Firstly, in the phase 1, we build a Vietnamese accented system combined N-gram statistic model and Hidden Markov model(HMM) for the purpose ofconverting a sentence without accents into a Vietnamese accented sentence. Secondly, in the phase 2, we construct a mapping dictionary to identify implicit features by mapping those ones to corresponding opinion words; and we proposed a method of using SVM-kNN semi-supervised learning along with HAC clustering method generating training set for SVM-kNN to group synonym features; after that, co-reference was resolved by using some Vietnamese rules.
Nhận xét
Đăng nhận xét