工业工程 ›› 2023, Vol. 26 ›› Issue (3): 86-94.doi: 10.3969/j.issn.1007-7375.2023.03.010

• 系统建模与优化方法 • 上一篇    下一篇

基于两视图半监督学习的产品质量问题识别方法

姚池1,2, 潘尔顺1,2   

  1. 上海交通大学 1. 机械与动力工程学院;
    2. 中国质量发展研究院,上海 200240
  • 收稿日期:2021-10-11 发布日期:2023-07-08
  • 通讯作者: 潘尔顺(1972-),男,江苏省人,教授,博士,主要研究方向为可靠性工程与宏观质量研究。
  • 作者简介:姚池(1997-),女,四川省人,硕士研究生,主要研究方向为质量管理与文本挖掘
  • 基金资助:
    中国工程院重大咨询资助项目(2021-HYZD-7-3)

Identification Method of Product Quality Problems Based on Two-view Semi-supervised Learning

YAO Chi1,2, PAN Ershun1,2   

  1. 1. School of Mechanical Engineering;
    2. Chinese Institute for Quality Research, Shanghai Jiao Tong University, Shanghai 200240, China
  • Received:2021-10-11 Published:2023-07-08

摘要: 针对电商网站中的大量非结构化、无标注的用户评论文本,运用两视图半监督学习方法对其进行分类,识别出涉及产品质量问题的内容,从而挖掘出其中隐含的产品质量缺陷与隐患。综合考虑词汇、情感、领域等多方面特征,构建文本特征视图和非文本特征视图,采用Co-training协同训练算法,依据是否涉及质量问题对评论进行分类。以电热水壶为例,爬取电商网站的评论数据进行实证分析。结果显示,本文方法的分类F1值和AUC值分别为82.18%和86.24%,相比于单视图监督学习分类器具有显著提升。

关键词: 评论分类, 多视图学习, 半监督学习, 协同训练, 质量问题识别

Abstract: Based on the abundant unstructured and unlabeled texts of consumer reviews in e-commerce websites, a two-view semi-supervised learning method is proposed to classify the reviews and identify the content related to product quality problems, so as to mine the hidden quality defects and dangers of products. Comprehensively considering the characteristics of vocabulary, emotion, domain and so on, the text view and non-text view are constructed, and the Co-training collaborative training algorithm is adopted to classify the reviews according to whether quality problems are involved. Taking the electric kettle as an example, the consumer reviews were crawled from an e-commerce website for empirical analysis. Results show that F1 score and AUC of the proposed method are 82.18% and 86.24%, respectively, which is significantly improved compared with the single view supervised learning classifier.

Key words: reviews classification, multi-view learning, semi-supervised learning, collaborative training, quality problems identification

中图分类号: