爱问知识人 爱问教育 医院库

求贵人帮忙翻译

首页

求贵人帮忙翻译

A Multi-Classifier Based Guideline SentenceClassification System
A 10-fold cross validation using 340 sentences was done. Althoughthe typical method is to split data into training, testingand validation sets, n-fold cross validation is preferredwhen the available data are highly limited in size. So we splitthe data into 10 sub-sets and tested on every nth data aftertraining each model using the rest of the data, calculatingprecision, recall and f-measure on average. The ratio of thesize of sentences for four classes was identical, the number ofinstances thus being 85 per class. Each classifier was trainedusing a set of 340 feature vectors.
Much to the contrary to the initial intuition, the f-measureof overall classifiers given in Table 4 was over 0.78. Theperformance of Naïve Bayes was relatively lower than otheralgorithms, probably due to its being based on independenceassumption, which does not stand in harmony with the nonlinearnature inherent in textual object generation systems.Since this research presupposed that these base learnerswould choose weak classification model parameters andthat it would be necessary for them to be sequentially combinedby boosting for learning strong classifiers based on aweighted voting scheme, the result above simply eradicatedthe necessity of using a meta-algorithm (Table 5).
Since weak classifiers are generally defined as having slightlybetter accuracy than a random classifier with an accuracyof less than 50%, we concluded that even boosting NaïveBayes classifiers would not cause distinct improvement.Table 6 is the actual test in which we sequentially combinedNaïve Bayes classifiers using the AdaBoost.M1 meta-algorithm.
As given in (Table 6), the improvement of Naïve Bayes classifierscombined by AdaBoost.M1 was not sufficiently conspicuous.This is because the transformation of the originalfeature space by feature extractors into a lower dimensionmay have reduced the complexity and non-linearity of thedata, decreasing the width of scatter and the number of outliers.In actual fact, boosting Neural Network algorithm suchas Multi-layered Perceptron or Radial Basis Function Networksaw no increase of accuracy by AdaBoost.M1.
The aims of this research were to apply transformation totackle the problem of dimensionality and to increase classificationaccuracy by applying a boosting algorithm to learna strong classifier that is robust to outliers and the non-linearcharacteristics of data. The second purpose turns out to bemeaningless when the curse of dimensionality is resolvedand robust classification algorithms such as a multilayer perceptronor a radial basis function network are adopted.
Moreover, we found that transformation has the advantageof exploiting structural and underlying features which gounseen by the BOW model. We also realized that integratinga sentential classifier with a TF-IDF-based search engineenhances a search process by realizing the capabilityof maximizing the probability of automatically presentingrelevant information required in the context generated inthe guideline authoring environment. This, however, has adisadvantage of increasing the total amount of time requiredto parse and classify the set of sentences within a documentat runtime. Therefore, our future study shall be focused onexcluding slow parsing processes while extracting structuralfeatures from a textual object.

提交回答
好评回答
  • 2012-11-11 07:17:41
      以下翻译仅作参考:
    指引句子分类系统的多分类器
    10倍交叉验证,使用340句。虽然典型的方法是分裂培训,测试和验证集的数据转换成可用的数据n倍交叉验证时,最好是非常有限的大小。因此,我们的数据分割成10个子集,对训练后每个模型,使用其余的数据,计算精度,召回和f-衡量平均每n个数据进行测试。
      四个类的句子的大小的比率是相同的,因此每个类的实例的数量是85。使用一组340的特征向量,每个分类器进行训练。 大部分的初始的直觉相反,在表4中给定的整体分类器的f-衡量是0。78以上。贝叶斯算法的性能比其他算法,可能是由于它是基于独立假设,即不站在固有的文本对象发电系统的非线性性质的和谐与相对较低。
      由于本研究的先决条件,这些基础的学习者会选择弱分类模型参数,这将是他们必须按顺序通过提高学习强分类器的加权投票制度,简单地消灭了需要使用元以上结果的基础上结合算法(表5)。 由于弱分类器通常被定义为具有比随机分类的准确度低于50%的准确度稍微好一点,我们得出的结论是,即使提高朴素贝叶斯分类器将不会导致明显的改善。
      表6是实际的测试中,我们按顺序使用AdaBoost。M1元算法结合朴素贝叶斯分类器。 (表6)中给出的,组合AdaBoost。M1朴素贝叶斯分类器的改善是不充分突出。这是因为转换成​​低维的特征提取器的原始特征空间可能已经减少了的数据的复杂性和非线性,散射和异常值的数目减小的宽度。
      事实上,提高神经网络算法,如多层感知或径向基函数神经网络的准确性AdaBoost。M1没有增加。 本研究的目的是要适用变换来处理这个问题的维数,并增加通过施加的升压算法学习是鲁棒性离群值和数据的非线性特性的强分类器的分类精度。第二个目的是毫无意义的维数灾难时解决了一个多层感知器或径向基函数神经网络和强大的分类算法,如采用。
       此外,我们发现去看不见的弓模型的结构和基本功能,利用这种转变的优势。我们还意识到,集成了一个句子的分类与TF-IDF的搜索引擎,实现最大可能的背景下产生的方针创作环境中所需的相关信息自动呈现的能力,增强了搜索的过程。然而,这有一个缺点,增加总的所需的时间量,在运行时的文件内的一组句子解析和分类。
      因此,我们今后的研究应集中在不包括解析过程缓慢而提取的结构特征从一个文本对象。

    3***

    2012-11-11 07:17:41

其他答案

类似问题

换一换
  • 外语学习 相关知识

  • 教育培训
  • 教育考试

相关推荐

正在加载...
最新资料 推荐信息 热门专题 热点推荐
  • 1-20
  • 21-40
  • 41-60
  • 61-80
  • 81-100
  • 101-120
  • 121-140
  • 141-160
  • 161-180
  • 181-200
  • 1-20
  • 21-40
  • 41-60
  • 61-80
  • 81-100
  • 101-120
  • 121-140
  • 141-160
  • 161-180
  • 181-200
  • 1-20
  • 21-40
  • 41-60
  • 61-80
  • 81-100
  • 101-120
  • 121-140
  • 141-160
  • 161-180
  • 181-200
  • 1-20
  • 21-40
  • 41-60
  • 61-80
  • 81-100
  • 101-120
  • 121-140
  • 141-160
  • 161-180
  • 181-200

热点检索

  • 1-20
  • 21-40
  • 41-60
  • 61-80
  • 81-100
  • 101-120
  • 121-140
  • 141-160
  • 161-180
  • 181-200
返回
顶部
帮助 意见
反馈

确定举报此问题

举报原因(必选):