Develop a Part-of-Speech Tagger and a Tagger-Maker: Algorithms, Implementations, Results, and APIs - Couverture souple

Han, Jiayun

 
9783659376221: Develop a Part-of-Speech Tagger and a Tagger-Maker: Algorithms, Implementations, Results, and APIs

Synopsis

This project is aimed to build an efficient, scalable, portable, and trainable part-of-speech tagger. Using 98% of Penn Treebank-3 as the training data, it builds a raw tagger, using Bayes’ theorem, a hidden Markov model, and the Viterbi algorithm. After that, a reinforcement machine learning algorithm and contextual transformation rules were applied to increase the tagger’s accuracy. The tagger’s final accuracy on the testing data is 96.51% and its speed is about 26,000 words per second on a computer with two-gigabyte random access memory and two 3.00 GHz Pentium duo processors. The tagger’s portability and trainability are proved by the tagger-maker’s success in building a new tagger out of a corpus that is annotated with the tagset different from that of Penn Treebank.

Les informations fournies dans la section « Synopsis » peuvent faire référence à une autre édition de ce titre.

Présentation de l'éditeur

This project is aimed to build an efficient, scalable, portable, and trainable part-of-speech tagger. Using 98% of Penn Treebank-3 as the training data, it builds a raw tagger, using Bayes’ theorem, a hidden Markov model, and the Viterbi algorithm. After that, a reinforcement machine learning algorithm and contextual transformation rules were applied to increase the tagger’s accuracy. The tagger’s final accuracy on the testing data is 96.51% and its speed is about 26,000 words per second on a computer with two-gigabyte random access memory and two 3.00 GHz Pentium duo processors. The tagger’s portability and trainability are proved by the tagger-maker’s success in building a new tagger out of a corpus that is annotated with the tagset different from that of Penn Treebank.

Biographie de l'auteur

Jiayun Han, Obtained his PhD in Linguistics and MS in Artificial Intelligence from The University of Georgia, U.S.A. He was working for North Side Inc. as a natural language processing engineer and is currently employed by Manwin Canada as a software developer.

Les informations fournies dans la section « A propos du livre » peuvent faire référence à une autre édition de ce titre.