aboutsummaryrefslogtreecommitdiff
path: root/exampleData/ruleSets/language-processing/jspos/README.txt
diff options
context:
space:
mode:
Diffstat (limited to 'exampleData/ruleSets/language-processing/jspos/README.txt')
-rw-r--r--exampleData/ruleSets/language-processing/jspos/README.txt81
1 files changed, 81 insertions, 0 deletions
diff --git a/exampleData/ruleSets/language-processing/jspos/README.txt b/exampleData/ruleSets/language-processing/jspos/README.txt
new file mode 100644
index 0000000..d12f12d
--- /dev/null
+++ b/exampleData/ruleSets/language-processing/jspos/README.txt
@@ -0,0 +1,81 @@
+ABOUT:
+
+jspos is a Javascript port of Mark Watson's FastTag Part of Speech Tagger which
+was itself based on Eric Brill's trained rule set and English lexicon.
+jspos also includes a basic lexer that can be used to extract words and other
+tokens from text strings.
+
+LICENSE:
+
+jspos is licensed under the GNU LGPLv3
+
+FILES:
+
+lexicon.js_ - Javascript version of Eric Brill's English lexicon
+lexer.js - Lexer to break a sentence into taggable tokens (e.g. words)
+POSTagger.js - the Part of Speech tagger
+
+You'll typically need to include all 3 files.
+
+USAGE:
+
+var words = new Lexer().lex("This is some sample text. This text can contain multiple sentences.");
+var taggedWords = new POSTagger().tag(words);
+for (i in taggedWords) {
+ var taggedWord = taggedWords[i];
+ var word = taggedWord[0];
+ var tag = taggedWord[1];
+}
+
+ACKNOWLEDGEMENTS:
+
+Thanks to Mark Watson for writing FastTag, which served as the basis for jspos.
+
+Thanks to Toby Rahilly for compressing the lexicon.
+
+TAGS:
+
+CC Coord Conjuncn and,but,or
+CD Cardinal number one,two
+DT Determiner the,some
+EX Existential there there
+FW Foreign Word mon dieu
+IN Preposition of,in,by
+JJ Adjective big
+JJR Adj., comparative bigger
+JJS Adj., superlative biggest
+LS List item marker 1,One
+MD Modal can,should
+NN Noun, sing. or mass dog
+NNP Proper noun, sing. Edinburgh
+NNPS Proper noun, plural Smiths
+NNS Noun, plural dogs
+POS Possessive ending Õs
+PDT Predeterminer all, both
+PP$ Possessive pronoun my,oneÕs
+PRP Personal pronoun I,you,she
+RB Adverb quickly
+RBR Adverb, comparative faster
+RBS Adverb, superlative fastest
+RP Particle up,off
+SYM Symbol +,%,&
+TO ÒtoÓ to
+UH Interjection oh, oops
+VB verb, base form eat
+VBD verb, past tense ate
+VBG verb, gerund eating
+VBN verb, past part eaten
+VBP Verb, present eat
+VBZ Verb, present eats
+WDT Wh-determiner which,that
+WP Wh pronoun who,what
+WP$ Possessive-Wh whose
+WRB Wh-adverb how,where
+, Comma ,
+. Sent-final punct . ! ?
+: Mid-sent punct. : ; Ñ
+$ Dollar sign $
+# Pound sign #
+" quote "
+( Left paren (
+) Right paren ) \ No newline at end of file