diff --git a/README.md b/README.md index 595af93..c0e81f5 100755 --- a/README.md +++ b/README.md @@ -104,6 +104,8 @@ official protocols used as input data are included. - N-grams will then be count by year, speaker and so on. - The second argument is a user specified string to identify from what kind of protocols the n-grams have been calculated. - The string "lm_ns_year" for example describes that the input protocols have been lemmatized ("lm") and contain no stop words ("ns"). The last part ("year") specifies that the n-grams have been calculated by year. + - The string "tk_ws_speaker" means that the ngrams are calculated using tokenized("tk") protocols with stop words ("ws"). + - N-grams are counted per speaker ("speaker"). # Used packages and software - js-beautify: https://github.com/beautify-web/js-beautify