diff --git a/README.md b/README.md index dc5c444..2998abb 100755 --- a/README.md +++ b/README.md @@ -89,3 +89,12 @@ official protocols used as input data are included. 5. If you want to calculate n-grams from tokenized protocols without stopwords per year use this command: `./bundesdata_nlp.py -cn year tk_ns_year -sp /path/to/nlp_output/nlp_beuatiful_xml/ /path/to/some/folder/for/the/output/`. 6. If you want to calculate n-grams from tokenized protocols with stopwords per speaker use this command: `./bundesdata_nlp.py -cn speaker tk_ws_speaker -sp /path/to/nlp_output/nlp_beuatiful_xml/ /path/to/some/folder/for/the/output/`. 7. The parameter `-cn` is always followed by two arguments (Example: `-cn year lm_ns_year`). The first is used to specifie how the n-grams are counted. It can be set to "year", "mont_year", "speaker" or "speech". N-grams will then be count by year, speaker and so on. The second argument is a user specified string to identify from what kind of protocols the n-grams have been calculated. The string "lm_ns_year" for example describes that the input protocols have been lemmatized (lm) and contain no stop words (ns). The last part (year) specifies that the n-grams have been calculated by year. + +# Used packages and software +- js-beautify + - Lielmanis, E.; Newman, L.; Stockman, D. & Sanfilippo, S. +- lxml + - Behnel, S.; Faassen, M.; Bicking, I.; Joukl, H.; Sapin, S.; Parent, M.-A.; Grisel, O.; Buchcik, K.; Wagner, F.; Kroymann, E.; Everitt, P.; Ng, V.; Kern, R.; Pakulat, A.; Sankel, D.; Kasperski, M.; da Silva, S. & Oberndörfer, P. +- Babel2018 + - Ronacher, A. +