Update README.md

This commit is contained in:
Stephan Porada 2019-06-18 11:28:54 +02:00
parent 3b368f70fe
commit 65acb5d952

View File

@ -6,14 +6,15 @@ Pelase read the description of that project to understand what kind of data this
The data can be downloaded here: https://uni-bielefeld.sciebo.de/s/9p55VIn9OLmNqa9
Size: around 70GB
**Size**: around 70GB
**Structure of the data files**:
Structure:
Note that there are currently **two** versions of all the data available.
Version 1.0 data contains the data described and used for the original master thesis. The master thesis can be viewed here.
Version _1.0\_data_ contains the data described and used for the original master thesis. The master thesis can be viewed here. Protocols for the periods 15, 16 and 17 are erroneous.
Version 1.1 data contains new calculated ngrams based on new officlally released xml protocols. Also some fixes have been introduced.
Version _1.1\_data_ contains new calculated ngrams based on new officlally released xml protocols. Also some fixes regarding the markup have been introduced. The erroneous protocols have been fixed officially from the Bundesregierung.
```
.