diff --git a/README.md b/README.md index b098e90..27b8b78 100644 --- a/README.md +++ b/README.md @@ -6,14 +6,15 @@ Pelase read the description of that project to understand what kind of data this The data can be downloaded here: https://uni-bielefeld.sciebo.de/s/9p55VIn9OLmNqa9 -Size: around 70GB +**Size**: around 70GB + +**Structure of the data files**: -Structure: Note that there are currently **two** versions of all the data available. -Version 1.0 data contains the data described and used for the original master thesis. The master thesis can be viewed here. +Version _1.0\_data_ contains the data described and used for the original master thesis. The master thesis can be viewed here. Protocols for the periods 15, 16 and 17 are erroneous. -Version 1.1 data contains new calculated ngrams based on new officlally released xml protocols. Also some fixes have been introduced. +Version _1.1\_data_ contains new calculated ngrams based on new officlally released xml protocols. Also some fixes regarding the markup have been introduced. The erroneous protocols have been fixed officially from the Bundesregierung. ``` .