Update README.md
This commit is contained in:
parent
ba3102f559
commit
357f47e689
13
README.md
13
README.md
@ -36,12 +36,17 @@ The actual data can be found here: https://gitlab.ub.uni-bielefeld.de/sporada/bu
|
||||
## Import the data into the database
|
||||
|
||||
1. Befor importing the data we have to setup the tables in the PostgreSQL database.
|
||||
1. Do this with `docker-compose run web python manage.py makemigrations`
|
||||
1. followed by `docker-compose run web python manage.py migrate`.
|
||||
- Do this with `docker-compose run web python manage.py makemigrations`
|
||||
- followed by `docker-compose run web python manage.py migrate`.
|
||||
11. Now the data for the ngrams, speeches, and speakers has to be imported into the database of the app.
|
||||
12. Shutdown the app with the command `docker-compose down`.
|
||||
13. Change the owner rights of all files in the repository. This has to be done because every process inside a docker container is always executed with root privilage. Thus the created volumes are not accessable anymore. Change the rights with `sudo chown -R $USER:$USER .` This is only needed for linux systems.
|
||||
12. Download the folders *MdB\_data* and *outputs* from the link mentioned in [this repository](https://gitlab.ub.uni-bielefeld.de/sporada/bundesdata_markup_nlp_data) and copy those into the folder *input_volume* which is located inside the web app repository on the root level. If the downloaded folders are inside an archive extract the folders first. This folder is a volume which is mounted into the web app container. The contianer is able to read every data inside that volume. Note that the volume is accessed with the path */usr/src/app/input_data* not */usr/src/app/input_volume*.
|
||||
13. Change the owner rights of all files in the repository. (This step should only be necessary for linux systems.)
|
||||
- This has to be done because every process inside a docker container is always executed with root privilage. Thus the created volumes are not accessable anymore.
|
||||
- Change the rights with `sudo chown -R $USER:$USER .`
|
||||
12. Download the folders *MdB\_data* and *outputs* from the link mentioned in [this repository](https://gitlab.ub.uni-bielefeld.de/sporada/bundesdata_markup_nlp_data).
|
||||
- Copy those into the folder *input_volume* which is located inside the web app repository on the root level.
|
||||
- If the downloaded folders are inside an archive extract the folders first.
|
||||
- The folder *input_volume* is a volume which is mounted into the web app container. The contianer is able to read every data inside that volume. Note that the volume is accessed with the path */usr/src/app/input_data* not */usr/src/app/input_volume*.
|
||||
13. Restart the app with `docker-compose up`
|
||||
13. First we have to import the speaker data. This will be done by executing following command `docker-compose run web python manage.py import_speakers /usr/src/app/input_data/MdB_data/MdB_Stammdaten.xml` in the second terminal.
|
||||
14. After that we can import all the protocols and thus all speeches for every person. The command to do that is `docker-compose run web python manage.py import_protocols /usr/src/app/input_data/outputs/markup/full_periods` (Importing all protocols takes up to 2 days. For testing purposes *dev\_data/beautiful\_xml* or *test\_data/beautiful\_xml* can be used.)
|
||||
|
Loading…
Reference in New Issue
Block a user