nopaque/README.md

71 lines
3.9 KiB
Markdown
Raw Normal View History

2020-06-05 14:42:04 +02:00
# nopaque
2019-07-17 13:49:16 +02:00
2020-06-09 13:27:02 +02:00
nopaque bundles various tools and services that provide humanities scholars with DH methods and thus can support their various individual research processes. Using nopaque, researchers can subject digitized sources to Optical Character Recognition (OCR). The resulting text files can then be used as a data basis for Natural Language Processing (NLP). The texts are automatically subjected to various linguistic annotations. The data processed via NLP can then be summarized in the web application as corpora and analyzed by means of an information retrieval system through complex search queries. The range of functions of the web application will be successively extended according to the needs of the researchers.
2019-08-26 14:09:43 +02:00
2020-06-05 14:42:04 +02:00
## Prerequisites and requirements
2019-08-26 14:09:43 +02:00
2020-06-05 14:42:04 +02:00
1. Install docker for your system. Following the official instructions. (LINK)
2. Install docker-compose. Following the official instructions. (LINK)
2019-08-26 14:09:43 +02:00
2020-01-07 15:03:21 +01:00
2020-06-09 13:27:02 +02:00
## Configuration and startup
2019-08-15 12:03:16 +02:00
2020-06-09 13:27:02 +02:00
### **Create Docker swarm**
2020-02-13 14:35:55 +01:00
2020-06-05 14:42:04 +02:00
The generated computational workload is handled by a [Docker](https://docs.docker.com/) swarm. A swarm is a group of machines that are running Docker and joined into a cluster. It consists out of two different kinds of members, manager and worker nodes. The swarm setup process is described best in the [Docker documentation](https://docs.docker.com/engine/swarm/swarm-tutorial/).
2020-02-13 14:35:55 +01:00
2020-06-09 13:27:02 +02:00
### **Create network storage**
2020-02-13 14:35:55 +01:00
2020-06-05 14:42:04 +02:00
A shared network space is necessary so that all swarm members have access to all the data. To achieve this a [samba](https://www.samba.org/) share is used.
2019-08-26 14:09:43 +02:00
``` bash
2020-01-07 15:03:21 +01:00
# Example: Create a Samba share via Docker
# More details can be found under https://hub.docker.com/r/dperson/samba/
2020-06-09 13:27:02 +02:00
username@hostname:~$ sudo mkdir -p /srv/nopaque/storage
username@hostname:~$ docker run \
--name opaque_storage \
-v /srv/nopaque/storage:/srv/nopaque/storage \
-p 445:445 \
dperson/samba \
-p \
-s storage.nopaque;/srv/nopaque/storage;no;no;no;nopaque \
-u nopaque;nopaque
2020-01-07 15:03:21 +01:00
2020-06-05 14:42:04 +02:00
# Mount the Samba share on all swarm nodes (managers and workers)
2020-06-09 13:27:02 +02:00
username@hostname:~$ sudo mkdir /mnt/nopaque
username@hostname:~$ sudo mount --types cifs --options gid=${USER},password=nopaque,uid=${USER},user=nopaque,vers=3.0 //<SAMBA-SERVER-IP>/storage.nopaque /mnt/nopaque
2019-08-15 12:03:16 +02:00
```
2020-06-05 14:42:04 +02:00
2020-06-09 13:27:02 +02:00
### **Download, configure and build nopaque**
2020-06-05 14:42:04 +02:00
2019-08-26 14:09:43 +02:00
``` bash
2020-07-28 13:14:14 +02:00
# Clone the nopaque repository
2020-06-09 13:27:02 +02:00
username@hostname:~$ git clone https://gitlab.ub.uni-bielefeld.de/sfb1288inf/nopaque.git
username@hostname:~$ cp .env.tpl .env
# Fill out the variables within this file.
username@hostname:~$ <YOUR EDITOR> .env
2020-06-09 15:59:41 +02:00
username@hostname:~$ cp docker-compose.override.yml.tpl docker-compose.override.yml
2020-06-09 13:27:02 +02:00
# Tweak the docker-compose.override.yml to satisfy your needs.
username@hostname:~$ <YOUR EDITOR> docker-compose.override.yml
2020-07-28 13:14:14 +02:00
# Build docker images
username@hostname:~$ docker-compose build
2020-03-16 10:49:45 +01:00
```
2020-06-09 13:27:02 +02:00
#### Configuration variables in detail
The variables prefixed with **DOCKER** should only be filled out if you want to use the Docker HTTP API. Check the [Docker Documentation](https://docs.docker.com/engine/security/https/) to see how to create certificates, configure and activate the Docker HTTP API.
The variables prefixed with **GITLAB** hold your login information of this GitLab instance. Either use your credentials (not recommended) or create an access token with the read_registry scope.
The value of the **NOPAQUE_MAIL_SENDER** variable is shown as a sender of emails generated by nopaque.
On registration the email address stored in the **NOPAQUE_ADMIN** variable will automatically be granted the administrator role.
The email address stored in **NOPAQUE_CONTACT** will be used within the contact button of the footer within the websites.
### Start your instance
``` bash
# For background execution add the -d flag and to scale the app, add --scale web=<NUM-INSTANCES>
username@hostname:~$ docker-compose up
2020-01-07 15:03:21 +01:00
```