Go to file
2020-06-08 10:24:28 +02:00
daemon Merge branch 'development' of gitlab.ub.uni-bielefeld.de:sfb1288inf/nopaque into development 2020-06-08 10:24:28 +02:00
logs Update 2020-06-05 15:12:59 +02:00
web Some changes for convinience 2020-06-08 10:14:15 +02:00
.env.tpl integrate nopaque repo 2020-06-05 14:42:04 +02:00
.gitignore Update 2020-06-05 15:12:59 +02:00
docker-compose.yml Some changes for convinience 2020-06-08 10:14:15 +02:00
nopaque.env.tpl integrate nopaque repo 2020-06-05 14:42:04 +02:00
README.md integrate nopaque repo 2020-06-05 14:42:04 +02:00

nopaque

nopaque bundles various tools and services that provide humanities scholars with DH methods and thus can support their various individual research processes. Using nopaque, researchers can subject digitized sources to Optical Character Recognition (OCR). The resulting text files can then be used as a data basis for Natural Language Processing (NLP). The texts are automatically subjected to various linguistic annotations. The data processed via NLP can then be summarized in the web application as corpora and analyzed by means of an information retrieval system through complex search queries. The range of functions of the web application will be successively extended according to the needs of the researchers.

Prerequisites and requirements

  1. Install docker for your system. Following the official instructions. (LINK)
  2. Install docker-compose. Following the official instructions. (LINK)

Configuration and startup: Run a docker swarm and setup a Samba share

  1. Create Docker swarm:

The generated computational workload is handled by a Docker swarm. A swarm is a group of machines that are running Docker and joined into a cluster. It consists out of two different kinds of members, manager and worker nodes. The swarm setup process is described best in the Docker documentation.

  1. Create a network storage:

A shared network space is necessary so that all swarm members have access to all the data. To achieve this a samba share is used.

# Example: Create a Samba share via Docker
# More details can be found under https://hub.docker.com/r/dperson/samba/
sudo mkdir -p /srv/nopaque/storage
docker run \
    --name opaque_storage \
    -v /srv/nopaque/storage:/srv/nopaque/storage \
    -p 445:445 \
    dperson/samba \
      -p \
      -s storage.nopaque;/srv/nopaque/storage;no;no;no;nopaque \
      -u nopaque;nopaque

# Mount the Samba share on all swarm nodes (managers and workers)
sudo mkdir /mnt/nopaque
sudo mount --types cifs --options gid=${USER},password=nopaque,uid=${USER},user=nopaque,vers=3.0 //<SAMBA-SERVER-IP>/storage.nopaque /mnt/nopaque

Download, configure and build nopaque**

  1. Download, configure and build nopaque
git clone https://gitlab.ub.uni-bielefeld.de/sfb1288inf/nopaque.git
mkdir logs
cp nopaque.env.tpl nopaque.env
<YOUR EDITOR> nopaque.env # Fill out the variables within this file. For the GitLab variables either use your credentials (not recommended) or create an access token with the read_registry scope. If this repository is public no credentials are needed.

Variables and their values are explained here:

The first three DOCKER variables should only be used if you want to use the Docker HTTP API. Check the daemon readme to see how to create certificates and activate the API.

FLASK_CONFIG=development|testing|production
SECRET_KEY=92b461ba136e4ca48e430003acd56977
Uses this for example to create a secret key: python -c "import uuid; print(uuid.uuid4().hex)"

The GitLab Registry Variables are not needed if this repository is public. If needed use your GitLab username and a token as a password

Flask Mail variables are needed for sending password reset mails etc. Use your own mail server configs here.
MAIL_SERVER=smtp.example.com
MAIL_PORT=587
MAIL_USE_TLS=true
MAIL_USERNAME=user@example.com
MAIL_PASSWORD=password
NOPAQUE_MAIL_SENDER=Nopaque Admin user@example.com Name shown as sender.

Nopaque variables are needed for the web app.
NOPAQUE_ADMIN=yourmail@example.com If a user is registered using this mail the user will automatically be granted admin rights.
NOPAQUE_CONTACT=contactmailadress@example.com Contact mail address shown in the footer of the web application.
NOPAQUE_DOMAIN=yourdomain.com The domain your nopaque installation is hosted on. use https://nopaque.localhost for a local running instance.
NOPAQUE_LOG_LEVEL=WARNING|INFO|ERROR|DEBUG
NOPAQUE_STORAGE=path/to/your/samba/share

PostgreSQL Database credentials:
POSTGRES_DB_NAME=dbanme
POSTGRES_USER=username
POSTGRES_PASSWORD=password

cp .env.tpl .env
<YOUR EDITOR> .env # Fill out the variables within this file.
docker-compose build

Start your instance

# for background execution add the -d flag and to scale the app, add --scale web=<NUM-INSTANCES>
docker-compose up