mirror of
https://gitlab.ub.uni-bielefeld.de/sfb1288inf/nopaque.git
synced 2025-07-01 18:30:34 +00:00
integrate nopaque repo
This commit is contained in:
109
README.md
109
README.md
@ -1,46 +1,22 @@
|
||||
# Opaque
|
||||
# nopaque
|
||||
|
||||
Opaque is a virtual research environment (VRE) bundling OCR, NLP and additional computer linguistic methods for research purposes in the field of Digital Humanities.
|
||||
_nopaque_ bundles various tools and services that provide humanities scholars with DH methods and thus can support their various individual research processes. Using _nopaque_, researchers can subject digitized sources to Optical Character Recognition (OCR). The resulting text files can then be used as a data basis for Natural Language Processing (NLP). The texts are automatically subjected to various linguistic annotations. The data processed via NLP can then be summarized in the web application as corpora and analyzed by means of an information retrieval system through complex search queries. The range of functions of the web application will be successively extended according to the needs of the researchers.
|
||||
|
||||
Opaque is designed as a web application which can be easily used by researchers to aid them during their research process.
|
||||
## Prerequisites and requirements
|
||||
|
||||
In particular researchers can use Opaque to start OCR jobs for digitized sources. The text output of these OCR jobs can then be used as an input for tagging processes (POS, NER etc.).
|
||||
|
||||
As a last step texts can be loaded into an information retrieval system to query for specific words, phrases in connection with linguistic features.
|
||||
1. Install docker for your system. Following the official instructions. (LINK)
|
||||
2. Install docker-compose. Following the official instructions. (LINK)
|
||||
|
||||
|
||||
## Dependencies
|
||||
|
||||
- cifs-utils
|
||||
- Docker
|
||||
- Docker Compose
|
||||
|
||||
|
||||
## Configuration and startup
|
||||
## Configuration and startup: Run a docker swarm and setup a Samba share
|
||||
|
||||
1. **Create Docker swarm:**
|
||||
|
||||
The following part is for **users** and not the development team. The development team uses a script which sets up a local development swarm.
|
||||
|
||||
The generated computational workload is handled by a [Docker](https://docs.docker.com/) swarm. A swarm is a group of machines that are running Docker and joined into a cluster. It consists out of two different kinds of members, managers and workers. Currently it is not possible to specify a dedicated Docker host, instead Opaque expects the executing system to be a swarm manager of a cluster with at least one dedicated worker machine. The swarm setup process is described best in the [Docker documentation](https://docs.docker.com/engine/swarm/swarm-tutorial/).
|
||||
|
||||
The dev team can use dind_swarm_setup.sh. If the workers cannot join the manager node. Try opening the following ports using the ubuntu firewall ufw:
|
||||
```bash
|
||||
sudo ufw allow 2376/tcp \
|
||||
&& sudo ufw allow 7946/udp \
|
||||
&& sudo ufw allow 7946/tcp \
|
||||
&& sudo ufw allow 80/tcp \
|
||||
&& sudo ufw allow 2377/tcp \
|
||||
&& sudo ufw allow 4789/udp
|
||||
|
||||
sudo ufw reload && sudo ufw enable
|
||||
sudo systemctl restart docker
|
||||
```
|
||||
The generated computational workload is handled by a [Docker](https://docs.docker.com/) swarm. A swarm is a group of machines that are running Docker and joined into a cluster. It consists out of two different kinds of members, manager and worker nodes. The swarm setup process is described best in the [Docker documentation](https://docs.docker.com/engine/swarm/swarm-tutorial/).
|
||||
|
||||
2. **Create a network storage:**
|
||||
The dind_swarm_setup.sh script handles this step for the dev team aswell.
|
||||
|
||||
A shared network space is necessary so that all swarm members have access to all the data. To achieve this a [Samba](https://www.samba.org/) can be used.
|
||||
A shared network space is necessary so that all swarm members have access to all the data. To achieve this a [samba](https://www.samba.org/) share is used.
|
||||
``` bash
|
||||
# Example: Create a Samba share via Docker
|
||||
# More details can be found under https://hub.docker.com/r/dperson/samba/
|
||||
@ -54,42 +30,57 @@ docker run \
|
||||
-s storage.nopaque;/srv/nopaque/storage;no;no;no;nopaque \
|
||||
-u nopaque;nopaque
|
||||
|
||||
# Mount the Samba share on all swarm member nodes with the following code
|
||||
# Mount the Samba share on all swarm nodes (managers and workers)
|
||||
sudo mkdir /mnt/nopaque
|
||||
sudo mount --types cifs --options gid=${USER},password=nopaque,uid=${USER},user=nopaque,vers=3.0 //<YOUR IP>/storage.nopaque /mnt/nopaque
|
||||
sudo mount --types cifs --options gid=${USER},password=nopaque,uid=${USER},user=nopaque,vers=3.0 //<SAMBA-SERVER-IP>/storage.nopaque /mnt/nopaque
|
||||
```
|
||||
3. **Download Opaque**
|
||||
``` bash
|
||||
git clone https://gitlab.ub.uni-bielefeld.de/sfb1288inf/opaque.git
|
||||
cd opaque
|
||||
docker-compose pull
|
||||
```
|
||||
4. **Configure your instance:**
|
||||
For production environments it is recommended to activate and secure the Docker HTTP API. You can read more [here](https://gitlab.ub.uni-bielefeld.de/sfb1288inf/opaque_daemon).
|
||||
## Download, configure and build _nopaque_**
|
||||
|
||||
3. **Download, configure and build _nopaque_**
|
||||
|
||||
``` bash
|
||||
git clone https://gitlab.ub.uni-bielefeld.de/sfb1288inf/nopaque.git
|
||||
mkdir logs
|
||||
cp nopaque.env.tpl nopaque.env
|
||||
<YOUR EDITOR> nopaque.env # Fill out the empty variables within this file. For the gitlab login either use your credentials (not recommended) Or create a gitlab token
|
||||
<YOUR EDITOR> nopaque.env # Fill out the variables within this file. For the GitLab variables either use your credentials (not recommended) or create an access token with the read_registry scope. If this repository is public no credentials are needed.
|
||||
```
|
||||
### Variables and their values are explained here:
|
||||
The first three DOCKER variables should only be used if you want to use the Docker HTTP API. Check the daemon readme to see how to create certificates and activate the API.
|
||||
|
||||
FLASK_CONFIG=development|testing|production \
|
||||
SECRET_KEY=92b461ba136e4ca48e430003acd56977 \
|
||||
Uses this for example to create a secret key: `python -c "import uuid; print(uuid.uuid4().hex)"`
|
||||
|
||||
The **GitLab Registry** Variables are not needed if this repository is public. If needed use your GitLab username and a token as a password
|
||||
|
||||
**Flask Mail** variables are needed for sending password reset mails etc. Use your own mail server configs here.\
|
||||
MAIL_SERVER=smtp.example.com \
|
||||
MAIL_PORT=587 \
|
||||
MAIL_USE_TLS=true \
|
||||
MAIL_USERNAME=user@example.com \
|
||||
MAIL_PASSWORD=password \
|
||||
NOPAQUE_MAIL_SENDER=Nopaque Admin <user@example.com> _Name shown as sender._
|
||||
|
||||
**Nopaque** variables are needed for the web app.\
|
||||
NOPAQUE_ADMIN=yourmail@example.com _If a user is registered using this mail the user will automatically be granted admin rights._ \
|
||||
NOPAQUE_CONTACT=contactmailadress@example.com _Contact mail address shown in the footer of the web application._\
|
||||
NOPAQUE_DOMAIN=yourdomain.com _The domain your nopaque installation is hosted on. use https://nopaque.localhost for a local running instance._ \
|
||||
NOPAQUE_LOG_LEVEL=WARNING|INFO|ERROR|DEBUG \
|
||||
NOPAQUE_STORAGE=path/to/your/samba/share
|
||||
|
||||
**PostgreSQL Database** credentials: \
|
||||
POSTGRES_DB_NAME=dbanme \
|
||||
POSTGRES_USER=username \
|
||||
POSTGRES_PASSWORD=password
|
||||
|
||||
5. Further development instructions
|
||||
Use the following command to allow docker to pull images from your gitlab registry. TODO: Check if this could also work wit a token?
|
||||
```bash
|
||||
docker login gitlab.ub.uni-bielefeld.de:4567
|
||||
cp .env.tpl .env
|
||||
<YOUR EDITOR> .env # Fill out the variables within this file.
|
||||
docker-compose build
|
||||
```
|
||||
|
||||
6. **Start your instance**
|
||||
``` bash
|
||||
# Execute the following 3 steps only on first startup
|
||||
docker-compose run web flask db upgrade
|
||||
docker-compose run web flask insert-initial-database-entries
|
||||
docker-compose down
|
||||
|
||||
**Start your instance**
|
||||
```bash
|
||||
# for background execution add the -d flag and to scale the app, add --scale web=<NUM-INSTANCES>
|
||||
docker-compose up
|
||||
```
|
||||
|
||||
6. **Alter Database Models**
|
||||
``` bash
|
||||
docker-compose run web flask db migrate
|
||||
docker-compose run web flask db upgrade
|
||||
```
|
||||
|
Reference in New Issue
Block a user