integrate nopaque repo
6
.env.tpl
Normal file
@ -0,0 +1,6 @@
|
||||
# docker GID (getent group docker | cut -d: -f3)
|
||||
docker_gid=
|
||||
# GID (id -g)
|
||||
gid=
|
||||
# UID (id -u)
|
||||
uid=
|
5
.gitignore
vendored
@ -1,9 +1,4 @@
|
||||
# Files
|
||||
.DS_Store
|
||||
*.env
|
||||
|
||||
|
||||
# Directories
|
||||
logs
|
||||
__pycache__
|
||||
certs
|
||||
|
@ -1,55 +0,0 @@
|
||||
image: docker:stable
|
||||
|
||||
services:
|
||||
- docker:stable-dind
|
||||
|
||||
variables:
|
||||
DOCKER_DRIVER: overlay2
|
||||
|
||||
stages:
|
||||
- build
|
||||
- push
|
||||
|
||||
before_script:
|
||||
- docker login -u gitlab-ci-token -p $CI_JOB_TOKEN $CI_REGISTRY
|
||||
|
||||
Build:
|
||||
script:
|
||||
- docker build --pull -t $CI_REGISTRY_IMAGE:tmp .
|
||||
- docker push $CI_REGISTRY_IMAGE:tmp
|
||||
stage: build
|
||||
tags:
|
||||
- docker
|
||||
|
||||
Push development:
|
||||
only:
|
||||
- development
|
||||
script:
|
||||
- docker pull $CI_REGISTRY_IMAGE:tmp
|
||||
- docker tag $CI_REGISTRY_IMAGE:tmp $CI_REGISTRY_IMAGE:development
|
||||
- docker push $CI_REGISTRY_IMAGE:development
|
||||
stage: push
|
||||
tags:
|
||||
- docker
|
||||
|
||||
Push latest:
|
||||
only:
|
||||
- master
|
||||
script:
|
||||
- docker pull $CI_REGISTRY_IMAGE:tmp
|
||||
- docker tag $CI_REGISTRY_IMAGE:tmp $CI_REGISTRY_IMAGE:latest
|
||||
- docker push $CI_REGISTRY_IMAGE:latest
|
||||
stage: push
|
||||
tags:
|
||||
- docker
|
||||
|
||||
Push tag:
|
||||
only:
|
||||
- tags
|
||||
script:
|
||||
- docker pull $CI_REGISTRY_IMAGE:tmp
|
||||
- docker tag $CI_REGISTRY_IMAGE:tmp $CI_REGISTRY_IMAGE:$CI_COMMIT_REF_NAME
|
||||
- docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_REF_NAME
|
||||
stage: push
|
||||
tags:
|
||||
- docker
|
113
README.md
@ -1,46 +1,22 @@
|
||||
# Opaque
|
||||
# nopaque
|
||||
|
||||
Opaque is a virtual research environment (VRE) bundling OCR, NLP and additional computer linguistic methods for research purposes in the field of Digital Humanities.
|
||||
_nopaque_ bundles various tools and services that provide humanities scholars with DH methods and thus can support their various individual research processes. Using _nopaque_, researchers can subject digitized sources to Optical Character Recognition (OCR). The resulting text files can then be used as a data basis for Natural Language Processing (NLP). The texts are automatically subjected to various linguistic annotations. The data processed via NLP can then be summarized in the web application as corpora and analyzed by means of an information retrieval system through complex search queries. The range of functions of the web application will be successively extended according to the needs of the researchers.
|
||||
|
||||
Opaque is designed as a web application which can be easily used by researchers to aid them during their research process.
|
||||
## Prerequisites and requirements
|
||||
|
||||
In particular researchers can use Opaque to start OCR jobs for digitized sources. The text output of these OCR jobs can then be used as an input for tagging processes (POS, NER etc.).
|
||||
|
||||
As a last step texts can be loaded into an information retrieval system to query for specific words, phrases in connection with linguistic features.
|
||||
1. Install docker for your system. Following the official instructions. (LINK)
|
||||
2. Install docker-compose. Following the official instructions. (LINK)
|
||||
|
||||
|
||||
## Dependencies
|
||||
|
||||
- cifs-utils
|
||||
- Docker
|
||||
- Docker Compose
|
||||
|
||||
|
||||
## Configuration and startup
|
||||
## Configuration and startup: Run a docker swarm and setup a Samba share
|
||||
|
||||
1. **Create Docker swarm:**
|
||||
|
||||
The following part is for **users** and not the development team. The development team uses a script which sets up a local development swarm.
|
||||
|
||||
The generated computational workload is handled by a [Docker](https://docs.docker.com/) swarm. A swarm is a group of machines that are running Docker and joined into a cluster. It consists out of two different kinds of members, managers and workers. Currently it is not possible to specify a dedicated Docker host, instead Opaque expects the executing system to be a swarm manager of a cluster with at least one dedicated worker machine. The swarm setup process is described best in the [Docker documentation](https://docs.docker.com/engine/swarm/swarm-tutorial/).
|
||||
|
||||
The dev team can use dind_swarm_setup.sh. If the workers cannot join the manager node. Try opening the following ports using the ubuntu firewall ufw:
|
||||
```bash
|
||||
sudo ufw allow 2376/tcp \
|
||||
&& sudo ufw allow 7946/udp \
|
||||
&& sudo ufw allow 7946/tcp \
|
||||
&& sudo ufw allow 80/tcp \
|
||||
&& sudo ufw allow 2377/tcp \
|
||||
&& sudo ufw allow 4789/udp
|
||||
|
||||
sudo ufw reload && sudo ufw enable
|
||||
sudo systemctl restart docker
|
||||
```
|
||||
The generated computational workload is handled by a [Docker](https://docs.docker.com/) swarm. A swarm is a group of machines that are running Docker and joined into a cluster. It consists out of two different kinds of members, manager and worker nodes. The swarm setup process is described best in the [Docker documentation](https://docs.docker.com/engine/swarm/swarm-tutorial/).
|
||||
|
||||
2. **Create a network storage:**
|
||||
The dind_swarm_setup.sh script handles this step for the dev team aswell.
|
||||
|
||||
A shared network space is necessary so that all swarm members have access to all the data. To achieve this a [Samba](https://www.samba.org/) can be used.
|
||||
A shared network space is necessary so that all swarm members have access to all the data. To achieve this a [samba](https://www.samba.org/) share is used.
|
||||
``` bash
|
||||
# Example: Create a Samba share via Docker
|
||||
# More details can be found under https://hub.docker.com/r/dperson/samba/
|
||||
@ -54,42 +30,57 @@ docker run \
|
||||
-s storage.nopaque;/srv/nopaque/storage;no;no;no;nopaque \
|
||||
-u nopaque;nopaque
|
||||
|
||||
# Mount the Samba share on all swarm member nodes with the following code
|
||||
# Mount the Samba share on all swarm nodes (managers and workers)
|
||||
sudo mkdir /mnt/nopaque
|
||||
sudo mount --types cifs --options gid=${USER},password=nopaque,uid=${USER},user=nopaque,vers=3.0 //<YOUR IP>/storage.nopaque /mnt/nopaque
|
||||
sudo mount --types cifs --options gid=${USER},password=nopaque,uid=${USER},user=nopaque,vers=3.0 //<SAMBA-SERVER-IP>/storage.nopaque /mnt/nopaque
|
||||
```
|
||||
3. **Download Opaque**
|
||||
``` bash
|
||||
git clone https://gitlab.ub.uni-bielefeld.de/sfb1288inf/opaque.git
|
||||
cd opaque
|
||||
docker-compose pull
|
||||
```
|
||||
4. **Configure your instance:**
|
||||
For production environments it is recommended to activate and secure the Docker HTTP API. You can read more [here](https://gitlab.ub.uni-bielefeld.de/sfb1288inf/opaque_daemon).
|
||||
## Download, configure and build _nopaque_**
|
||||
|
||||
3. **Download, configure and build _nopaque_**
|
||||
|
||||
``` bash
|
||||
git clone https://gitlab.ub.uni-bielefeld.de/sfb1288inf/nopaque.git
|
||||
mkdir logs
|
||||
cp nopaque.env.tpl nopaque.env
|
||||
<YOUR EDITOR> nopaque.env # Fill out the empty variables within this file. For the gitlab login either use your credentials (not recommended) Or create a gitlab token
|
||||
<YOUR EDITOR> nopaque.env # Fill out the variables within this file. For the GitLab variables either use your credentials (not recommended) or create an access token with the read_registry scope. If this repository is public no credentials are needed.
|
||||
```
|
||||
### Variables and their values are explained here:
|
||||
The first three DOCKER variables should only be used if you want to use the Docker HTTP API. Check the daemon readme to see how to create certificates and activate the API.
|
||||
|
||||
FLASK_CONFIG=development|testing|production \
|
||||
SECRET_KEY=92b461ba136e4ca48e430003acd56977 \
|
||||
Uses this for example to create a secret key: `python -c "import uuid; print(uuid.uuid4().hex)"`
|
||||
|
||||
The **GitLab Registry** Variables are not needed if this repository is public. If needed use your GitLab username and a token as a password
|
||||
|
||||
**Flask Mail** variables are needed for sending password reset mails etc. Use your own mail server configs here.\
|
||||
MAIL_SERVER=smtp.example.com \
|
||||
MAIL_PORT=587 \
|
||||
MAIL_USE_TLS=true \
|
||||
MAIL_USERNAME=user@example.com \
|
||||
MAIL_PASSWORD=password \
|
||||
NOPAQUE_MAIL_SENDER=Nopaque Admin <user@example.com> _Name shown as sender._
|
||||
|
||||
**Nopaque** variables are needed for the web app.\
|
||||
NOPAQUE_ADMIN=yourmail@example.com _If a user is registered using this mail the user will automatically be granted admin rights._ \
|
||||
NOPAQUE_CONTACT=contactmailadress@example.com _Contact mail address shown in the footer of the web application._\
|
||||
NOPAQUE_DOMAIN=yourdomain.com _The domain your nopaque installation is hosted on. use https://nopaque.localhost for a local running instance._ \
|
||||
NOPAQUE_LOG_LEVEL=WARNING|INFO|ERROR|DEBUG \
|
||||
NOPAQUE_STORAGE=path/to/your/samba/share
|
||||
|
||||
**PostgreSQL Database** credentials: \
|
||||
POSTGRES_DB_NAME=dbanme \
|
||||
POSTGRES_USER=username \
|
||||
POSTGRES_PASSWORD=password
|
||||
|
||||
```bash
|
||||
cp .env.tpl .env
|
||||
<YOUR EDITOR> .env # Fill out the variables within this file.
|
||||
docker-compose build
|
||||
```
|
||||
|
||||
5. Further development instructions
|
||||
Use the following command to allow docker to pull images from your gitlab registry. TODO: Check if this could also work wit a token?
|
||||
**Start your instance**
|
||||
```bash
|
||||
docker login gitlab.ub.uni-bielefeld.de:4567
|
||||
```
|
||||
|
||||
6. **Start your instance**
|
||||
``` bash
|
||||
# Execute the following 3 steps only on first startup
|
||||
docker-compose run web flask db upgrade
|
||||
docker-compose run web flask insert-initial-database-entries
|
||||
docker-compose down
|
||||
|
||||
# for background execution add the -d flag and to scale the app, add --scale web=<NUM-INSTANCES>
|
||||
docker-compose up
|
||||
```
|
||||
|
||||
6. **Alter Database Models**
|
||||
``` bash
|
||||
docker-compose run web flask db migrate
|
||||
docker-compose run web flask db upgrade
|
||||
```
|
||||
|
40
daemon/Dockerfile
Normal file
@ -0,0 +1,40 @@
|
||||
FROM python:3.6-slim-stretch
|
||||
|
||||
|
||||
LABEL maintainer="inf_sfb1288@lists.uni-bielefeld.de"
|
||||
|
||||
|
||||
ARG docker_gid=998
|
||||
ARG gid=1000
|
||||
ARG uid=1000
|
||||
ENV LANG=C.UTF-8
|
||||
|
||||
|
||||
RUN apt-get update \
|
||||
&& apt-get install --no-install-recommends --yes \
|
||||
build-essential \
|
||||
libpq-dev \
|
||||
wait-for-it \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
|
||||
RUN groupadd --gid ${docker_gid} --system docker \
|
||||
&& groupadd --gid ${gid} --system nopaqued \
|
||||
&& useradd --create-home --gid ${gid} --groups ${docker_gid} --no-log-init --system --uid ${uid} nopaqued
|
||||
USER nopaqued
|
||||
WORKDIR /home/nopaqued
|
||||
|
||||
|
||||
COPY ["notify", "notify"]
|
||||
COPY ["tasks", "tasks"]
|
||||
COPY ["logger", "logger"]
|
||||
COPY ["decorators.py", "nopaqued.py", "requirements.txt", "./"]
|
||||
RUN python -m venv venv \
|
||||
&& venv/bin/pip install --requirement requirements.txt \
|
||||
&& mkdir logs
|
||||
|
||||
|
||||
COPY docker-entrypoint.sh /usr/local/bin/
|
||||
|
||||
|
||||
ENTRYPOINT ["docker-entrypoint.sh"]
|
14
daemon/decorators.py
Normal file
@ -0,0 +1,14 @@
|
||||
from functools import wraps
|
||||
from threading import Thread
|
||||
|
||||
|
||||
def background(f):
|
||||
'''
|
||||
' This decorator executes a function in a Thread.
|
||||
'''
|
||||
@wraps(f)
|
||||
def wrapped(*args, **kwargs):
|
||||
thread = Thread(target=f, args=args, kwargs=kwargs)
|
||||
thread.start()
|
||||
return thread
|
||||
return wrapped
|
8
daemon/docker-entrypoint.sh
Normal file
@ -0,0 +1,8 @@
|
||||
#!/bin/bash
|
||||
|
||||
echo "Waiting for db..."
|
||||
wait-for-it db:5432 --strict --timeout=0
|
||||
echo "Waiting for web..."
|
||||
wait-for-it web:5000 --strict --timeout=0
|
||||
|
||||
venv/bin/python nopaqued.py
|
26
daemon/logger/logger.py
Normal file
@ -0,0 +1,26 @@
|
||||
import os
|
||||
import logging
|
||||
|
||||
|
||||
def init_logger():
|
||||
'''
|
||||
Functions initiates a logger instance.
|
||||
'''
|
||||
if not os.path.isfile('logs/nopaqued.log'):
|
||||
file_path = os.path.join(os.getcwd(), 'logs/nopaqued.log')
|
||||
log = open(file_path, 'w+')
|
||||
log.close()
|
||||
logging.basicConfig(datefmt='%Y-%m-%d %H:%M:%S',
|
||||
filemode='w', filename='logs/nopaqued.log',
|
||||
format='%(asctime)s - %(levelname)s - %(name)s - '
|
||||
'%(filename)s - %(lineno)d - %(message)s')
|
||||
logger = logging.getLogger(__name__)
|
||||
if os.environ.get('FLASK_CONFIG') == 'development':
|
||||
logger.setLevel(logging.DEBUG)
|
||||
if os.environ.get('FLASK_CONFIG') == 'production':
|
||||
logger.setLevel(logging.WARNING)
|
||||
return logger
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
init_logger()
|
455
daemon/nopaqued.bak.py
Normal file
@ -0,0 +1,455 @@
|
||||
from notify.notification import Notification
|
||||
from notify.service import NotificationService
|
||||
from sqlalchemy import create_engine, asc
|
||||
from sqlalchemy.orm import Session, relationship
|
||||
from sqlalchemy.ext.automap import automap_base
|
||||
from datetime import datetime
|
||||
from time import sleep
|
||||
import docker
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
import shutil
|
||||
|
||||
|
||||
''' Global constants '''
|
||||
NOPAQUE_STORAGE = os.environ.get('NOPAQUE_STORAGE')
|
||||
|
||||
''' Global variables '''
|
||||
docker_client = None
|
||||
session = None
|
||||
|
||||
|
||||
# Classes for database models
|
||||
Base = automap_base()
|
||||
|
||||
|
||||
class Corpus(Base):
|
||||
__tablename__ = 'corpora'
|
||||
files = relationship('CorpusFile', collection_class=set)
|
||||
|
||||
|
||||
class CorpusFile(Base):
|
||||
__tablename__ = 'corpus_files'
|
||||
|
||||
|
||||
class Job(Base):
|
||||
__tablename__ = 'jobs'
|
||||
inputs = relationship('JobInput', collection_class=set)
|
||||
results = relationship('JobResult', collection_class=set)
|
||||
notification_data = relationship('NotificationData', collection_class=list)
|
||||
notification_email_data = relationship('NotificationEmailData', collection_class=list)
|
||||
|
||||
|
||||
class NotificationData(Base):
|
||||
__tablename__ = 'notification_data'
|
||||
job = relationship('Job', collection_class=set)
|
||||
|
||||
|
||||
class NotificationEmailData(Base):
|
||||
__tablename__ = 'notification_email_data'
|
||||
job = relationship('Job', collection_class=set)
|
||||
|
||||
|
||||
class JobInput(Base):
|
||||
__tablename__ = 'job_results'
|
||||
|
||||
|
||||
class JobResult(Base):
|
||||
__tablename__ = 'job_results'
|
||||
|
||||
|
||||
class User(Base):
|
||||
__tablename__ = 'users'
|
||||
jobs = relationship('Job', collection_class=set)
|
||||
corpora = relationship('Corpus', collection_class=set)
|
||||
|
||||
|
||||
def check_corpora():
|
||||
corpora = session.query(Corpus).all()
|
||||
for corpus in filter(lambda corpus: corpus.status == 'submitted', corpora):
|
||||
__create_build_corpus_service(corpus)
|
||||
for corpus in filter(lambda corpus: (corpus.status == 'queued'
|
||||
or corpus.status == 'running'),
|
||||
corpora):
|
||||
__checkout_build_corpus_service(corpus)
|
||||
for corpus in filter(lambda corpus: corpus.status == 'start analysis',
|
||||
corpora):
|
||||
__create_cqpserver_container(corpus)
|
||||
for corpus in filter(lambda corpus: corpus.status == 'stop analysis',
|
||||
corpora):
|
||||
__remove_cqpserver_container(corpus)
|
||||
|
||||
|
||||
def __create_build_corpus_service(corpus):
|
||||
corpus_dir = os.path.join(NOPAQUE_STORAGE, str(corpus.user_id),
|
||||
'corpora', str(corpus.id))
|
||||
corpus_data_dir = os.path.join(corpus_dir, 'data')
|
||||
corpus_file = os.path.join(corpus_dir, 'merged', 'corpus.vrt')
|
||||
corpus_registry_dir = os.path.join(corpus_dir, 'registry')
|
||||
if os.path.exists(corpus_data_dir):
|
||||
shutil.rmtree(corpus_data_dir)
|
||||
if os.path.exists(corpus_registry_dir):
|
||||
shutil.rmtree(corpus_registry_dir)
|
||||
os.mkdir(corpus_data_dir)
|
||||
os.mkdir(corpus_registry_dir)
|
||||
service_args = {'command': 'docker-entrypoint.sh build-corpus',
|
||||
'constraints': ['node.role==worker'],
|
||||
'labels': {'origin': 'nopaque',
|
||||
'type': 'corpus.prepare',
|
||||
'corpus_id': str(corpus.id)},
|
||||
'mounts': [corpus_file + ':/root/files/corpus.vrt:ro',
|
||||
corpus_data_dir + ':/corpora/data:rw',
|
||||
corpus_registry_dir + ':/usr/local/share/cwb/registry:rw'],
|
||||
'name': 'build-corpus_{}'.format(corpus.id),
|
||||
'restart_policy': docker.types.RestartPolicy()}
|
||||
service_image = ('gitlab.ub.uni-bielefeld.de:4567/sfb1288inf/cqpserver:latest')
|
||||
try:
|
||||
service = docker_client.services.get(service_args['name'])
|
||||
except docker.errors.NotFound:
|
||||
pass
|
||||
except docker.errors.DockerException:
|
||||
return
|
||||
else:
|
||||
service.remove()
|
||||
try:
|
||||
docker_client.services.create(service_image, **service_args)
|
||||
except docker.errors.DockerException:
|
||||
corpus.status = 'failed'
|
||||
else:
|
||||
corpus.status = 'queued'
|
||||
|
||||
|
||||
def __checkout_build_corpus_service(corpus):
|
||||
service_name = 'build-corpus_{}'.format(corpus.id)
|
||||
try:
|
||||
service = docker_client.services.get(service_name)
|
||||
except docker.errors.NotFound:
|
||||
logger.error('__checkout_build_corpus_service({}):'.format(corpus.id)
|
||||
+ ' The service does not exist.'
|
||||
+ ' (stauts: {} -> failed)'.format(corpus.status))
|
||||
corpus.status = 'failed'
|
||||
return
|
||||
except docker.errors.DockerException:
|
||||
return
|
||||
service_tasks = service.tasks()
|
||||
if not service_tasks:
|
||||
return
|
||||
task_state = service_tasks[0].get('Status').get('State')
|
||||
if corpus.status == 'queued' and task_state != 'pending':
|
||||
corpus.status = 'running'
|
||||
elif corpus.status == 'running' and task_state == 'complete':
|
||||
service.remove()
|
||||
corpus.status = 'prepared'
|
||||
elif corpus.status == 'running' and task_state == 'failed':
|
||||
service.remove()
|
||||
corpus.status = task_state
|
||||
|
||||
|
||||
def __create_cqpserver_container(corpus):
|
||||
corpus_dir = os.path.join(NOPAQUE_STORAGE, str(corpus.user_id),
|
||||
'corpora', str(corpus.id))
|
||||
corpus_data_dir = os.path.join(corpus_dir, 'data')
|
||||
corpus_registry_dir = os.path.join(corpus_dir, 'registry')
|
||||
container_args = {'command': 'cqpserver',
|
||||
'detach': True,
|
||||
'volumes': [corpus_data_dir + ':/corpora/data:rw',
|
||||
corpus_registry_dir + ':/usr/local/share/cwb/registry:rw'],
|
||||
'name': 'cqpserver_{}'.format(corpus.id),
|
||||
'network': 'opaque_default'}
|
||||
container_image = ('gitlab.ub.uni-bielefeld.de:4567/sfb1288inf/cqpserver:latest')
|
||||
try:
|
||||
container = docker_client.containers.get(container_args['name'])
|
||||
except docker.errors.NotFound:
|
||||
pass
|
||||
except docker.errors.DockerException:
|
||||
return
|
||||
else:
|
||||
container.remove(force=True)
|
||||
try:
|
||||
docker_client.containers.run(container_image, **container_args)
|
||||
except docker.errors.DockerException:
|
||||
return
|
||||
else:
|
||||
corpus.status = 'analysing'
|
||||
|
||||
|
||||
def __remove_cqpserver_container(corpus):
|
||||
container_name = 'cqpserver_{}'.format(corpus.id)
|
||||
try:
|
||||
container = docker_client.containers.get(container_name)
|
||||
except docker.errors.NotFound:
|
||||
pass
|
||||
except docker.errors.DockerException:
|
||||
return
|
||||
else:
|
||||
container.remove(force=True)
|
||||
corpus.status = 'prepared'
|
||||
|
||||
|
||||
def check_jobs():
|
||||
jobs = session.query(Job).all()
|
||||
for job in filter(lambda job: job.status == 'submitted', jobs):
|
||||
__create_job_service(job)
|
||||
for job in filter(lambda job: (job.status == 'queued'), jobs):
|
||||
__checkout_job_service(job)
|
||||
# __add_notification_data(job, 'queued')
|
||||
for job in filter(lambda job: (job.status == 'running'), jobs):
|
||||
__checkout_job_service(job)
|
||||
# __add_notification_data(job, 'running')
|
||||
# for job in filter(lambda job: job.status == 'complete', jobs):
|
||||
# __add_notification_data(job, 'complete')
|
||||
# for job in filter(lambda job: job.status == 'failed', jobs):
|
||||
#__add_notification_data(job, 'failed')
|
||||
for job in filter(lambda job: job.status == 'canceling', jobs):
|
||||
__remove_job_service(job)
|
||||
|
||||
|
||||
def __add_notification_data(job, notified_on_status):
|
||||
# checks if user wants any notifications at all
|
||||
if (job.user.setting_job_status_mail_notifications == 'none'):
|
||||
# logger.warning('User does not want any notifications!')
|
||||
return
|
||||
# checks if user wants only notification on completed jobs
|
||||
elif (job.user.setting_job_status_mail_notifications == 'end'
|
||||
and notified_on_status != 'complete'):
|
||||
# logger.warning('User only wants notifications on job completed!')
|
||||
return
|
||||
else:
|
||||
# check if a job already has associated NotificationData
|
||||
notification_exists = len(job.notification_data)
|
||||
# create notification_data for current job if there is none
|
||||
if (notification_exists == 0):
|
||||
notification_data = NotificationData(job_id=job.id)
|
||||
session.add(notification_data)
|
||||
session.commit() # If no commit job will have no NotificationData
|
||||
# logger.warning('Created NotificationData for current Job.'))
|
||||
else:
|
||||
pass
|
||||
# logger.warning('Job already had notification: {}'.format(notification_exists))
|
||||
if (job.notification_data[0].notified_on != notified_on_status):
|
||||
notification_email_data = NotificationEmailData(job_id=job.id)
|
||||
notification_email_data.notify_status = notified_on_status
|
||||
notification_email_data.creation_date = datetime.utcnow()
|
||||
job.notification_data[0].notified_on = notified_on_status
|
||||
session.add(notification_email_data)
|
||||
# logger.warning('Created NotificationEmailData for current Job.')
|
||||
else:
|
||||
# logger.warning('NotificationEmailData has already been created for current Job!')
|
||||
pass
|
||||
|
||||
|
||||
def __create_job_service(job):
|
||||
job_dir = os.path.join(NOPAQUE_STORAGE, str(job.user_id), 'jobs',
|
||||
str(job.id))
|
||||
service_args = {'command': ('{} /files /files/output'.format(job.service)
|
||||
+ ' {}'.format(job.secure_filename if job.service == 'file-setup' else '')
|
||||
+ ' --log-dir /files'
|
||||
+ ' --zip [{}]_{}'.format(job.service, job.secure_filename)
|
||||
+ ' ' + ' '.join(json.loads(job.service_args))),
|
||||
'constraints': ['node.role==worker'],
|
||||
'labels': {'origin': 'nopaque',
|
||||
'type': 'service.{}'.format(job.service),
|
||||
'job_id': str(job.id)},
|
||||
'mounts': [job_dir + ':/files:rw'],
|
||||
'name': 'job_{}'.format(job.id),
|
||||
'resources': docker.types.Resources(
|
||||
cpu_reservation=job.n_cores * (10 ** 9),
|
||||
mem_reservation=job.mem_mb * (10 ** 6)),
|
||||
'restart_policy': docker.types.RestartPolicy()}
|
||||
service_image = ('gitlab.ub.uni-bielefeld.de:4567/sfb1288inf/'
|
||||
+ job.service + ':' + job.service_version)
|
||||
try:
|
||||
service = docker_client.services.get(service_args['name'])
|
||||
except docker.errors.NotFound:
|
||||
pass
|
||||
except docker.errors.DockerException:
|
||||
return
|
||||
else:
|
||||
service.remove()
|
||||
try:
|
||||
docker_client.services.create(service_image, **service_args)
|
||||
except docker.errors.DockerException:
|
||||
job.status = 'failed'
|
||||
else:
|
||||
job.status = 'queued'
|
||||
|
||||
|
||||
def __checkout_job_service(job):
|
||||
service_name = 'job_{}'.format(job.id)
|
||||
try:
|
||||
service = docker_client.services.get(service_name)
|
||||
except docker.errors.NotFound:
|
||||
logger.error('__checkout_job_service({}):'.format(job.id)
|
||||
+ ' The service does not exist.'
|
||||
+ ' (stauts: {} -> failed)'.format(job.status))
|
||||
job.status = 'failed'
|
||||
return
|
||||
except docker.errors.DockerException:
|
||||
return
|
||||
service_tasks = service.tasks()
|
||||
if not service_tasks:
|
||||
return
|
||||
task_state = service_tasks[0].get('Status').get('State')
|
||||
if job.status == 'queued' and task_state != 'pending':
|
||||
job.status = 'running'
|
||||
elif (job.status == 'running'
|
||||
and (task_state == 'complete' or task_state == 'failed')):
|
||||
service.remove()
|
||||
job.end_date = datetime.utcnow()
|
||||
job.status = task_state
|
||||
if task_state == 'complete':
|
||||
results_dir = os.path.join(NOPAQUE_STORAGE, str(job.user_id),
|
||||
'jobs', str(job.id), 'output')
|
||||
results = filter(lambda x: x.endswith('.zip'),
|
||||
os.listdir(results_dir))
|
||||
for result in results:
|
||||
job_result = JobResult(dir=results_dir, filename=result,
|
||||
job_id=job.id)
|
||||
session.add(job_result)
|
||||
|
||||
|
||||
def __remove_job_service(job):
|
||||
service_name = 'job_{}'.format(job.id)
|
||||
try:
|
||||
service = docker_client.services.get(service_name)
|
||||
except docker.errors.NotFound:
|
||||
job.status = 'canceled'
|
||||
except docker.errors.DockerException:
|
||||
return
|
||||
else:
|
||||
service.update(mounts=None)
|
||||
service.remove()
|
||||
|
||||
|
||||
def handle_jobs():
|
||||
check_jobs()
|
||||
|
||||
|
||||
def handle_corpora():
|
||||
check_corpora()
|
||||
|
||||
|
||||
# Email notification functions
|
||||
def create_mail_notifications(notification_service):
|
||||
notification_email_data = session.query(NotificationEmailData).order_by(asc(NotificationEmailData.creation_date)).all()
|
||||
notifications = {}
|
||||
for data in notification_email_data:
|
||||
notification = Notification()
|
||||
notification.set_addresses(notification_service.email_address,
|
||||
data.job.user.email)
|
||||
subject_template = '[nopaque] Status update for your Job/Corpora: {title}!'
|
||||
subject_template_values_dict = {'title': data.job.title}
|
||||
domain = os.environ.get('NOPAQUE_DOMAIN')
|
||||
url = '{domain}/{jobs}/{id}'.format(domain=domain,
|
||||
jobs='jobs',
|
||||
id=data.job.id)
|
||||
body_template_values_dict = {'username': data.job.user.username,
|
||||
'id': data.job.id,
|
||||
'title': data.job.title,
|
||||
'status': data.notify_status,
|
||||
'time': data.creation_date,
|
||||
'url': url}
|
||||
notification.set_notification_content(subject_template,
|
||||
subject_template_values_dict,
|
||||
'templates/notification_messages/notification.txt',
|
||||
'templates/notification_messages/notification.html',
|
||||
body_template_values_dict)
|
||||
notifications[data.job.id] = notification
|
||||
# Using a dictionary for notifications avoids sending multiple mails
|
||||
# if the status of a job changes in a few seconds. The user will not get
|
||||
# swamped with mails for queued, running and complete if those happen in
|
||||
# in a few seconds. Only the last update will be sent.
|
||||
session.delete(data)
|
||||
return notifications
|
||||
|
||||
|
||||
def send_mail_notifications(notifications, notification_service):
|
||||
for key, notification in notifications.items():
|
||||
try:
|
||||
notification_service.send(notification)
|
||||
notification_service.mail_limit_exceeded = False
|
||||
except Exception as e:
|
||||
# Adds notifications to unsent if mail server exceded limit for
|
||||
# consecutive mail sending
|
||||
notification_service.not_sent[key] = notification
|
||||
notification_service.mail_limit_exceeded = True
|
||||
|
||||
|
||||
def notify():
|
||||
# Initialize notification service
|
||||
notification_service = NotificationService()
|
||||
notification_service.get_smtp_configs()
|
||||
notification_service.set_server()
|
||||
# create notifications (content, recipient etc.)
|
||||
notifications = create_mail_notifications(notification_service)
|
||||
# only login and send mails if there are any notifications
|
||||
if (len(notifications) > 0):
|
||||
try:
|
||||
notification_service.login()
|
||||
# combine new and unsent notifications
|
||||
notifications.update(notification_service.not_sent)
|
||||
# send all notifications
|
||||
send_mail_notifications(notifications, notification_service)
|
||||
# remove unsent notifications because they have been sent now
|
||||
# but only if mail limit has not been exceeded
|
||||
if (notification_service.mail_limit_exceeded is not True):
|
||||
notification_service.not_sent = {}
|
||||
notification_service.quit()
|
||||
except Exception as e:
|
||||
notification_service.not_sent.update(notifications)
|
||||
|
||||
|
||||
# Logger functions #
|
||||
def init_logger():
|
||||
'''
|
||||
Functions initiates a logger instance.
|
||||
'''
|
||||
global logger
|
||||
|
||||
if not os.path.isfile('logs/nopaqued.log'):
|
||||
file_path = os.path.join(os.getcwd(), 'logs/nopaqued.log')
|
||||
log = open(file_path, 'w+')
|
||||
log.close()
|
||||
logging.basicConfig(datefmt='%Y-%m-%d %H:%M:%S',
|
||||
filemode='w', filename='logs/nopaqued.log',
|
||||
format='%(asctime)s - %(levelname)s - %(name)s - '
|
||||
'%(filename)s - %(lineno)d - %(message)s')
|
||||
logger = logging.getLogger(__name__)
|
||||
if os.environ.get('FLASK_CONFIG') == 'development':
|
||||
logger.setLevel(logging.DEBUG)
|
||||
if os.environ.get('FLASK_CONFIG') == 'production':
|
||||
logger.setLevel(logging.WARNING)
|
||||
|
||||
|
||||
def nopaqued():
|
||||
global Base
|
||||
global docker_client
|
||||
global session
|
||||
|
||||
engine = create_engine(
|
||||
'postgresql://{}:{}@db/{}'.format(
|
||||
os.environ.get('POSTGRES_USER'),
|
||||
os.environ.get('POSTGRES_PASSWORD'),
|
||||
os.environ.get('POSTGRES_DB_NAME')))
|
||||
Base.prepare(engine, reflect=True)
|
||||
session = Session(engine)
|
||||
session.commit()
|
||||
|
||||
docker_client = docker.from_env()
|
||||
docker_client.login(password=os.environ.get('GITLAB_PASSWORD'),
|
||||
registry="gitlab.ub.uni-bielefeld.de:4567",
|
||||
username=os.environ.get('GITLAB_USERNAME'))
|
||||
|
||||
# executing background functions
|
||||
while True:
|
||||
handle_jobs()
|
||||
handle_corpora()
|
||||
# notify()
|
||||
session.commit()
|
||||
sleep(3)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
init_logger()
|
||||
nopaqued()
|
18
daemon/nopaqued.py
Normal file
@ -0,0 +1,18 @@
|
||||
from tasks.check_jobs import check_jobs
|
||||
from tasks.check_corpora import check_corpora
|
||||
from tasks.notify import notify
|
||||
from time import sleep
|
||||
|
||||
|
||||
def nopaqued():
|
||||
# executing background functions
|
||||
while True:
|
||||
check_jobs()
|
||||
check_corpora()
|
||||
notify(True) # If True mails are sent. If False no mails are sent.
|
||||
# But notification status will be set nonetheless.
|
||||
sleep(3)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
nopaqued()
|
0
daemon/notify/__init__.py
Normal file
27
daemon/notify/notification.py
Normal file
@ -0,0 +1,27 @@
|
||||
from email.message import EmailMessage
|
||||
|
||||
|
||||
class Notification(EmailMessage):
|
||||
"""docstring for Email."""
|
||||
|
||||
def set_notification_content(self,
|
||||
subject_template,
|
||||
subject_template_values_dict,
|
||||
body_txt_template_path,
|
||||
body_html_template_path,
|
||||
body_template_values_dict):
|
||||
# Create subject with subject_template_values_dict
|
||||
self['subject'] = subject_template.format(**subject_template_values_dict)
|
||||
# Open template files and insert values from body_template_values_dict
|
||||
with open(body_txt_template_path) as nfile:
|
||||
self.body_txt = nfile.read().format(**body_template_values_dict)
|
||||
with open(body_html_template_path) as nfile:
|
||||
self.body_html = nfile.read().format(**body_template_values_dict)
|
||||
# Set txt of email
|
||||
self.set_content(self.body_txt)
|
||||
# Set html alternative
|
||||
self.add_alternative(self.body_html, subtype='html')
|
||||
|
||||
def set_addresses(self, sender, recipient):
|
||||
self['From'] = sender
|
||||
self['to'] = recipient
|
41
daemon/notify/service.py
Normal file
@ -0,0 +1,41 @@
|
||||
import os
|
||||
import smtplib
|
||||
|
||||
|
||||
class NotificationService(object):
|
||||
"""This is a nopaque notifcation service object."""
|
||||
|
||||
def __init__(self, execute_flag):
|
||||
super(NotificationService, self).__init__()
|
||||
self.execute_flag = execute_flag # If True mails are sent normaly
|
||||
# If False mails are not sent. Used to avoid sending mails for jobs that
|
||||
# have been completed a long time ago. Use this if you implement notify
|
||||
# into an already existing nopaque instance. Change it to True after the
|
||||
# daemon has run one time with the flag set to False
|
||||
self.not_sent = {} # Holds due to an error unsent email notifications
|
||||
self.mail_limit_exceeded = False # Bool to show if the mail server
|
||||
# stoped sending mails due to exceeding its sending limit
|
||||
|
||||
def get_smtp_configs(self):
|
||||
self.password = os.environ.get('MAIL_PASSWORD')
|
||||
self.port = os.environ.get('MAIL_PORT')
|
||||
self.server_str = os.environ.get('MAIL_SERVER')
|
||||
self.tls = os.environ.get('MAIL_USE_TLS')
|
||||
self.username = os.environ.get('MAIL_USERNAME').split("@")[0]
|
||||
self.email_address = os.environ.get('MAIL_USERNAME')
|
||||
|
||||
def set_server(self):
|
||||
self.smtp_server = smtplib.SMTP(host=self.server_str, port=self.port)
|
||||
|
||||
def login(self):
|
||||
self.smtp_server.starttls()
|
||||
self.smtp_server.login(self.username, self.password)
|
||||
|
||||
def send(self, email):
|
||||
if self.execute_flag:
|
||||
self.smtp_server.send_message(email)
|
||||
else:
|
||||
return
|
||||
|
||||
def quit(self):
|
||||
self.smtp_server.quit()
|
@ -0,0 +1,15 @@
|
||||
<html>
|
||||
<body>
|
||||
<p>Dear <b>{username}</b>,</p>
|
||||
|
||||
<p>The status of your Job/Corpus({id}) with the title <b>"{title}"</b> has changed!</p>
|
||||
<p>It is now <b>{status}</b>!</p>
|
||||
<p>Time of this status update was: <b>{time} UTC</b></p>
|
||||
|
||||
<p>You can access your Job/Corpus here: <a href="{url}">{url}</a>
|
||||
</p>
|
||||
|
||||
<p>Kind regards!<br>
|
||||
Your nopaque team</p>
|
||||
</body>
|
||||
</html>
|
@ -0,0 +1,10 @@
|
||||
Dear {username},
|
||||
|
||||
The status of your Job/Corpus({id}) with the title "{title}" has changed!
|
||||
It is now {status}!
|
||||
Time of this status update was: {time} UTC
|
||||
|
||||
You can access your Job/Corpus here: {url}
|
||||
|
||||
Kind regards!
|
||||
Your nopaque team
|
3
daemon/requirements.txt
Normal file
@ -0,0 +1,3 @@
|
||||
docker
|
||||
psycopg2
|
||||
SQLAlchemy
|
52
daemon/tasks/Models.py
Normal file
@ -0,0 +1,52 @@
|
||||
from sqlalchemy.ext.automap import automap_base
|
||||
from sqlalchemy.orm import relationship
|
||||
from tasks import engine
|
||||
|
||||
|
||||
Base = automap_base()
|
||||
|
||||
|
||||
# Classes for database models
|
||||
class Corpus(Base):
|
||||
__tablename__ = 'corpora'
|
||||
files = relationship('CorpusFile', collection_class=set)
|
||||
|
||||
|
||||
class CorpusFile(Base):
|
||||
__tablename__ = 'corpus_files'
|
||||
|
||||
|
||||
class Job(Base):
|
||||
__tablename__ = 'jobs'
|
||||
inputs = relationship('JobInput', collection_class=set)
|
||||
results = relationship('JobResult', collection_class=set)
|
||||
notification_data = relationship('NotificationData', collection_class=list)
|
||||
notification_email_data = relationship('NotificationEmailData',
|
||||
collection_class=list)
|
||||
|
||||
|
||||
class JobInput(Base):
|
||||
__tablename__ = 'job_results'
|
||||
|
||||
|
||||
class JobResult(Base):
|
||||
__tablename__ = 'job_results'
|
||||
|
||||
|
||||
class NotificationData(Base):
|
||||
__tablename__ = 'notification_data'
|
||||
job = relationship('Job', collection_class=set)
|
||||
|
||||
|
||||
class NotificationEmailData(Base):
|
||||
__tablename__ = 'notification_email_data'
|
||||
job = relationship('Job', collection_class=set)
|
||||
|
||||
|
||||
class User(Base):
|
||||
__tablename__ = 'users'
|
||||
jobs = relationship('Job', collection_class=set)
|
||||
corpora = relationship('Corpus', collection_class=set)
|
||||
|
||||
|
||||
Base.prepare(engine, reflect=True)
|
22
daemon/tasks/__init__.py
Normal file
@ -0,0 +1,22 @@
|
||||
from sqlalchemy import create_engine
|
||||
from sqlalchemy.orm import scoped_session, sessionmaker
|
||||
import os
|
||||
import docker
|
||||
|
||||
''' Global constants '''
|
||||
NOPAQUE_STORAGE = os.environ.get('NOPAQUE_STORAGE')
|
||||
|
||||
''' Docker client '''
|
||||
docker_client = docker.from_env()
|
||||
docker_client.login(password=os.environ.get('GITLAB_PASSWORD'),
|
||||
registry="gitlab.ub.uni-bielefeld.de:4567",
|
||||
username=os.environ.get('GITLAB_USERNAME'))
|
||||
|
||||
''' Scoped session '''
|
||||
engine = create_engine(
|
||||
'postgresql://{}:{}@db/{}'.format(
|
||||
os.environ.get('POSTGRES_USER'),
|
||||
os.environ.get('POSTGRES_PASSWORD'),
|
||||
os.environ.get('POSTGRES_DB_NAME')))
|
||||
session_factory = sessionmaker(bind=engine)
|
||||
Session = scoped_session(session_factory)
|
134
daemon/tasks/check_corpora.py
Normal file
@ -0,0 +1,134 @@
|
||||
from decorators import background
|
||||
from logger.logger import init_logger
|
||||
from tasks import Session, docker_client, NOPAQUE_STORAGE
|
||||
from tasks.Models import Corpus
|
||||
import docker
|
||||
import os
|
||||
import shutil
|
||||
|
||||
|
||||
@background
|
||||
def check_corpora():
|
||||
c_session = Session()
|
||||
corpora = c_session.query(Corpus).all()
|
||||
for corpus in filter(lambda corpus: corpus.status == 'submitted', corpora):
|
||||
__create_build_corpus_service(corpus)
|
||||
for corpus in filter(lambda corpus: (corpus.status == 'queued'
|
||||
or corpus.status == 'running'),
|
||||
corpora):
|
||||
__checkout_build_corpus_service(corpus)
|
||||
for corpus in filter(lambda corpus: corpus.status == 'start analysis',
|
||||
corpora):
|
||||
__create_cqpserver_container(corpus)
|
||||
for corpus in filter(lambda corpus: corpus.status == 'stop analysis',
|
||||
corpora):
|
||||
__remove_cqpserver_container(corpus)
|
||||
c_session.commit()
|
||||
Session.remove()
|
||||
|
||||
|
||||
def __create_build_corpus_service(corpus):
|
||||
corpus_dir = os.path.join(NOPAQUE_STORAGE, str(corpus.user_id),
|
||||
'corpora', str(corpus.id))
|
||||
corpus_data_dir = os.path.join(corpus_dir, 'data')
|
||||
corpus_file = os.path.join(corpus_dir, 'merged', 'corpus.vrt')
|
||||
corpus_registry_dir = os.path.join(corpus_dir, 'registry')
|
||||
if os.path.exists(corpus_data_dir):
|
||||
shutil.rmtree(corpus_data_dir)
|
||||
if os.path.exists(corpus_registry_dir):
|
||||
shutil.rmtree(corpus_registry_dir)
|
||||
os.mkdir(corpus_data_dir)
|
||||
os.mkdir(corpus_registry_dir)
|
||||
service_args = {'command': 'docker-entrypoint.sh build-corpus',
|
||||
'constraints': ['node.role==worker'],
|
||||
'labels': {'origin': 'nopaque',
|
||||
'type': 'corpus.prepare',
|
||||
'corpus_id': str(corpus.id)},
|
||||
'mounts': [corpus_file + ':/root/files/corpus.vrt:ro',
|
||||
corpus_data_dir + ':/corpora/data:rw',
|
||||
corpus_registry_dir + ':/usr/local/share/cwb/registry:rw'],
|
||||
'name': 'build-corpus_{}'.format(corpus.id),
|
||||
'restart_policy': docker.types.RestartPolicy()}
|
||||
service_image = ('gitlab.ub.uni-bielefeld.de:4567/sfb1288inf/cqpserver:latest')
|
||||
try:
|
||||
service = docker_client.services.get(service_args['name'])
|
||||
except docker.errors.NotFound:
|
||||
pass
|
||||
except docker.errors.DockerException:
|
||||
return
|
||||
else:
|
||||
service.remove()
|
||||
try:
|
||||
docker_client.services.create(service_image, **service_args)
|
||||
except docker.errors.DockerException:
|
||||
corpus.status = 'failed'
|
||||
else:
|
||||
corpus.status = 'queued'
|
||||
|
||||
|
||||
def __checkout_build_corpus_service(corpus):
|
||||
logger = init_logger()
|
||||
service_name = 'build-corpus_{}'.format(corpus.id)
|
||||
try:
|
||||
service = docker_client.services.get(service_name)
|
||||
except docker.errors.NotFound:
|
||||
logger.error('__checkout_build_corpus_service({}):'.format(corpus.id)
|
||||
+ ' The service does not exist.'
|
||||
+ ' (stauts: {} -> failed)'.format(corpus.status))
|
||||
corpus.status = 'failed'
|
||||
return
|
||||
except docker.errors.DockerException:
|
||||
return
|
||||
service_tasks = service.tasks()
|
||||
if not service_tasks:
|
||||
return
|
||||
task_state = service_tasks[0].get('Status').get('State')
|
||||
if corpus.status == 'queued' and task_state != 'pending':
|
||||
corpus.status = 'running'
|
||||
elif corpus.status == 'running' and task_state == 'complete':
|
||||
service.remove()
|
||||
corpus.status = 'prepared'
|
||||
elif corpus.status == 'running' and task_state == 'failed':
|
||||
service.remove()
|
||||
corpus.status = task_state
|
||||
|
||||
|
||||
def __create_cqpserver_container(corpus):
|
||||
corpus_dir = os.path.join(NOPAQUE_STORAGE, str(corpus.user_id),
|
||||
'corpora', str(corpus.id))
|
||||
corpus_data_dir = os.path.join(corpus_dir, 'data')
|
||||
corpus_registry_dir = os.path.join(corpus_dir, 'registry')
|
||||
container_args = {'command': 'cqpserver',
|
||||
'detach': True,
|
||||
'volumes': [corpus_data_dir + ':/corpora/data:rw',
|
||||
corpus_registry_dir + ':/usr/local/share/cwb/registry:rw'],
|
||||
'name': 'cqpserver_{}'.format(corpus.id),
|
||||
'network': 'opaque_default'}
|
||||
container_image = ('gitlab.ub.uni-bielefeld.de:4567/sfb1288inf/cqpserver:latest')
|
||||
try:
|
||||
container = docker_client.containers.get(container_args['name'])
|
||||
except docker.errors.NotFound:
|
||||
pass
|
||||
except docker.errors.DockerException:
|
||||
return
|
||||
else:
|
||||
container.remove(force=True)
|
||||
try:
|
||||
docker_client.containers.run(container_image, **container_args)
|
||||
except docker.errors.DockerException:
|
||||
return
|
||||
else:
|
||||
corpus.status = 'analysing'
|
||||
|
||||
|
||||
def __remove_cqpserver_container(corpus):
|
||||
container_name = 'cqpserver_{}'.format(corpus.id)
|
||||
try:
|
||||
container = docker_client.containers.get(container_name)
|
||||
except docker.errors.NotFound:
|
||||
pass
|
||||
except docker.errors.DockerException:
|
||||
return
|
||||
else:
|
||||
container.remove(force=True)
|
||||
corpus.status = 'prepared'
|
152
daemon/tasks/check_jobs.py
Normal file
@ -0,0 +1,152 @@
|
||||
from datetime import datetime
|
||||
from decorators import background
|
||||
from logger.logger import init_logger
|
||||
from tasks import Session, docker_client, NOPAQUE_STORAGE
|
||||
from tasks.Models import Job, NotificationData, NotificationEmailData, JobResult
|
||||
import docker
|
||||
import json
|
||||
import os
|
||||
|
||||
|
||||
@background
|
||||
def check_jobs():
|
||||
# logger = init_logger()
|
||||
cj_session = Session()
|
||||
jobs = cj_session.query(Job).all()
|
||||
for job in filter(lambda job: job.status == 'submitted', jobs):
|
||||
__create_job_service(job)
|
||||
for job in filter(lambda job: (job.status == 'queued'), jobs):
|
||||
__checkout_job_service(job, cj_session)
|
||||
__add_notification_data(job, 'queued', cj_session)
|
||||
for job in filter(lambda job: (job.status == 'running'), jobs):
|
||||
__checkout_job_service(job, cj_session)
|
||||
__add_notification_data(job, 'running', cj_session)
|
||||
for job in filter(lambda job: job.status == 'complete', jobs):
|
||||
__add_notification_data(job, 'complete', cj_session)
|
||||
for job in filter(lambda job: job.status == 'failed', jobs):
|
||||
__add_notification_data(job, 'failed', cj_session)
|
||||
for job in filter(lambda job: job.status == 'canceling', jobs):
|
||||
__remove_job_service(job)
|
||||
cj_session.commit()
|
||||
Session.remove()
|
||||
|
||||
|
||||
def __add_notification_data(job, notified_on_status, scoped_session):
|
||||
logger = init_logger()
|
||||
# checks if user wants any notifications at all
|
||||
if (job.user.setting_job_status_mail_notifications == 'none'):
|
||||
# logger.warning('User does not want any notifications!')
|
||||
return
|
||||
# checks if user wants only notification on completed jobs
|
||||
elif (job.user.setting_job_status_mail_notifications == 'end'
|
||||
and notified_on_status != 'complete'):
|
||||
# logger.warning('User only wants notifications on job completed!')
|
||||
return
|
||||
else:
|
||||
# check if a job already has associated NotificationData
|
||||
notification_exists = len(job.notification_data)
|
||||
# create notification_data for current job if there is none
|
||||
if (notification_exists == 0):
|
||||
notification_data = NotificationData(job_id=job.id)
|
||||
scoped_session.add(notification_data)
|
||||
scoped_session.commit() # If no commit job will have no NotificationData
|
||||
# logger.warning('Created NotificationData for current Job.'))
|
||||
else:
|
||||
pass
|
||||
# logger.warning('Job already had notification: {}'.format(notification_exists))
|
||||
if (job.notification_data[0].notified_on != notified_on_status):
|
||||
notification_email_data = NotificationEmailData(job_id=job.id)
|
||||
notification_email_data.notify_status = notified_on_status
|
||||
notification_email_data.creation_date = datetime.utcnow()
|
||||
job.notification_data[0].notified_on = notified_on_status
|
||||
scoped_session.add(notification_email_data)
|
||||
logger.warning('Created NotificationEmailData for current Job.')
|
||||
else:
|
||||
# logger.warning('NotificationEmailData has already been created for current Job!')
|
||||
pass
|
||||
|
||||
|
||||
def __create_job_service(job):
|
||||
job_dir = os.path.join(NOPAQUE_STORAGE, str(job.user_id), 'jobs',
|
||||
str(job.id))
|
||||
service_args = {'command': ('{} /files /files/output'.format(job.service)
|
||||
+ ' {}'.format(job.secure_filename if job.service == 'file-setup' else '')
|
||||
+ ' --log-dir /files'
|
||||
+ ' --zip [{}]_{}'.format(job.service, job.secure_filename)
|
||||
+ ' ' + ' '.join(json.loads(job.service_args))),
|
||||
'constraints': ['node.role==worker'],
|
||||
'labels': {'origin': 'nopaque',
|
||||
'type': 'service.{}'.format(job.service),
|
||||
'job_id': str(job.id)},
|
||||
'mounts': [job_dir + ':/files:rw'],
|
||||
'name': 'job_{}'.format(job.id),
|
||||
'resources': docker.types.Resources(
|
||||
cpu_reservation=job.n_cores * (10 ** 9),
|
||||
mem_reservation=job.mem_mb * (10 ** 6)),
|
||||
'restart_policy': docker.types.RestartPolicy()}
|
||||
service_image = ('gitlab.ub.uni-bielefeld.de:4567/sfb1288inf/'
|
||||
+ job.service + ':' + job.service_version)
|
||||
try:
|
||||
service = docker_client.services.get(service_args['name'])
|
||||
except docker.errors.NotFound:
|
||||
pass
|
||||
except docker.errors.DockerException:
|
||||
return
|
||||
else:
|
||||
service.remove()
|
||||
try:
|
||||
docker_client.services.create(service_image, **service_args)
|
||||
except docker.errors.DockerException:
|
||||
job.status = 'failed'
|
||||
else:
|
||||
job.status = 'queued'
|
||||
|
||||
|
||||
def __checkout_job_service(job, scoped_session):
|
||||
logger = init_logger()
|
||||
scoped_session = Session()
|
||||
service_name = 'job_{}'.format(job.id)
|
||||
try:
|
||||
service = docker_client.services.get(service_name)
|
||||
except docker.errors.NotFound:
|
||||
logger.error('__checkout_job_service({}):'.format(job.id)
|
||||
+ ' The service does not exist.'
|
||||
+ ' (stauts: {} -> failed)'.format(job.status))
|
||||
job.status = 'failed'
|
||||
return
|
||||
except docker.errors.DockerException:
|
||||
return
|
||||
service_tasks = service.tasks()
|
||||
if not service_tasks:
|
||||
return
|
||||
task_state = service_tasks[0].get('Status').get('State')
|
||||
if job.status == 'queued' and task_state != 'pending':
|
||||
job.status = 'running'
|
||||
elif (job.status == 'running'
|
||||
and (task_state == 'complete' or task_state == 'failed')):
|
||||
service.remove()
|
||||
job.end_date = datetime.utcnow()
|
||||
job.status = task_state
|
||||
if task_state == 'complete':
|
||||
results_dir = os.path.join(NOPAQUE_STORAGE, str(job.user_id),
|
||||
'jobs', str(job.id), 'output')
|
||||
results = filter(lambda x: x.endswith('.zip'),
|
||||
os.listdir(results_dir))
|
||||
for result in results:
|
||||
job_result = JobResult(dir=results_dir, filename=result,
|
||||
job_id=job.id)
|
||||
scoped_session.add(job_result)
|
||||
scoped_session.commit()
|
||||
|
||||
|
||||
def __remove_job_service(job):
|
||||
service_name = 'job_{}'.format(job.id)
|
||||
try:
|
||||
service = docker_client.services.get(service_name)
|
||||
except docker.errors.NotFound:
|
||||
job.status = 'canceled'
|
||||
except docker.errors.DockerException:
|
||||
return
|
||||
else:
|
||||
service.update(mounts=None)
|
||||
service.remove()
|
87
daemon/tasks/notify.py
Normal file
@ -0,0 +1,87 @@
|
||||
from decorators import background
|
||||
from notify.notification import Notification
|
||||
from notify.service import NotificationService
|
||||
from sqlalchemy import asc
|
||||
from tasks import Session
|
||||
from tasks.Models import NotificationEmailData
|
||||
import os
|
||||
|
||||
|
||||
# Email notification functions
|
||||
def __create_mail_notifications(notification_service):
|
||||
mn_session = Session()
|
||||
notification_email_data = mn_session.query(NotificationEmailData).order_by(asc(NotificationEmailData.creation_date)).all()
|
||||
notifications = {}
|
||||
for data in notification_email_data:
|
||||
notification = Notification()
|
||||
notification.set_addresses(notification_service.email_address,
|
||||
data.job.user.email)
|
||||
subject_template = '[nopaque] Status update for your Job/Corpora: {title}!'
|
||||
subject_template_values_dict = {'title': data.job.title}
|
||||
domain = os.environ.get('NOPAQUE_DOMAIN')
|
||||
url = '{domain}/{jobs}/{id}'.format(domain=domain,
|
||||
jobs='jobs',
|
||||
id=data.job.id)
|
||||
body_template_values_dict = {'username': data.job.user.username,
|
||||
'id': data.job.id,
|
||||
'title': data.job.title,
|
||||
'status': data.notify_status,
|
||||
'time': data.creation_date,
|
||||
'url': url}
|
||||
notification.set_notification_content(subject_template,
|
||||
subject_template_values_dict,
|
||||
'notify/templates/notification_messages/notification.txt',
|
||||
'notify/templates/notification_messages/notification.html',
|
||||
body_template_values_dict)
|
||||
notifications[data.job.id] = notification
|
||||
# Using a dictionary for notifications avoids sending multiple mails
|
||||
# if the status of a job changes in a few seconds. The user will not get
|
||||
# swamped with mails for queued, running and complete if those happen in
|
||||
# in a few seconds. Only the last update will be sent. This depends on
|
||||
# the sleep time interval though.
|
||||
mn_session.delete(data)
|
||||
mn_session.commit()
|
||||
Session.remove()
|
||||
return notifications
|
||||
|
||||
|
||||
def __send_mail_notifications(notifications, notification_service):
|
||||
for key, notification in notifications.items():
|
||||
try:
|
||||
notification_service.send(notification)
|
||||
notification_service.mail_limit_exceeded = False
|
||||
except Exception as e:
|
||||
# Adds notifications to unsent if mail server exceded limit for
|
||||
# consecutive mail sending
|
||||
notification_service.not_sent[key] = notification
|
||||
notification_service.mail_limit_exceeded = True
|
||||
|
||||
|
||||
@background
|
||||
def notify(execute_flag):
|
||||
# If True mails are sent normaly
|
||||
# If False mails are not sent. Used to avoid sending mails for jobs that
|
||||
# have been completed a long time ago. Use this if you implement notify
|
||||
# into an already existing nopaque instance. Change it to True after the
|
||||
# daemon has run one time with the flag set to False.
|
||||
# Initialize notification service
|
||||
notification_service = NotificationService(execute_flag)
|
||||
notification_service.get_smtp_configs()
|
||||
notification_service.set_server()
|
||||
# create notifications (content, recipient etc.)
|
||||
notifications = __create_mail_notifications(notification_service)
|
||||
# only login and send mails if there are any notifications
|
||||
if (len(notifications) > 0):
|
||||
try:
|
||||
notification_service.login()
|
||||
# combine new and unsent notifications
|
||||
notifications.update(notification_service.not_sent)
|
||||
# send all notifications
|
||||
__send_mail_notifications(notifications, notification_service)
|
||||
# remove unsent notifications because they have been sent now
|
||||
# but only if mail limit has not been exceeded
|
||||
if (notification_service.mail_limit_exceeded is not True):
|
||||
notification_service.not_sent = {}
|
||||
notification_service.quit()
|
||||
except Exception as e:
|
||||
notification_service.not_sent.update(notifications)
|
@ -10,11 +10,16 @@ volumes:
|
||||
|
||||
services:
|
||||
web:
|
||||
build:
|
||||
args:
|
||||
gid: ${gid}
|
||||
uid: ${uid}
|
||||
context: ./web
|
||||
depends_on:
|
||||
- db
|
||||
- redis
|
||||
env_file: nopaque.env
|
||||
image: gitlab.ub.uni-bielefeld.de:4567/sfb1288inf/opaque:development
|
||||
image: nopaque/web
|
||||
labels:
|
||||
- "traefik.docker.network=reverse-proxy"
|
||||
- "traefik.enable=true"
|
||||
@ -22,13 +27,13 @@ services:
|
||||
- "traefik.http.middlewares.nopaque-header.headers.customrequestheaders.X-Forwarded-Proto=http"
|
||||
- "traefik.http.routers.nopaque.entrypoints=web"
|
||||
- "traefik.http.routers.nopaque.middlewares=nopaque-header, redirect-to-https@file"
|
||||
- "traefik.http.routers.nopaque.rule=Host(`nopaque.localhost`)" # Change this to match your nopaque domain
|
||||
- "traefik.http.routers.nopaque.rule=Host(`nopaque.localhost`)"
|
||||
### </http> ###
|
||||
### <https> ###
|
||||
- "traefik.http.middlewares.nopaque-secure-header.headers.customrequestheaders.X-Forwarded-Proto=https"
|
||||
- "traefik.http.routers.nopaque-secure.entrypoints=web-secure"
|
||||
- "traefik.http.routers.nopaque-secure.middlewares=hsts-header@file, nopaque-secure-header"
|
||||
- "traefik.http.routers.nopaque-secure.rule=Host(`nopaque.localhost`)" # Change this to match your nopaque domain
|
||||
- "traefik.http.routers.nopaque-secure.rule=Host(`nopaque.localhost`)"
|
||||
- "traefik.http.routers.nopaque-secure.tls.options=intermediate@file"
|
||||
### </https> ###
|
||||
### <basicauth help="https://docs.traefik.io/middlewares/basicauth/"> ###
|
||||
@ -41,32 +46,35 @@ services:
|
||||
- reverse-proxy
|
||||
volumes:
|
||||
- "/mnt/dind-swarm/nopaque:/mnt/dind-swarm/nopaque"
|
||||
- "./app:/home/nopaque/app"
|
||||
- "./logs:/home/nopaque/logs"
|
||||
- "./migrations:/home/nopaque/migrations"
|
||||
- "./tests:/home/nopaque/tests"
|
||||
- "./config.py:/home/nopaque/config.py"
|
||||
- "./docker-entrypoint.sh:/usr/local/bin/docker-entrypoint.sh"
|
||||
- "./nopaque.py:/home/nopaque/nopaque.py"
|
||||
- "./requirements.txt:/home/nopaque/requirements.txt"
|
||||
- "./web/app:/home/nopaque/app"
|
||||
- "./web/migrations:/home/nopaque/migrations"
|
||||
- "./web/tests:/home/nopaque/tests"
|
||||
- "./web/config.py:/home/nopaque/config.py"
|
||||
- "./web/docker-entrypoint.sh:/usr/local/bin/docker-entrypoint.sh"
|
||||
- "./web/nopaque.py:/home/nopaque/nopaque.py"
|
||||
- "./web/requirements.txt:/home/nopaque/requirements.txt"
|
||||
daemon:
|
||||
build:
|
||||
args:
|
||||
docker_gid: ${docker_gid}
|
||||
gid: ${gid}
|
||||
uid: ${uid}
|
||||
context: ./daemon
|
||||
depends_on:
|
||||
- db
|
||||
- web
|
||||
env_file: nopaque.env
|
||||
extra_hosts:
|
||||
- "host.docker.internal:172.17.0.1"
|
||||
image: gitlab.ub.uni-bielefeld.de:4567/sfb1288inf/opaque_daemon:latest
|
||||
image: nopaque/daemon
|
||||
volumes:
|
||||
- "/mnt/dind-swarm/nopaque:/mnt/dind-swarm/nopaque"
|
||||
- "/var/run/docker.sock:/var/run/docker.sock"
|
||||
- "./logs:/home/nopaqued/logs"
|
||||
- "../opaque_daemon/docker-entrypoint.sh:/usr/local/bin/docker-entrypoint.sh"
|
||||
- "../opaque_daemon/nopaqued.py:/home/nopaqued/nopaqued.py"
|
||||
- "../opaque_daemon/decorators.py:/home/nopaqued/decorators.py"
|
||||
- "../opaque_daemon/notify:/home/nopaqued/notify"
|
||||
- "../opaque_daemon/templates:/home/nopaqued/templates"
|
||||
- "../opaque_daemon/requirements.txt:/home/nopaqued/requirements.txt"
|
||||
- "$HOME/.docker:/home/nopaqued/.docker"
|
||||
- "./daemon/notify:/home/nopaqued/notify"
|
||||
- "./daemon/decorators.py:/home/nopaqued/decorators.py"
|
||||
- "./daemon/docker-entrypoint.sh:/usr/local/bin/docker-entrypoint.sh"
|
||||
- "./daemon/nopaqued.py:/home/nopaqued/nopaqued.py"
|
||||
- "./daemon/requirements.txt:/home/nopaqued/requirements.txt"
|
||||
db:
|
||||
env_file: nopaque.env
|
||||
image: postgres:11
|
||||
|
@ -1,21 +1,18 @@
|
||||
### PostgreSQL ###
|
||||
POSTGRES_DB_NAME=
|
||||
POSTGRES_USER=
|
||||
POSTGRES_PASSWORD=
|
||||
|
||||
### Docker ###
|
||||
DOCKER_CERT_PATH=
|
||||
DOCKER_HOST=
|
||||
DOCKER_TLS_VERIFY=
|
||||
|
||||
### GitLab Registry ###
|
||||
GITLAB_USERNAME=
|
||||
GITLAB_PASSWORD=
|
||||
# Fill out these variables to use the Docker HTTP socket. When doing this, you
|
||||
# can remove the Docker UNIX socket mount from the docker-compose file.
|
||||
# DOCKER_CERT_PATH=
|
||||
# DOCKER_HOST=
|
||||
# DOCKER_TLS_VERIFY=
|
||||
|
||||
### Flask ###
|
||||
FLASK_CONFIG=
|
||||
SECRET_KEY=
|
||||
|
||||
### GitLab Registry ###
|
||||
GITLAB_USERNAME=
|
||||
GITLAB_PASSWORD=
|
||||
|
||||
### Flask-Mail ###
|
||||
MAIL_SERVER=
|
||||
MAIL_PORT=
|
||||
@ -27,5 +24,11 @@ MAIL_PASSWORD=
|
||||
NOPAQUE_ADMIN=
|
||||
NOPAQUE_CONTACT=
|
||||
NOPAQUE_DOMAIN=
|
||||
NOPAQUE_LOG_LEVEL=
|
||||
NOPAQUE_MAIL_SENDER=
|
||||
NOPAQUE_STORAGE=
|
||||
|
||||
### PostgreSQL ###
|
||||
POSTGRES_DB_NAME=
|
||||
POSTGRES_USER=
|
||||
POSTGRES_PASSWORD=
|
||||
|
@ -4,6 +4,8 @@ FROM python:3.6-slim-stretch
|
||||
LABEL maintainer="inf_sfb1288@lists.uni-bielefeld.de"
|
||||
|
||||
|
||||
ARG uid=1000
|
||||
ARG gid=1000
|
||||
ENV FLASK_APP=nopaque.py
|
||||
ENV LANG=C.UTF-8
|
||||
|
||||
@ -19,8 +21,8 @@ RUN apt-get update \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
|
||||
RUN groupadd --gid 1000 --system nopaque \
|
||||
&& useradd --create-home --gid nopaque --no-log-init --system --uid 1000 nopaque
|
||||
RUN groupadd --gid "${gid}" --system nopaque \
|
||||
&& useradd --create-home --gid "${gid}" --no-log-init --system --uid "${uid}" nopaque
|
||||
USER nopaque
|
||||
WORKDIR /home/nopaque
|
||||
|
Before Width: | Height: | Size: 14 KiB After Width: | Height: | Size: 14 KiB |
Before Width: | Height: | Size: 16 KiB After Width: | Height: | Size: 16 KiB |
Before Width: | Height: | Size: 805 B After Width: | Height: | Size: 805 B |
Before Width: | Height: | Size: 1.9 MiB After Width: | Height: | Size: 1.9 MiB |
Before Width: | Height: | Size: 6.2 MiB After Width: | Height: | Size: 6.2 MiB |
Before Width: | Height: | Size: 2.1 MiB After Width: | Height: | Size: 2.1 MiB |
Before Width: | Height: | Size: 282 KiB After Width: | Height: | Size: 282 KiB |
Before Width: | Height: | Size: 290 KiB After Width: | Height: | Size: 290 KiB |
Before Width: | Height: | Size: 100 KiB After Width: | Height: | Size: 100 KiB |
Before Width: | Height: | Size: 282 KiB After Width: | Height: | Size: 282 KiB |
Before Width: | Height: | Size: 202 KiB After Width: | Height: | Size: 202 KiB |
Before Width: | Height: | Size: 117 KiB After Width: | Height: | Size: 117 KiB |
Before Width: | Height: | Size: 188 KiB After Width: | Height: | Size: 188 KiB |
Before Width: | Height: | Size: 252 KiB After Width: | Height: | Size: 252 KiB |
Before Width: | Height: | Size: 177 KiB After Width: | Height: | Size: 177 KiB |
Before Width: | Height: | Size: 15 KiB After Width: | Height: | Size: 15 KiB |
Before Width: | Height: | Size: 4.2 KiB After Width: | Height: | Size: 4.2 KiB |