Add April 2022 update news

This commit is contained in:
Patrick Jentsch 2022-04-28 14:35:05 +02:00
parent d61e68c690
commit 7b10a6be2e
2 changed files with 78 additions and 0 deletions

View File

@ -78,6 +78,7 @@
<p>A job is the execution of a service provided by nopaque. You can create any number of jobs and let them be processed simultaneously.</p> <p>A job is the execution of a service provided by nopaque. You can create any number of jobs and let them be processed simultaneously.</p>
<div class="card"> <div class="card">
<div class="card-content"> <div class="card-content">
<p><b>Where is my Job data?</b> Don't worry, please read <a href="{{ url_for('main.news', _anchor='april-2022-update') }}">this news</a> entry</p>
<div class="input-field"> <div class="input-field">
<i class="material-icons prefix">search</i> <i class="material-icons prefix">search</i>
<input id="search-job" class="search" type="search"></input> <input id="search-job" class="search" type="search"></input>

View File

@ -8,6 +8,83 @@
<h1 id="title">{{ title }}</h1> <h1 id="title">{{ title }}</h1>
</div> </div>
<div class="col s12">
<div class="card" id="april-2022-update">
<div class="card-content">
<span class="card-title">April 2022 update</span>
<p>Dear users</p>
<br>
<p>
with the April 2022 update we have improved nopaque in all places.
We have significantly reworked our backend code to utilize our servers more efficiently,
integrated a new service, updated all previously existing ones, rewrote a lot of code and made a few minor design improvements.
</p>
<br>
<span class="card-title">Where is my Job data?</span>
<p>
At the beginning of the year, we realized that our storage limit had been reached.
This was the time when some users may have noticed system instabilities.
We were fortunately able to temporarily solve this problem without data loss
by deleting some non-nopaque related data on our system (yes we also do <a href="https://digital-history.uni-bielefeld.de">other things then nopaque</a>).
In order to not face the same problem again, we had to dedicate ourselves to a long-term solution.
This consists of deleting all previous job data with this update and henceforth storing new job data
only for three months after job creation (important note: <b>corpora are not affected</b>).
All job data prior to this update has been backed up for you,
feel free to contact us at nopaque@uni-bielefeld.de if you would like to get this data back.
</p>
<br>
<span class="card-title">What's new?</span>
<p>
By partnering up with <a href="https://readcoop.eu/transkribus/?sc=Transkribus">Transkribus</a> we reached one of our long term goals: integrate a HTR service into nopaque.
The <a href="{{ url_for('services.transkribus_htr_pipeline') }}">Transkribus HTR Pipeline</a> service is implemented as a kind of proxied service where the work is split between Transkribus and us.
That means we do the preprocessing, storage and postprocessing, while Transkribus handles the HTR itself.
</p>
<br>
<p>
One of the changes in the background was to fix our performance issues. While implementing the <a href="{{ url_for('services.transkribus_htr_pipeline') }}">Transkribus HTR Pipeline</a> service we
found some optimization potential within different steps of our processing routine. These optimizations are now also
available in our <a href="{{ url_for('services.transkribus_htr_pipeline') }}">Tesseract OCR Pipeline</a> service, resulting in a speed up of about 4x.
For now we are done with the most obvious optimizations but we may include more in the near future, so stay tuned!
</p>
<br>
<p>
The next step was to reorganize our <a href="{{ url_for('services.corpus_analysis') }}">Corpus Analysis</a> code. Unfortunatly it was a bit messy, after a complete rewrite we are
now able to query a corpus without long loading times and with better error handling, resulting in way more stable user experience.
The Corpus Analysis service is now modularized and comes with 2 modules that recreate and extend the functionality of the old service.<br>
For now we had to disable the Query Result viewer, the code was based on the old Corpus Analysis service and will be reintegrated as a module to the Corpus Analysis.
</p>
<br>
<p>
The <a href="{{ url_for('services.spacy_nlp_pipeline') }}">spaCy NLP Pipeline</a> service got some love in the form of smaller updates too.
This is important preliminary work to support more models/languages that does not provide the full set of linguistic features (lemma, ner, pos, simple_pos). It still needs some testing and tweaking but will be ready soon!
</p>
<br>
<p>
Last but not least we made some design changes. Now you can find colors in places where we had just black and white before.
Nothing big but the new colors will help you identify ressources more efficient!
</p>
<br>
<span class="card-title">Database cleanup</span>
<p>
We may be a bit late with our spring cleaning but with this update we tidied up within our database system.
This means we deleted old corpora with no corpus files, unconfirmed user accounts and in general unnecessary data fields.
</p>
<br>
<p>
That's it, thank you for using nopaque! We hope you like the update and appreciate all your past and future feedback.
</p>
</div>
</div>
</div>
<div class="col s12"> <div class="col s12">
<div class="card" id="maintenance"> <div class="card" id="maintenance">
<div class="card-content"> <div class="card-content">