Merge branch 'development'

This commit is contained in:
Inga Kirschnick 2024-01-23 14:19:06 +01:00
commit ba65cf5911
4 changed files with 152 additions and 92 deletions

Binary file not shown.

Before

Width:  |  Height:  |  Size: 34 KiB

After

Width:  |  Height:  |  Size: 30 KiB

View File

@ -12,7 +12,7 @@
<li> <li>
<a class="dropdown-trigger no-autoinit" data-target="nav-more-dropdown" href="#!" id="nav-more-dropdown-trigger"> <a class="dropdown-trigger no-autoinit" data-target="nav-more-dropdown" href="#!" id="nav-more-dropdown-trigger">
{% if current_user.is_authenticated %} {% if current_user.is_authenticated %}
<img src="{{ url_for('users.user_avatar', user_id=current_user.id) }}" alt="avatar" class="circle left" style="height: 54px; padding: 10px 10px 0 0;"> <img src="{{ url_for('users.user_avatar', user_id=current_user.id) }}" alt="avatar" class="circle left" style="height: 54px; padding:8px;">
{{ current_user.username }} ({{ current_user.email }}) {{ current_user.username }} ({{ current_user.email }})
{% else %} {% else %}
<i class="material-icons left">more_vert</i> <i class="material-icons left">more_vert</i>

View File

@ -6,119 +6,179 @@
<div class="col s12"> <div class="col s12">
<h1 id="title">{{ title }}</h1> <h1 id="title">{{ title }}</h1>
</div> </div>
<div class="col s12">
<div class="card" id="news-post-january-2024">
<div class="card-content">
<h6 style="font-weight: 300;">January 2024</h6>
<span class="card-title">Looking back on 2023 - new changes to nopaque</span>
<br>
<p>Hello nopaque users!</p>
<p>First of all, the nopaque team would like to wish everyone a good start to 2024! We hope you found the time to relax over the winter break.</p>
<p>Now that the new year has come around and were all back in the office, we wanted to take the opportunity to tell you about the most important things weve worked on in nopaque in 2023 things weve incorporated into our <b>latest nopaque update</b> as of late <b>December 2023</b>. You may have noticed some of them as youve returned to your projects on nopaque.</p>
<br>
<h6 style="font-weight: 300;">Changes to the Query Builder</h6>
<p>
The Query Builder has undergone changes to make it more intuitive to use and is now the standard option for creating queries.
Individual elements of a query can now be easily modified and edited by clicking on them.
An input marker shows your position in the inquiry and where new elements will be added. This and all other elements can be moved around via drag and drop.
A new toggle button enables users to easily switch between the Query Builder and Expert Mode if they prefer to work with the plain Corpus Query Language (CQL) instead. This can be done in the middle of an existing query existing chips will be “translated” into CQL.
This also works the other way around if you want to switch back, your query in CQL wll be parsed into chips.
More details and instructions on how to use the new Query Builder can be found in the manual.
</p>
<br>
<h6 style="font-weight: 300;">Community Update</h6>
<p>
The most extensive changes to nopaque have taken place in the Social Area. We want nopaque to be a platform where researchers can connect with each other, so weve added some more features to make this possible.
Users can now update their personal profiles to be publicly visible to others on nopaque, including a short “About me” section and options to share your website, organization, location, and add an avatar that others can see.
It is also possible to share corpora with other researchers via share links, access invitations, or by setting corpus visibility to Public. Other users can only see the meta data of public corpora further access can be granted upon request.
The extent of access to these shared corpora is managed by assigning the roles of Viewer, Contributor, and Administrator. Viewers may only download the files. Contributors can download and edit files and their metadata as well as analyze and build the corpus. Administrators can manage users, followers and visibility, in addition to all of the above.
</p>
<br>
</div>
</div>
</div>
<div class="col s12">
<div class="card" id="news-post-july-2023">
<div class="card-content">
<h6 style="font-weight: 300;">July 2023</h6>
<span class="card-title">Visualization Update (beta) - new analysis features</span>
<br>
<p>Hey users,</p>
<p>
we wanted to give you some news on updates were making to nopaque.
Since we want to make it easier for users to grasp and work with different elements of their data,
weve been working on adding some visualization features into the Corpus Analysis service. Currently, the two main modules,
“Reader” and “Concordance” have been expanded with an additional “Static Visualizations” module, but theres more to come!
</p>
<p>
With the Static Visualizations module, its now possible to view information
about your corpus, such as the number of (unique) tokens, sentences, lemmata,
corresponding information on individual texts, the distribution of these elements
within your corpus, as well as searchable lists of word frequencies with stopwords
that can be preset and modified. In the future, this area will be extended with more advanced visualization options.
</p>
<p>
Well keep you posted about further visualization updates. Until then, we hope the latest update improves
your research experience with nopaque. And as always, if you have any ideas for nopaque or need assistance,
dont hesitate to contact us!
</p>
<br>
</div>
</div>
</div>
<div class="col s12">
<div class="card" id="news-post-november-2022">
<div class="card-content">
<h6 style="font-weight: 300;">November 2022</h6>
<span class="card-title">Contribution Update</span>
<br>
<p>Dear users,</p>
<p>
users can now upload their own language models into nopaque. This is useful for working with different languages that are not available as standard in nopaque or if a user wants to work with a language model they have developed themselves. Tesseract models can be uploaded in .traineddata format; spaCy models can be uploaded in tar.gz format. We are also working on the option to upload models in .whl format in the future.
Uploaded models can be found in the model list of the corresponding service and can be used immediately. Models can also be made public if you have a role of Contributor in nopaque.
</p>
<br>
<p><b>Please note:</b> The Contributor role must be requested from the nopaque admins if you would like to make a model public for all users.</p>
<br>
</div>
</div>
</div>
<div class="col s12"> <div class="col s12">
<div class="card" id="april-2022-update"> <div class="card" id="news-post-april-2022">
<div class="card-content"> <div class="card-content">
<span class="card-title">April 2022 update</span> <h6 style="font-weight: 300;">April 2022</h6>
<p>Dear users</p> <span class="card-title">April updates more features, faster operation</span>
<br> <br>
<p>Hello everyone,</p>
<p> <p>
with the April 2022 update we have improved nopaque in all places. in April 2022, we released an update improving many elements of nopaque. We rewrote a lot of our code,
We have significantly reworked our backend code to utilize our servers more efficiently, including a significant reworking of our backend code for more efficient use of our servers.
integrated a new service, updated all previously existing ones, rewrote a lot of code and made a few minor design improvements. We integrated a new service, updated the existing ones, and made some minor design improvements.
</p> </p>
<br> <br>
<h6 style="font-weight: 300;">Database Cleanup</h6>
<span class="card-title">Where is my Job data?</span>
<p> <p>
At the beginning of the year, we realized that our storage limit had been reached. We may be a bit late with our spring cleaning, but weve tidied up our
This was the time when some users may have noticed system instabilities. database system and deleted old, empty corpora, unconfirmed user accounts and
We were fortunately able to temporarily solve this problem without data loss unnecessary data fields.
by deleting some non-nopaque related data on our system (yes we also do <a href="https://digital-history.uni-bielefeld.de">other things then nopaque</a>).
In order to not face the same problem again, we had to dedicate ourselves to a long-term solution.
This consists of deleting all previous job data with this update and henceforth storing new job data
only for three months after job creation (important note: <b>corpora are not affected</b>).
All job data prior to this update has been backed up for you,
feel free to contact us at nopaque@uni-bielefeld.de if you would like to get this data back.
</p> </p>
<br> <h6 style="font-weight: 300;">What's new?</h6>
<span class="card-title">What's new?</span>
<p> <p>
By partnering up with <a href="https://readcoop.eu/transkribus/?sc=Transkribus">Transkribus</a> we reached one of our long term goals: integrate a HTR service into nopaque. By partnering with Transkribus, weve reached one of our long-term goals: to integrate a
The <a href="{{ url_for('services.transkribus_htr_pipeline') }}">Transkribus HTR Pipeline</a> service is implemented as a kind of proxied service where the work is split between Transkribus and us. Handwritten Text Recognition (HTR) service into nopaque. The Transkribus HTR Pipeline service is implemented as a
kind of proxied service where the work is split between us and Transkribus.
That means we do the preprocessing, storage and postprocessing, while Transkribus handles the HTR itself. That means we do the preprocessing, storage and postprocessing, while Transkribus handles the HTR itself.
</p> </p>
<br>
<p> <p>
One of the changes in the background was to fix our performance issues. While implementing the <a href="{{ url_for('services.transkribus_htr_pipeline') }}">Transkribus HTR Pipeline</a> service we One change we needed to make in the background was to fix our performance issues.
found some optimization potential within different steps of our processing routine. These optimizations are now also While implementing the Transkribus HTR Pipeline service, we saw optimization potential
available in our <a href="{{ url_for('services.transkribus_htr_pipeline') }}">Tesseract OCR Pipeline</a> service, resulting in a speed up of about 4x. in different steps of our processing routine. These optimizations are now also available
For now we are done with the most obvious optimizations but we may include more in the near future, so stay tuned! in our Tesseract OCR Pipeline service and result in speeds that are about four times faster
than before. Were now finished with the major optimizations, but there could be more soon,
so stay tuned!
</p>
<p>
Next, we reorganized our Corpus Analysis code. It was a bit messy, but after a complete rewrite,
we are now able to query a corpus without long loading times and with better error handling,
making the user experience much more stable. The Corpus Analysis service is now modularized and comes with two modules
that recreate and extend the functionality of the old service.
</p>
<p>
The Query Result viewer had to be temporarily disabled, as the code was based on the old Corpus Analysis service.
It will be reintegrated as a module to the Corpus Analysis.
</p>
<p>
The spaCy NLP Pipeline service was also taken care of with some smaller updates. This is important preliminary work
for support of more models/languages missing the full set of linguistic features (lemma, ner, pos, simple_pos).
It still needs some testing and adjustments but will be ready soon!
</p>
<p>
Last, but not least, we made some design changes. Now, you can find color in places that were previously in black and white.
Nothing big, but the new colors can aid in identifying resources more efficiently.
</p>
<h6 style="font-weight: 300;">Where is my job data?</h6>
<p>
We reached our storage limit at the beginning of the year.
At this time, some users may have noticed system instability.
Fortunately, we found a solution that avoided data loss by deleting some
non-nopaque related data in our system (yes, <a href="https://www.uni-bielefeld.de/fakultaeten/geschichtswissenschaft/abteilung/arbeitsbereiche/digital-history/">we also do things other than nopaque</a>).
To avoid facing the same problem again, we had to find a long-term solution.
In the end, this involved the deletion of all previous job data with this update and,
going forward, only keeping new job data for three months after job creation
(<b>important note:</b> corpora are not affected). All job data created prior to this
update has been backed up for you. Feel free to contact us at <a href="mailto:nopaque@uni-bielefeld.de">nopaque@uni-bielefeld.de</a>
if you would like to get this data back.
</p> </p>
<br> <br>
<p>
The next step was to reorganize our <a href="{{ url_for('services.corpus_analysis') }}">Corpus Analysis</a> code. Unfortunatly it was a bit messy, after a complete rewrite we are
now able to query a corpus without long loading times and with better error handling, resulting in way more stable user experience.
The Corpus Analysis service is now modularized and comes with 2 modules that recreate and extend the functionality of the old service.<br>
For now we had to disable the Query Result viewer, the code was based on the old Corpus Analysis service and will be reintegrated as a module to the Corpus Analysis.
</p>
<br>
<p>
The <a href="{{ url_for('services.spacy_nlp_pipeline') }}">spaCy NLP Pipeline</a> service got some love in the form of smaller updates too.
This is important preliminary work to support more models/languages that does not provide the full set of linguistic features (lemma, ner, pos, simple_pos). It still needs some testing and tweaking but will be ready soon!
</p>
<br>
<p>
Last but not least we made some design changes. Now you can find colors in places where we had just black and white before.
Nothing big but the new colors will help you identify ressources more efficient!
</p>
<br>
<span class="card-title">Database cleanup</span>
<p>
We may be a bit late with our spring cleaning but with this update we tidied up within our database system.
This means we deleted old corpora with no corpus files, unconfirmed user accounts and in general unnecessary data fields.
</p>
<br>
<p>
That's it, thank you for using nopaque! We hope you like the update and appreciate all your past and future feedback.
</p>
</div> </div>
</div> </div>
</div> </div>
<div class="col s12"> <div class="col s12">
<div class="card" id="maintenance"> <div class="card" id="news-post-september-2021">
<div class="card-content">
<span class="card-title">Maintenance</span>
<p>Dear users</p>
<br>
<p>Currently we are rewriting big parts of our project infrastructure. Due to this the following features are not available:</p>
<ul>
<li>Corpus export and import</li>
<li>Query result export, import and view</li>
</ul>
<p>We hope to add these features back in the near future, until then check out our updated corpus analysis.</p>
</div>
</div>
</div>
<div class="col s12">
<div class="card" id="nlp-removed-language-support">
<div class="card-content">
<span class="card-title">Natural Language Processing removed language support</span>
<p>Dear users</p>
<br>
<p>Not all language models support all features we utizlize in our NLP service. Thats why we had to drop them, as soon as they meet our requirements we will add them back!</p>
</div>
</div>
</div>
<div class="col s12">
<div class="card" id="beta-launch">
<div class="card-content"> <div class="card-content">
<h6 style="font-weight: 300;">September 2021</h6>
<span class="card-title">nopaque's beta launch</span> <span class="card-title">nopaque's beta launch</span>
<p>Dear users</p>
<br> <br>
<p>A few days ago we went live with nopaque. Right now nopaque is still in its Beta phase. So some bugs are to be expected. If you encounter any bugs or some feature is not working as expected please send as an email using the feedback button at the botton of the page in the footer!</p> <p>Hello to all our users!</p>
<p>We are happy to help you with any issues and will use the feedback to fix all mentioned bugs!</p> <p>The BETA version of our web platform, nopaque, is now available! Nopaque is a web application that offers different services and tools to support researchers working with image and text-based data. These services include:</p>
<ul>
<li>File Setup, which converts and merges different data (e.g., books, letters) for further processing</li>
<li>Optical Character Recognition, which converts photos and scans into text data for machine readability</li>
<li>Natural Language Processing, which extracts information from your text via computational linguistic data processing (tokenization, lemmatization, part-of-speech tagging and named-entity recognition)</li>
<li>Corpus analysis, which makes use of CQP Query Language to search through text corpora with the aid of metadata and Natural Language Processing tags.</li>
</ul>
<p>
Nopaque was created based on our experiences working with other subprojects and a Prototyp user study in the
first phase of funding. The platform is open source under the terms of the MIT license (<a href="https://gitlab.ub.uni-bielefeld.de/sfb1288inf/nopaque">https://gitlab.ub.uni-bielefeld.de/sfb1288inf/nopaque</a>).
Language support and functions are currently limited extensions can be requested by sending an email to <a href="mailto:nopaque@uni-bielefeld.de">nopaque@uni-bielefeld.de</a>.
Because we are still in the beta phase, some bugs are to be expected. If you encounter any problems, please let us know!
We are thankful for all feedback we receive.
</p>
</div> </div>
</div> </div>
</div> </div>

View File

@ -2,7 +2,7 @@ apifairy
cqi>=0.1.7 cqi>=0.1.7
dnspython==2.2.1 dnspython==2.2.1
docker docker
eventlet eventlet==0.34.2
Flask==2.1.3 Flask==2.1.3
Flask-APScheduler Flask-APScheduler
Flask-Assets==2.0 Flask-Assets==2.0