Merge branch 'query-builder' of gitlab.ub.uni-bielefeld.de:sfb1288inf/nopaque into query-builder

This commit is contained in:
Inga Kirschnick 2023-12-07 12:46:48 +01:00
commit 8f5d5ffdec
7 changed files with 34 additions and 21 deletions

View File

@ -1,5 +1,8 @@
# nopaque # nopaque
![release badge](https://gitlab.ub.uni-bielefeld.de/sfb1288inf/nopaque/-/badges/release.svg)
![pipeline badge](https://gitlab.ub.uni-bielefeld.de/sfb1288inf/nopaque/badges/master/pipeline.svg?ignore_skipped=true)
nopaque bundles various tools and services that provide humanities scholars with DH methods and thus can support their various individual research processes. Using nopaque, researchers can subject digitized sources to Optical Character Recognition (OCR). The resulting text files can then be used as a data basis for Natural Language Processing (NLP). The texts are automatically subjected to various linguistic annotations. The data processed via NLP can then be summarized in the web application as corpora and analyzed by means of an information retrieval system through complex search queries. The range of functions of the web application will be successively extended according to the needs of the researchers. nopaque bundles various tools and services that provide humanities scholars with DH methods and thus can support their various individual research processes. Using nopaque, researchers can subject digitized sources to Optical Character Recognition (OCR). The resulting text files can then be used as a data basis for Natural Language Processing (NLP). The texts are automatically subjected to various linguistic annotations. The data processed via NLP can then be summarized in the web application as corpora and analyzed by means of an information retrieval system through complex search queries. The range of functions of the web application will be successively extended according to the needs of the researchers.
## Prerequisites and requirements ## Prerequisites and requirements

View File

@ -8,7 +8,7 @@
The <a href="{{ url_for('main.dashboard') }}">dashboard</a> provides a central overview of all resources assigned to the The <a href="{{ url_for('main.dashboard') }}">dashboard</a> provides a central overview of all resources assigned to the
user. These are <a href="{{ url_for('main.dashboard', _anchor='corpora') }}">corpora</a> and created <a href="{{ url_for('main.dashboard', _anchor='jobs') }}">jobs</a>. Corpora are freely composable user. These are <a href="{{ url_for('main.dashboard', _anchor='corpora') }}">corpora</a> and created <a href="{{ url_for('main.dashboard', _anchor='jobs') }}">jobs</a>. Corpora are freely composable
annotated text collections and jobs are the initiated file processing annotated text collections and jobs are the initiated file processing
procedures. Both the job and the corpus listings can be searched using procedures. One can search for jobs as well as corpus listings using
the search field displayed above them. the search field displayed above them.
</p> </p>
</div> </div>
@ -20,10 +20,10 @@
<p> <p>
A corpus is a collection of texts that can be analyzed using the A corpus is a collection of texts that can be analyzed using the
Corpus Analysis service. All texts must be in the verticalized text Corpus Analysis service. All texts must be in the verticalized text
file format, which can be obtained via the Natrual Language file format, which can be obtained via the Natural Language
Processing service. It contains, in addition to the actual text, Processing service. It contains, in addition to the text,
further annotations that are searchable in combination with optional further annotations that are searchable in combination with optional
addable metadata during your analysis. metadata that can be added during your analysis.
</p> </p>
</div> </div>
</div> </div>

View File

@ -35,13 +35,13 @@
</p> </p>
<h4>Optical Character Recognition (OCR)</h4> <h4>Optical Character Recognition (OCR)</h4>
<p>Comming soon...</p> <p>Coming soon...</p>
<h4>Handwritten Text Recognition (HTR)</h4> <h4>Handwritten Text Recognition (HTR)</h4>
<p>Comming soon...</p> <p>Coming soon...</p>
<h4>Natural Language Processing (NLP)</h4> <h4>Natural Language Processing (NLP)</h4>
<p>Comming soon...</p> <p>Coming soon...</p>
<h4>Corpus Analysis</h4> <h4>Corpus Analysis</h4>
<p> <p>

View File

@ -7,7 +7,7 @@
<div class="col s12 m8"> <div class="col s12 m8">
<p> <p>
To <a href="{{ url_for('corpora.create_corpus') }}">create a corpus</a>, you To <a href="{{ url_for('corpora.create_corpus') }}">create a corpus</a>, you
can use the "New Corpus" button, which can be found on both, the Corpus can use the "New Corpus" button, which can be found on both the Corpus
Analysis Service page and the Dashboard below the corpus list. Fill in the input Analysis Service page and the Dashboard below the corpus list. Fill in the input
mask to Create a corpus. After you have completed the input mask, you will mask to Create a corpus. After you have completed the input mask, you will
be automatically taken to the corpus overview page (which can be called up be automatically taken to the corpus overview page (which can be called up
@ -43,5 +43,5 @@
the way of how a token is displayed, by using the text style switch. The the way of how a token is displayed, by using the text style switch. The
concordance module offers some more options regarding the context size of concordance module offers some more options regarding the context size of
search results. If the context does not provide enough information you can search results. If the context does not provide enough information you can
hop into the reader module by using the lupe icon next to a match. hop into the reader module by using the magnifier icon next to a match.
</p> </p>

View File

@ -1,14 +1,22 @@
<h3 class="manual-chapter-title">Query Builder Tutorial</h3> <h3 class="manual-chapter-title">Query Builder Tutorial</h3>
<h4>Overview</h4>
<p>The query builder helps you to make a query in the form of the Corpus Query <p>The query builder can be accessed via "My Corpora" or "Corpus Analysis" in the sidebar options.
Language (CQL) to your text. You can use the CQL to filter out various types of Select the desired corpus and click on the "Analyze" and then "Concordance"
text parameters, for example, a specific word, a lemma, or you can set part-of-speech buttons to open the query builder.</p>
<p>The query builder uses the Corpus Query Language (CQL) to help you make a query for analyzing your texts.
In this way, it is possible to filter out various types of text parameters, for
example, a specific word, a lemma, or you can set part-of-speech
tags (pos) that indicate the type of word you are looking for (a noun, an tags (pos) that indicate the type of word you are looking for (a noun, an
adjective, etc.). In addition, you can also search for structural attributes, adjective, etc.). In addition, you can also search for structural attributes,
or specify your query for a token (word, lemma, pos) via entity typing. And of or specify your query for a token (word, lemma, pos) via entity typing. And of
course everything can be combined. You can find examples for different queries course, the different text parameters can be combined.</p>
under the tab "Examples".</p> <p>Tokens and structural attributes can be added by clicking on the "+" button
<p></p> (the "input marker") in the input field. Elements added are shown as chips. These can
be reorganized using drag and drop. The input marker can also be moved in this way.
Its position shows where new elements will be added. <br>
A "translation" of your query into Corpus Query Language (CQL) is shown below.</p>
<p>Advanced users can make direct use of the Corpus Query Language (CQL) by switching to "expert mode" via the toggle button.</p>
<p>The entire input field can be cleared using the red trash icon on the right.</p>
<br> <br>
<div style="border: 1px solid; padding-left: 20px; margin-right: 400px; margin-bottom: 40px;"> <div style="border: 1px solid; padding-left: 20px; margin-right: 400px; margin-bottom: 40px;">
@ -101,7 +109,9 @@ under the tab "Examples".</p>
this case. For this you can simply string them together: <br> this case. For this you can simply string them together: <br>
[word="I"] [word="will" & simple_pos="VERB"] [word="go"].</p> [word="I"] [word="will" & simple_pos="VERB"] [word="go"].</p>
<img src="{{ url_for('static', filename='images/manual/query_builder/or_and.gif') }}" alt="OR/AND explanation" width="100%;" style="margin-bottom:20px;"> <img src="{{ url_for('static', filename='images/manual/query_builder/or_and.gif') }}" alt="OR/AND explanation" width="100%;" style="margin-bottom:20px;">
<p></p> <p>Tokens that have already been added can also be modified by clicking on the corresponding
pen icon. Click on the "ignore case" box, for example, and the query builder will
not differentiate between upper- and lower- case letters for that respective token.</p>
<br> <br>
</div> </div>

View File

@ -30,7 +30,7 @@
{% endif %} {% endif %}
{%- endfor -%} {%- endfor -%}
</ul> </ul>
<a class="btn-floating btn-large halfway-fab modal-trigger pink tooltipped waves-effect waves-light" data-tooltip="Manual" href="#manual-modal"><i class="material-icons">help</i></a> <a class="btn-floating btn-large halfway-fab modal-trigger pink tooltipped waves-effect waves-light" data-tooltip="Manual" href="#manual-modal"><i class="material-icons">school</i></a>
</div> </div>
</nav> </nav>
</div> </div>

View File

@ -153,16 +153,16 @@
let deleteJobRequestElement = document.querySelector('#delete-job-request'); let deleteJobRequestElement = document.querySelector('#delete-job-request');
let restartJobRequestElement = document.querySelector('#restart-job-request'); let restartJobRequestElement = document.querySelector('#restart-job-request');
deleteJobRequestElement.addEventListener('click', (event) => { deleteJobRequestElement.addEventListener('click', (event) => {
requests.jobs.entity.delete({{ job.hashid|tojson }}); nopaque.requests.jobs.entity.delete({{ job.hashid|tojson }});
}); });
restartJobRequestElement.addEventListener('click', (event) => { restartJobRequestElement.addEventListener('click', (event) => {
requests.jobs.entity.restart({{ job.hashid|tojson }}); nopaque.requests.jobs.entity.restart({{ job.hashid|tojson }});
}); });
if ({{ current_user.is_administrator()|tojson }}) { if ({{ current_user.is_administrator()|tojson }}) {
let jobLogButtonElement = document.querySelector('#job-log-button'); let jobLogButtonElement = document.querySelector('#job-log-button');
jobLogButtonElement.addEventListener('click', (event) => { jobLogButtonElement.addEventListener('click', (event) => {
requests.jobs.entity.log({{ job.hashid|tojson }}) nopaque.requests.jobs.entity.log({{ job.hashid|tojson }})
.then( .then(
(response) => { (response) => {
response.json() response.json()