diff --git a/app/templates/_base/_modals/_manual/02_getting_started.html.j2 b/app/templates/_base/_modals/_manual/02_getting_started.html.j2 index ecc60d6a..7d7fb0df 100644 --- a/app/templates/_base/_modals/_manual/02_getting_started.html.j2 +++ b/app/templates/_base/_modals/_manual/02_getting_started.html.j2 @@ -1,6 +1,8 @@

Getting Started

Getting Started

-
+

+In this section, we will take you through all the steps you need to start analyzing your data with nopaque. +

Content
@@ -21,6 +23,7 @@ Open the menu (three dots) at the top right of the screen and choose “Register.” Enter the required details listed on the registration page (username, password, email address). After verifying your account via the link sent to your email, you can log in.

+
Preparing files for analysis

A few steps need to be taken before images, scans, or other text data are ready for analysis in nopaque. The SpaCy NLP Pipeline service can only extract linguistic data @@ -39,6 +42,7 @@ Add a title and description to your job and select the File Setup version* you w After uploading the images and completing the File Setup job, the list of files added can be seen under “Inputs.” Further below, under “Results,” you can find and download the PDF output.

+
Converting a PDF into text data

Select an image-to-text conversion tool depending on whether your PDF is primarily composed of handwritten text or printed text. For printed text, select the Tesseract OCR @@ -50,11 +54,13 @@ the text output for errors and coherence. (Note: the Transkribus HTR Pipeline is deactivated; we are working on an alternative solution. You can try using Tesseract OCR, though the results will likely be poor.)

+
Extracting linguistic data from text

The SpaCy NLP Pipeline service extracts linguistic information from plain text files (in .txt format). Select the corresponding .txt file, the language model, and the version* you want to use. When the job is finished, find and download the files in .json and .vrt format under “Results.”

+
Creating a corpus

Now, using the files in .vrt format, you can create a corpus. This can be done in the Dashboard or @@ -72,6 +78,7 @@ be prepared for analysis. This process can be initiated by clicking on the On the corpus overview page, you can see information about the current status of the corpus in the upper right corner. After the build process, the status "built" should be shown here. Now, your corpus is ready for analysis.

+
Analyzing a corpus

Navigate to the corpus you would like to analyze and click on the Analyze button. This will take you to an analysis overview page for your corpus. Here, you can find a diff --git a/app/templates/_base/_modals/_manual/06_services.html.j2 b/app/templates/_base/_modals/_manual/06_services.html.j2 index 19e898c4..789ce134 100644 --- a/app/templates/_base/_modals/_manual/06_services.html.j2 +++ b/app/templates/_base/_modals/_manual/06_services.html.j2 @@ -1,6 +1,9 @@

Services

Services

-
+

+In this section, we will describe the different services nopaque has to offer. +

+
Services @@ -87,15 +90,17 @@ version you want to use. When the job is finished, find and download the files i

From the corpus analysis overview page, you can navigate to other analysis modules: - the Query Builder (under Concordance) and the Reader. With the Reader, you can read - your corpus texts tokenized with the associated linguistic information. The tokens + the Query Builder (under Concordance) and the Reader. +

+

+ With the Reader, you can read your corpus texts tokenized with the associated linguistic information. The tokens can be shown as lemmas, parts of speech, words, and can be displayed in different ways: visually as plain text with the option of highlighted entities or as chips.

The Concordance module allows for more specific, query-oriented text analyses. Here, you can filter out text parameters and structural attributes in different - combinations. This is explained in more detail in the Query Builder section of the + combinations. This is explained in more detail in the Query Builder section of the manual.

diff --git a/app/templates/_base/_modals/_manual/08_cqp_query_language.html.j2 b/app/templates/_base/_modals/_manual/08_cqp_query_language.html.j2 index 9069d495..70ffc9a8 100644 --- a/app/templates/_base/_modals/_manual/08_cqp_query_language.html.j2 +++ b/app/templates/_base/_modals/_manual/08_cqp_query_language.html.j2 @@ -1,5 +1,22 @@

CQP Query Language

-

Within the Corpus Query Language, a distinction is made between two types of annotations: positional attributes and structural attributes. Positional attributes refer to a token, e.g. the word "book" is assigned the part-of-speech tag "NN", the lemma "book" and the simplified part-of-speech tag "NOUN" within the token structure. Structural attributes refer to text structure-giving elements such as sentence and entity markup. For example, the markup of a sentence is represented in the background as follows:

+

CQP Query Language

+

In this section, we will provide some functional explanations of the properties of the Corpus Query Language. This includes +the types of linguistic attributes one can work with and how to use them in your query.

+ +
+
Content
+
    +
  1. Overview of annotation types
  2. +
  3. Positional attributes
  4. +
  5. How to search for positional attributes
  6. +
  7. Structural attributes
  8. +
  9. How to search for structural attributes
  10. + +
+
+ +

Overview of annotation types

+

Within the Corpus Query Language, a distinction is made between two types of annotations: positional attributes and structural attributes. Positional attributes refer to a token, e.g. the word "book" is assigned the part-of-speech tag "NN", the lemma "book" and the simplified part-of-speech tag "NOUN" within the token structure. Structural attributes refer to text structure-giving elements such as sentence and entity markup. For example, the markup of a sentence is represented in the background as follows:

   
     <s>                                     structural attribute
@@ -13,7 +30,7 @@
   
 
-

Positional attributes

+

Positional attributes

Before you can start searching for positional attributes (also called tokens), it is necessary to know what properties they contain.

  1. word: The string as it is also found in the original text
  2. @@ -33,7 +50,7 @@
-
Searching for positional attributes
+
How to search for positional attributes

Token with no condition on any property (also called wildcard token)
@@ -118,7 +135,7 @@

         ^             ^ the braces indicate the start and end of an option group
-

Structural attributes

+

Structural attributes

nopaque provides several structural attributes for query. A distinction is made between attributes with and without value.

  1. s: Annotates a sentence
  2. @@ -153,7 +170,7 @@
-
Searching for structural attributes
+
How to search for structural attributes
<ent> [] </ent>;                       A one token long entity of any type
<ent_type="PERSON"> [] </ent_type>;     A one token long entity of type PERSON
<ent_type="PERSON"> []* </ent_type>;    Entity of any length of type PERSON
diff --git a/app/templates/_base/_modals/_manual/09_query_builder.html.j2 b/app/templates/_base/_modals/_manual/09_query_builder.html.j2 index 31f46f52..221aa891 100644 --- a/app/templates/_base/_modals/_manual/09_query_builder.html.j2 +++ b/app/templates/_base/_modals/_manual/09_query_builder.html.j2 @@ -1,27 +1,12 @@

Query Builder Tutorial

-

Overview

-

The query builder can be accessed via "My Corpora" or "Corpus Analysis" in the sidebar options. -Select the desired corpus and click on the "Analyze" and then "Concordance" -buttons to open the query builder.

-

The query builder uses the Corpus Query Language (CQL) to help you make a query for analyzing your texts. -In this way, it is possible to filter out various types of text parameters, for -example, a specific word, a lemma, or you can set part-of-speech -tags (pos) that indicate the type of word you are looking for (a noun, an -adjective, etc.). In addition, you can also search for structural attributes, -or specify your query for a token (word, lemma, pos) via entity typing. And of -course, the different text parameters can be combined.

-

Tokens and structural attributes can be added by clicking on the "+" button -(the "input marker") in the input field or the labeled buttons below it. Elements -added are shown as chips. These can be reorganized using drag and drop. The input -marker can also be moved in this way. Its position shows where new elements will be added.
-A "translation" of your query into Corpus Query Language (CQL) is shown below.

-

Advanced users can make direct use of the Corpus Query Language (CQL) by switching to "expert mode" via the toggle button.

-

The entire input field can be cleared using the red trash icon on the right.

-
+

Query Builder

+

In this section, we will provide you with more detailed instructions on how to use the Query Builder - +nopaque's main user-friendly tool for finding and analyzing different linguistic elements of your texts.

Content
    +
  1. General Overview
  2. Add a new token to your query
  3. Options for editing your query
  4. Add structural attributes to your query
  5. @@ -29,6 +14,33 @@ A "translation" of your query into Corpus Query Language (CQL) is shown below.
+

General Overview

+

The Query Builder can be accessed via My Corpora or Corpus Analysis in the sidebar options. +Click on the corpus you wish to analyze. You will be sent to its corpus overview page. +Here, click on Analyze to reach the analysis page. +The analysis page features different options for analyzing your corpus, including +visualizations and a Reader module. In this case, we want to open the query builder. +To do so, click on the Concordance button on the top of the page.

+

The query builder uses the Corpus Query Language (CQL) to help you make a query for analyzing your texts. +In this way, it is possible to filter out various types of text parameters, for +example, a specific word, a lemma, or you can set part-of-speech +tags (pos) that indicate the type of word you are looking for (a noun, an +adjective, etc.). In addition, you can also search for structural attributes, +or specify your query for a token (word, lemma, pos) via entity typing. And of +course, the different text parameters can be combined.

+

Tokens and structural attributes can be added by clicking on the "+" button +(what we call the "input marker") in the input field or the labeled buttons below it. Elements +added are shown as chips. These can be reorganized using drag and drop. The input +marker can also be moved in this way. Its position shows where new elements will be added.
+A "translation" of your query into Corpus Query Language (CQL) will be displayed underneath the query field.

+

For more information, see our manual section for the Corpus Query Language. +
+Advanced users can make direct use of CQL by switching to expert mode via the toggle button. +

+

The entire input field can be cleared using the red trash icon on the right.

+
+ + {# Add Token Tutorial #}

@@ -37,8 +49,8 @@ A "translation" of your query into Corpus Query Language (CQL) is shown below.Add new token to your Query

If you are only looking for a specific token, you can click on the left button and select the type of token you are looking for from the drop-down menu. - By default "Word" is selected.

-
+ "Word" is selected by default.

+
Word and Lemma

If you want to search for a specific word or lemma and the respective category is selected in the drop-down menu, you can type in the word or lemma diff --git a/app/templates/_base/_modals/manual.html.j2 b/app/templates/_base/_modals/manual.html.j2 index 64660561..f175d224 100644 --- a/app/templates/_base/_modals/manual.html.j2 +++ b/app/templates/_base/_modals/manual.html.j2 @@ -6,9 +6,10 @@

  • Getting Started
  • Dashboard
  • Services
  • -
  • A closer look at the Corpus Analysis
  • -
  • CQP Query Language
  • +
  • Query Builder
  • +
  • CQP Query Language
  • +
  • Tagsets
  • @@ -27,10 +28,10 @@
    {% include "_base/_modals/_manual/06_services.html.j2" %}
    -
    +

    {% include "_base/_modals/_manual/08_cqp_query_language.html.j2" %}