mirror of
				https://gitlab.ub.uni-bielefeld.de/sfb1288inf/nopaque.git
				synced 2025-10-30 18:22:45 +00:00 
			
		
		
		
	updates and restructuring
This commit is contained in:
		| @@ -1,6 +1,8 @@ | ||||
| <h3 class="manual-chapter-title">Getting Started</h3> | ||||
| <h4>Getting Started</h4> | ||||
| <br> | ||||
| <p> | ||||
| In this section, we will take you through all the steps you need to start analyzing your data with nopaque. | ||||
| </p> | ||||
|  | ||||
| <div style="border: 1px solid; padding-left: 20px; margin-right: 400px; margin-bottom: 40px;"> | ||||
|   <h5>Content</h5> | ||||
| @@ -21,6 +23,7 @@ | ||||
| Open the menu (three dots) at the top right of the screen and choose “Register.” Enter  | ||||
| the required details listed on the registration page (username, password, email address).  | ||||
| After verifying your account via the link sent to your email, you can log in.</p> | ||||
|  | ||||
| <h5 id="preparing-files">Preparing files for analysis</h5> | ||||
| <p>A few steps need to be taken before images, scans, or other text data are ready for  | ||||
| analysis in nopaque. The SpaCy NLP Pipeline service can only extract linguistic data  | ||||
| @@ -39,6 +42,7 @@ Add a title and description to your job and select the File Setup version* you w | ||||
| After uploading the images and completing the File Setup job, the list of files added  | ||||
| can be seen under “Inputs.” Further below, under “Results,” you can find and download  | ||||
| the PDF output.</p> | ||||
|  | ||||
| <h5 id="converting-a-pdf-into-text">Converting a PDF into text data</h5> | ||||
| <p>Select an image-to-text conversion tool depending on whether your PDF is primarily  | ||||
| composed of handwritten text or printed text. For printed text, select the <b>Tesseract OCR  | ||||
| @@ -50,11 +54,13 @@ the text output for errors and coherence. (Note: the Transkribus HTR Pipeline is | ||||
| deactivated; we are working on an alternative solution. You can try using Tesseract OCR,  | ||||
| though the results will likely be poor.) | ||||
| </p> | ||||
|  | ||||
| <h5 id="extracting-linguistic-data">Extracting linguistic data from text</h5> | ||||
| <p>The <b>SpaCy NLP Pipeline</b> service extracts linguistic information from plain text files  | ||||
| (in .txt format). Select the corresponding .txt file, the language model, and the  | ||||
| version* you want to use. When the job is finished, find and download the files in  | ||||
| <b>.json</b> and <b>.vrt</b> format under “Results.”</p> | ||||
|  | ||||
| <h5 id="creating-a-corpus">Creating a corpus</h5> | ||||
| <p>Now, using the files in .vrt format, you can create a corpus. This can be done  | ||||
| in the <a href="{{ url_for('main.dashboard') }}">Dashboard</a> or  | ||||
| @@ -72,6 +78,7 @@ be prepared for analysis. This process can be initiated by clicking on the | ||||
| On the corpus overview page, you can see information about the current status of  | ||||
| the corpus in the upper right corner. After the build process, the status "built" should be shown here. | ||||
| Now, your corpus is ready for analysis.</p> | ||||
|  | ||||
| <h5 id="analyzing-a-corpus">Analyzing a corpus</h5> | ||||
| <p>Navigate to the corpus you would like to analyze and click on the Analyze button.  | ||||
| This will take you to an analysis overview page for your corpus. Here, you can find a  | ||||
|   | ||||
| @@ -1,6 +1,9 @@ | ||||
| <h3 class="manual-chapter-title">Services</h5> | ||||
| <h4>Services</h4> | ||||
| <br> | ||||
| <p> | ||||
| In this section, we will describe the different services nopaque has to offer. | ||||
| </p> | ||||
|  | ||||
| <div class="row"> | ||||
|   <div class="col s12 m4"> | ||||
|     <img alt="Services" class="materialboxed responsive-img" src="{{ url_for('static', filename='images/manual/services.png') }}"> | ||||
| @@ -87,15 +90,17 @@ version you want to use. When the job is finished, find and download the files i | ||||
|   </p> | ||||
|   <p> | ||||
|   From the corpus analysis overview page, you can navigate to other analysis modules:  | ||||
|   the Query Builder (under Concordance) and the Reader. With the Reader, you can read  | ||||
|   your corpus texts tokenized with the associated linguistic information. The tokens  | ||||
|   the Query Builder (under Concordance) and the Reader.  | ||||
|   </p> | ||||
|   <p> | ||||
|   With the <b>Reader</b>, you can read your corpus texts tokenized with the associated linguistic information. The tokens  | ||||
|   can be shown as lemmas, parts of speech, words, and can be displayed in different  | ||||
|   ways: visually as plain text with the option of highlighted entities or as chips. | ||||
|   </p> | ||||
|   <p> | ||||
|   The Concordance module allows for more specific, query-oriented text analyses.  | ||||
|   Here, you can filter out text parameters and structural attributes in different  | ||||
|   combinations. This is explained in more detail in the Query Builder section of the  | ||||
|   combinations. This is explained in more detail in the <b>Query Builder</b> section of the  | ||||
|   manual. | ||||
|   </p> | ||||
| </p> | ||||
|   | ||||
| @@ -1,5 +1,22 @@ | ||||
| <h3 class="manual-chapter-title">CQP Query Language</h3> | ||||
| <p>Within the Corpus Query Language, a distinction is made between two types of annotations: positional attributes and structural attributes. Positional attributes refer to a token, e.g. the word "book" is assigned the part-of-speech tag "NN", the lemma "book" and the simplified part-of-speech tag "NOUN" within the token structure. Structural attributes refer to text structure-giving elements such as sentence and entity markup. For example, the markup of a sentence is represented in the background as follows:</p> | ||||
| <h4 id="cqp-query-language">CQP Query Language</h4> | ||||
| <p>In this section, we will provide some functional explanations of the properties of the Corpus Query Language. This includes  | ||||
| the types of linguistic attributes one can work with and how to use them in your query.</p> | ||||
|  | ||||
| <div style="border: 1px solid; padding-left: 20px; margin-right: 400px; margin-bottom: 40px;"> | ||||
|   <h5>Content</h5> | ||||
|   <ol style="list-style-type:disc"> | ||||
|     <li><a href="#overview-annotations">Overview of annotation types</a></li> | ||||
|     <li><a href="#positional-attributes">Positional attributes</a></li> | ||||
|     <li><a href="#searching-positional-attributes">How to search for positional attributes</a></li> | ||||
|     <li><a href="#structural-attributes">Structural attributes</a></li> | ||||
|     <li><a href="#searching-structural-attributes">How to search for structural attributes</a></li> | ||||
|  | ||||
|   </ol> | ||||
| </div> | ||||
|  | ||||
| <h4 id="overview-annotations">Overview of annotation types</h4> | ||||
| <p>Within the Corpus Query Language, a distinction is made between two types of annotations: <b>positional attributes</b> and <b>structural attributes</b>. Positional attributes refer to a token, e.g. the word "book" is assigned the part-of-speech tag "NN", the lemma "book" and the simplified part-of-speech tag "NOUN" within the token structure. Structural attributes refer to text structure-giving elements such as sentence and entity markup. For example, the markup of a sentence is represented in the background as follows:</p> | ||||
| <pre> | ||||
|   <code> | ||||
|     <span class="green-text"><s>                                     structural attribute</span> | ||||
| @@ -13,7 +30,7 @@ | ||||
|   </code> | ||||
| </pre> | ||||
|  | ||||
| <h4>Positional attributes</h4> | ||||
| <h4 id="positional-attributes">Positional attributes</h4> | ||||
| <p>Before you can start searching for positional attributes (also called tokens), it is necessary to know what properties they contain.</p> | ||||
| <ol> | ||||
|   <li><span class="blue-text"><b>word</b></span>: The string as it is also found in the original text</li> | ||||
| @@ -33,7 +50,7 @@ | ||||
|   </li> | ||||
| </ol> | ||||
|  | ||||
| <h5>Searching for positional attributes</h5> | ||||
| <h5 id="searching-positional-attributes">How to search for positional attributes</h5> | ||||
| <div> | ||||
|   <p> | ||||
|     <b>Token with no condition on any property (also called <span class="blue-text">wildcard token</span>)</b><br> | ||||
| @@ -118,7 +135,7 @@ | ||||
|   <pre style="margin-top: 0;"   ><code>         ^             ^ the braces indicate the start and end of an option group</code></pre> | ||||
| </div> | ||||
|  | ||||
| <h4>Structural attributes</h4> | ||||
| <h4 id="structural-attributes">Structural attributes</h4> | ||||
| <p>nopaque provides several structural attributes for query. A distinction is made between attributes with and without value.</p> | ||||
| <ol> | ||||
|   <li><span class="green-text"><b>s</b></span>: Annotates a sentence</li> | ||||
| @@ -153,7 +170,7 @@ | ||||
|   </li> | ||||
| </ol> | ||||
|  | ||||
| <h5>Searching for structural attributes</h5> | ||||
| <h5 id="searching-structural-attributes">How to search for structural attributes</h5> | ||||
| <pre><code><ent> [] </ent>;                       A one token long entity of any type</code></pre> | ||||
| <pre><code><ent_type="PERSON"> [] </ent_type>;     A one token long entity of type PERSON</code></pre> | ||||
| <pre><code><ent_type="PERSON"> []* </ent_type>;    Entity of any length of type PERSON</code></pre> | ||||
|   | ||||
| @@ -1,27 +1,12 @@ | ||||
| <h3 class="manual-chapter-title">Query Builder Tutorial</h3> | ||||
| <h4>Overview</h4> | ||||
| <p>The query builder can be accessed via "My Corpora" or "Corpus Analysis" in the sidebar options.  | ||||
| Select the desired corpus and click on the "Analyze" and then "Concordance" | ||||
| buttons to open the query builder.</p> | ||||
| <p>The query builder uses the Corpus Query Language (CQL) to help you make a query for analyzing your texts.  | ||||
| In this way, it is possible to filter out various types of text parameters, for  | ||||
| example, a specific word, a lemma, or you can set part-of-speech  | ||||
| tags (pos) that indicate the type of word you are looking for (a noun, an  | ||||
| adjective, etc.). In addition, you can also search for structural attributes,  | ||||
| or specify your query for a token (word, lemma, pos) via entity typing. And of  | ||||
| course, the different text parameters can be combined.</p> | ||||
| <p>Tokens and structural attributes can be added by clicking on the "+" button | ||||
| (the "input marker") in the input field or the labeled buttons below it. Elements  | ||||
| added are shown as chips. These can be reorganized using drag and drop. The input  | ||||
| marker can also be moved in this way. Its position shows where new elements will be added. <br> | ||||
| A "translation" of your query into Corpus Query Language (CQL) is shown below.</p> | ||||
| <p>Advanced users can make direct use of the Corpus Query Language (CQL) by switching to "expert mode" via the toggle button.</p> | ||||
| <p>The entire input field can be cleared using the red trash icon on the right.</p> | ||||
| <br> | ||||
| <h4>Query Builder</h4> | ||||
| <p>In this section, we will provide you with more detailed instructions on how to use the Query Builder -  | ||||
| nopaque's main user-friendly tool for finding and analyzing different linguistic elements of your texts.</p> | ||||
|  | ||||
| <div style="border: 1px solid; padding-left: 20px; margin-right: 400px; margin-bottom: 40px;"> | ||||
|   <h5>Content</h5> | ||||
|   <ol style="list-style-type:disc"> | ||||
|     <li><a href="#general-overview">General Overview</a></li> | ||||
|     <li><a href="#add-new-token-tutorial">Add a new token to your query</a></li> | ||||
|     <li><a href="#edit-options-tutorial">Options for editing your query</a></li> | ||||
|     <li><a href="#add-structural-attribute-tutorial">Add structural attributes to your query</a></li> | ||||
| @@ -29,6 +14,33 @@ A "translation" of your query into Corpus Query Language (CQL) is shown below.</ | ||||
|   </ol> | ||||
| </div> | ||||
|  | ||||
| <h4 id="general-overview">General Overview</h4> | ||||
| <p>The Query Builder can be accessed via <a href=" {{ url_for('main.dashboard') }}">My Corpora</a> or <a href=" {{ url_for('services.corpus_analysis') }}">Corpus Analysis</a> in the sidebar options.  | ||||
| Click on the corpus you wish to analyze. You will be sent to its corpus overview page. | ||||
| Here, click on <b>Analyze</b> to reach the analysis page. | ||||
| The analysis page features different options for analyzing your corpus, including  | ||||
| visualizations and a <b>Reader</b> module. In this case, we want to open the query builder.  | ||||
| To do so, click on the <b>Concordance</b> button on the top of the page.</p> | ||||
| <p>The query builder uses the <b>Corpus Query Language (CQL)</b> to help you make a query for analyzing your texts. | ||||
| In this way, it is possible to filter out various types of text parameters, for  | ||||
| example, a specific word, a lemma, or you can set part-of-speech  | ||||
| tags (pos) that indicate the type of word you are looking for (a noun, an  | ||||
| adjective, etc.). In addition, you can also search for structural attributes,  | ||||
| or specify your query for a token (word, lemma, pos) via entity typing. And of  | ||||
| course, the different text parameters can be combined.</p> | ||||
| <p>Tokens and structural attributes can be added by clicking on the <b>"+"</b> button | ||||
| (what we call the "input marker") in the input field or the labeled buttons below it. Elements  | ||||
| added are shown as chips. These can be reorganized using drag and drop. The input  | ||||
| marker can also be moved in this way. Its position shows where new elements will be added. <br> | ||||
| A "translation" of your query into Corpus Query Language (CQL) will be displayed underneath the query field.</p> | ||||
| <p>For more information, see our <b>manual section for the Corpus Query Language.</b> | ||||
| <br> | ||||
| Advanced users can make direct use of CQL by switching to <b>expert mode</b> via the toggle button. | ||||
| </p> | ||||
| <p>The entire input field can be cleared using the red trash icon on the right.</p> | ||||
| <br> | ||||
|  | ||||
|  | ||||
| {# Add Token Tutorial #} | ||||
| <div> | ||||
|   <hr> | ||||
| @@ -37,8 +49,8 @@ A "translation" of your query into Corpus Query Language (CQL) is shown below.</ | ||||
|   <h4 id="add-new-token-tutorial">Add new token to your Query</h4> | ||||
|   <p>If you are only looking for a specific token, you can click on the left  | ||||
|   button and select the type of token you are looking for from the drop-down menu.  | ||||
|   By default "Word" is selected. </p> | ||||
|   <br> | ||||
|   "Word" is selected by default. </p> | ||||
|    | ||||
|   <h5>Word and Lemma</h5> | ||||
|   <p>If you want to search for a specific word or lemma and the respective  | ||||
|   category is selected in the drop-down menu, you can type in the word or lemma  | ||||
|   | ||||
| @@ -6,9 +6,10 @@ | ||||
|       <li class="tab"><a href="#manual-modal-getting-started">Getting Started</a></li> | ||||
|       <li class="tab"><a href="#manual-modal-dashboard">Dashboard</a></li> | ||||
|       <li class="tab"><a href="#manual-modal-services">Services</a></li> | ||||
|       <li class="tab"><a href="#manual-modal-a-closer-look-at-the-corpus-analysis">A closer look at the Corpus Analysis</a></li> | ||||
|       <li class="tab"><a href="#manual-modal-cqp-query-language">CQP Query Language</a></li> | ||||
|       <!-- <li class="tab"><a href="#manual-modal-a-closer-look-at-the-corpus-analysis">A closer look at the Corpus Analysis</a></li> --> | ||||
|       <li class="tab"><a href="#manual-modal-query-builder">Query Builder</a></li> | ||||
|       <li class="tab"><a href="#manual-modal-cqp-query-language">CQP Query Language</a></li> | ||||
|  | ||||
|       <li class="tab"><a href="#manual-modal-tagsets">Tagsets</a></li> | ||||
|     </ul> | ||||
|     <div id="manual-modal-introduction"> | ||||
| @@ -27,10 +28,10 @@ | ||||
|       <br> | ||||
|       {% include "_base/_modals/_manual/06_services.html.j2" %} | ||||
|     </div> | ||||
|     <div id="manual-modal-a-closer-look-at-the-corpus-analysis"> | ||||
|     <!-- <div id="manual-modal-a-closer-look-at-the-corpus-analysis"> | ||||
|       <br> | ||||
|       {% include "_base/_modals/_manual/07_a_closer_look_at_the_corpus_analysis.html.j2" %} | ||||
|     </div> | ||||
|     </div> --> | ||||
|     <div id="manual-modal-cqp-query-language"> | ||||
|       <br> | ||||
|       {% include "_base/_modals/_manual/08_cqp_query_language.html.j2" %} | ||||
|   | ||||
		Reference in New Issue
	
	Block a user