diff --git a/app/templates/_base/_modals/_manual/01_introduction.html.j2 b/app/templates/_base/_modals/_manual/01_introduction.html.j2 index 0b1d9ad7..8e8b7b7b 100644 --- a/app/templates/_base/_modals/_manual/01_introduction.html.j2 +++ b/app/templates/_base/_modals/_manual/01_introduction.html.j2 @@ -1,9 +1,27 @@
- nopaque is a web-based digital working environment. It implements a - workflow based on the research process in the humanities and supports its - users in processing their data in order to subsequently apply digital - analysis methods to them. All processes are implemented in a specially - provided cloud environment with established open source software. This - always ensures that no personal data of the users is disclosed. + Nopaque is a web application that offers different services and tools to support + researchers working with image and text-based data. These services are logically + connected and build upon each other. They include:
+- Before you can start using the web platform, you need to create a user - account. This requires only a few details: just a user name, an e-mail - address and a password are needed. In order to register yourself, fill out - the form on the registration page. After successful registration, the - created account must be verified. To do this, follow the instructions - given in the automatically sent e-mail. Afterwards, you can log in as - usual with your username/email address and password in the log-in form - located next to the registration button. -
-Before you can begin using nopaque, you will need to create a personal user account. +Open the menu (three dots) at the top right of the screen and choose “Register.” Enter +the required details listed on the registration page (username, password, email address). +After verifying your account via the link sent to your email, you can log in.
+A few steps need to be taken before images, scans, or other text data are ready for +analysis in nopaque. The SpaCy NLP Pipeline service can only extract linguistic data +from texts in plain text (.txt) format. If your text is already in this format, you +can skip the next steps and go directly to Extracting linguistic data from text. +Otherwise, the next steps assume that you are starting off with image data.
++First, all data needs to be converted into PDF format. Using the File Setup service, +you can bundle images together – even of different formats – and convert them all into +one PDF file. Note that the File Setup service will sort the images based on their file +name in ascending order. It is thus recommended to name them accordingly, for example: +page-01.png, page-02.jpg, page-03.tiff. +
++After uploading the images and completing the File Setup job, the list of files added +can be seen under “Inputs.” Further below, under “Results,” you can find and download +the PDF output.
+Select an image-to-text conversion tool depending on whether your PDF is primarily +composed of handwritten text or printed text. For printed text, select the Tesseract OCR +Pipeline. For handwritten text, select the Transkribus HTR Pipeline. Select the desired +language model or upload your own. Select the version of Tesseract OCR you want to use +and click on submit to start the conversion. When the job is finished, various output +files can be seen and downloaded further below, under “Results.” You may want to review +the text output for errors and coherence.
+The SpaCy NLP Pipeline service extracts linguistic information from plain text files +(in .txt format). Select the corresponding .txt file, the language model, and the +version you want to use. When the job is finished, find and download the files in +.json and .vrt format under “Results.”
+Now, using the files in .vrt format, you can create a corpus. This can be done +in the Dashboard or Corpus Analysis under “My Corpora.” Click on “Create corpus” +and add a title and description for your corpus. After submitting, navigate down to +the “Corpus files” section. Once you have added the desired .vrt files, select “Build” +on the corpus page under “Actions.” Now, your corpus is ready for analysis.
+Navigate to the corpus you would like to analyze and click on the Analyze button. +This will take you to an analysis overview page for your corpus. Here, you can find a +visualization of general linguistic information of your corpus, including tokens, +sentences, unique words, unique lemmas, unique parts of speech and unique simple parts +of speech. You will also find a pie chart of the proportional textual makeup of your +corpus and can view the linguistic information for each individual text file. A more +detailed visualization of token frequencies with a search option is also on this page.
+From the corpus analysis overview page, you can navigate to other analysis modules: +the Query Builder (under Concordance) and the Reader. With the Reader, you can read +your corpus texts tokenized with the associated linguistic information. The tokens can +be shown as lemmas, parts of speech, words, and can be displayed in different ways: +visually as plain text with the option of highlighted entities or as chips.
+The Concordance module allows for more specific, query-oriented text analyses. +Here, you can filter out text parameters and structural attributes in different +combinations. This is explained in more detail in the Query Builder section of the +manual.
diff --git a/app/templates/_base/_modals/_manual/03_dashboard.html.j2 b/app/templates/_base/_modals/_manual/03_dashboard.html.j2 index 51d772a3..c1a9f33a 100644 --- a/app/templates/_base/_modals/_manual/03_dashboard.html.j2 +++ b/app/templates/_base/_modals/_manual/03_dashboard.html.j2 @@ -1,15 +1,20 @@- The dashboard provides a central overview of all resources assigned to the - user. These are corpora and created jobs. Corpora are freely composable - annotated text collections and jobs are the initiated file processing - procedures. One can search for jobs as well as corpus listings using - the search field displayed above them. + The dashboard provides a central + overview of all user-specific resources. + These are corpora, + created jobs, and + model contributions. + A corpus is a freely composable annotated text collection. + A job is an initiated file processing procedure. One can search for jobs as + well as corpus listings using the search field displayed above them on the dashboard.