{% extends "nopaque.html.j2" %} {% set headline = 'Natural Language Processing' %} {% block page_content %}
{% include 'services/roadmap.html.j2' %}
layersTokenization

Your text is split up into sentences and words, so called tokens, which can then be analyzed.

layersLemmatization

All inflected forms of a word are grouped together so that it can be analyzed as a single item.

layersPart-of-speech Tagging

In accordance with its definition and context, each word is marked up as corresponding to a particular part of speech.

layersNamed-Entity Recognition

Named entities are located and classified into specific categories like persons or locations.

Submit a job

{{ add_job_form.hidden_tag() }}
title {{ add_job_form.title(data_length='32') }} {{ add_job_form.title.label }} {% for error in add_job_form.title.errors %} {{ error }} {% endfor %}
language {{ add_job_form.language() }} {{ add_job_form.language.label }} {% for error in add_job_form.language.errors %} {{ error }} {% endfor %}
language {{ add_job_form.version() }} {{ add_job_form.version.label }} {% for error in add_job_form.version.errors %} {{ error }} {% endfor %}
{{ add_job_form.files.label.text }} {{ add_job_form.files(accept='text/plain') }}
{% for error in add_job_form.files.errors %} {{ error }} {% endfor %}
description {{ add_job_form.description(data_length='255') }} {{ add_job_form.description.label }} {% for error in add_job_form.description.errors %} {{ error }} {% endfor %}
Preprocessing

{{ add_job_form.check_encoding.label.text }}

If the input files are not created with the nopaque OCR service or you do not know if your text files are UTF-8 encoded, check this switch. We will try to automatically determine the right encoding for your texts to process them.

{{ macros.submit_button(add_job_form.submit) }}
{% endblock %}