Commit Graph

41 Commits

Author SHA1 Message Date
Patrick Jentsch
e1b78b6ba4 Update to Tesseract 5.0.0, Set version 0.1.0 2022-01-04 11:42:55 +01:00
Patrick Jentsch
6db7f70446 Add back german language models 2021-03-17 14:26:24 +01:00
Patrick Jentsch
acbf61be05 Cleanup and make use of globbing for input files for binarization and ocr 2021-03-15 12:45:05 +01:00
Patrick Jentsch
104598039e Dockerfile codestyle 2021-02-24 15:28:04 +01:00
Patrick Jentsch
da29659a9b Add back missing author mention 2021-02-24 15:17:42 +01:00
Patrick Jentsch
613bceb4ff Add new models 2021-02-23 11:11:50 +01:00
Patrick Jentsch
ca7df6d0ed First work on version 1.0.0 2021-02-19 13:04:03 +01:00
Patrick Jentsch
07635dcdfa Use "buster" instead of "10" in FROM 2020-10-08 23:17:48 +02:00
Patrick Jentsch
c0069d5453 Use new Dockerfile structure 2020-10-08 23:09:10 +02:00
Patrick Jentsch
ac4b5c2fd8 Add possibility to use an intermediate dir 2020-09-22 17:44:32 +02:00
Patrick Jentsch
7cc8ebd666 compile tesseract in container 2020-09-21 14:46:03 +02:00
Patrick Jentsch
7322a5bc7c More GhostScript, less dependencies! 2020-07-02 11:47:43 +02:00
Patrick Jentsch
2b63ba9e59 Remove unused dependencies and use ghostscript for image split 2020-07-01 11:03:34 +02:00
Stephan Porada
ec5b4eb521 Add PDF compression 2020-06-16 09:31:34 +02:00
Patrick Jentsch
a75b32ca1d Bump versions 2020-04-06 09:21:52 +02:00
Patrick Jentsch
dfc05be7db add zip creation of results 2020-01-20 15:04:55 +01:00
Patrick Jentsch
088cf49b89 set charset again! 2019-09-12 11:30:52 +02:00
Patrick Jentsch
cebc53da03 Codestyle 2019-09-11 15:15:00 +02:00
Patrick Jentsch
fa4a798351 Use language models from repository. Remove workaround for the legacy German Fraktur model. 2019-07-31 11:13:55 +02:00
Patrick Jentsch
a4b68bece7 Use more specific versions. 2019-06-02 21:45:11 +02:00
Patrick Jentsch
a433aea3e6 Fix 2019-06-02 21:41:33 +02:00
Patrick Jentsch
95adc4d804 Update for unprivileged usage. 2019-06-02 21:38:30 +02:00
Patrick Jentsch
f731634ba1 Update Dockerfile 2019-05-27 11:47:38 +02:00
Patrick Jentsch
f73a191314 Add wrapper and remove default arguments from Dockerfile 2019-05-21 12:29:26 +02:00
Patrick Jentsch
843151e547 Correct order for output files. 2019-05-13 15:03:43 +02:00
Stephan Porada
937abb8c8d Add Italian language data. 2019-05-02 12:14:05 +02:00
Patrick Jentsch
0b20d641df Update 2019-04-25 11:50:32 +02:00
Patrick Jentsch
cc05d09756 test 2019-04-25 11:46:05 +02:00
Patrick Jentsch
efbf6f24e6 Update 2019-04-25 11:40:27 +02:00
Patrick Jentsch
d25204d6a9 Change tif split handling, sort files before merging 2019-04-24 17:01:49 +02:00
Patrick Jentsch
babb773693 Update Dockerfile 2019-03-11 23:54:33 +01:00
Patrick Jentsch
44c759f7c2 Fix spelling. 2019-03-10 21:04:14 +01:00
Patrick Jentsch
26757eda03 Some renaming and cleanup. 2019-03-10 20:59:30 +01:00
Patrick Jentsch
dc8db6c0b1 Added por language pack. 2019-01-11 16:16:31 +01:00
Patrick Jentsch
4995025e45 Create input and output directories in Dockerfile. 2019-01-08 12:38:07 +01:00
Stephan Porada
2cfcc2b2c2 Added spanish model. 2018-12-13 14:01:24 +01:00
Patrick Jentsch
2e5c8b0327 Update 2018-10-29 10:50:56 +01:00
Patrick Jentsch
c49da3b50a Update 2018-10-29 10:49:19 +01:00
Patrick Jentsch
132490a929 Update 2018-10-29 10:38:50 +01:00
Patrick Jentsch
ce864e205a Added missing dependencies for ocropus. 2018-10-10 15:20:34 +02:00
Patrick Jentsch
aa48ea6ed2 Initial commit 2018-10-09 14:43:23 +02:00