Commit Graph

  • ca1803ab8a Mark required arguments in scripts as required master v0.1.0 Patrick Jentsch 2022-02-03 10:40:50 +0100
  • 4518ca1c83 Codestyle enhacements Patrick Jentsch 2022-01-27 13:40:23 +0100
  • aeab9b7802 Fix enumeration in readme Patrick Jentsch 2022-01-18 13:46:52 +0100
  • 00c4b17018 Codestyle update Patrick Jentsch 2022-01-18 13:45:17 +0100
  • c057d324cf Cleanup and change some output options Patrick Jentsch 2022-01-17 15:07:46 +0100
  • f51a8c4546 Change output files file format Patrick Jentsch 2022-01-14 10:56:16 +0100
  • c640d9743f Add output_files.json (lists all output files) generation. Patrick Jentsch 2022-01-05 11:25:00 +0100
  • e3fd679b38 Mark all scripts as executeable Patrick Jentsch 2022-01-04 13:21:38 +0100
  • 8a3816121c fix image tag Patrick Jentsch 2022-01-04 12:10:26 +0100
  • e1b78b6ba4 Update to Tesseract 5.0.0, Set version 0.1.0 Patrick Jentsch 2022-01-04 11:42:55 +0100
  • a0760487ae Don't process files in subdirectories 1.0.0b Patrick Jentsch 2021-04-12 13:22:28 +0200
  • a798457c43 Add mising --log-dir argument to wrapper script Patrick Jentsch 2021-04-12 09:53:59 +0200
  • e2da0fb839 Tweak the README and pipeline help. Patrick Jentsch 2021-03-26 10:03:59 +0100
  • e78f667438 Use more descriptive argument names then i and o (now: input and output) Patrick Jentsch 2021-03-18 10:32:55 +0100
  • 41f70da8eb Update the hocrtotei script Patrick Jentsch 2021-03-17 16:58:13 +0100
  • 6db7f70446 Add back german language models Patrick Jentsch 2021-03-17 14:26:24 +0100
  • 947658a7d8 Change intermediate image name in order to fix issues with building multiple branches/tags at the same time Patrick Jentsch 2021-03-15 14:11:23 +0100
  • acbf61be05 Cleanup and make use of globbing for input files for binarization and ocr Patrick Jentsch 2021-03-15 12:45:05 +0100
  • 104598039e Dockerfile codestyle Patrick Jentsch 2021-02-24 15:28:04 +0100
  • da29659a9b Add back missing author mention Patrick Jentsch 2021-02-24 15:17:42 +0100
  • 613bceb4ff Add new models Patrick Jentsch 2021-02-23 11:11:50 +0100
  • ca7df6d0ed First work on version 1.0.0 Patrick Jentsch 2021-02-19 13:04:03 +0100
  • 07635dcdfa Use "buster" instead of "10" in FROM Patrick Jentsch 2020-10-08 23:17:48 +0200
  • c0069d5453 Use new Dockerfile structure Patrick Jentsch 2020-10-08 23:09:10 +0200
  • e941f64ee4 test new ci config Patrick Jentsch 2020-10-07 16:44:38 +0200
  • cb68d6de2d One thread per page ocr patch Stephan Porada 2020-10-07 13:46:22 +0200
  • 4b84488fe6 fix gitlab ci Patrick Jentsch 2020-09-23 16:58:07 +0200
  • 7d52ad9f68 Update Patrick Jentsch 2020-09-23 15:52:24 +0200
  • ac4b5c2fd8 Add possibility to use an intermediate dir Patrick Jentsch 2020-09-22 17:44:32 +0200
  • 6d90d43699 fix cleanup attempt Patrick Jentsch 2020-09-21 15:36:03 +0200
  • 4bd0d3bb01 Use commit_sha for intermediate image Patrick Jentsch 2020-09-21 15:02:04 +0200
  • 15061bfaaf add tag to clean stage Patrick Jentsch 2020-09-21 15:00:09 +0200
  • 7cc8ebd666 compile tesseract in container Patrick Jentsch 2020-09-21 14:46:03 +0200
  • 82285a8e6c better multithreading Patrick Jentsch 2020-07-02 11:49:35 +0200
  • 7322a5bc7c More GhostScript, less dependencies! Patrick Jentsch 2020-07-02 11:47:43 +0200
  • 2b63ba9e59 Remove unused dependencies and use ghostscript for image split Patrick Jentsch 2020-07-01 11:03:34 +0200
  • aee9628e5e fix pipeline Patrick Jentsch 2020-06-23 15:19:27 +0200
  • ec5b4eb521 Add PDF compression Stephan Porada 2020-06-16 09:31:34 +0200
  • b77ca5914f Set relative file paths in hocr Stephan Porada 2020-06-10 11:48:58 +0200
  • 018939ae55 Add PoCo zips part 1 Stephan Porada 2020-06-09 16:58:22 +0200
  • 64fe706126 Keep uncompressed output files after zip jobs. Patrick Jentsch 2020-05-13 09:11:01 +0200
  • a75b32ca1d Bump versions Patrick Jentsch 2020-04-06 09:21:52 +0200
  • 364e3d626d Fix zip creation Patrick Jentsch 2020-04-04 15:37:21 +0200
  • 36a86887b0 Update OCR Pipeline Patrick Jentsch 2020-04-03 17:35:30 +0200
  • eb5ccf4e21 Add ocr to filenames 1.0 stephan 2020-02-18 10:16:24 +0100
  • c1f5252633 Some cosmetics stephan 2020-02-17 14:59:34 +0100
  • 880f0efcf9 Add zip fielname argument stephan 2020-02-17 14:26:50 +0100
  • 6c4a642cb7 Add a switch for zip functionality Patrick Jentsch 2020-02-03 15:00:27 +0100
  • dfc05be7db add zip creation of results Patrick Jentsch 2020-01-20 15:04:55 +0100
  • 3a4cc16e5b Update Patrick Jentsch 2019-11-04 15:14:59 +0100
  • 8a4d006687 Update .gitlab-ci.yml Patrick Jentsch 2019-09-16 15:39:02 +0200
  • 3e43c8eab5 Update .gitlab-ci.yml Patrick Jentsch 2019-09-16 15:33:35 +0200
  • f1d1434e1a Update .gitlab-ci.yml Patrick Jentsch 2019-09-16 15:30:11 +0200
  • 62a435e8c2 Update .gitlab-ci.yml Patrick Jentsch 2019-09-16 15:28:33 +0200
  • 088cf49b89 set charset again! Patrick Jentsch 2019-09-12 11:30:52 +0200
  • cebc53da03 Codestyle Patrick Jentsch 2019-09-11 15:15:00 +0200
  • 1fd85d1b44 Change CI script. Patrick Jentsch 2019-07-31 11:23:41 +0200
  • fa4a798351 Use language models from repository. Remove workaround for the legacy German Fraktur model. Patrick Jentsch 2019-07-31 11:13:55 +0200
  • 1a3d7175fe Remove comments Patrick Jentsch 2019-06-11 14:18:46 +0200
  • 6f6d6e809e Update .gitlab-ci.yml Patrick Jentsch 2019-06-04 12:18:31 +0200
  • 148e9e86e9 Use variable instead of headcoded string Patrick Jentsch 2019-06-03 14:20:22 +0200
  • f280b16b1b Make arguments optional Patrick Jentsch 2019-06-03 14:18:16 +0200
  • b5ba154f86 Update for unprivileged usage 2 Patrick Jentsch 2019-06-03 13:32:42 +0200
  • a4b68bece7 Use more specific versions. Patrick Jentsch 2019-06-02 21:45:11 +0200
  • a433aea3e6 Fix Patrick Jentsch 2019-06-02 21:41:33 +0200
  • 95adc4d804 Update for unprivileged usage. Patrick Jentsch 2019-06-02 21:38:30 +0200
  • f731634ba1 Update Dockerfile Patrick Jentsch 2019-05-27 11:47:38 +0200
  • f73a191314 Add wrapper and remove default arguments from Dockerfile Patrick Jentsch 2019-05-21 12:29:26 +0200
  • 8a9ff27aaa Add usage hint. Patrick Jentsch 2019-05-20 12:06:57 +0200
  • 8ca24c3a14 Merge branch 'master' of gitlab.ub.uni-bielefeld.de:sfb1288inf/ocr Patrick Jentsch 2019-05-20 11:10:49 +0200
  • e1462152fe Codestyle Patrick Jentsch 2019-05-20 11:10:40 +0200
  • 14a70e010c Update README.md Patrick Jentsch 2019-05-17 14:17:53 +0200
  • 93de923b4e Some variable renaming Patrick Jentsch 2019-05-17 12:00:56 +0200
  • ca4f218d2a Update README Patrick Jentsch 2019-05-17 01:07:39 +0200
  • 18b659684a Change README example Patrick Jentsch 2019-05-16 20:46:35 +0200
  • 916fbe158d Update README wording Patrick Jentsch 2019-05-16 20:40:01 +0200
  • 9fb88d84fe Remove useless help messages Patrick Jentsch 2019-05-16 15:00:38 +0200
  • 46bb0efd14 Add description to hocrtotei Patrick Jentsch 2019-05-16 14:59:22 +0200
  • b81ad4cc67 Use argparse in hocrtotei Patrick Jentsch 2019-05-16 14:21:01 +0200
  • c39edec1ab Fix argument description. Patrick Jentsch 2019-05-16 13:26:19 +0200
  • 41e46e95a2 Update README Patrick Jentsch 2019-05-16 13:22:29 +0200
  • 75dd73f383 Fix Typo Patrick Jentsch 2019-05-16 13:19:20 +0200
  • 9536116cc2 Update README Patrick Jentsch 2019-05-16 13:17:15 +0200
  • 4c0ba270db Update Patrick Jentsch 2019-05-16 00:09:19 +0200
  • 03b1054560 Sort all lists before processing Patrick Jentsch 2019-05-15 14:55:36 +0200
  • b9dba80d7f update for better graph Patrick Jentsch 2019-05-15 13:54:08 +0200
  • e5c0d53a03 Add some output messages and code formatting. Patrick Jentsch 2019-05-15 11:56:24 +0200
  • 843151e547 Correct order for output files. Patrick Jentsch 2019-05-13 15:03:43 +0200
  • 937abb8c8d Add Italian language data. Stephan Porada 2019-05-02 12:14:05 +0200
  • 0b20d641df Update Patrick Jentsch 2019-04-25 11:50:32 +0200
  • cc05d09756 test Patrick Jentsch 2019-04-25 11:46:05 +0200
  • efbf6f24e6 Update Patrick Jentsch 2019-04-25 11:40:27 +0200
  • d25204d6a9 Change tif split handling, sort files before merging Patrick Jentsch 2019-04-24 17:01:49 +0200
  • 1fcb8bd318 Merge branch 'master' of gitlab.ub.uni-bielefeld.de:sfb1288inf/ocr Patrick Jentsch 2019-04-16 11:38:42 +0200
  • 10b473ae37 Implement the workaround a bit different Patrick Jentsch 2019-04-16 11:38:36 +0200
  • d31e6315cd Update README.md Stephan Porada 2019-04-16 11:16:35 +0200
  • a533ef76c6 Update Patrick Jentsch 2019-04-15 10:40:08 +0200
  • 84bcac0fc7 Update Patrick Jentsch 2019-04-15 10:34:28 +0200
  • f3fe886335 Update Patrick Jentsch 2019-04-15 10:33:20 +0200
  • 5e43e09beb Update Patrick Jentsch 2019-04-15 10:25:57 +0200