Patrick Jentsch
|
7322a5bc7c
|
More GhostScript, less dependencies!
|
2020-07-02 11:47:43 +02:00 |
|
Patrick Jentsch
|
2b63ba9e59
|
Remove unused dependencies and use ghostscript for image split
|
2020-07-01 11:03:34 +02:00 |
|
Patrick Jentsch
|
aee9628e5e
|
fix pipeline
|
2020-06-23 15:19:27 +02:00 |
|
Stephan Porada
|
ec5b4eb521
|
Add PDF compression
|
2020-06-16 09:31:34 +02:00 |
|
Stephan Porada
|
b77ca5914f
|
Set relative file paths in hocr
|
2020-06-10 11:48:58 +02:00 |
|
Stephan Porada
|
018939ae55
|
Add PoCo zips part 1
|
2020-06-09 16:58:22 +02:00 |
|
Patrick Jentsch
|
64fe706126
|
Keep uncompressed output files after zip jobs.
|
2020-05-13 09:11:01 +02:00 |
|
Patrick Jentsch
|
364e3d626d
|
Fix zip creation
|
2020-04-04 15:37:21 +02:00 |
|
Patrick Jentsch
|
36a86887b0
|
Update OCR Pipeline
|
2020-04-03 17:35:30 +02:00 |
|
stephan
|
eb5ccf4e21
|
Add ocr to filenames
|
2020-02-18 10:16:24 +01:00 |
|
stephan
|
c1f5252633
|
Some cosmetics
|
2020-02-17 14:59:34 +01:00 |
|
stephan
|
880f0efcf9
|
Add zip fielname argument
|
2020-02-17 14:26:50 +01:00 |
|
Patrick Jentsch
|
6c4a642cb7
|
Add a switch for zip functionality
|
2020-02-03 15:00:27 +01:00 |
|
Patrick Jentsch
|
dfc05be7db
|
add zip creation of results
|
2020-01-20 15:04:55 +01:00 |
|
Patrick Jentsch
|
fa4a798351
|
Use language models from repository. Remove workaround for the legacy German Fraktur model.
|
2019-07-31 11:13:55 +02:00 |
|
Patrick Jentsch
|
1a3d7175fe
|
Remove comments
|
2019-06-11 14:18:46 +02:00 |
|
Patrick Jentsch
|
e1462152fe
|
Codestyle
|
2019-05-20 11:10:40 +02:00 |
|
Patrick Jentsch
|
93de923b4e
|
Some variable renaming
|
2019-05-17 12:00:56 +02:00 |
|
Patrick Jentsch
|
46bb0efd14
|
Add description to hocrtotei
|
2019-05-16 14:59:22 +02:00 |
|
Patrick Jentsch
|
b81ad4cc67
|
Use argparse in hocrtotei
|
2019-05-16 14:21:01 +02:00 |
|
Patrick Jentsch
|
4c0ba270db
|
Update
|
2019-05-16 00:09:19 +02:00 |
|
Patrick Jentsch
|
03b1054560
|
Sort all lists before processing
|
2019-05-15 14:55:36 +02:00 |
|
Patrick Jentsch
|
b9dba80d7f
|
update for better graph
|
2019-05-15 13:54:08 +02:00 |
|
Patrick Jentsch
|
e5c0d53a03
|
Add some output messages and code formatting.
|
2019-05-15 11:56:24 +02:00 |
|
Patrick Jentsch
|
843151e547
|
Correct order for output files.
|
2019-05-13 15:03:43 +02:00 |
|
Patrick Jentsch
|
efbf6f24e6
|
Update
|
2019-04-25 11:40:27 +02:00 |
|
Patrick Jentsch
|
d25204d6a9
|
Change tif split handling, sort files before merging
|
2019-04-24 17:01:49 +02:00 |
|
Patrick Jentsch
|
10b473ae37
|
Implement the workaround a bit different
|
2019-04-16 11:38:36 +02:00 |
|
Patrick Jentsch
|
a533ef76c6
|
Update
|
2019-04-15 10:40:08 +02:00 |
|
Patrick Jentsch
|
84bcac0fc7
|
Update
|
2019-04-15 10:34:28 +02:00 |
|
Patrick Jentsch
|
f3fe886335
|
Update
|
2019-04-15 10:33:20 +02:00 |
|
Patrick Jentsch
|
5e43e09beb
|
Update
|
2019-04-15 10:25:57 +02:00 |
|
Patrick Jentsch
|
5e11fcae01
|
Rename output directories.
|
2019-04-15 10:13:08 +02:00 |
|
Patrick Jentsch
|
8e6868194d
|
Fix bug
|
2019-04-15 10:02:02 +02:00 |
|
Patrick Jentsch
|
fdc53fd16c
|
Use a single core for deu_frak
|
2019-04-15 09:56:47 +02:00 |
|
Patrick Jentsch
|
d84db585fa
|
Sort files in output.
|
2019-04-15 09:47:30 +02:00 |
|
Patrick Jentsch
|
eb6327aed3
|
Rename job.
|
2019-04-15 09:28:05 +02:00 |
|
Patrick Jentsch
|
0d3efe167e
|
Update
|
2019-04-14 14:33:40 +02:00 |
|
Patrick Jentsch
|
9f3c71a118
|
Fehler behoben
|
2019-04-12 15:36:47 +02:00 |
|
Patrick Jentsch
|
ac9b25271f
|
Add skip binarization
|
2019-04-12 15:28:24 +02:00 |
|
Patrick Jentsch
|
0a25afbd51
|
len not length... Thx python
|
2019-04-11 13:55:35 +02:00 |
|
Patrick Jentsch
|
1e740aec66
|
Aktualisieren ocr
|
2019-04-11 13:51:07 +02:00 |
|
Patrick Jentsch
|
d4218fcd7c
|
Aktualisieren ocr
|
2019-04-11 13:46:24 +02:00 |
|
Patrick Jentsch
|
fd7ad08e1e
|
Update ocr
|
2019-04-11 13:04:03 +02:00 |
|
Patrick Jentsch
|
3131174676
|
Fix input file
|
2019-04-11 11:55:42 +02:00 |
|
Patrick Jentsch
|
a947e36997
|
Start one ocropus-nlbin job per page instead of one per document
|
2019-04-11 11:50:09 +02:00 |
|
Patrick Jentsch
|
26757eda03
|
Some renaming and cleanup.
|
2019-03-10 20:59:30 +01:00 |
|