Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V2.70.0 backport to v3 #22

Open
wants to merge 139 commits into
base: new-processor-api
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
139 commits
Select commit Hold shift + click to select a range
dfeddbd
Fix typos (most of them found by codespell)
stweil Sep 12, 2024
5663f48
processor CLI: delegate --resolve-resource, too
bertsky Sep 13, 2024
853bdb5
test_mets_server: fix arg vs kwarg
bertsky Aug 13, 2024
33c7386
mets_server: ClientSideOcrdMets needs OcrdMets-like kwargs (without d…
bertsky Aug 13, 2024
37f7cda
use up-to-date kwargs (avoiding old deprecations)
bertsky Aug 13, 2024
44946ba
hide/test expected deprecation warnings
bertsky Aug 13, 2024
d0962d6
improve output in case of assertion failures
bertsky Aug 13, 2024
061f023
allow "from ocrd_models import OcrdPage
kba Aug 15, 2024
d2f92d1
ocrd_utils: forgot to export scale_coordinates at toplvl
bertsky Aug 16, 2024
c6c5c42
fix imports
bertsky Aug 16, 2024
245778c
Processor.zip_input_files: warning instead of exception for missing i…
bertsky Aug 20, 2024
1f7b57f
Processor.zip_input_files: more verbose log msg
bertsky Aug 21, 2024
35bdb39
tests report.is_valid: improve output on failure
bertsky Aug 21, 2024
e595996
fix --log-filename (6fc606027a): apply in ocrd_cli_wrap_processor
bertsky Aug 24, 2024
f21b8d2
fix exception
bertsky Aug 24, 2024
0cbd3ea
adapt to PIL.Image moved constants
bertsky Aug 24, 2024
8f8912c
cli.workspace: pass fileGrp as well, improve description
bertsky Aug 24, 2024
6dccfb3
OcrdMets.add_agent: does not have positional args
bertsky Aug 24, 2024
2d85f14
update pylintrc
bertsky Aug 24, 2024
ea68370
pylint: try ignoring generateds (again)
bertsky Aug 25, 2024
18ac2c0
ClientSideOcrdMets: use same logger name prefix as server
bertsky Aug 28, 2024
da37967
test_mets_server: use tmpdir to avoid side effects between suites
bertsky Aug 28, 2024
ccb416b
disableLogging: re-instate root logger, to
bertsky Aug 28, 2024
7e3cdf4
test-logging: also remove ocrd.log from tempdir
bertsky Aug 28, 2024
4f45b12
bashlib: re-add --log-filename, implement as stderr redirect
bertsky Aug 28, 2024
7b70c90
ocrd_utils.config: add reset_defaults()
bertsky Aug 29, 2024
48bb3c2
add test for OcrdEnvConfig.reset_defaults()
bertsky Aug 29, 2024
ed92403
Workspace.reload_mets: fix for METS server case
bertsky Sep 1, 2024
9c3c399
OcrdMetsServer.add_file: pass on 'force' kwarg, too
bertsky Sep 2, 2024
c077e95
test_mets_server: add test for force (overwrite)
bertsky Sep 2, 2024
4492168
PcGts.Page.id / make_xml_id: replace '/' with '_'
bertsky Sep 13, 2024
83d52d8
METS Server: also export+delegate physical_pages
bertsky Sep 15, 2024
4eccefc
ocrd.cli.workspace: consistently pass on --mets-server-url and --back…
bertsky Sep 13, 2024
083df27
ocrd.cli.workspace server: add 'reload' and 'save'
bertsky Sep 13, 2024
b2c0161
ocrd.cli.validate tasks: pass on --mets-server-url, too
bertsky Sep 12, 2024
203a06a
run_processor: be robust if ocrd_tool is missing steps
bertsky Sep 12, 2024
4fbdd00
lib.bash: fix errexit
bertsky Sep 12, 2024
c865079
tests: make sure ocrd_utils.config gets reset whenever changing it gl…
bertsky Sep 13, 2024
1a13cd3
ocrd.cli.workspace: assert non-server in cmds mutating METS
bertsky Sep 16, 2024
bba597e
OcrdPage: add PageType.get_ReadingOrderGroups()
bertsky Sep 7, 2024
fa0fada
update OcrdPage from generateds
bertsky Sep 7, 2024
9641d4a
OcrdMets.get_physical_pages: cover return_divs w/o for_fileIds for_pa…
bertsky Sep 27, 2024
19ce7d9
ocrd.cli.workspace: use physical_pages if possible, fix default outpu…
bertsky Sep 27, 2024
372f725
Added space after %U in imagemagick identify format prameter.
mexthecat Sep 27, 2024
44deb80
ocrd_exif: add multi-frame TIFF example
bertsky Sep 27, 2024
606915b
disableLogging: clearer comment
bertsky Sep 30, 2024
3b908a6
:memo: changelog
kba Sep 30, 2024
343a66a
:memo: changelog: remove spurious entries
kba Sep 30, 2024
f808b72
:memo: update changelog again
bertsky Sep 30, 2024
4d25fcf
update assets
kba Sep 30, 2024
bdfb410
test_exif: add example provided by @mexthecat
kba Sep 30, 2024
c744dc8
Merge branch 'mexthecat-master'
kba Sep 30, 2024
e6d1f85
:memo: changelog
kba Sep 30, 2024
ff81c6b
:package: v2.69.0
kba Sep 30, 2024
f44e28b
introduce: OCRD_NETWORK_CLIENT_POLLING_PRINT
MehmedGIT Oct 1, 2024
7177eb1
fix: config value description
MehmedGIT Oct 1, 2024
df8e8ee
add default value param to preserver backwards compatibility
MehmedGIT Oct 1, 2024
b183cfc
make -b/--block as flags
MehmedGIT Oct 1, 2024
342ef3a
implement feedback
MehmedGIT Oct 1, 2024
0e80a7c
fix: missed params
MehmedGIT Oct 1, 2024
d7df200
fix: integration client tests
MehmedGIT Oct 1, 2024
0bfef64
post_ps_workflow_request: pagewise configurable
kba Oct 1, 2024
19aad83
Merge remote-tracking branch 'github/network_client_block_prints' int…
kba Oct 1, 2024
1f5c4bb
Dockerfile.cuda-torch: do NOT rm /build/core
bertsky Oct 1, 2024
611b6b5
deployer: Remove any pre-existing socket file before starting the ser…
kba Oct 1, 2024
a68782d
Merge pull request #1280 from OCR-D/fix-docker-cuda-torch
kba Oct 1, 2024
9a71d04
remove UDS socket files
MehmedGIT Oct 2, 2024
854403d
remove shortcuts for page-wise
MehmedGIT Oct 2, 2024
4d01e66
fix: pass page-wise argument to relevant methods
MehmedGIT Oct 2, 2024
97427e0
Update src/ocrd_network/client_utils.py
MehmedGIT Oct 2, 2024
7454845
add endpoint DELETE /workflow/kill-mets-server-zombies to kill -SIGTE…
kba Oct 2, 2024
0506e9d
move mets-zombie killer to / and return list of killed PIDs
kba Oct 2, 2024
ad81356
/kill_mets_server_zombies use underscores not slashes
kba Oct 2, 2024
7a3be1e
Merge pull request #1278 from OCR-D/page-wise-param
kba Oct 2, 2024
4862d72
use 3.8 compatible typing
kba Oct 2, 2024
2cb3e2a
Merge branch 'network_client_block_prints' into mets-server-kill-zombies
kba Oct 2, 2024
8b6a49c
Merge pull request #1282 from OCR-D/mets-server-rm-socket
kba Oct 2, 2024
0d297e7
Merge branch 'network_client_block_prints' into mets-server-kill-zombies
kba Oct 2, 2024
4f6775f
OcrdMetsServer.kill_process: try the easy way (SIGINT) then the hard …
kba Oct 2, 2024
3882e7a
fix: add default to page_wise param
MehmedGIT Oct 2, 2024
a8bfbe4
Merge branch 'network_client_block_prints' into mets-server-kill-zombies
kba Oct 2, 2024
c5fd843
Merge pull request #1283 from OCR-D/mets-server-kill-zombies
kba Oct 2, 2024
7b6552b
previous state
MehmedGIT Oct 4, 2024
e3f5949
Merge branch 'network_client_block_prints' into fix_mets_server_zombies
MehmedGIT Oct 4, 2024
637a40e
do not use pid killing
MehmedGIT Oct 4, 2024
387dc30
add logger param to stop mets server
MehmedGIT Oct 4, 2024
07953f7
add extensive logging to mets proxy
MehmedGIT Oct 4, 2024
3a9e147
return empty response type earlier
MehmedGIT Oct 4, 2024
00655b8
fix: change UDS file deletion place
MehmedGIT Oct 4, 2024
810f811
return response from mets server before dying
MehmedGIT Oct 4, 2024
4970e62
fix: remove UDS file correctly
MehmedGIT Oct 4, 2024
906766d
comment out irrelevant code
MehmedGIT Oct 4, 2024
a87a2e1
fix: no more zombies, yay!
MehmedGIT Oct 4, 2024
e0ff4eb
add: extensive logging of mets server to file
MehmedGIT Oct 4, 2024
53c8f3f
change cache debug -> info for extensive logging to file
MehmedGIT Oct 4, 2024
fe41223
set log from info to debug
MehmedGIT Oct 4, 2024
55c2f63
fix: typo
MehmedGIT Oct 4, 2024
bf6616f
improve: delete socket file more appropriately
MehmedGIT Oct 4, 2024
bc8a03b
remove: unnecessary code
MehmedGIT Oct 4, 2024
303488a
fix: .__dict__ of {}
MehmedGIT Oct 4, 2024
c8e0c73
Update src/ocrd/mets_server.py
MehmedGIT Oct 8, 2024
2cd4a64
Update src/ocrd/mets_server.py
MehmedGIT Oct 8, 2024
44a8ceb
Update src/ocrd/mets_server.py
MehmedGIT Oct 8, 2024
61c683f
Update src/ocrd_network/runtime_data/deployer.py
MehmedGIT Oct 8, 2024
5055309
remove unnecessary method
MehmedGIT Oct 8, 2024
34bfbf4
fix: make stop() and ..reload..() sync
MehmedGIT Oct 8, 2024
ab660fb
fix: stop mets server when no cached requests
MehmedGIT Oct 8, 2024
148f8d4
clean: remove pid kill flag in stop mets server
MehmedGIT Oct 8, 2024
dacd325
extend log: server cache requests
MehmedGIT Oct 8, 2024
05ded73
improve: sleep no longer needed
MehmedGIT Oct 8, 2024
5d755a8
add new env: OCRD_NETWORK_RABBITMQ_HEARTBEAT
MehmedGIT Oct 9, 2024
a295b0c
deps-torch: also install torchvision
bertsky Oct 9, 2024
c5c60fd
fix: empty -> text
MehmedGIT Oct 9, 2024
e1b9784
deployer: remove METS Server path and url from their resp. caches on …
kba Oct 9, 2024
47c9acf
Merge branch 'fix_mets_server_zombies' into deployer-mets-caching
kba Oct 10, 2024
926cb97
Merge pull request #1287 from OCR-D/deployer-mets-caching
kba Oct 10, 2024
d39c3d7
kill_mets_server_zombies: actually return List[int]
kba Oct 10, 2024
7512bd6
kill_mets_server_zombies: allow dry_run to test
kba Oct 10, 2024
252fb4d
Merge branch 'network_client_block_prints'
kba Oct 10, 2024
e40ed79
:memo: changelog
kba Oct 10, 2024
7f60559
Simplify description for OCRD_NETWORK_RABBITMQ_HEARTBEAT
kba Oct 10, 2024
3e736a7
Merge pull request #1285 from OCR-D/rabbitmq_heartbeat_env
kba Oct 10, 2024
4e6551b
Merge remote-tracking branch 'github/deps-torch-torchvision'
kba Oct 10, 2024
02c6eff
:memo: changelog
kba Oct 10, 2024
9391f49
Merge remote-tracking branch 'github/fix_mets_server_zombies'
kba Oct 10, 2024
88707ca
:memo: changelog
kba Oct 10, 2024
cb8d787
CLI decorator: only import ocrd_network when needed
bertsky Oct 10, 2024
03018f7
Merge pull request #1274 from stweil/typos
kba Oct 10, 2024
94e6d2c
:memo: changelog
kba Oct 10, 2024
d29c029
Merge branch 'master' of https:/OCR-D/core
kba Oct 10, 2024
e5cdbe9
deps-cuda: retry if micromamba is unresponsive
bertsky Oct 10, 2024
539f5d7
Merge remote-tracking branch 'github/cli-decorator-import-network'
kba Oct 10, 2024
80c0c6f
:memo: changelog
kba Oct 10, 2024
7b1d172
create PyPI CD
bertsky Oct 10, 2024
7750f3f
:memo: changelog
kba Oct 10, 2024
012ccf6
:package: v2.70.0
kba Oct 10, 2024
a8e2c64
deps-cuda: retry micro.mamba.pm even more
bertsky Oct 10, 2024
85bde15
PyPI: do not upload deprecated distribution aliases anymore
bertsky Oct 10, 2024
7129ced
Merge branch 'master' into v2.70.0-backport-to-v3
kba Oct 17, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 31 additions & 0 deletions .github/workflows/publish-pypi.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# This workflow will upload a Python Package using Twine when a release is created
# For more information see: https://help.github.com/en/actions/language-and-framework-guides/using-python-with-github-actions#publishing-to-package-registries

name: Upload Python Package

on:
release:
types: [published]
workflow_dispatch:

jobs:
deploy:

runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v2
- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: '3.8'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install setuptools wheel build twine
pip install -r requirements.txt
- name: Build and publish
env:
TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}
TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}
run: make pypi
54 changes: 54 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,58 @@ Added:
- `Processor.verify`: handle fileGrp cardinality verification, with default implementation
- `Processor.setup`: to set up processor before processing, optional

## [2.70.0] - 2024-10-10

Added:

- `ocrd network client workflow run`: Add `--print-status` flag to periodically print the job status, #1277
- Processing Server: `DELETE /mets_server_zombies` to kill any renegade METS servers, #1277
- No more zombie METS Server by properly shutting them down, #1284
- `OCRD_NETWORK_RABBITMQ_HEARBEAT` to allow overriding the [heartbeat](https://pika.readthedocs.io/en/stable/examples/heartbeat_and_blocked_timeouts.html) behavior of RabbitMQ, #1285

Changed:

- significantly more detailed logging for the METS Server and Processing Server, #1284
- Only import `ocrd_network` in src/ocrd/decorators/__init__.py once needed, #1289
- Automate release via GitHub Actions, #1290

Fixed:

- `ocrd/core-cuda-torch`: Install torchvision as well, #1286
- Processing Server: remove shut down METS servers from deployer's cache, #1287
- typos, #1274

## [2.69.0] - 2024-09-30

Fixed:
- tests: ensure `ocrd_utils.config` gets reset whenever changing it globally
- `ocrd.cli.workspace`: consistently pass on `--mets-server-url` and `--backup`
- `ocrd.cli.workspace`: make `list-page` work w/ METS Server
- `ocrd.cli.validate "tasks"`: pass on `--mets-server-url`
- `lib.bash`: fix `errexit` handling
- actually apply CLI `--log-filename`, and show in `--help`
- adapt to Pillow changes
- `ocrd workspace clone`: do pass on `--file-grp` (for download filtering)
- `OcrdMetsServer.add_file`: pass on `force` kwarg
- `Workspace.reload_mets`: handle ClientSideOcrdMets as well
- `OcrdMets.get_physical_pages`: cover `return_divs` w/o `for_fileIds` and `for_pageIds`
- `disableLogging`: also re-instate root logger to Python defaults
- `OcrdExif`: handle multi-frame TIFFs gracefully in `identify` callout, #1276

Changed:
- `run_processor`: be robust if `ocrd_tool` is missing `steps`
- `PcGtsType.PageType.id` via `make_xml_id`: replace `/` with `_`
- `ClientSideOcrdMets`: use same logger name prefix as METS Server
- `Processor.zip_input_files`: when `--page-id` yields empty list, just log instead of raise

Added:
- `OcrdPage`: new `PageType.get_ReadingOrderGroups()` to retrieve recursive RO as dict
- METS Server: export and delegate `physical_pages`
- ocrd.cli.workspace `server`: add subcommands `reload` and `save`
- processor CLI: delegate `--resolve-resource`, too
- `OcrdConfig.reset_defaults` to reset config variables to their defaults
- `ocrd_utils.scale_coordinates` for resizing images

## [2.68.0] - 2024-08-23

Changed:
Expand Down Expand Up @@ -2294,6 +2346,8 @@ Initial Release
[3.0.0b1]: ../../compare/v3.0.0b1..v3.0.0a2
[3.0.0a2]: ../../compare/v3.0.0a2..v3.0.0a1
[3.0.0a1]: ../../compare/v3.0.0a1..v2.67.2
[2.70.0]: ../../compare/v2.70.0..v2.69.0
[2.69.0]: ../../compare/v2.69.0..v2.68.0
[2.68.0]: ../../compare/v2.68.0..v2.67.2
[2.67.2]: ../../compare/v2.67.2..v2.67.1
[2.67.1]: ../../compare/v2.67.1..v2.67.0
Expand Down
2 changes: 0 additions & 2 deletions Dockerfile.cuda-torch
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,5 @@ RUN make deps-torch

WORKDIR /data

RUN rm -fr /build

CMD ["/usr/local/bin/ocrd", "--help"]

6 changes: 3 additions & 3 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ deps-cuda: CONDA_EXE ?= /usr/local/bin/conda
deps-cuda: export CONDA_PREFIX ?= /conda
deps-cuda: PYTHON_PREFIX != $(PYTHON) -c 'import sysconfig; print(sysconfig.get_paths()["purelib"])'
deps-cuda:
curl -Ls https://micro.mamba.pm/api/micromamba/linux-64/latest | tar -xvj bin/micromamba
curl --retry 6 -Ls https://micro.mamba.pm/api/micromamba/linux-64/latest | tar -xvj bin/micromamba
mv bin/micromamba $(CONDA_EXE)
# Install Conda system-wide (for interactive / login shells)
echo 'export MAMBA_EXE=$(CONDA_EXE) MAMBA_ROOT_PREFIX=$(CONDA_PREFIX) CONDA_PREFIX=$(CONDA_PREFIX) PATH=$(CONDA_PREFIX)/bin:$$PATH' >> /etc/profile.d/98-conda.sh
Expand Down Expand Up @@ -158,7 +158,7 @@ deps-tf2:
fi

deps-torch:
$(PIP) install -i https://download.pytorch.org/whl/cu118 torch
$(PIP) install -i https://download.pytorch.org/whl/cu118 torch torchvision

# Dependencies for deployment in an ubuntu/debian linux
deps-ubuntu:
Expand All @@ -178,7 +178,7 @@ build:

# (Re)install the tool
install: #build
# not stricttly necessary but a precaution against outdated python build tools, https:/OCR-D/core/pull/1166
# not strictly necessary but a precaution against outdated python build tools, https:/OCR-D/core/pull/1166
$(PIP) install -U pip wheel
$(PIP_INSTALL) . $(PIP_INSTALL_CONFIG_OPTION)
@# workaround for shapely#1598
Expand Down
2 changes: 2 additions & 0 deletions src/ocrd/cli/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,8 @@ def get_help(self, ctx):
\b
{config.describe('OCRD_NETWORK_RABBITMQ_CLIENT_CONNECT_ATTEMPTS')}
\b
{config.describe('OCRD_NETWORK_RABBITMQ_HEARTBEAT')}
\b
{config.describe('OCRD_PROFILE_FILE')}
\b
{config.describe('OCRD_PROFILE', wrap_text=False)}
Expand Down
12 changes: 6 additions & 6 deletions src/ocrd/decorators/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@
redirect_stderr_and_stdout_to_file,
)
from ocrd_validators import WorkspaceValidator
from ocrd_network import ProcessingWorker, ProcessorServer, AgentType

from ..resolver import Resolver
from ..processor.base import ResourceNotFoundError, run_processor
Expand All @@ -23,8 +22,6 @@
from .ocrd_cli_options import ocrd_cli_options
from .mets_find_options import mets_find_options

SUBCOMMANDS = [AgentType.PROCESSING_WORKER, AgentType.PROCESSOR_SERVER]


def ocrd_cli_wrap_processor(
processorClass,
Expand Down Expand Up @@ -88,11 +85,9 @@ def ocrd_cli_wrap_processor(
if list_resources:
processor.list_resources()
sys.exit()
if subcommand:
if subcommand or address or queue or database:
# Used for checking/starting network agents for the WebAPI architecture
check_and_run_network_agent(processorClass, subcommand, address, database, queue)
elif address or queue or database:
raise ValueError(f"Subcommand options --address --queue and --database are only valid for subcommands: {SUBCOMMANDS}")

# from here: single-run processing context
initLogging()
Expand Down Expand Up @@ -162,6 +157,11 @@ def goexit():
def check_and_run_network_agent(ProcessorClass, subcommand: str, address: str, database: str, queue: str):
"""
"""
from ocrd_network import ProcessingWorker, ProcessorServer, AgentType
SUBCOMMANDS = [AgentType.PROCESSING_WORKER, AgentType.PROCESSOR_SERVER]

if not subcommand:
raise ValueError(f"Subcommand options --address --queue and --database are only valid for subcommands: {SUBCOMMANDS}")
if subcommand not in SUBCOMMANDS:
raise ValueError(f"SUBCOMMAND can only be one of {SUBCOMMANDS}")

Expand Down
Loading