Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

magic-pdf, version 0.8.1 pdf 解析报错 #758

Open
wertyac opened this issue Oct 18, 2024 · 4 comments
Open

magic-pdf, version 0.8.1 pdf 解析报错 #758

wertyac opened this issue Oct 18, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@wertyac
Copy link

wertyac commented Oct 18, 2024

Description of the bug | 错误描述

python =3.10
安装方式遵循doc文档。
CUDA version为11.8
Ubuntu 为22.04.
报如下错误,无法解析成功。
2024-10-18 09:32:53.186 | ERROR | magic_pdf.tools.cli:parse_doc:96 - Coordinate 'right' is less than 'left'

How to reproduce the bug | 如何复现

(mineru) health@222server:~$ magic-pdf -p small_ocr.pdf -o /home/health/pdf/
2024-10-18 09:29:54.551 | INFO | magic_pdf.libs.pdf_check:detect_invalid_chars:57 - cid_count: 0, text_len: 8, cid_chars_radio: 0.0
2024-10-18 09:29:54.553 | WARNING | magic_pdf.filter.pdf_classify_by_type:classify:334 - pdf is not classified by area and text_len, by_image_area: False, by_text: False, by_avg_words: False, by_img_num: True, by_text_layout: False, by_img_narrow_strips: False, by_invalid_chars: True
2024-10-18 09:30:02.736 | INFO | magic_pdf.model.pdf_extract_kit:init:180 - DocAnalysis init, this may take some times. apply_layout: True, apply_formula: True, apply_ocr: True, apply_table: False
2024-10-18 09:30:02.736 | INFO | magic_pdf.model.pdf_extract_kit:init:188 - using device: cuda
2024-10-18 09:30:02.736 | INFO | magic_pdf.model.pdf_extract_kit:init:190 - using models_dir: /home/health/.cache/modelscope/hub/opendatalab/PDF-Extract-Kit/models
CustomVisionEncoderDecoderModel init
CustomMBartForCausalLM init
CustomMBartDecoder init
[10/18 09:30:19 detectron2]: Rank of current process: 0. World size: 1
[10/18 09:30:19 detectron2]: Environment info:


sys.platform linux
Python 3.10.15 (main, Oct 3 2024, 07:27:34) [GCC 11.2.0]
numpy 1.26.4
detectron2 0.6 @/home/health/anaconda3/envs/mineru/lib/python3.10/site-packages/detectron2
Compiler GCC 11.4
CUDA compiler not available
DETECTRON2_ENV_MODULE
PyTorch 2.3.1+cu121 @/home/health/anaconda3/envs/mineru/lib/python3.10/site-packages/torch
PyTorch debug build False
torch._C._GLIBCXX_USE_CXX11_ABI False
GPU available Yes
GPU 0 NVIDIA GeForce RTX 4070 (arch=8.9)
Driver version 535.183.01
CUDA_HOME /usr/local/cuda-11.8
Pillow 11.0.0
torchvision 0.18.1+cu121 @/home/health/anaconda3/envs/mineru/lib/python3.10/site-packages/torchvision
torchvision arch flags 5.0, 6.0, 7.0, 7.5, 8.0, 8.6, 9.0
fvcore 0.1.5.post20221221
iopath 0.1.9
cv2 4.6.0


PyTorch built with:

  • GCC 9.3
  • C++ Version: 201703
  • Intel(R) oneAPI Math Kernel Library Version 2022.2-Product Build 20220804 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v3.3.6 (Git Hash 86e6af5974177e513fd3fee58425e1063e7f1361)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • LAPACK is enabled (usually provided by MKL)
  • NNPACK is enabled
  • CPU capability usage: AVX2
  • CUDA Runtime 12.1
  • NVCC architecture flags: -gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90
  • CuDNN 8.6 (built against CUDA 11.8)
    • Built with CuDNN 8.9.2
  • Magma 2.6.1
  • Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=12.1, CUDNN_VERSION=8.9.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=2.3.1, USE_CUDA=ON, USE_CUDNN=ON, USE_CUSPARSELT=1, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF,

[10/18 09:30:19 detectron2]: Command line arguments: {'config_file': '/home/health/anaconda3/envs/mineru/lib/python3.10/site-packages/magic_pdf/resources/model_config/layoutlmv3/layoutlmv3_base_inference.yaml', 'resume': False, 'eval_only': False, 'num_gpus': 1, 'num_machines': 1, 'machine_rank': 0, 'dist_url': 'tcp://127.0.0.1:57823', 'opts': ['MODEL.WEIGHTS', '/home/health/.cache/modelscope/hub/opendatalab/PDF-Extract-Kit/models/Layout/model_final.pth']}
[10/18 09:30:19 detectron2]: Contents of args.config_file=/home/health/anaconda3/envs/mineru/lib/python3.10/site-packages/magic_pdf/resources/model_config/layoutlmv3/layoutlmv3_base_inference.yaml:
AUG:
DETR: true
CACHE_DIR: ~/cache/huggingface
CUDNN_BENCHMARK: false
DATALOADER:
ASPECT_RATIO_GROUPING: true
FILTER_EMPTY_ANNOTATIONS: false
NUM_WORKERS: 4
REPEAT_THRESHOLD: 0.0
SAMPLER_TRAIN: TrainingSampler
DATASETS:
PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000
PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000
PROPOSAL_FILES_TEST: []
PROPOSAL_FILES_TRAIN: []
TEST:

  • scihub_train
    TRAIN:
  • scihub_train
    GLOBAL:
    HACK: 1.0
    ICDAR_DATA_DIR_TEST: ''
    ICDAR_DATA_DIR_TRAIN: ''
    INPUT:
    CROP:
    ENABLED: true
    SIZE:
    • 384
    • 600
      TYPE: absolute_range
      FORMAT: RGB
      MASK_FORMAT: polygon
      MAX_SIZE_TEST: 1333
      MAX_SIZE_TRAIN: 1333
      MIN_SIZE_TEST: 800
      MIN_SIZE_TRAIN:
  • 480
  • 512
  • 544
  • 576
  • 608
  • 640
  • 672
  • 704
  • 736
  • 768
  • 800
    MIN_SIZE_TRAIN_SAMPLING: choice
    RANDOM_FLIP: horizontal
    MODEL:
    ANCHOR_GENERATOR:
    ANGLES:
      • -90
      • 0
      • 90
        ASPECT_RATIOS:
      • 0.5
      • 1.0
      • 2.0
        NAME: DefaultAnchorGenerator
        OFFSET: 0.0
        SIZES:
      • 32
      • 64
      • 128
      • 256
      • 512
        BACKBONE:
        FREEZE_AT: 2
        NAME: build_vit_fpn_backbone
        CONFIG_PATH: ''
        DEVICE: cuda
        FPN:
        FUSE_TYPE: sum
        IN_FEATURES:
    • layer3
    • layer5
    • layer7
    • layer11
      NORM: ''
      OUT_CHANNELS: 256
      IMAGE_ONLY: true
      KEYPOINT_ON: false
      LOAD_PROPOSALS: false
      MASK_ON: true
      META_ARCHITECTURE: VLGeneralizedRCNN
      PANOPTIC_FPN:
      COMBINE:
      ENABLED: true
      INSTANCES_CONFIDENCE_THRESH: 0.5
      OVERLAP_THRESH: 0.5
      STUFF_AREA_LIMIT: 4096
      INSTANCE_LOSS_WEIGHT: 1.0
      PIXEL_MEAN:
  • 127.5
  • 127.5
  • 127.5
    PIXEL_STD:
  • 127.5
  • 127.5
  • 127.5
    PROPOSAL_GENERATOR:
    MIN_SIZE: 0
    NAME: RPN
    RESNETS:
    DEFORM_MODULATED: false
    DEFORM_NUM_GROUPS: 1
    DEFORM_ON_PER_STAGE:
    • false
    • false
    • false
    • false
      DEPTH: 50
      NORM: FrozenBN
      NUM_GROUPS: 1
      OUT_FEATURES:
    • res4
      RES2_OUT_CHANNELS: 256
      RES5_DILATION: 1
      STEM_OUT_CHANNELS: 64
      STRIDE_IN_1X1: true
      WIDTH_PER_GROUP: 64
      RETINANET:
      BBOX_REG_LOSS_TYPE: smooth_l1
      BBOX_REG_WEIGHTS:
    • 1.0
    • 1.0
    • 1.0
    • 1.0
      FOCAL_LOSS_ALPHA: 0.25
      FOCAL_LOSS_GAMMA: 2.0
      IN_FEATURES:
    • p3
    • p4
    • p5
    • p6
    • p7
      IOU_LABELS:
    • 0
    • -1
    • 1
      IOU_THRESHOLDS:
    • 0.4
    • 0.5
      NMS_THRESH_TEST: 0.5
      NORM: ''
      NUM_CLASSES: 10
      NUM_CONVS: 4
      PRIOR_PROB: 0.01
      SCORE_THRESH_TEST: 0.05
      SMOOTH_L1_LOSS_BETA: 0.1
      TOPK_CANDIDATES_TEST: 1000
      ROI_BOX_CASCADE_HEAD:
      BBOX_REG_WEIGHTS:
      • 10.0
      • 10.0
      • 5.0
      • 5.0
      • 20.0
      • 20.0
      • 10.0
      • 10.0
      • 30.0
      • 30.0
      • 15.0
      • 15.0
        IOUS:
    • 0.5
    • 0.6
    • 0.7
      ROI_BOX_HEAD:
      BBOX_REG_LOSS_TYPE: smooth_l1
      BBOX_REG_LOSS_WEIGHT: 1.0
      BBOX_REG_WEIGHTS:
    • 10.0
    • 10.0
    • 5.0
    • 5.0
      CLS_AGNOSTIC_BBOX_REG: true
      CONV_DIM: 256
      FC_DIM: 1024
      NAME: FastRCNNConvFCHead
      NORM: ''
      NUM_CONV: 0
      NUM_FC: 2
      POOLER_RESOLUTION: 7
      POOLER_SAMPLING_RATIO: 0
      POOLER_TYPE: ROIAlignV2
      SMOOTH_L1_BETA: 0.0
      TRAIN_ON_PRED_BOXES: false
      ROI_HEADS:
      BATCH_SIZE_PER_IMAGE: 512
      IN_FEATURES:
    • p2
    • p3
    • p4
    • p5
      IOU_LABELS:
    • 0
    • 1
      IOU_THRESHOLDS:
    • 0.5
      NAME: CascadeROIHeads
      NMS_THRESH_TEST: 0.5
      NUM_CLASSES: 10
      POSITIVE_FRACTION: 0.25
      PROPOSAL_APPEND_GT: true
      SCORE_THRESH_TEST: 0.05
      ROI_KEYPOINT_HEAD:
      CONV_DIMS:
    • 512
    • 512
    • 512
    • 512
    • 512
    • 512
    • 512
    • 512
      LOSS_WEIGHT: 1.0
      MIN_KEYPOINTS_PER_IMAGE: 1
      NAME: KRCNNConvDeconvUpsampleHead
      NORMALIZE_LOSS_BY_VISIBLE_KEYPOINTS: true
      NUM_KEYPOINTS: 17
      POOLER_RESOLUTION: 14
      POOLER_SAMPLING_RATIO: 0
      POOLER_TYPE: ROIAlignV2
      ROI_MASK_HEAD:
      CLS_AGNOSTIC_MASK: false
      CONV_DIM: 256
      NAME: MaskRCNNConvUpsampleHead
      NORM: ''
      NUM_CONV: 4
      POOLER_RESOLUTION: 14
      POOLER_SAMPLING_RATIO: 0
      POOLER_TYPE: ROIAlignV2
      RPN:
      BATCH_SIZE_PER_IMAGE: 256
      BBOX_REG_LOSS_TYPE: smooth_l1
      BBOX_REG_LOSS_WEIGHT: 1.0
      BBOX_REG_WEIGHTS:
    • 1.0
    • 1.0
    • 1.0
    • 1.0
      BOUNDARY_THRESH: -1
      CONV_DIMS:
    • -1
      HEAD_NAME: StandardRPNHead
      IN_FEATURES:
    • p2
    • p3
    • p4
    • p5
    • p6
      IOU_LABELS:
    • 0
    • -1
    • 1
      IOU_THRESHOLDS:
    • 0.3
    • 0.7
      LOSS_WEIGHT: 1.0
      NMS_THRESH: 0.7
      POSITIVE_FRACTION: 0.5
      POST_NMS_TOPK_TEST: 1000
      POST_NMS_TOPK_TRAIN: 2000
      PRE_NMS_TOPK_TEST: 1000
      PRE_NMS_TOPK_TRAIN: 2000
      SMOOTH_L1_BETA: 0.0
      SEM_SEG_HEAD:
      COMMON_STRIDE: 4
      CONVS_DIM: 128
      IGNORE_VALUE: 255
      IN_FEATURES:
    • p2
    • p3
    • p4
    • p5
      LOSS_WEIGHT: 1.0
      NAME: SemSegFPNHead
      NORM: GN
      NUM_CLASSES: 10
      VIT:
      DROP_PATH: 0.1
      IMG_SIZE:
    • 224
    • 224
      NAME: layoutlmv3_base
      OUT_FEATURES:
    • layer3
    • layer5
    • layer7
    • layer11
      POS_TYPE: abs
      WEIGHTS:
      OUTPUT_DIR:
      SCIHUB_DATA_DIR_TRAIN: ~/publaynet/layout_scihub/train
      SEED: 42
      SOLVER:
      AMP:
      ENABLED: true
      BACKBONE_MULTIPLIER: 1.0
      BASE_LR: 0.0002
      BIAS_LR_FACTOR: 1.0
      CHECKPOINT_PERIOD: 2000
      CLIP_GRADIENTS:
      CLIP_TYPE: full_model
      CLIP_VALUE: 1.0
      ENABLED: true
      NORM_TYPE: 2.0
      GAMMA: 0.1
      GRADIENT_ACCUMULATION_STEPS: 1
      IMS_PER_BATCH: 32
      LR_SCHEDULER_NAME: WarmupCosineLR
      MAX_ITER: 20000
      MOMENTUM: 0.9
      NESTEROV: false
      OPTIMIZER: ADAMW
      REFERENCE_WORLD_SIZE: 0
      STEPS:
  • 10000
    WARMUP_FACTOR: 0.01
    WARMUP_ITERS: 333
    WARMUP_METHOD: linear
    WEIGHT_DECAY: 0.05
    WEIGHT_DECAY_BIAS: null
    WEIGHT_DECAY_NORM: 0.0
    TEST:
    AUG:
    ENABLED: false
    FLIP: true
    MAX_SIZE: 4000
    MIN_SIZES:
    • 400
    • 500
    • 600
    • 700
    • 800
    • 900
    • 1000
    • 1100
    • 1200
      DETECTIONS_PER_IMAGE: 100
      EVAL_PERIOD: 1000
      EXPECTED_RESULTS: []
      KEYPOINT_OKS_SIGMAS: []
      PRECISE_BN:
      ENABLED: false
      NUM_ITER: 200
      VERSION: 2
      VIS_PERIOD: 0

[10/18 09:30:21 d2.checkpoint.detection_checkpoint]: [DetectionCheckpointer] Loading from /home/health/.cache/modelscope/hub/opendatalab/PDF-Extract-Kit/models/Layout/model_final.pth ...
[10/18 09:30:21 fvcore.common.checkpoint]: [Checkpointer] Loading from /home/health/.cache/modelscope/hub/opendatalab/PDF-Extract-Kit/models/Layout/model_final.pth ...
2024-10-18 09:30:22.596 | INFO | magic_pdf.model.pdf_extract_kit:init:248 - DocAnalysis init done!
2024-10-18 09:30:22.596 | INFO | magic_pdf.model.doc_analyze_by_custom_model:custom_model_init:98 - model init cost: 28.04293656349182
2024-10-18 09:30:24.960 | INFO | magic_pdf.model.pdf_extract_kit:call:259 - layout detection cost: 1.47

0: 1888x1312 219 embeddings, 81 isolateds, 106.1ms
Speed: 18.1ms preprocess, 106.1ms inference, 42.8ms postprocess per image at shape (1, 3, 1888, 1312)
2024-10-18 09:32:48.691 | INFO | magic_pdf.model.pdf_extract_kit:call:289 - formula nums: 300, mfr time: 132.87
2024-10-18 09:32:52.197 | INFO | magic_pdf.model.pdf_extract_kit:call:372 - ocr cost: 3.47
2024-10-18 09:32:52.901 | INFO | magic_pdf.model.pdf_extract_kit:call:259 - layout detection cost: 0.7

0: 1888x1312 290 embeddings, 10 isolateds, 105.5ms
Speed: 18.2ms preprocess, 105.5ms inference, 125.3ms postprocess per image at shape (1, 3, 1888, 1312)
2024-10-18 09:32:53.186 | ERROR | magic_pdf.tools.cli:parse_doc:96 - Coordinate 'right' is less than 'left'
Traceback (most recent call last):

File "/home/health/anaconda3/envs/mineru/bin/magic-pdf", line 8, in
sys.exit(cli())
│ │ └
│ └
└ <module 'sys' (built-in)>
File "/home/health/anaconda3/envs/mineru/lib/python3.10/site-packages/click/core.py", line 1157, in call
return self.main(*args, **kwargs)
│ │ │ └ {}
│ │ └ ()
│ └ <function BaseCommand.main at 0x7e1eebe32950>

File "/home/health/anaconda3/envs/mineru/lib/python3.10/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
│ │ └ <click.core.Context object at 0x7e1eec00c880>
│ └ <function Command.invoke at 0x7e1eebe33400>

File "/home/health/anaconda3/envs/mineru/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
│ │ │ │ │ └ {'path': 'small_ocr.pdf', 'output_dir': '/home/health/pdf/', 'method': 'auto', 'debug_able': False, 'start_page_id': 0, 'end_...
│ │ │ │ └ <click.core.Context object at 0x7e1eec00c880>
│ │ │ └ <function cli at 0x7e1db96d53f0>
│ │ └
│ └ <function Context.invoke at 0x7e1eebe32170>
└ <click.core.Context object at 0x7e1eec00c880>
File "/home/health/anaconda3/envs/mineru/lib/python3.10/site-packages/click/core.py", line 783, in invoke
return _callback(*args, **kwargs)
│ └ {'path': 'small_ocr.pdf', 'output_dir': '/home/health/pdf/', 'method': 'auto', 'debug_able': False, 'start_page_id': 0, 'end
...
└ ()
File "/home/health/anaconda3/envs/mineru/lib/python3.10/site-packages/magic_pdf/tools/cli.py", line 102, in cli
parse_doc(path)
│ └ 'small_ocr.pdf'
└ <function cli..parse_doc at 0x7e1eec0337f0>

File "/home/health/anaconda3/envs/mineru/lib/python3.10/site-packages/magic_pdf/tools/cli.py", line 84, in parse_doc
do_parse(
└ <function do_parse at 0x7e1db96d4b80>
File "/home/health/anaconda3/envs/mineru/lib/python3.10/site-packages/magic_pdf/tools/common.py", line 79, in do_parse
pipe.pipe_analyze()
│ └ <function UNIPipe.pipe_analyze at 0x7e1db96d4d30>
└ <magic_pdf.pipe.UNIPipe.UNIPipe object at 0x7e1db96ba770>
File "/home/health/anaconda3/envs/mineru/lib/python3.10/site-packages/magic_pdf/pipe/UNIPipe.py", line 33, in pipe_analyze
self.model_list = doc_analyze(self.pdf_bytes, ocr=True,
│ │ │ │ └ b'%PDF-1.7\r\n%\xa1\xb3\xc5\xd7\r\n1 0 obj\r\n<</Pages 2 0 R /Type/Catalog>>\r\nendobj\r\n2 0 obj\r\n<</Count 8/Kids[ 4 0 R ...
│ │ │ └ <magic_pdf.pipe.UNIPipe.UNIPipe object at 0x7e1db96ba770>
│ │ └ <function doc_analyze at 0x7e1e5a4b95a0>
│ └ []
└ <magic_pdf.pipe.UNIPipe.UNIPipe object at 0x7e1db96ba770>
File "/home/health/anaconda3/envs/mineru/lib/python3.10/site-packages/magic_pdf/model/doc_analyze_by_custom_model.py", line 129, in doc_analyze
result = custom_model(img)
│ └ array([[[255, 255, 255],
│ [255, 255, 255],
│ [255, 255, 255],
│ ...,
│ [255, 255, 255],
│ [255...
└ <magic_pdf.model.pdf_extract_kit.CustomPEKModel object at 0x7e1db91afe50>
File "/home/health/anaconda3/envs/mineru/lib/python3.10/site-packages/magic_pdf/model/pdf_extract_kit.py", line 274, in call
bbox_img = get_croped_image(Image.fromarray(image), [xmin, ymin, xmax, ymax])
│ │ │ │ │ │ │ └ 0
│ │ │ │ │ │ └ 2361
│ │ │ │ │ └ 88
│ │ │ │ └ 2491
│ │ │ └ array([[[255, 255, 255],
│ │ │ [255, 255, 255],
│ │ │ [255, 255, 255],
│ │ │ ...,
│ │ │ [255, 255, 255],
│ │ │ [255...
│ │ └ <function fromarray at 0x7e1cfd945900>
│ └ <module 'PIL.Image' from '/home/health/anaconda3/envs/mineru/lib/python3.10/site-packages/PIL/Image.py'>
└ <function get_croped_image at 0x7e1cebd8f250>
File "/home/health/anaconda3/envs/mineru/lib/python3.10/site-packages/magic_pdf/model/pek_sub_modules/post_process.py", line 16, in get_croped_image
croped_img = image_pil.crop((x_min, y_min, x_max, y_max))
│ │ │ │ │ └ 0
│ │ │ │ └ 2361
│ │ │ └ 88
│ │ └ 2491
│ └ <function Image.crop at 0x7e1cfd90bac0>
└ <PIL.Image.Image image mode=RGB size=3405x5000 at 0x7E1CD1172AA0>
File "/home/health/anaconda3/envs/mineru/lib/python3.10/site-packages/PIL/Image.py", line 1305, in crop
raise ValueError(msg)
└ "Coordinate 'right' is less than 'left'"

ValueError: Coordinate 'right' is less than 'left'

Operating system | 操作系统

Linux

Python version | Python 版本

3.10

Software version | 软件版本 (magic-pdf --version)

0.8.x

Device mode | 设备模式

cuda

@wertyac wertyac added the bug Something isn't working label Oct 18, 2024
@myhloli
Copy link
Collaborator

myhloli commented Oct 18, 2024

从0: 1888x1312 219 embeddings, 81 isolateds, 106.1ms
Speed: 18.1ms preprocess, 106.1ms inference, 42.8ms postprocess per image at shape (1, 3, 1888, 1312)
2024-10-18 09:32:48.691 | INFO | magic_pdf.model.pdf_extract_kit:call:289 - formula nums: 300, mfr time: 132.87
这个日志来看的话,应该是安装时网络波动导致pytorch没有装好,运行时在mfd环境虚空产生了很多虚假的公式bbox。

建议使用网速较好的设备,以及镜像源的方式,减少安装过程中的不确定因素。

@wertyac
Copy link
Author

wertyac commented Oct 18, 2024

从0: 1888x1312 219 embeddings, 81 isolateds, 106.1ms Speed: 18.1ms preprocess, 106.1ms inference, 42.8ms postprocess per image at shape (1, 3, 1888, 1312) 2024-10-18 09:32:48.691 | INFO | magic_pdf.model.pdf_extract_kit:call:289 - formula nums: 300, mfr time: 132.87 这个日志来看的话,应该是安装时网络波动导致pytorch没有装好,运行时在mfd环境虚空产生了很多虚假的公式bbox。

建议使用网速较好的设备,以及镜像源的方式,减少安装过程中的不确定因素。

谢谢我重新安装一下试试,会不会和cuda和cudnn版本有关系?cuda用的是11.8.

@myhloli
Copy link
Collaborator

myhloli commented Oct 18, 2024

pytorch会使用pip安装cu12.1作为依赖,不会使用系统的cu11.8

@wertyac
Copy link
Author

wertyac commented Oct 18, 2024

pytorch会使用pip安装cu12.1作为依赖,不会使用系统的cu11.8

奇怪,我换了一条服务器,ubuntu 22.04, CUDA 12.1, driver 550.120 python 3.10 同样的安装方案,这个不会报错。。。。太奇怪了。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants