ORT would be crashed while loading the specific INT4 model #22284

peterer0625 · 2024-10-01T13:08:28Z

Describe the issue

ORT would be crashed while loading the specific INT4 model.

We can observe the issue on DML EP and CPU EP.

Here are the crash dumps - https://www.dropbox.com/scl/fi/h3wvh3vkap83gmvuugebs/onnxruntime-T5-crash-dump.7z?rlkey=kq7tu3i87eplnjo9z232zro10&st=aq0i4hvi&dl=0

The issue is gone if we set ORT.GraphOptimizationLevel.ORT_DISABLE_ALL

To reproduce

Export SD3 t5 model to onnx model.
Do INT4 quantization for MATMUL by this way Quantize ONNX models | onnxruntime.
Inference the quantized model.

Urgency

No response

Platform

Windows

OS Version

26100

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

Quantization should be with the newest commit. Inference can be run with ORT-DML 1.19.0

ONNX Runtime API

Python

Architecture

X64

Execution Provider

Default CPU, DirectML

Execution Provider Library Version

No response

github-actions bot added the ep:DML issues related to the DirectML execution provider label Oct 1, 2024

peterer0625 changed the title ~~ORT would be crashed while loading a model~~ ORT would be crashed while loading the specific INT4 model Oct 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ORT would be crashed while loading the specific INT4 model #22284

ORT would be crashed while loading the specific INT4 model #22284

peterer0625 commented Oct 1, 2024 •

edited

Loading

ORT would be crashed while loading the specific INT4 model #22284

ORT would be crashed while loading the specific INT4 model #22284

Comments

peterer0625 commented Oct 1, 2024 • edited Loading

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

peterer0625 commented Oct 1, 2024 •

edited

Loading