Skip to content

Commit

Permalink
Fix import error from quantization before PyTorch 2.3
Browse files Browse the repository at this point in the history
Summary: pytorch#884 introduced a quantizer that was not available before
PyTorch 2.3, causing import errors for users using an earlier
version. In torchao, we gate the import by PyTorch version.
We can do the same here.
  • Loading branch information
andrewor14 committed May 2, 2024
1 parent f5b8eaf commit ef82676
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 4 deletions.
2 changes: 1 addition & 1 deletion recipes/quantize.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ class QuantizationRecipe:
multiple of groupsize.
`percdamp`: GPTQ stablization hyperparameter, recommended to be .01
8da4w:
8da4w (PyTorch 2.3+):
torchtune.utils.quantization.Int8DynActInt4WeightQuantizer
int8 per token dynamic activation with int4 weight only per axis group quantization
Args:
Expand Down
11 changes: 8 additions & 3 deletions torchtune/utils/quantization.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,15 +11,14 @@
apply_weight_only_int8_quant,
Int4WeightOnlyGPTQQuantizer,
Int4WeightOnlyQuantizer,
Int8DynActInt4WeightQuantizer,
Quantizer,
)
from torchao.quantization.utils import TORCH_VERSION_AFTER_2_3

__all__ = [
"Int4WeightOnlyQuantizer",
"Int4WeightOnlyGPTQQuantizer",
"Int8WeightOnlyQuantizer",
"Int8DynActInt4WeightQuantizer",
"get_quantizer_mode",
]

Expand All @@ -36,10 +35,16 @@ def quantize(
Int4WeightOnlyQuantizer: "4w",
Int8WeightOnlyQuantizer: "8w",
Int4WeightOnlyGPTQQuantizer: "4w-gptq",
Int8DynActInt4WeightQuantizer: "8da4w",
}


if TORCH_VERSION_AFTER_2_3:
from torchao.quantization.quant_api import Int8DynActInt4WeightQuantizer

__all__.append("Int8DynActInt4WeightQuantizer")
_quantizer_to_mode[Int8DynActInt4WeightQuantizer] = "8da4w"


def get_quantizer_mode(quantizer: Optional[Callable]) -> Optional[str]:
"""Given a quantizer object, returns a string that specifies the type of quantization e.g.
4w, which means int4 weight only quantization.
Expand Down

0 comments on commit ef82676

Please sign in to comment.