Optimum documentation
Quantization
Quantization
FuriosaAIQuantizer
Handles the FuriosaAI quantization process for models shared on huggingface.co/models.
Computes the quantization ranges.
fit
< source >( dataset: Dataset calibration_config: CalibrationConfig batch_size: int = 1 )
Parameters
- dataset (
Dataset) — The dataset to use when performing the calibration step. - calibration_config (~CalibrationConfig) — The configuration containing the parameters related to the calibration step.
- batch_size (
int, optional, defaults to 1) — The batch size to use when collecting the quantization ranges values.
Performs the calibration step and computes the quantization ranges.
from_pretrained
< source >( model_or_path: Union file_name: Optional = None )
Parameters
- model_or_path (
Union[FuriosaAIModel, str, Path]) — Can be either:- A path to a saved exported ONNX Intermediate Representation (IR) model, e.g., `./my_model_directory/.
- Or an
FuriosaAIModelModelForXXclass, e.g.,FuriosaAIModelModelForImageClassification.
- file_name(
Optional[str], optional) — Overwrites the default model file name from"model.onnx"tofile_name. This allows you to load different model files from the same repository or directory.
Instantiates a FuriosaAIQuantizer from a model path.
get_calibration_dataset
< source >( dataset_name: str num_samples: int = 100 dataset_config_name: Optional = None dataset_split: Optional = None preprocess_function: Optional = None preprocess_batch: bool = True seed: int = 2016 use_auth_token: bool = False )
Parameters
- dataset_name (
str) — The dataset repository name on the Hugging Face Hub or path to a local directory containing data files to load to use for the calibration step. - num_samples (
int, optional, defaults to 100) — The maximum number of samples composing the calibration dataset. - dataset_config_name (
Optional[str], optional) — The name of the dataset configuration. - dataset_split (
Optional[str], optional) — Which split of the dataset to use to perform the calibration step. - preprocess_function (
Optional[Callable], optional) — Processing function to apply to each example after loading dataset. - preprocess_batch (
bool, optional, defaults toTrue) — Whether thepreprocess_functionshould be batched. - seed (
int, optional, defaults to 2016) — The random seed to use when shuffling the calibration dataset. - use_auth_token (
bool, optional, defaults toFalse) — Whether to use the token generated when runningtransformers-cli login(necessary for some datasets like ImageNet).
Creates the calibration datasets.Dataset to use for the post-training static quantization calibration step.
partial_fit
< source >( dataset: Dataset calibration_config: CalibrationConfig batch_size: int = 1 )
Parameters
- dataset (
Dataset) — The dataset to use when performing the calibration step. - calibration_config (
CalibrationConfig) — The configuration containing the parameters related to the calibration step. - batch_size (
int, optional, defaults to 1) — The batch size to use when collecting the quantization ranges values.
Performs the calibration step and collects the quantization ranges without computing them.
quantize
< source >( quantization_config: QuantizationConfig save_dir: Union file_suffix: Optional = 'quantized' calibration_tensors_range: Optional = None )
Parameters
- quantization_config (
QuantizationConfig) — The configuration containing the parameters related to quantization. - save_dir (
Union[str, Path]) — The directory where the quantized model should be saved. - file_suffix (
Optional[str], optional, defaults to"quantized") — The file_suffix used to save the quantized model. - calibration_tensors_range (
Optional[Dict[NodeName, Tuple[float, float]]], optional) — The dictionary mapping the nodes name to their quantization ranges, used and required only when applying static quantization.
Quantizes a model given the optimization specifications defined in quantization_config.