API Reference¶
This page provides detailed documentation for the Quantize API.
quantize Module¶
Quantize - A simple Python library for quantizing floating point values to int4 values.
- quantize.dequantize_from_int4(quantized_values: ndarray, scale: float, zero_point: float = 0) ndarray[source]¶
Dequantize int4 values back to floating point.
- Parameters:
quantized_values – Array of quantized int4 values (stored as int8)
scale – Scaling factor used during quantization
zero_point – Zero point offset (usually 0 for symmetric quantization)
- Returns:
Array of dequantized floating point values
- quantize.quantize_to_int4(values: List[float] | ndarray, scale_method: str = 'minmax') Tuple[ndarray, float, float][source]¶
Quantize floating point values to int4 values (4-bit integers).
Int4 values range from -8 to 7 (16 distinct values).
- Parameters:
values – List or array of floating point values to quantize
scale_method – Method to determine scaling factor (‘minmax’ or ‘absmax’)
- Returns:
Tuple of (quantized_values, scale, zero_point) - quantized_values: numpy array of int4 values (stored as int8) - scale: scaling factor used for quantization - zero_point: zero point offset (usually 0 for symmetric quantization)
quantize.quantize Module¶
Implementation of quantization functions for converting between floating point and int4 values.
- quantize.quantize.dequantize_from_int4(quantized_values: ndarray, scale: float, zero_point: float = 0) ndarray[source]¶
Dequantize int4 values back to floating point.
- Parameters:
quantized_values – Array of quantized int4 values (stored as int8)
scale – Scaling factor used during quantization
zero_point – Zero point offset (usually 0 for symmetric quantization)
- Returns:
Array of dequantized floating point values
- quantize.quantize.pack_int4_to_int8(int4_values: ndarray) ndarray[source]¶
Pack two int4 values into each int8 value to save memory.
- Parameters:
int4_values – Array of int4 values (stored as int8)
- Returns:
Array of packed int8 values (half the length of input)
- quantize.quantize.quantize_to_int4(values: List[float] | ndarray, scale_method: str = 'minmax') Tuple[ndarray, float, float][source]¶
Quantize floating point values to int4 values (4-bit integers).
Int4 values range from -8 to 7 (16 distinct values).
- Parameters:
values – List or array of floating point values to quantize
scale_method – Method to determine scaling factor (‘minmax’ or ‘absmax’)
- Returns:
Tuple of (quantized_values, scale, zero_point) - quantized_values: numpy array of int4 values (stored as int8) - scale: scaling factor used for quantization - zero_point: zero point offset (usually 0 for symmetric quantization)
Core Functions¶
quantize_to_int4¶
- quantize.quantize_to_int4(values: List[float] | ndarray, scale_method: str = 'minmax') Tuple[ndarray, float, float][source]¶
Quantize floating point values to int4 values (4-bit integers).
Int4 values range from -8 to 7 (16 distinct values).
- Parameters:
values – List or array of floating point values to quantize
scale_method – Method to determine scaling factor (‘minmax’ or ‘absmax’)
- Returns:
Tuple of (quantized_values, scale, zero_point) - quantized_values: numpy array of int4 values (stored as int8) - scale: scaling factor used for quantization - zero_point: zero point offset (usually 0 for symmetric quantization)
The quantize_to_int4 function converts floating point values to int4 values (4-bit integers). Int4 values range from -8 to 7, providing 16 distinct values.
- Parameters:
values (Union[List[float], np.ndarray]): List or array of floating point values to quantize
- scale_method (str, optional): Method to determine scaling factor. Default is “minmax”. Options are:
“minmax”: Maps the min and max values to the int4 range
“absmax”: Maps the absolute max value to the int4 range
- Returns:
- Tuple[np.ndarray, float, float]: A tuple containing:
quantized_values: numpy array of int4 values (stored as int8)
scale: scaling factor used for quantization
zero_point: zero point offset (usually 0 for symmetric quantization)
dequantize_from_int4¶
- quantize.dequantize_from_int4(quantized_values: ndarray, scale: float, zero_point: float = 0) ndarray[source]¶
Dequantize int4 values back to floating point.
- Parameters:
quantized_values – Array of quantized int4 values (stored as int8)
scale – Scaling factor used during quantization
zero_point – Zero point offset (usually 0 for symmetric quantization)
- Returns:
Array of dequantized floating point values
The dequantize_from_int4 function converts int4 values back to floating point.
- Parameters:
quantized_values (np.ndarray): Array of quantized int4 values (stored as int8)
scale (float): Scaling factor used during quantization
zero_point (float, optional): Zero point offset. Default is 0.
- Returns:
np.ndarray: Array of dequantized floating point values
Memory Optimization Functions¶
pack_int4_to_int8¶
- quantize.quantize.pack_int4_to_int8(int4_values: ndarray) ndarray[source]¶
Pack two int4 values into each int8 value to save memory.
- Parameters:
int4_values – Array of int4 values (stored as int8)
- Returns:
Array of packed int8 values (half the length of input)
The pack_int4_to_int8 function packs two int4 values into each int8 value to save memory.
- Parameters:
int4_values (np.ndarray): Array of int4 values (stored as int8)
- Returns:
np.ndarray: Array of packed int8 values (half the length of input)
unpack_int8_to_int4¶
- quantize.quantize.unpack_int8_to_int4(packed_values: ndarray) ndarray[source]¶
Unpack int8 values back into int4 values.
- Parameters:
packed_values – Array of packed int8 values
- Returns:
Array of unpacked int4 values (twice the length of input)
The unpack_int8_to_int4 function unpacks int8 values back into int4 values.
- Parameters:
packed_values (np.ndarray): Array of packed int8 values
- Returns:
np.ndarray: Array of unpacked int4 values (twice the length of input)
Example Function¶
example¶
The example function demonstrates the quantization process with sample values.
- Returns:
- Tuple[np.ndarray, np.ndarray, np.ndarray]: A tuple containing:
Original floating point values
Quantized int4 values
Dequantized floating point values