Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPTQ error: TypeError: descriptor 'to' for 'torch._C.TensorBase' objects doesn't apply to a 'torch.device' object #1189

Open
ambitious-octopus opened this issue Aug 23, 2024 · 6 comments
Assignees

Comments

@ambitious-octopus
Copy link
Contributor

ambitious-octopus commented Aug 23, 2024

Issue Type

Bug

Source

pip (mct-nightly)

MCT Version

PR #1186

OS Platform and Distribution

Linux Ubuntu 22.04

Python version

3.10

Describe the issue

I'm attempting to quantize a YOLOv8n model from the Ultralytics package using MCT GPTQ. However, I encounter this error during the calibration process:

-> 1106 gptq_quant_model, _ = mct.gptq.pytorch_gradient_post_training_quantization(
   1107     model=self.model,
   1108     representative_data_gen=representative_dataset_gen,
   1109     target_resource_utilization=resource_utilization,
   1110     gptq_config=gptq_config,
   1111     core_config=config,
   1112     target_platform_capabilities=tpc)
   1114 print('Quantized-GPTQ model is ready')
   1116 return f, None

File ~/repos/model_optimization/model_compression_toolkit/gptq/pytorch/quantization_facade.py:196, in pytorch_gradient_post_training_quantization(model, representative_data_gen, target_resource_utilization, core_config, gptq_config, gptq_representative_data_gen, target_platform_capabilities)
    191 float_graph = copy.deepcopy(graph)
    193 # ---------------------- #
    194 # GPTQ Runner
    195 # ---------------------- #
--> 196 graph_gptq = gptq_runner(graph,
    197                          core_config,
    198                          gptq_config,
    199                          representative_data_gen,
    200                          gptq_representative_data_gen if gptq_representative_data_gen else representative_data_gen,
    201                          DEFAULT_PYTORCH_INFO,
    202                          fw_impl,
    203                          tb_w,
    204                          hessian_info_service=hessian_info_service)
    206 if core_config.debug_config.analyze_similarity:
    207     analyzer_model_quantization(representative_data_gen,
    208                                 tb_w,
    209                                 float_graph,
    210                                 graph_gptq,
    211                                 fw_impl,
    212                                 DEFAULT_PYTORCH_INFO)

File ~/repos/model_optimization/model_compression_toolkit/gptq/runner.py:115, in gptq_runner(tg, core_config, gptq_config, representative_data_gen, gptq_representative_data_gen, fw_info, fw_impl, tb_w, hessian_info_service)
    111 #############################################
    112 # Gradient Based Post Training Quantization
    113 #############################################
    114 Logger.info("Running GPTQ optimization.")
--> 115 tg_gptq = _apply_gptq(gptq_config,
    116                       gptq_representative_data_gen,
    117                       tb_w,
    118                       tg,
    119                       tg_bias,
    120                       fw_info,
    121                       fw_impl,
    122                       hessian_info_service=hessian_info_service)
    124 return tg_gptq

File ~/repos/model_optimization/model_compression_toolkit/gptq/runner.py:62, in _apply_gptq(gptq_config, representative_data_gen, tb_w, tg, tg_bias, fw_info, fw_impl, hessian_info_service)
     43 """
     44 Apply GPTQ to improve accuracy of quantized model.
     45 Build two models from a graph: A teacher network (float model) and a student network (quantized model).
   (...)
     59
     60 """
     61 if gptq_config is not None and gptq_config.n_epochs > 0:
---> 62     tg_bias = gptq_training(tg,
     63                             tg_bias,
     64                             gptq_config,
     65                             representative_data_gen,
     66                             fw_impl,
     67                             fw_info,
     68                             hessian_info_service=hessian_info_service)
     70     if tb_w is not None:
     71         tb_w.add_graph(tg_bias, 'after_gptq')

File ~/repos/model_optimization/model_compression_toolkit/gptq/common/gptq_training.py:287, in gptq_training(graph_float, graph_quant, gptq_config, representative_data_gen, fw_impl, fw_info, hessian_info_service)
    278 gptq_trainer = gptq_trainer_obj(graph_float,
    279                                 graph_quant,
    280                                 gptq_config,
   (...)
    283                                 representative_data_gen,
    284                                 hessian_info_service=hessian_info_service)
    286 # Training process
--> 287 gptq_trainer.train(representative_data_gen)
    289 # Update graph
    290 graph_quant = gptq_trainer.update_graph()

File ~/repos/model_optimization/model_compression_toolkit/gptq/pytorch/gptq_training.py:193, in PytorchGPTQTrainer.train(self, representative_data_gen)
    190     optimizer.add_param_group({'params': params})
    192 # Set models mode
--> 193 set_model(self.float_model, False)
    194 set_model(self.fxp_model, True)
    195 self._set_requires_grad()

File ~/repos/model_optimization/model_compression_toolkit/core/pytorch/utils.py:41, in set_model(model, train_mode)
     38     model.eval()
     40 device = get_working_device()
---> 41 model.to(device)

TypeError: descriptor 'to' for 'torch._C.TensorBase' objects doesn't apply to a 'torch.device' object

cc: @Idan-BenAmi

Expected behaviour

No response

Code to reproduce the issue

Dependencies:

  1. Ultralytics package:
pip install git+https://github.com/ultralytics/ultralytics.git@quan
  1. MCT:
pip install git+https://github.com/ambitious-octopus/model_optimization.git@get-output-fix

Code:

import os
import model_compression_toolkit as mct
from tutorials.mct_model_garden.evaluation_metrics.coco_evaluation import coco_dataset_generator
from tutorials.mct_model_garden.models_pytorch.yolov8.yolov8_preprocess import yolov8_preprocess_chw_transpose
from typing import Iterator, Tuple, List
import wget 
import zipfile
import logging


DATASET_ROOT = "./coco"

if not os.path.isdir(DATASET_ROOT):
    logging.info('Downloading COCO dataset')
    os.mkdir(DATASET_ROOT)
    wget.download('http://images.cocodataset.org/annotations/annotations_trainval2017.zip')
    with zipfile.ZipFile("annotations_trainval2017.zip", 'r') as zip_ref:
        zip_ref.extractall(DATASET_ROOT)
    os.remove('annotations_trainval2017.zip')
    
    wget.download('http://images.cocodataset.org/zips/val2017.zip')
    with zipfile.ZipFile("val2017.zip", 'r') as zip_ref:
        zip_ref.extractall(DATASET_ROOT)
    os.remove('val2017.zip')
    

from ultralytics import YOLO
from ultralytics.nn.modules import C2f, Detect
model = YOLO("yolov8n.pt").model

for m in model.modules():
    if isinstance(m, C2f):
        m.forward = m.forward_fx
    if isinstance(m, Detect):
        m.export = True
        m.format = "mct"

REPRESENTATIVE_DATASET_FOLDER = f'{DATASET_ROOT}/val2017/'
REPRESENTATIVE_DATASET_ANNOTATION_FILE = f'{DATASET_ROOT}/annotations/instances_val2017.json'
BATCH_SIZE = 4
n_iters = 20

# Load representative dataset
logging.info('Loading representative dataset')
representative_dataset = coco_dataset_generator(dataset_folder=REPRESENTATIVE_DATASET_FOLDER,
                                                annotation_file=REPRESENTATIVE_DATASET_ANNOTATION_FILE,
                                                preprocess=yolov8_preprocess_chw_transpose,
                                                batch_size=BATCH_SIZE)

# Define representative dataset generator
def get_representative_dataset(n_iter: int, dataset_loader: Iterator[Tuple]):
    """
    This function creates a representative dataset generator. The generator yields numpy
        arrays of batches of shape: [Batch, H, W ,C].
    Args:
        n_iter: number of iterations for MCT to calibrate on
    Returns:
        A representative dataset generator
    """       
    def representative_dataset() -> Iterator[List]:
        ds_iter = iter(dataset_loader)
        for _ in range(n_iter):
            yield [next(ds_iter)[0]]

    return representative_dataset

logging.info('Creating representative dataset generator')
# Get representative dataset generator
representative_dataset_gen = get_representative_dataset(n_iter=n_iters,
                                                        dataset_loader=representative_dataset)

# Set IMX500-v1 TPC
logging.info('Setting target platform capabilities')
tpc = mct.get_target_platform_capabilities(fw_name="pytorch",
                                           target_platform_name='imx500',
                                           target_platform_version='v1')

# # Specify the necessary configuration for mixed precision quantization. To keep the tutorial brief, we'll use a small set of images and omit the hessian metric for mixed precision calculations. It's important to be aware that this choice may impact the resulting accuracy. 
mp_config = mct.core.MixedPrecisionQuantizationConfig(num_of_images=5,
                                                      use_hessian_based_scores=False)
config = mct.core.CoreConfig(mixed_precision_config=mp_config,
                             quantization_config=mct.core.QuantizationConfig(shift_negative_activation_correction=True))


# # Define target Resource Utilization for mixed precision weights quantization (75% of 'standard' 8bits quantization)
resource_utilization_data = mct.core.pytorch_resource_utilization_data(in_model=model,
                                                                       representative_data_gen=
                                                                       representative_dataset_gen,
                                                                       core_config=config,
                                                                       target_platform_capabilities=tpc)


resource_utilization = mct.core.ResourceUtilization(weights_memory=resource_utilization_data.weights_memory * 0.75)

# Specify the necessary configuration for Gradient-Based PTQ.
n_gptq_epochs = 1000
gptq_config = mct.gptq.get_pytorch_gptq_config(n_epochs=n_gptq_epochs, use_hessian_based_weights=False)

# Perform Gradient-Based Post Training Quantization
gptq_quant_model, _ = mct.gptq.pytorch_gradient_post_training_quantization(
    model=model,
    representative_data_gen=representative_dataset_gen,
    target_resource_utilization=resource_utilization,
    gptq_config=gptq_config,
    core_config=config,
    target_platform_capabilities=tpc)

print('Quantized-GPTQ model is ready')

Log output

No response

@Idan-BenAmi
Copy link
Collaborator

Hi @ambitious-octopus,
once I'm able to reproduce the issue in #1186, I'll start to debug this issue.

@ofirgo ofirgo linked a pull request Sep 2, 2024 that will close this issue
9 tasks
@Idan-BenAmi
Copy link
Collaborator

Hi @ambitious-octopus ,
We have found the root cause for this error. We noticed that your model performs operations on constants, such as ”to” and “mul” operations, which cause failures in MCT. (specifically cause the model.to(device) error).
To be more specific, I think those operations are done in the anchor preparation in your model.

This issue runs deeper, as manipulating constants during model inference can lead to accuracy degradation. Performing these manipulations in advance and using final constant values instead would enhance accuracy and reduce unnecessary calculations. Therefore, we recommend removing constant manipulations from the model and using the finalized constant values instead. This approach should also resolve issue 1189.

Idan

@Idan-BenAmi
Copy link
Collaborator

Idan-BenAmi commented Oct 7, 2024

While avoiding operators like "to" seems to be correct for this model, we still need to address how to manage such issues. During torch FX, node names should be checked to ensure they aren't reserved names. A suggestion to handle such cases can be found in: #1204

Copy link

github-actions bot commented Dec 6, 2024

Stale issue message

@CYL0089
Copy link

CYL0089 commented Jan 8, 2025

Hello, dose anybody solve this problem? or has some temporary solution to avoid this error?

@CYL0089
Copy link

CYL0089 commented Jan 9, 2025

Hello, dose anybody solve this problem? or has some temporary solution to avoid this error?

I found that change the set_model() in model_compression_toolkit/core/pytorch/utils.py as follow can avoid this error:
try: model.to(device) except: model = model.cuda()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants