Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

有关Internvl-1.5的微调实验(AttributeError: 'NoneType' object has no attribute 'shape') #925

Closed
MVP-D77 opened this issue May 13, 2024 · 5 comments

Comments

@MVP-D77
Copy link

MVP-D77 commented May 13, 2024

@hjh0119 在1.5版本微调训练结束后,按照教程里面的推理命令,并加载了本地的权重,我使用的infer命令如下:

CUDA_VISIBLE_DEVICES=0,1 swift infer  --ckpt_dir output/internvl-chat-v1_5/v0-20240512-191616/checkpoint-25 --load_dataset_config true  --dtype bf16  --model_id_or_path xxxxx/InternVL/pretrained/InternVL-Chat-V1-5

但在加载后出现了报错情况,不知之前测试时有没有发生

Traceback (most recent call last):
  File "/data2/renyw/PythonWorkspace/FM-LLM/swift/swift/cli/infer.py", line 5, in <module>
    infer_main()
  File "/data2/renyw/PythonWorkspace/FM-LLM/swift/swift/utils/run_utils.py", line 27, in x_main
    result = llm_x(args, **kwargs)
  File "/data2/renyw/PythonWorkspace/FM-LLM/swift/swift/llm/infer.py", line 376, in llm_infer
    if args.show_dataset_sample >= 0 and val_dataset.shape[0] > args.show_dataset_sample:
AttributeError: 'NoneType' object has no attribute 'shape'
@MVP-D77
Copy link
Author

MVP-D77 commented May 13, 2024

@hjh0119 除此之外,有个问题是,对于internvl1.5相对于1.2版本很重要的改变之一是分辨率,它可以把分辨率比较大的图片分成多个小尺寸图片放进batch里面输入,不知在微调阶段,图片是会被直接resize到固定尺寸,还是动态分割成多张图片,如果是后者可否控制分割图片的张数?谢谢您的回答

@hjh0119
Copy link
Collaborator

hjh0119 commented May 14, 2024

第一个问题用最新的代码会出现吗
第二个问题参考https://github.com/modelscope/swift/blob/main/swift/llm/utils/vision_utils.py#L74C1-L90C24
用的官方的处理图片的逻辑,会根据图片大小计算ViT的patch数

@MVP-D77
Copy link
Author

MVP-D77 commented May 15, 2024

@hjh0119 现在用最新的代码 直接推理微调后的模型,还是会报相同的错误,请问有测试过微调后的推理实验嘛,有没有成功的案例,想请教一下谢谢
另外 对于第二个问题,这个根据图片大小计算ViT的patch数目,这个最大max_num在微调时可以加以控制么,有没有预留的参数可以使用?我想做visual grounding 任务,可能并不需要切成多个patch,现在微调时默认情况是都会把大的图片切成小的图片计算ViT的patch数目嘛

def load_image(img_path, input_size=448, max_num=6):
    if isinstance(img_path, str):
        img_path = img_path.strip()
        if img_path.startswith('http'):
            content = requests.get(img_path).content
            image = Image.open(BytesIO(content))
        else:
            image = Image.open(img_path)
    else:
        image = img_path
    if image.mode != 'RGB':
        image = image.convert('RGB')
    transform = build_transform(input_size=input_size)
    images = dynamic_preprocess(image, image_size=input_size, use_thumbnail=True, max_num=max_num)
    pixel_values = [transform(image) for image in images]
    pixel_values = torch.stack(pixel_values)
    return pixel_values
    ```

@hjh0119
Copy link
Collaborator

hjh0119 commented May 15, 2024

第一个问题找到bug了,正在修复
第二个问题单独设置参数可能有点臃肿,目前是对齐官方实现

@hjh0119
Copy link
Collaborator

hjh0119 commented May 15, 2024

fixed #937

@hjh0119 hjh0119 closed this as completed May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants