Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Valid split will trigger AttributeError: 'NoneType' object has no attribute 'map' #633

Open
2 tasks done
jiminHuang opened this issue May 9, 2024 · 1 comment
Open
2 tasks done
Labels
bug Something isn't working

Comments

@jiminHuang
Copy link

Prerequisites

  • I have read the documentation.
  • I have checked other issues for similar problems.

Backend

Local

Interface Used

CLI

CLI Command

'model': 'meta-llama/Meta-Llama-3-8B-Instruct', 'project_name': 'Meta-Ll-UMLS-Co-2-0-0001', 'data_path': 'XXXX/UMLS_Concept_train', 'train_split': 'train', 'valid_split': 'valid', 'add_eos_token': False, 'block_size': 4096, 'model_max_length': 8192, 'padding': None, 'trainer': 'default', 'use_flash_attention_2': False, 'log': 'none', 'disable_gradient_checkpointing': False, 'logging_steps': -1, 'evaluation_strategy': 'epoch', 'save_total_limit': 2, 'auto_find_batch_size': False, 'mixed_precision': 'fp16', 'lr': 0.0001, 'epochs': 2, 'batch_size': 4, 'warmup_ratio': 0.1, 'gradient_accumulation': 4, 'optimizer': 'adamw_torch', 'scheduler': 'linear', 'weight_decay': 0.01, 'max_grad_norm': 1.0, 'seed': 42, 'chat_template': 'tokenizer', 'quantization': None, 'target_modules': 'all-linear', 'merge_adapter': True, 'peft': True, 'lora_r': 16, 'lora_alpha': 32, 'lora_dropout': 0.05, 'model_ref': None, 'dpo_beta': 0.1, 'max_prompt_length': 128, 'max_completion_length': None, 'prompt_text_column': 'prompt', 'text_column': 'conversations', 'rejected_text_column': 'rejected', 'push_to_hub': True, 'username': 'XXXX', 'token': '*****'}

UI Screenshots & Parameters

No response

Error Logs

ERROR | 2024-05-08 22:21:35 | autotrain.trainers.common:wrapper:120 - train has failed due to an exception: Traceback (most recent call last):
File "/blue/yonghui.wu/qx68/autotrain-conda/lib/python3.10/site-packages/autotrain/trainers/common.py", line 117, in wrapper
return func(*args, **kwargs)
File "/blue/yonghui.wu/qx68/autotrain-conda/lib/python3.10/site-packages/autotrain/trainers/clm/main.py", line 23, in train
train_default(config)
File "/blue/yonghui.wu/qx68/autotrain-conda/lib/python3.10/site-packages/autotrain/trainers/clm/train_clm_default.py", line 40, in train
train_data, valid_data = utils.process_data_with_chat_template(config, tokenizer, train_data, valid_data)
File "/blue/yonghui.wu/qx68/autotrain-conda/lib/python3.10/site-packages/autotrain/trainers/clm/utils.py", line 414, in process_data_with_chat_template
valid_data = valid_data.map(
AttributeError: 'NoneType' object has no attribute 'map'

Additional Information

why there is:

valid_data = None

on clm utils?

It looks like that it will always trigger the value error when valid_split is specified.

@jiminHuang jiminHuang added the bug Something isn't working label May 9, 2024
@abhishekkrthakur
Copy link
Member

we still need to allow valid_split for llm tasks. currently validation is disabled for llm finetuning as its not representative.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants