add tests for merging lora and validating the dtype #1512

winglian · 2024-04-10T17:12:04Z

No description provided.

NanoCode012 · 2024-04-12T08:43:38Z

tests/e2e/test_lora_llama.py

+        cfg.lora_model_dir = cfg.output_dir
+        cfg.load_in_4bit = False
+        cfg.load_in_8bit = False
+        cfg.flash_attention = False
+        cfg.deepspeed = None
+        cfg.fsdp = None


This can be excluded as the modify_cfg_for_merge should've set it?

Suggested change

cfg.lora_model_dir = cfg.output_dir

cfg.load_in_4bit = False

cfg.load_in_8bit = False

cfg.flash_attention = False

cfg.deepspeed = None

cfg.fsdp = None

NanoCode012 · 2024-04-12T08:44:03Z

tests/e2e/test_lora_llama.py

+        cfg.fsdp = None
+
+        cfg = modify_cfg_for_merge(cfg)
+        cfg.merge_lora = True


Let's move this setting inside the modify_cfg function as well.

NanoCode012 · 2024-04-12T08:46:13Z

src/axolotl/cli/merge_lora.py

@@ -27,21 +28,26 @@ def do_cli(config: Path = Path("examples/"), **kwargs):
        flash_attention=False,


If the above section already sets these properties, is it necessary to set it again below?

NanoCode012 · 2024-04-12T08:47:41Z

tests/e2e/test_lora_llama.py

+        # pylint: disable=duplicate-code
+        cfg = DictDefault(
+            {
+                "base_model": "JackFram/llama-68m",


Also, sometimes, this issue can occur for different model types. For ex, previous llama merge was fine, but mistral was not. Do we need to test this for other arch?

NanoCode012 · 2024-04-12T08:47:50Z

tests/e2e/test_lora_llama.py

+        cli_args = TrainerCliArgs()
+        dataset_meta = load_datasets(cfg=cfg, cli_args=cli_args)
+
+        train(cfg=cfg, cli_args=cli_args, dataset_meta=dataset_meta)


I don't think you need to train a model, maybe a tiny adapter can be uploaded to HF which we use for merge?

winglian added 2 commits April 10, 2024 13:00

add tests for merging lora and validating the dtype

5767eea

fix the torch dtype check

4c92b51

winglian requested a review from NanoCode012 April 12, 2024 06:02

NanoCode012 reviewed Apr 12, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add tests for merging lora and validating the dtype #1512

add tests for merging lora and validating the dtype #1512

winglian commented Apr 10, 2024

NanoCode012 Apr 12, 2024

NanoCode012 Apr 12, 2024

NanoCode012 Apr 12, 2024

NanoCode012 Apr 12, 2024

NanoCode012 Apr 12, 2024

		@@ -27,21 +28,26 @@ def do_cli(config: Path = Path("examples/"), **kwargs):
		flash_attention=False,

add tests for merging lora and validating the dtype #1512

Are you sure you want to change the base?

add tests for merging lora and validating the dtype #1512

Conversation

winglian commented Apr 10, 2024

NanoCode012 Apr 12, 2024

Choose a reason for hiding this comment

NanoCode012 Apr 12, 2024

Choose a reason for hiding this comment

NanoCode012 Apr 12, 2024

Choose a reason for hiding this comment

NanoCode012 Apr 12, 2024

Choose a reason for hiding this comment

NanoCode012 Apr 12, 2024

Choose a reason for hiding this comment