You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi GaLore Team, congratulations for the interesting work!
I am trying to fine-tune llama-3 8B model using GaLore but getting this error: torch._C._LinAlgError: linalg.svd: The algorithm failed to converge because the input matrix is ill-conditioned or has too many repeated singular values.
Interestingly first batch loss is non-zero and all subsequent losses are zero values before training is automatically terminated.
Full Error Log
Activated GaLoRE fine-tuning, depending on your model size and hardware, the training might take a while before starting. Please be patient !
model.layers.0.self_attn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.0.self_attn.rotary_emb has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.0.mlp has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.0.mlp.act_fn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.1.self_attn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.1.self_attn.rotary_emb has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.1.mlp has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.1.mlp.act_fn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.2.self_attn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.2.self_attn.rotary_emb has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.2.mlp has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.2.mlp.act_fn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.3.self_attn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.3.self_attn.rotary_emb has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.3.mlp has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.3.mlp.act_fn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.4.self_attn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.4.self_attn.rotary_emb has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.4.mlp has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.4.mlp.act_fn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.5.self_attn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.5.self_attn.rotary_emb has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.5.mlp has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.5.mlp.act_fn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.6.self_attn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.6.self_attn.rotary_emb has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.6.mlp has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.6.mlp.act_fn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.7.self_attn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.7.self_attn.rotary_emb has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.7.mlp has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.7.mlp.act_fn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.8.self_attn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.8.self_attn.rotary_emb has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.8.mlp has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.8.mlp.act_fn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.9.self_attn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.9.self_attn.rotary_emb has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.9.mlp has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.9.mlp.act_fn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.10.self_attn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.10.self_attn.rotary_emb has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.10.mlp has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.10.mlp.act_fn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.11.self_attn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.11.self_attn.rotary_emb has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.11.mlp has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.11.mlp.act_fn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.12.self_attn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.12.self_attn.rotary_emb has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.12.mlp has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.12.mlp.act_fn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.13.self_attn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.13.self_attn.rotary_emb has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.13.mlp has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.13.mlp.act_fn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.14.self_attn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.14.self_attn.rotary_emb has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.14.mlp has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.14.mlp.act_fn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.15.self_attn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.15.self_attn.rotary_emb has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.15.mlp has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.15.mlp.act_fn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.16.self_attn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.16.self_attn.rotary_emb has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.16.mlp has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.16.mlp.act_fn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.17.self_attn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.17.self_attn.rotary_emb has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.17.mlp has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.17.mlp.act_fn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.18.self_attn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.18.self_attn.rotary_emb has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.18.mlp has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.18.mlp.act_fn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.19.self_attn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.19.self_attn.rotary_emb has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.19.mlp has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.19.mlp.act_fn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.20.self_attn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.20.self_attn.rotary_emb has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.20.mlp has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.20.mlp.act_fn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.21.self_attn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.21.self_attn.rotary_emb has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.21.mlp has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.21.mlp.act_fn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.22.self_attn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.22.self_attn.rotary_emb has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.22.mlp has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.22.mlp.act_fn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.23.self_attn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.23.self_attn.rotary_emb has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.23.mlp has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.23.mlp.act_fn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.24.self_attn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.24.self_attn.rotary_emb has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.24.mlp has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.24.mlp.act_fn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.25.self_attn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.25.self_attn.rotary_emb has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.25.mlp has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.25.mlp.act_fn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.26.self_attn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.26.self_attn.rotary_emb has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.26.mlp has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.26.mlp.act_fn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.27.self_attn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.27.self_attn.rotary_emb has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.27.mlp has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.27.mlp.act_fn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.28.self_attn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.28.self_attn.rotary_emb has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.28.mlp has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.28.mlp.act_fn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.29.self_attn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.29.self_attn.rotary_emb has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.29.mlp has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.29.mlp.act_fn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.30.self_attn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.30.self_attn.rotary_emb has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.30.mlp has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.30.mlp.act_fn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.31.self_attn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.31.self_attn.rotary_emb has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.31.mlp has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
model.layers.31.mlp.act_fn has been matched but ignored as GaLore only supports linear layers. Please double check your `optim_target_modules`!
0%| | 0/6719320 [00:00<?, ?it/s]You're using a PreTrainedTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
0%| | 1/6719320 [06:15<701609:55:03, 375.90s/it][2024-07-23 07:19:54,094] [INFO] [axolotl.callbacks.on_step_end:128] [PID:148509] [RANK:0] GPU memory usage while training: 17.607GB (+15.215GB cache, +1.482GB misc)
{'loss': 1.694, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.001, 'epoch': 0.0}
0%| | 200/6719320 [08:51<1455:09:36, 1.28it/s]/home/minimalist/miniconda3/envs/comps/lib/python3.10/site-packages/galore_torch/galore_projector.py:83: UserWarning: torch.linalg.svd: During SVD computation with the selected cusolver driver, batches 0 failed to converge. A more accurate method will be used to compute the SVD as a fallback. Check doc at https://pytorch.org/docs/stable/generated/torch.linalg.svd.html (Triggered internally at ../aten/src/ATen/native/cuda/linalg/BatchLinearAlgebraLib.cpp:697.)
U, s, Vh = torch.linalg.svd(matrix, full_matrices = False)
Traceback (most recent call last):
File "/home/minimalist/miniconda3/envs/comps/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/minimalist/miniconda3/envs/comps/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/minimalist/work/projects/sota/axolotl/src/axolotl/cli/train.py", line 72, in <module>
fire.Fire(do_cli)
File "/home/minimalist/miniconda3/envs/comps/lib/python3.10/site-packages/fire/core.py", line 143, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/home/minimalist/miniconda3/envs/comps/lib/python3.10/site-packages/fire/core.py", line 477, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/home/minimalist/miniconda3/envs/comps/lib/python3.10/site-packages/fire/core.py", line 693, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/home/minimalist/work/projects/sota/axolotl/src/axolotl/cli/train.py", line 39, in do_cli
return do_train(parsed_cfg, parsed_cli_args)
File "/home/minimalist/work/projects/sota/axolotl/src/axolotl/cli/train.py", line 67, in do_train
return train(cfg=cfg, cli_args=cli_args, dataset_meta=dataset_meta)
File "/home/minimalist/work/projects/sota/axolotl/src/axolotl/train.py", line 191, in train
trainer.train(resume_from_checkpoint=resume_from_checkpoint)
File "/home/minimalist/miniconda3/envs/comps/lib/python3.10/site-packages/transformers/trainer.py", line 1932, in train
return inner_training_loop(
File "/home/minimalist/miniconda3/envs/comps/lib/python3.10/site-packages/transformers/trainer.py", line 2268, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/home/minimalist/miniconda3/envs/comps/lib/python3.10/site-packages/transformers/trainer.py", line 3324, in training_step
self.accelerator.backward(loss, **kwargs)
File "/home/minimalist/miniconda3/envs/comps/lib/python3.10/site-packages/accelerate/accelerator.py", line 2151, in backward
loss.backward(**kwargs)
File "/home/minimalist/miniconda3/envs/comps/lib/python3.10/site-packages/torch/_tensor.py", line 525, in backward
torch.autograd.backward(
File "/home/minimalist/miniconda3/envs/comps/lib/python3.10/site-packages/torch/autograd/__init__.py", line 267, in backward
_engine_run_backward(
File "/home/minimalist/miniconda3/envs/comps/lib/python3.10/site-packages/torch/autograd/graph.py", line 744, in _engine_run_backward
return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
File "/home/minimalist/miniconda3/envs/comps/lib/python3.10/site-packages/transformers/trainer.py", line 1398, in optimizer_hook
optimizer_dict[param].step()
File "/home/minimalist/miniconda3/envs/comps/lib/python3.10/site-packages/torch/optim/lr_scheduler.py", line 75, in wrapper
return wrapped(*args, **kwargs)
File "/home/minimalist/miniconda3/envs/comps/lib/python3.10/site-packages/torch/optim/optimizer.py", line 391, in wrapper
out = func(*args, **kwargs)
File "/home/minimalist/miniconda3/envs/comps/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/minimalist/miniconda3/envs/comps/lib/python3.10/site-packages/galore_torch/adamw8bit.py", line 58, in step
grad = state["projector"].project(p.grad, state["step"])
File "/home/minimalist/miniconda3/envs/comps/lib/python3.10/site-packages/galore_torch/galore_projector.py", line 21, in project
self.ortho_matrix = self.get_orthogonal_matrix(full_rank_grad, self.rank, type='left')
File "/home/minimalist/miniconda3/envs/comps/lib/python3.10/site-packages/galore_torch/galore_projector.py", line 83, in get_orthogonal_matrix
U, s, Vh = torch.linalg.svd(matrix, full_matrices = False)
torch._C._LinAlgError: linalg.svd: The algorithm failed to converge because the input matrix is ill-conditioned or has too many repeated singular values (error code: 1023).
0%| | 200/6719320 [13:19<7465:36:33, 4.00s/it]
Traceback (most recent call last):
File "/home/minimalist/miniconda3/envs/comps/bin/accelerate", line 8, in <module>
sys.exit(main())
File "/home/minimalist/miniconda3/envs/comps/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 48, in main
args.func(args)
File "/home/minimalist/miniconda3/envs/comps/lib/python3.10/site-packages/accelerate/commands/launch.py", line 1097, in launch_command
simple_launcher(args)
File "/home/minimalist/miniconda3/envs/comps/lib/python3.10/site-packages/accelerate/commands/launch.py", line 703, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/home/minimalist/miniconda3/envs/comps/bin/python', '-m', 'axolotl.cli.train', 'examples/llama-3/qlora.yml']' returned non-zero exit status 1.
Hi GaLore Team, congratulations for the interesting work!
I am trying to fine-tune llama-3 8B model using GaLore but getting this error:
torch._C._LinAlgError: linalg.svd: The algorithm failed to converge because the input matrix is ill-conditioned or has too many repeated singular values
.Interestingly first batch loss is non-zero and all subsequent losses are zero values before training is automatically terminated.
Full Error Log
Hyperparams
The text was updated successfully, but these errors were encountered: