Great series.
Btw that’s a huge Lora parameter compared to the LoRA paper (4) but I suppose it’s good if deepspeed recommends...?