Implement rest of the test cases (LoRA tests)#2824
Implement rest of the test cases (LoRA tests)#2824sayakpaul merged 8 commits intohuggingface:mainfrom
Conversation
|
The documentation is not available anymore as the PR was closed or merged. |
|
@patrickvonplaten Haven't pushed the updated code but atm I'm thinking that we add another conditional statement to: if name.startswith("mid_block"):
hidden_size = model.config.block_out_channels[-1]
elif name.startswith("up_blocks"):
block_id = int(name[len("up_blocks.")])
hidden_size = list(reversed(model.config.block_out_channels))[block_id]
elif name.startswith("down_blocks"):
block_id = int(name[len("down_blocks.")])
hidden_size = model.config.block_out_channels[block_id]for |
patrickvonplaten
left a comment
There was a problem hiding this comment.
Looks very cool! Think your added tests are great! Went into the PR to help a bit. test_lora_processors now passes and I think you can use its logic to make the other tests as well - see: https://github.com/huggingface/diffusers/pull/2824/files#r1157208349
Let me know if you need any more help :-)
cc @sayakpaul could you also take a look here?
Thank you for the help! I really appreciate it. Would you be able to recommend some resources that I could look into to learn more about this structure and what the transformer_in is doing specifically? I'm amazed by how fast you were able to figure out the PR. |
| lora_attn_procs = {} | ||
| for name in model.attn_processors.keys(): | ||
| cross_attention_dim = None if name.endswith("attn1.processor") else model.config.cross_attention_dim | ||
| has_cross_attention = name.endswith("attn2.processor") and not ( |
| with torch.no_grad(): | ||
| sample1 = model(**inputs_dict).sample | ||
|
|
||
| lora_attn_procs = {} |
There was a problem hiding this comment.
We can leverage the create_lora_layers() here and elsewhere, no?
There was a problem hiding this comment.
Agree - @pie31415 could we maybe as a final todo factor out this code:
for name in model.attn_processors.keys():
has_cross_attention = name.endswith("attn2.processor") and not (
name.startswith("transformer_in") or "temp_attentions" in name.split(".")
)
cross_attention_dim = model.config.cross_attention_dim if has_cross_attention else None
if name.startswith("mid_block"):
hidden_size = model.config.block_out_channels[-1]
elif name.startswith("up_block"):
block_id = int(name[len("up_blocks.")])
hidden_size = list(reversed(model.config.block_out_channels))[block_id]
elif name.startswith("down_blocks"):
block_id = int(name[len("down_blocks.")])
hidden_size = model.config.block_out_channels[block_id]
elif name.startswith("transformer_in"):
# Note that the `8 * ...` comes from: https://github.com/huggingface/diffusers/blob/7139f0e874f10b2463caa8cbd585762a309d12d6/src/diffusers/models/unet_3d_condition.py#L148
hidden_size = 8 * model.config.attention_head_dim
lora_attn_procs[name] = LoRAAttnProcessor(hidden_size=hidden_size, cross_attention_dim=cross_attention_dim)
with torch.no_grad():
lora_attn_procs[name].to_q_lora.up.weight += 1
lora_attn_procs[name].to_k_lora.up.weight += 1
lora_attn_procs[name].to_v_lora.up.weight += 1
lora_attn_procs[name].to_out_lora.up.weight += 1into a create_lora_layers() as it's used three times further below? :-)
Think after this we're good for merge ❤️
sayakpaul
left a comment
There was a problem hiding this comment.
Thanks for taking on this one! Great work 🔥
To be honest the To understand what's going on, I don't have too many tips besides reading the code:
and trying to understand how they are connected. Note that we use the Very well done in the PR though - we're almost there! |
|
@patrickvonplaten @sayakpaul PR should be good to merge! |
| lora_attn_procs[name] = LoRAAttnProcessor(hidden_size=hidden_size, cross_attention_dim=cross_attention_dim) | ||
| lora_attn_procs[name] = lora_attn_procs[name].to(model.device) | ||
|
|
||
| lora_attn_procs = create_lora_layers(model, mock_weights=False) |
There was a problem hiding this comment.
Why are we not mocking weights here?
| lora_attn_procs[name] = LoRAAttnProcessor(hidden_size=hidden_size, cross_attention_dim=cross_attention_dim) | ||
| lora_attn_procs[name] = lora_attn_procs[name].to(model.device) | ||
|
|
||
| lora_attn_procs = create_lora_layers(model, mock_weights=False) |
sayakpaul
left a comment
There was a problem hiding this comment.
I truly appreciate your hard work! I think this was super important.
I just have a single doubt after which we should be good to merge 🚀
|
Great job @pie31415 ! |
* inital commit for lora test cases * help a bit with lora for 3d * fixed lora tests * replaced redundant code --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* inital commit for lora test cases * help a bit with lora for 3d * fixed lora tests * replaced redundant code --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* inital commit for lora test cases * help a bit with lora for 3d * fixed lora tests * replaced redundant code --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* inital commit for lora test cases * help a bit with lora for 3d * fixed lora tests * replaced redundant code --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
PR for issue #2789