[Core] refactor transformer_2d forward logic into meaningful conditions.#7489
[Core] refactor transformer_2d forward logic into meaningful conditions.#7489
transformer_2d forward logic into meaningful conditions.#7489Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
@DN6 WDYT?
(refer to the OP) |
|
I'm cool with this PR once you and @DN6 are happy with this!
for this, yes, I agree it is a highly used block, and I think we should give more thought to avoid breaking changes. would love to hear your thoughts @DN6 |
|
Cool. Will work on the deprecation in a future PR then as we discuss the best design. |
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
This reverts commit 12178b1.
|
@DN6 I had to revert your suggestion because Gonna merge the PR after the CI is green. |
…ions. (#7489) * refactor transformer_2d forward logic into meaningful conditions. * Empty-Commit * fix: _operate_on_patched_inputs * fix: _operate_on_patched_inputs * check * fix: patch output computation block. * fix: _operate_on_patched_inputs. * remove print. * move operations to blocks. * more readability neats. * empty commit * Apply suggestions from code review Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> * Revert "Apply suggestions from code review" This reverts commit 12178b1. --------- Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
What does this PR do?
Refactors the
forward()ofTransformers2DModelfor easier readability.More specifically, the PR refactors the
forward()method ofTransformers2DModelto have the following unified structure:About the possibility of spinning out separate transformer classes for patched and vectorized inputs, what's the consensus? Currently, we have DiT and PixArt-Alpha that use patched inputs. So, for those checkpoints (DiT and PixArt-Alpha), are we thinking of just updating the configs to use
PatchedTransformer2DModel? (Same could be done for theVectorizedTransformer2DModel)If that's the case I think it could be quite breaking of a change. Many folks use the DiT and PixArt models especially in the light of many Open SoRA initiative. Introducing a change like this could be problematic. Should we consider throwing a deprecation warning, instead?
@yiyixuxu @DN6 please let me know your thoughts.