[Core] refactor transformers 2d into multiple init variants. by sayakpaul · Pull Request #7491 · huggingface/diffusers

sayakpaul · 2024-03-27T11:16:47Z

What does this PR do?

For a new Transformer variant, we should do "Transformer2DModelForXXX" going forward.

I would love an initial review of the design.

Adjacent to #7489. I believe together with this PR and the current one, the class will read more modular.

HuggingFaceDocBuilderDev · 2024-03-27T11:24:41Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

src/diffusers/models/transformers/transformer_2d.py

DN6

I like the init methods approach. Left a couple of comments.

sayakpaul · 2024-03-29T07:52:52Z

@DN6 @yiyixuxu all the comments have been addressed. PTAL.

sayakpaul · 2024-03-29T10:39:04Z

@DN6 @yiyixuxu the problem with the failing tests (8) are a little shaky in nature. These failures happen ONLY when the input type is continuous.

They're failing because the parameters of the Transformer2Model are changing depending on the position of its internal components (transformer_blocks, proj_in, etc.) EVEN WHEN seeded.

Earlier, the self.transformer_blocks init was between self.proj_in and self._proj_out. This PR doesn't have that kind of positioning as the final part of the main __init__() initializes self.transformer_blocks. To confirm things, I put the init of self.transformer_blocks after

diffusers/src/diffusers/models/transformers/transformer_2d.py

Line 200 in b990afd

    
           self.proj_in = torch.nn.Conv2d(self.in_channels, self.inner_dim, kernel_size=1, stride=1, padding=0)

and before

diffusers/src/diffusers/models/transformers/transformer_2d.py

Line 203 in b990afd

if self.use_linear_projection:

All the tests pass after this change.

You might ask why does this not happen for the other input types (patched and vectorized)? Because we only configure Transformer2DModel for continuous input types in tests/models/test_layers_utils.py (which is where the 8 tests are failing).

I see two potential solutions:

Change the assertion values in the tests
Add self.transformer_blocks init in _init_continuous_input() like the suggested above.

I think we shouldn't do pt. 2 and settle with pt. 1. The tests are mild in nature and don't seem to be testing for anything critical.

LMK.

yiyixuxu · 2024-03-29T22:58:00Z

I see two potential solutions:

Change the assertion values in the tests
Add self.transformer_blocks init in _init_continuous_input() like the suggested above.

I left a comment here, #7491 (comment)
I don't think we should move the transformer init inside these _init* methods just to get tests pass but I would actually in favor of doing that because it is easier to read IMO, depends on the input type, user can just go to one of the init* method instead of going back and forth;
but I don't feel strongly about this, I think it is a nice clean up regardless. So I will leave it to you guys to decide :)

sayakpaul · 2024-03-30T01:39:40Z

but I would actually in favor of doing that because it is easier to read IMO, depends on the input type, user can just go to one of the init* method instead of going back and forth;

This hits the point home. So, I will follow that. Thanks!

yiyixuxu · 2024-03-30T16:32:36Z

@sayakpaul @DN6
should we just spin out Transformer2DModelPatched and Transformer2DModelVectorized? we are pretty much already there - I think most of the work is already done.

sayakpaul · 2024-03-30T16:41:28Z

Move out the classes to their own modules? I think it might be better done after #7489.

yiyixuxu · 2024-03-30T17:58:05Z

Move out the classes to their own modules? I think it might be better done after #7489.

oh sure,
let me know what you think @DN6: we are already creating 3 separate __init__ and forward, they are practically already 3 different classes. I think the only models that use patch are pixart and dit; vectorized is only for vq diffusion but we will have to double-check

sayakpaul · 2024-03-31T14:33:24Z

I have an idea of the kind of refactoring @yiyixuxu suggesting. Would prefer doing it in a separate PR clubbing it with #7489. Would that be okay?

DN6 · 2024-04-01T13:52:25Z

@sayakpaul @DN6 should we just spin out Transformer2DModelPatched and Transformer2DModelVectorized? we are pretty much already there - I think most of the work is already done.

Yeah I'm cool with spinning it out into dedicated classes. If possible, could they have model specific names e.g. PixartTransformer2D?

We can handle it in a separate PR if you prefer @sayakpaul

Would we need to merge this PR if it's just going to be changed soonish? Can we just take the work done here and in #7489 and open a new PR?

@yiyixuxu Is right that we are essentially creating three different init and forwards within the same object, so the signal here is that we should break it up into three different models.

sayakpaul · 2024-04-01T14:22:35Z

@DN6

I would still prefer to handle the spinning out portion in #7489 and merge this PR.

If possible, could they have model specific names e.g. PixartTransformer2D?

I think we can gradually tackle these phasing out cases. More like a coarse-to-fine kind of approach.

Regarding #7491 (comment), I think it's a matter of preference. So, I would like to elaborate a bit on why I am tending to go via the duplicating route for this particular case.

The current PR allows the Transformer2DModel class to have the following structure:

class Transformer2DModel:
    def __init__(self, ...):
        # init common attributes that are shared across the board
        self.in_channels = ...
        self.out_channels = ...

        # handle initialization of specific attributes, modules, etc.
        if input_type == "continuous":
            self._init_continuous_inputs()
        elif input_type == "patched":
            self._init_patched_inputs()
        elif input_type == "vectorized":
            self._init_vectorized_inputs()

    def _init_continuous_inputs(self):
        # initialize attributes specific to continuous inputs.
        ...

        # initialize blocks specific to continuous inputs.
        # input block.
        ...

        # transformer blocks.
        self.transformer_blocks = ... # <- duplicate code for other init methods too.

        # output blocks.
        ...

    # rest of the init methods follow the above structure too.

Now, if we move out the transformer block initialization, it would look like:

class Transformer2DModel:
    def __init__(self, ...):
        # init common attributes that are shared across the board
        self.in_channels = ...
        self.out_channels = ...

        # handle initialization of specific attributes, modules, etc.
        if input_type == "continuous":
            self._init_continuous_inputs()
        elif input_type == "patched":
            self._init_patched_inputs()
        elif input_type == "vectorized":
            self._init_vectorized_inputs()

        # transformer blocks.
        self.transformer_blocks = ...

    def _init_continuous_inputs(self):
        # initialize attributes specific to continuous inputs.
        ...

        # initialize blocks specific to continuous inputs.
        # input block.
        ...

        # output blocks.
        ...

    # rest of the init methods follow the above structure too.

I think it breaks the linearity of the reading flow plus the tests (which are minor, I agree).

Hope this provides more reasoning.

DN6

LGTM 👍🏽

…face#7491) * refactor transformers 2d into multiple legacy variants. * fix: init. * fix recursive init. * add inits. * make transformer block creation more modular. * complete refactor. * remove forward * debug * remove legacy blocks and refactor within the module itself. * remove print * guard caption projection * remove fetcher. * reduce the number of args. * fix: norm_type * group variables that are shared. * remove _get_transformer_blocks * harmonize the init function signatures. * transformer_blocks to common * repeat .

* refactor transformers 2d into multiple legacy variants. * fix: init. * fix recursive init. * add inits. * make transformer block creation more modular. * complete refactor. * remove forward * debug * remove legacy blocks and refactor within the module itself. * remove print * guard caption projection * remove fetcher. * reduce the number of args. * fix: norm_type * group variables that are shared. * remove _get_transformer_blocks * harmonize the init function signatures. * transformer_blocks to common * repeat .

refactor transformers 2d into multiple legacy variants.

46f0af3

sayakpaul added the refactor label Mar 27, 2024

sayakpaul requested review from DN6 and yiyixuxu March 27, 2024 11:16

sayakpaul added 2 commits March 27, 2024 17:34

fix: init.

584a7a9

fix recursive init.

aacad2b

sayakpaul marked this pull request as draft March 27, 2024 12:30

sayakpaul removed request for DN6 and yiyixuxu March 27, 2024 12:30

sayakpaul added 3 commits March 27, 2024 20:00

add inits.

d7b0b94

make transformer block creation more modular.

590fb6f

complete refactor.

d7c61a0

sayakpaul requested review from DN6 and yiyixuxu March 28, 2024 01:55

sayakpaul added 2 commits March 28, 2024 08:43

remove forward

67a71ef

debug

a446e39

sayakpaul removed request for DN6 and yiyixuxu March 28, 2024 03:39

sayakpaul added 3 commits March 28, 2024 09:47

remove legacy blocks and refactor within the module itself.

76da4f6

remove print

4fc63d5

guard caption projection

935b123

sayakpaul changed the title ~~[Core] refactor transformers 2d into multiple legacy variants.~~ [Core] refactor transformers 2d into multiple init variants. Mar 28, 2024

Merge branch 'main' into transformers-2d-refactor-ii

20ca6f7

sayakpaul requested review from DN6 and yiyixuxu March 28, 2024 04:42

DN6 reviewed Mar 28, 2024

View reviewed changes

src/diffusers/models/transformers/transformer_2d.py Outdated Show resolved Hide resolved

DN6 reviewed Mar 28, 2024

View reviewed changes

src/diffusers/models/transformers/transformer_2d.py Outdated Show resolved Hide resolved

DN6 reviewed Mar 28, 2024

View reviewed changes

remove fetcher.

04ed9ec

sayakpaul added 4 commits March 29, 2024 12:54

group variables that are shared.

ae4aacc

remove _get_transformer_blocks

7a8428b

harmonize the init function signatures.

ed0b40a

transformer_blocks to common

5aee98a

sayakpaul requested review from DN6 and yiyixuxu March 29, 2024 07:52

Merge branch 'main' into transformers-2d-refactor-ii

e197d5b

Merge branch 'main' into transformers-2d-refactor-ii

b990afd

sayakpaul added 2 commits March 30, 2024 07:09

Merge branch 'main' into transformers-2d-refactor-ii

e863f29

repeat .

646eee3

Merge branch 'main' into transformers-2d-refactor-ii

8a08bee

sayakpaul added 4 commits April 1, 2024 19:53

Merge branch 'main' into transformers-2d-refactor-ii

bba87bc

Merge branch 'main' into transformers-2d-refactor-ii

8edfb93

merge main and resolve conflicts

916ea80

Merge branch 'main' into transformers-2d-refactor-ii

785e41f

DN6 approved these changes Apr 3, 2024

View reviewed changes

sayakpaul merged commit a9a5b14 into main Apr 3, 2024

sayakpaul deleted the transformers-2d-refactor-ii branch April 3, 2024 07:26

Conversation

sayakpaul commented Mar 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Mar 27, 2024

Uh oh!

Uh oh!

Uh oh!

DN6 left a comment

Choose a reason for hiding this comment

Uh oh!

sayakpaul commented Mar 29, 2024

Uh oh!

sayakpaul commented Mar 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yiyixuxu commented Mar 29, 2024

Uh oh!

sayakpaul commented Mar 30, 2024

Uh oh!

yiyixuxu commented Mar 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sayakpaul commented Mar 30, 2024

Uh oh!

yiyixuxu commented Mar 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sayakpaul commented Mar 31, 2024

Uh oh!

DN6 commented Apr 1, 2024

Uh oh!

sayakpaul commented Apr 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DN6 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

sayakpaul commented Mar 27, 2024 •

edited

Loading

sayakpaul commented Mar 29, 2024 •

edited

Loading

yiyixuxu commented Mar 30, 2024 •

edited

Loading

yiyixuxu commented Mar 30, 2024 •

edited

Loading

sayakpaul commented Apr 1, 2024 •

edited

Loading