Skip to content

Run torch.compile tests in separate subprocesses#3503

Merged
pcuenca merged 12 commits intomainfrom
run-compile-subprocess
May 23, 2023
Merged

Run torch.compile tests in separate subprocesses#3503
pcuenca merged 12 commits intomainfrom
run-compile-subprocess

Conversation

@pcuenca
Copy link
Copy Markdown
Member

@pcuenca pcuenca commented May 22, 2023

torch.compile() spawns several subprocesses and the GPU memory used was not reclaimed after the test ran.

This solution was taken from transformers: https://github.com/huggingface/transformers/blob/3658488ff77ff8d45101293e749263acf437f4d5/src/transformers/testing_utils.py#L1787

  • Run all compile tests in subprocesses.
  • Reduce memory consumption of ControlNet tests (~18 GB in my tests).
  • Fix tolerance/determinism issues (in V100, ideally).

`torch.compile()` spawns several subprocesses and the GPU memory used
was not reclaimed after the test ran. This approach was taken from
`transformers`.
@pcuenca pcuenca marked this pull request as draft May 22, 2023 09:50
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

HuggingFaceDocBuilderDev commented May 22, 2023

The documentation is not available anymore as the PR was closed or merged.

@pcuenca pcuenca changed the title Run ControlNet compile test in a separate subprocess Run torch.compile tests in separate subprocesses May 22, 2023
@pcuenca pcuenca marked this pull request as ready for review May 22, 2023 14:49
Copy link
Copy Markdown
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool let's give this a try!

@pcuenca pcuenca merged commit bde2cb5 into main May 23, 2023
@pcuenca pcuenca deleted the run-compile-subprocess branch May 23, 2023 17:24
yoonseokjin pushed a commit to yoonseokjin/diffusers that referenced this pull request Dec 25, 2023
* Run ControlNet compile test in a separate subprocess

`torch.compile()` spawns several subprocesses and the GPU memory used
was not reclaimed after the test ran. This approach was taken from
`transformers`.

* Style

* Prepare a couple more compile tests to run in subprocess.

* Use require_torch_2 decorator.

* Test inpaint_compile in subprocess.

* Run img2img compile test in subprocess.

* Run stable diffusion compile test in subprocess.

* style

* Temporarily trigger on pr to test.

* Revert "Temporarily trigger on pr to test."

This reverts commit 82d7686.
AmericanPresidentJimmyCarter pushed a commit to AmericanPresidentJimmyCarter/diffusers that referenced this pull request Apr 26, 2024
* Run ControlNet compile test in a separate subprocess

`torch.compile()` spawns several subprocesses and the GPU memory used
was not reclaimed after the test ran. This approach was taken from
`transformers`.

* Style

* Prepare a couple more compile tests to run in subprocess.

* Use require_torch_2 decorator.

* Test inpaint_compile in subprocess.

* Run img2img compile test in subprocess.

* Run stable diffusion compile test in subprocess.

* style

* Temporarily trigger on pr to test.

* Revert "Temporarily trigger on pr to test."

This reverts commit 82d7686.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants