Skip to content

feat(sources): generate .git_archival.txt for setuptools-scm builds#962

Draft
LalatenduMohanty wants to merge 1 commit intopython-wheel-build:mainfrom
LalatenduMohanty:961_git-archival-for-setuptools-scm
Draft

feat(sources): generate .git_archival.txt for setuptools-scm builds#962
LalatenduMohanty wants to merge 1 commit intopython-wheel-build:mainfrom
LalatenduMohanty:961_git-archival-for-setuptools-scm

Conversation

@LalatenduMohanty
Copy link
Copy Markdown
Member

@LalatenduMohanty LalatenduMohanty commented Mar 13, 2026

Packages using setuptools-scm fail when built from source archives without .git metadata. Add ensure_git_archival() to synthesize a .git_archival.txt with the resolved version, which setuptools-scm reads before PKG-INFO in its fallback chain.

Closes: #961

@LalatenduMohanty LalatenduMohanty requested a review from a team as a code owner March 13, 2026 11:06
@mergify mergify bot added the ci label Mar 13, 2026
@LalatenduMohanty LalatenduMohanty force-pushed the 961_git-archival-for-setuptools-scm branch 2 times, most recently from 779259f to 87d777b Compare March 13, 2026 12:37
@LalatenduMohanty LalatenduMohanty force-pushed the 961_git-archival-for-setuptools-scm branch 4 times, most recently from 545b0a5 to 8287e3c Compare March 19, 2026 12:11
@LalatenduMohanty LalatenduMohanty force-pushed the 961_git-archival-for-setuptools-scm branch 2 times, most recently from d7d9adf to 811b9e9 Compare March 24, 2026 02:22
@LalatenduMohanty LalatenduMohanty force-pushed the 961_git-archival-for-setuptools-scm branch from 811b9e9 to 23827f7 Compare March 25, 2026 12:30
@LalatenduMohanty
Copy link
Copy Markdown
Member Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 27, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 27, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 06a6ddd6-e475-4928-824e-8794c3a62bb5

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds ensure_git_archival(version, target_dir) to src/fromager/sources.py and invokes it from default_build_sdist() when creating an sdist and no .git metadata is present (checked in build_dir and, when different, sdist_root_dir). The helper inspects target_dir/.git_archival.txt: if the file is absent it returns None and does not create one; if present but contains unprocessed $Format: markers or is missing/has empty required fields, it overwrites the file with a synthesized archival payload using a zeroed 40-hex node and describe-name derived from the provided version and returns False; if the file is valid it leaves it unchanged and returns True. Tests in tests/test_sources.py cover absence, replacement of templates, preservation of valid files, and repair of truncated/empty-field files.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: adding git_archival.txt generation for setuptools-scm builds.
Description check ✅ Passed The description is directly related to the changeset, explaining the problem and solution with specific context about setuptools-scm and closing issue #961.
Linked Issues check ✅ Passed The implementation fully addresses issue #961: ensure_git_archival() generates .git_archival.txt with describe-name field, preserves valid files, replaces unprocessed templates, and integrates with sdist creation logic.
Out of Scope Changes check ✅ Passed All changes are scoped to implementing ensure_git_archival() and its tests; no unrelated modifications to other parts of the codebase.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
src/fromager/sources.py (1)

693-697: ⚠️ Potential issue | 🟠 Major

Check the source root for git metadata before synthesizing archival data.

This guard only looks at build_dir/.git. Packages with a nested build_dir keep git metadata at sdist_root_dir, so this path will still write a synthetic .git_archival.txt for a real checkout.

Proposed fix
-    if not build_dir.joinpath(".git").exists():
+    has_git_metadata = (
+        build_dir.joinpath(".git").exists()
+        or sdist_root_dir.joinpath(".git").exists()
+    )
+    if not has_git_metadata:
         ensure_git_archival(
             version=version,
             target_dir=build_dir,
         )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/fromager/sources.py` around lines 693 - 697, The current guard only
checks build_dir/.git before calling ensure_git_archival, which can synthesize
.git_archival.txt for projects where the real git metadata lives at
sdist_root_dir; update the condition to check for git metadata in the source
root as well (e.g., verify sdist_root_dir.joinpath(".git").exists() or similar)
and only call ensure_git_archival(version=version, target_dir=build_dir) when
neither build_dir nor sdist_root_dir contains a .git directory; reference the
build_dir and sdist_root_dir variables and the ensure_git_archival function to
locate the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/fromager/sources.py`:
- Around line 818-823: The code returns True for any existing .git_archival.txt
that lacks the unprocessed marker, but doesn't validate the archival payload, so
empty/truncated files can be treated as valid; update the check in the block
that reads archival_file to validate the content contains the required
setuptools-scm fields (e.g., "node" and/or "describe-name" entries or a
non-empty node value) before returning True: if those keys/values are missing,
log a warning and proceed as if the file is unprocessed (so replacement
happens). Refer to archival_file, _UNPROCESSED_ARCHIVAL_MARKER and target_dir
when locating and updating the logic.

---

Duplicate comments:
In `@src/fromager/sources.py`:
- Around line 693-697: The current guard only checks build_dir/.git before
calling ensure_git_archival, which can synthesize .git_archival.txt for projects
where the real git metadata lives at sdist_root_dir; update the condition to
check for git metadata in the source root as well (e.g., verify
sdist_root_dir.joinpath(".git").exists() or similar) and only call
ensure_git_archival(version=version, target_dir=build_dir) when neither
build_dir nor sdist_root_dir contains a .git directory; reference the build_dir
and sdist_root_dir variables and the ensure_git_archival function to locate the
change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4b2969c1-cb4c-4d05-a87a-8882f2a5bd6e

📥 Commits

Reviewing files that changed from the base of the PR and between 0a941fd and 23827f7.

📒 Files selected for processing (2)
  • src/fromager/sources.py
  • tests/test_sources.py

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/fromager/sources.py`:
- Around line 823-831: The current check builds a set "fields" from keys only,
so entries like "node:" or "describe-name:" with empty values still pass; change
the parsing to build a dict (e.g., parse "content" into something like parsed =
{key.strip(): value.strip() for line in content.splitlines() if ":" in line for
key, _, value in [line.partition(':')]}) and then replace the membership check
with a validation that _REQUIRED_ARCHIVAL_FIELDS is a subset of parsed.keys()
and that all parsed[field] are non-empty (truthy) for each required field; keep
the existing check for _UNPROCESSED_ARCHIVAL_MARKER in content and only treat
the file as archival when the marker is absent and all required fields have
non-empty values.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c784ffc9-48c4-48a0-af58-90fe2889bb07

📥 Commits

Reviewing files that changed from the base of the PR and between 23827f7 and efe230a.

📒 Files selected for processing (2)
  • src/fromager/sources.py
  • tests/test_sources.py
✅ Files skipped from review due to trivial changes (1)
  • tests/test_sources.py

@LalatenduMohanty LalatenduMohanty force-pushed the 961_git-archival-for-setuptools-scm branch 2 times, most recently from 96ffb9a to a300456 Compare March 30, 2026 17:41
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tests/test_sources.py (1)

320-331: Add explicit assertions for node-date in generated/repaired archival files.

Current tests won’t catch regressions where node-date is missing from synthesized output. Add assertions in creation/replacement paths to verify the full archival contract.

Suggested test additions
         content = archival.read_text()
         assert "describe-name: 1.2.3\n" in content
         assert "node: " in content
+        assert "node-date: " in content
         assert "$Format:" not in content
@@
         content = archival.read_text()
         assert "describe-name: 4.5.6\n" in content
+        assert "node-date: " in content
         assert "$Format:" not in content
@@
         content = archival.read_text()
         assert "describe-name: 3.0.0\n" in content
+        assert "node-date: " in content
@@
         content = archival.read_text()
         assert "describe-name: 5.0.0\n" in content
+        assert "node-date: " in content

As per coding guidelines, "tests/**: Verify test actually tests the intended behavior. Check for missing edge cases."

Also applies to: 344-347, 371-374, 382-384

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_sources.py` around lines 320 - 331, Update the tests that validate
generated/rewritten .git_archival.txt (e.g., test_creates_file_when_missing and
the similar tests around lines 344-347, 371-374, 382-384) to assert that the
"node-date: " line is present and correctly populated in the file content;
locate the archival file read (variable archival or calls to
sources.ensure_git_archival) and add an assertion like `assert "node-date: " in
content` (and where applicable assert the date format or non-empty value) so
missing node-date regressions are caught.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/fromager/sources.py`:
- Around line 793-799: Add "node-date" to the required archival fields set and
include a corresponding node-date line in the synthesized archival template;
update the symbols _REQUIRED_ARCHIVAL_FIELDS and _GIT_ARCHIVAL_CONTENT (and the
duplicate occurrence later in the file) so the archival content emits
"node-date: {node_date}" and the required-fields set contains "node-date" to
match the new .git_archival.txt contract.

---

Nitpick comments:
In `@tests/test_sources.py`:
- Around line 320-331: Update the tests that validate generated/rewritten
.git_archival.txt (e.g., test_creates_file_when_missing and the similar tests
around lines 344-347, 371-374, 382-384) to assert that the "node-date: " line is
present and correctly populated in the file content; locate the archival file
read (variable archival or calls to sources.ensure_git_archival) and add an
assertion like `assert "node-date: " in content` (and where applicable assert
the date format or non-empty value) so missing node-date regressions are caught.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 0eff024c-b129-4cc8-9b79-1a7ab4c18088

📥 Commits

Reviewing files that changed from the base of the PR and between 96ffb9a and a300456.

📒 Files selected for processing (2)
  • src/fromager/sources.py
  • tests/test_sources.py

@LalatenduMohanty LalatenduMohanty force-pushed the 961_git-archival-for-setuptools-scm branch from a300456 to 173746e Compare March 30, 2026 20:06
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/fromager/sources.py`:
- Around line 798-828: ensure_git_archival currently returns None when
.git_archival.txt is missing; instead, create a synthetic archival file using
the provided version and return False to indicate a new file was written. In
ensure_git_archival(), when archival_file.is_file() is False, write a minimal
setuptools-scm archival payload to archival_file (for example include a
"describe-name: {version}" line using the Version value passed in) with
appropriate file permissions, flush/close it, and return False; keep the
existing behavior for cases where the file exists and only replace if
placeholders are present (preserve the existing logic in ensure_git_archival
that handles existing files).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e1a1de89-1ee1-471e-8169-4f4299268a93

📥 Commits

Reviewing files that changed from the base of the PR and between a300456 and 173746e.

📒 Files selected for processing (2)
  • src/fromager/sources.py
  • tests/test_sources.py
✅ Files skipped from review due to trivial changes (1)
  • tests/test_sources.py

@LalatenduMohanty LalatenduMohanty force-pushed the 961_git-archival-for-setuptools-scm branch from 173746e to 16a06e4 Compare March 30, 2026 21:05
@LalatenduMohanty
Copy link
Copy Markdown
Member Author

LalatenduMohanty commented Mar 30, 2026

@tiran Looks like my approach had a drawback and it was failing for the packages who needs a custom tag_regex, so we need to rethink about how we want to achieve this. I added a comment #961 (comment) for direction we should take. PTAL

@LalatenduMohanty LalatenduMohanty force-pushed the 961_git-archival-for-setuptools-scm branch from 16a06e4 to 2bdd4c1 Compare March 30, 2026 21:25
Ensure sdist builds without .git metadata have a valid .git_archival.txt
for setuptools-scm version resolution. Replaces existing invalid files
unconditionally. Creates new files only when generate_git_archival is
enabled in package settings, since bare versions break custom tag_regex.

Co-Authored-By: Claude <claude@anthropic.com>
Closes: python-wheel-build#961
Signed-off-by: Lalatendu Mohanty <lmohanty@redhat.com>
@LalatenduMohanty LalatenduMohanty force-pushed the 961_git-archival-for-setuptools-scm branch from 2bdd4c1 to 02e3d25 Compare March 30, 2026 21:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Generate .git_archival.txt for setuptools-scm packages built from source archives

3 participants