Skip to content

Add how-to guide for storage data migration (#2228)#2299

Open
carstenjacobsen wants to merge 5 commits intomainfrom
migrate-contract-storage
Open

Add how-to guide for storage data migration (#2228)#2299
carstenjacobsen wants to merge 5 commits intomainfrom
migrate-contract-storage

Conversation

@carstenjacobsen
Copy link
Copy Markdown
Contributor

PR for issue #2228

Copilot AI review requested due to automatic review settings March 6, 2026 12:54
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new how-to guide explaining how to safely migrate Soroban contract storage data when upgrading stored data structures, focusing on why naïve decoding traps and how to use explicit versioning patterns.

Changes:

  • Introduces a “Version Marker Pattern” guide for per-record versioning and safe reads/writes across upgrades.
  • Explains lazy vs eager migration tradeoffs for Soroban.
  • Provides example test approaches for validating upgrade/migration behavior, plus an alternative “Versioned Enum Pattern”.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 6, 2026

@carstenjacobsen I've opened a new pull request, #2300, to work on those changes. Once the pull request is ready, I'll request review from you.

@stellar-jenkins
Copy link
Copy Markdown

@stellar-jenkins
Copy link
Copy Markdown

@stellar-jenkins
Copy link
Copy Markdown

Copy link
Copy Markdown
Member

@leighmcculloch leighmcculloch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Content is great, but I think we should reorder the document so it leads with the preferred approach, then lists alternative approaches, then finally lists those approaches that you think will work but won't which are currently the first thing a reader sees.

Comment on lines +9 to +51
## Why intuitive approaches fail

Suppose a contract stores `DataV1` entries and is upgraded to use `DataV2`, which adds an optional field `c`:

```rust
#[contracttype]
pub struct Data { a: i64, b: i64 }

#[contracttype]
pub struct DataV2 { a: i64, b: i64, c: Option<i64> }
```

### Approach 1: Read old entries directly with the new type

The most natural approach is to read the stored bytes directly as `DataV2` and expect `c` to default to `None`:

```rust
// Reading a DataV1 entry with the DataV2 type.
// A developer might expect c = None for old entries — but this traps.
let data: DataV2 = env.storage().persistent().get(&key).unwrap();
// Error(Object, UnexpectedSize)
```

This traps with `Error(Object, UnexpectedSize)`. The Soroban host validates the field count of the XDR-encoded value against the type definition before returning anything to the contract. Because `DataV1` has two fields and `DataV2` has three, the host rejects the entry before the SDK can handle it.

### Approach 2: Use `try_from_val` as a fallback

Another approach is to use `try_from_val` expecting to catch a deserialization error and recover:

```rust
let raw: Val = env.storage().persistent().get(&key).unwrap();
if let Ok(v2) = DataV2::try_from_val(&env, &raw) {
v2
} else {
// This branch is never reached — the host traps before returning Err.
let v1 = DataV1::try_from_val(&env, &raw).unwrap();
DataV2 { a: v1.a, b: v1.b, c: None }
}
```

This also traps at the host level. The field count validation happens in the host environment during deserialization — it does not produce a Rust `Err` that the SDK can intercept. There is no way to catch or recover from the mismatch at the contract level.

The root issue is that a contract cannot determine which type an existing storage entry was written as just by reading it. That information must be stored explicitly.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a first read, I missed that the first two approaches are what not to do and I got confused. So I think it might be helpful if this how-to guide leads with what someone should do, and then we could have some notes in those little boxes saying here's an approach that someone might think to use but don't do it, it won't work. I would put them further down in the document rather than leading with them.

- A `DataV2` entry written by the new contract
- A `DataV1` entry that is read and then written back (the lazy migration round-trip)

## Versioned Enum Pattern
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd lead with this approach, and then list the other approaches as alternatives, and then finally have a section of approaches you think will work but won't.

@ElliotFriend
Copy link
Copy Markdown
Contributor

@leighmcculloch i've tried to incorporate the feedback you left. how does everything look now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants