Add how-to guide for storage data migration (#2228)#2299
Add how-to guide for storage data migration (#2228)#2299carstenjacobsen wants to merge 5 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a new how-to guide explaining how to safely migrate Soroban contract storage data when upgrading stored data structures, focusing on why naïve decoding traps and how to use explicit versioning patterns.
Changes:
- Introduces a “Version Marker Pattern” guide for per-record versioning and safe reads/writes across upgrades.
- Explains lazy vs eager migration tradeoffs for Soroban.
- Provides example test approaches for validating upgrade/migration behavior, plus an alternative “Versioned Enum Pattern”.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@carstenjacobsen I've opened a new pull request, #2300, to work on those changes. Once the pull request is ready, I'll request review from you. |
|
Preview is available here: |
|
Preview is available here: |
|
Preview is available here: |
leighmcculloch
left a comment
There was a problem hiding this comment.
Content is great, but I think we should reorder the document so it leads with the preferred approach, then lists alternative approaches, then finally lists those approaches that you think will work but won't which are currently the first thing a reader sees.
| ## Why intuitive approaches fail | ||
|
|
||
| Suppose a contract stores `DataV1` entries and is upgraded to use `DataV2`, which adds an optional field `c`: | ||
|
|
||
| ```rust | ||
| #[contracttype] | ||
| pub struct Data { a: i64, b: i64 } | ||
|
|
||
| #[contracttype] | ||
| pub struct DataV2 { a: i64, b: i64, c: Option<i64> } | ||
| ``` | ||
|
|
||
| ### Approach 1: Read old entries directly with the new type | ||
|
|
||
| The most natural approach is to read the stored bytes directly as `DataV2` and expect `c` to default to `None`: | ||
|
|
||
| ```rust | ||
| // Reading a DataV1 entry with the DataV2 type. | ||
| // A developer might expect c = None for old entries — but this traps. | ||
| let data: DataV2 = env.storage().persistent().get(&key).unwrap(); | ||
| // Error(Object, UnexpectedSize) | ||
| ``` | ||
|
|
||
| This traps with `Error(Object, UnexpectedSize)`. The Soroban host validates the field count of the XDR-encoded value against the type definition before returning anything to the contract. Because `DataV1` has two fields and `DataV2` has three, the host rejects the entry before the SDK can handle it. | ||
|
|
||
| ### Approach 2: Use `try_from_val` as a fallback | ||
|
|
||
| Another approach is to use `try_from_val` expecting to catch a deserialization error and recover: | ||
|
|
||
| ```rust | ||
| let raw: Val = env.storage().persistent().get(&key).unwrap(); | ||
| if let Ok(v2) = DataV2::try_from_val(&env, &raw) { | ||
| v2 | ||
| } else { | ||
| // This branch is never reached — the host traps before returning Err. | ||
| let v1 = DataV1::try_from_val(&env, &raw).unwrap(); | ||
| DataV2 { a: v1.a, b: v1.b, c: None } | ||
| } | ||
| ``` | ||
|
|
||
| This also traps at the host level. The field count validation happens in the host environment during deserialization — it does not produce a Rust `Err` that the SDK can intercept. There is no way to catch or recover from the mismatch at the contract level. | ||
|
|
||
| The root issue is that a contract cannot determine which type an existing storage entry was written as just by reading it. That information must be stored explicitly. |
There was a problem hiding this comment.
On a first read, I missed that the first two approaches are what not to do and I got confused. So I think it might be helpful if this how-to guide leads with what someone should do, and then we could have some notes in those little boxes saying here's an approach that someone might think to use but don't do it, it won't work. I would put them further down in the document rather than leading with them.
| - A `DataV2` entry written by the new contract | ||
| - A `DataV1` entry that is read and then written back (the lazy migration round-trip) | ||
|
|
||
| ## Versioned Enum Pattern |
There was a problem hiding this comment.
I'd lead with this approach, and then list the other approaches as alternatives, and then finally have a section of approaches you think will work but won't.
|
@leighmcculloch i've tried to incorporate the feedback you left. how does everything look now? |
PR for issue #2228