Skip to content

Handle monitor update failures in two more places + new fuzz test#290

Merged
TheBlueMatt merged 5 commits intolightningdevkit:masterfrom
TheBlueMatt:2019-01-monitor-update-handle-fuzz
Jan 25, 2019
Merged

Handle monitor update failures in two more places + new fuzz test#290
TheBlueMatt merged 5 commits intolightningdevkit:masterfrom
TheBlueMatt:2019-01-monitor-update-handle-fuzz

Conversation

@TheBlueMatt
Copy link
Copy Markdown
Collaborator

Based on #285, #286, and #288, this adds handling of monitor update failures in two more places, with some trivial tests to check sanity thereof. It also finally adds the fuzz test which found the issues in #286 and #288, though it really wants rust-bitcoin/rust-secp256k1#89 to fully test things. This leaves only three non-fee-handling unimplemented!()s in ChannelManager (and they're all during channel setup, so are much easier to handling).

@TheBlueMatt TheBlueMatt changed the title 2019 01 monitor update handle fuzz Handle monitor update failures in two more places + new fuzz test Jan 13, 2019
@TheBlueMatt TheBlueMatt force-pushed the 2019-01-monitor-update-handle-fuzz branch from 691bf82 to 72dd991 Compare January 15, 2019 21:43
true
})
}
pending_msg_events.retain(|msg| {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm wandering, if client send stale message before channel_reestablish ones, isn't this mean that a code path is wrong somewhere ? Or it's not something we can be sure of because of the library architecture ?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not strictly. In theory a user could get a disconnect from a peer, reconnect to that peer, and then process events, seeing the message send events for a given node_id (and not connection). I'm not sure if there aren't other related races, but at least this improves things.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. Yes sure better to be careful and it cost nothing there. Moreover if user doesn't take our peer_handler, it may break some of our assumptions.

Copy link
Copy Markdown

@ariard ariard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More learnt than reviewed, monitor failures interaction with channel state machine are really interesting.
As you said, there is only left pre-ChannelFunded unimplemented! cases (and fee). Had a look to be sure, monitor update failure doesn't interfere with closing phase, at least after there is no HLTCs left on commiment_transaction. We can add a little test with a failure after shutdown have been exchanged on both side, but should be alright.

commitment_signed: commitment_msg,
},
});
} else {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build without unreachable branch (1.22.0)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, it will build without it, but having it in there means if there is a refactor in the future that breaks things we'll see panics instead of losing funds :p.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah astute! Maybe add a comment there with your answer

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually gonna leave it. We use this pattern in a few places, and hopefully "unreachable!()" is rather self-documenting. In general, across the library, we'd prefer to panic than have a potentially money-losing bug, so this isn't particularly out of the ordinary.

//TODO: Do something with e?
return
},
} else { unreachable!(); }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build without unreachable branch (1.22.0)

@TheBlueMatt TheBlueMatt added this to the 0.0.8 milestone Jan 21, 2019
@TheBlueMatt TheBlueMatt force-pushed the 2019-01-monitor-update-handle-fuzz branch 9 times, most recently from 4a2c20d to bfe9f1d Compare January 24, 2019 16:51
This shouldn't be required, but it may help prevent some downstream
race conditions due to clients not sending message events quickly
enough and trying to send stale messages before new
channel_reestablish messages.
This is an oversight as the MessageSendEvent is otherwise entirely
useless.
Sadly this requires reducing the honggfuzz iterations to fit within
Travis' runtime limits.
@TheBlueMatt TheBlueMatt force-pushed the 2019-01-monitor-update-handle-fuzz branch from bfe9f1d to 49d6330 Compare January 24, 2019 18:19
@TheBlueMatt TheBlueMatt merged commit 4ccb1e4 into lightningdevkit:master Jan 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants