Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement the tokio version of sans-io #2

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

joshka
Copy link

@joshka joshka commented Dec 31, 2024

The purpose of this PR is to avoid reimplementing the state machine that is inherently provided by async rust.

Follows from the discussion at

The purpose of this is to avoid reimplementing the state
machine that is inherently async rust.
This follows from the discussion at
<https://news.ycombinator.com/item?id=40872020#40879547> and then:
<https://www.reddit.com/r/rust/comments/1hpqgbt/sansio_the_secret_to_effective_rust_for_network/>
The modification to support time after the receiving a binding is 4 LoC
Copy link
Member

@thomaseizinger thomaseizinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't sure where you'd like to take this PR so I left a few comments that I'd like to have addressed for merging.

.find(|addr| addr.is_ipv4())
.context("Failed to resolve hostname")?;

let (sink, stream) = UdpFramed::new(socket, StunCodec).split();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see a sketch on how multiplexing of multiple StunBindings to different servers over the same socket works with this design.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that the other example doesn't directly address this. However, I know from experience that it does support multiplexing well so if we want to have an equal comparison, there needs to be an idea on how to support this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I was careful to do in this was to try to keep the implementation fairly close to the existing code (with a tiny bit of leeway). I'd be happy to play with extending it a bit, but the resulting code designed for a more real world approach will look quite different, which may introduce some impedence mismatch into things.

To clarify my understanding, you might choose to make two STUN binding requests simultaneously to two servers, and receive responses back. When you do get responses your code will see both addresses and be able to record them. These requests / responses are multiplexed on the same socket.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I was careful to do in this was to try to keep the implementation fairly close to the existing code (with a tiny bit of leeway). I'd be happy to play with extending it a bit, but the resulting code designed for a more real world approach will look quite different, which may introduce some impedence mismatch into things.

Yes, this makes sense! I think those (API) differences are interesting to dig into. The existing event-loop code can handle this without major design changes: You simply iterate over a list of StunBindings and pass the packet to it until one of them accepts it.

To clarify my understanding, you might choose to make two STUN binding requests simultaneously to two servers, and receive responses back. When you do get responses your code will see both addresses and be able to record them. These requests / responses are multiplexed on the same socket.

Yes. That is how you figure out, what kind of NAT the application is dealing with. If you receive the same observed address from both servers, the NAT is destination-independent and thus "hole-punch friendly". If you receive different responses (i.e. different port numbers) it is a symmetric NAT and hole-punching will fail.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another reason is that you want to test both IPv4 and IPv6 connectivity. This might be to the "same" server but from the perspective of STUN, that doesn't matter.

println!("Our public IP is: {address}");
}
}
tokio::time::sleep(tokio::time::Duration::from_secs(3)).await;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the other example, I'd like to see a sketch here how we would support other time-related requirements. For example, timeouts and re-transmits of messages where we didn't receive a response.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The re-transmits need to be bounded, i.e. we don't want to retry forever.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup - this was just an attempt to go like for like with the same logic from the article. The protocol logic is a better source of what should be done here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference is, adding more "timers" to the other implementation doesn't change its API. If we want to find an implementation that is equivalent, it also needs to support the same / similar ways of being extended.

if let Some(result) = self.stream.next().await {
let (message, _) = result?;
if let Some(address) = parse_binding_response(message) {
println!("Our public IP is: {address}");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does the caller learn the address? Printing should happen in main to showcase that one could do whatever with the response.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't in this particular approach - this was a like for like implementation. I'll come up with something to demonstrate something a bit more real world useful.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't in this particular approach - this was a like for like implementation.

The sans-IO code prints inside the event-loop though, not within StunBinding:

if let Some(address) = binding.public_address() {
println!("Our public IP is: {address}");
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we extend the README to say that these are retro-actively community-contributed?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup

@joshka
Copy link
Author

joshka commented Jan 2, 2025

I wasn't sure where you'd like to take this PR so I left a few comments that I'd like to have addressed for merging.

It's mostly about closing the loop. I don't expect this to really need to be merged / resolved in any way. The discussion is the most valuable part of this.

}
}

async fn public_address(&mut self) -> anyhow::Result<()> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing I want to mention here, in case you are interested in continuing to iterate on this:

In our production app, we don't just perform a STUN binding. Instead, we implement an entire TURN client. The TURN client has background timers that send particular messages on an interval (and wait for the responses). In addition - whilst those are running - other parts of the code need occasional access to the state of the client to pro-actively send messages. In particular, they need to be able to make new channel bindings (i.e. just a request-response) handshake. Those newly bound channels then trigger a new timer that rebinds the channel on a 5min interval.

The exact domain doesn't matter. What is important is that what we have here - an async fn that captures &mut self won't work for that unless we start using Arc and Mutex in a some way. To simulate the same thing, we could add an additional method that proactively performs another STUN binding, separately to the 5s timer.

Copy link
Author

@joshka joshka Jan 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. I think based purely on the RFC, my understanding is that multiple callers should have the ability to do STUN/ICE/TURN over the one UDP socket (or if really going all in on the spec, then across any of the supported transports).

Incidentally, as a supporting part of this part of things, I was under the impression that tokio / futures based streams already had a stream fork / partition approach, but it doesn't seem like they do. The filter method discards the non-filtered items. I'm looking at implementations for that in tokio-rs/tokio#7065.

My general idea on the design of how this might work is that:

  • on the STUN side of things, a Transaction can be created that known only about messages that are for that transaction. Implementing this seems simplest as something like:
let (stun_messages, non_stun_messages) =
    all_messages.partition(MultiplexedMessage::is_stun);
// then later in the code for implementing a single Transaction:
let (transaction_messages, non_transaction_messages) =
    stun_messages.partition(|message| message.transaction_id() == self.transaction_id)

At a higher level, the stun code sees a stream / sink of STUN messages, the TURN related code sees a stream / sink of TURN message, etc. There's no shared access to some &mut sender / Arc as the multiplex is handled by splitting the stream / sinks. I'm not sure how feasible this is as yet.

As an aside, my interest in this is purely about the technical challenge of understanding the technology better. I want to make sure that I communicate that your sans-io approach is pragmatic given the software you're building and it's very likely the best solution for your problem. All of this discussion is about whether this approach generalizes, or whether using async is a more ideal generalization of the sans-io problem. I think the shallow answer is that async could eventually be the better approach, but it's probably not quite there yet.

Also, I wanted to thank you for continuing the discussion on this. This is obviously your code for your $JOB which has settled on the sans-io approach alread, so this alternative discussion doesn't really change that. So I appreciate your time in giving feedback regardless, and I won't be offended if you have to duck out of the conversation at any point.

Copy link
Member

@thomaseizinger thomaseizinger Jan 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. I think based purely on the RFC, my understanding is that multiple callers should have the ability to do STUN/ICE/TURN over the one UDP socket (or if really going all in on the spec, then across any of the supported transports).

For completeness: It would be good if we can also multiplex other things in addition (like QUIC or in Firezone's case WireGuard). However, I don't think that should be an issue if we figure out a pattern on how to generally de-multiplex messages from one socket.

As an aside, my interest in this is purely about the technical challenge of understanding the technology better. I want to make sure that I communicate that your sans-io approach is pragmatic given the software you're building and it's very likely the best solution for your problem. All of this discussion is about whether this approach generalizes, or whether using async is a more ideal generalization of the sans-io problem. I think the shallow answer is that async could eventually be the better approach, but it's probably not quite there yet.

Also, I wanted to thank you for continuing the discussion on this. This is obviously your code for your $JOB which has settled on the sans-io approach alread, so this alternative discussion doesn't really change that. So I appreciate your time in giving feedback regardless, and I won't be offended if you have to duck out of the conversation at any point.

Absolutely! I think it is a very interesting topic and definitely worth exploring. Thank you for spending time on it! I think identifying async patterns (and shortcomings) here is very valuable and might also serve as a good resource for async Rust WGs in what is still needed to make things better. I'll do my best to stay responsive :)

An interesting thought I had this morning is: The direction you are pushing in with your design is still sans-IO (modulo the timers perhaps) because it doesn't depend on IO sources directly but uses the Stream + Sink abstractions. So eventually, we should probably also spend some time thinking about more descriptive names? :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An interesting thought I had this morning is: The direction you are pushing in with your design is still sans-IO (modulo the timers perhaps) because it doesn't depend on IO sources directly but uses the Stream + Sink abstractions. So eventually, we should probably also spend some time thinking about more descriptive names? :)

Yep. that's kind of similar to where I started in my hacker news comment:

This article / idea really refactors two things out of some IO code

  • the event loop
  • the state machine of data states that occur

But async rust is already a state machine, so the stun binding could be expressed as a 3 line async function that is fairly close to sans-io (if you don't consider relying on abstractions like Stream and Sink to be IO).

async fn stun(
    server: SocketAddr,
    mut socket: impl Sink<(BindingRequest, SocketAddr), Error = color_eyre::Report>
        + Stream<Item = Result<(BindingResponse, SocketAddr), color_eyre::Report>>
        + Unpin
        + Send
        + 'static,
) -> Result<SocketAddr> {
    socket.send((BindingRequest, server)).await?;
    let (message, _server) = socket.next().await.ok_or_eyre("No response")??;
    Ok(message.address)
}

Obviously there's a lot more to it than just that... :), but the underlying idea that async code and sans-io are on opposing sides of things is definitely not the case if you accept that sans-io doesn't require push style interfaces that are externally driven, and are ok with them being async state machines which are driven by the async framework instead of your own code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants