tokio-native-tls-flush-patch: Flush after write on MacOS in tokio-native-tls
This MR is a follow-up for the following comment on "[PoC] macos-native-tls-issue: Solving native-tls issue on MacOS": !3159 (comment 3256382) where we discussed the bug that users consistently got the error Error: connection closed via error when using Arti with native-tls on MacOS.
I already did a lot of investigation on this issue while it actually not being in my expertise range. A part of my conclusions can be found in !3159 (closed) but I recently also asked for support on other platforms:
- https://stackoverflow.com/questions/79762468/difference-in-flushing-behavior-in-openssl-and-apple-secure-transport
- https://users.rust-lang.org/t/need-help-understanding-flush-behavior-in-tokio-native-tls-openssl-vs-secure-transport/133844
- https://github.com/openssl/openssl/discussions/28527
Due to the lack of replies I suspect this topic is rather niche, but on Stackoverflow I was able to receive a reply. I quote a few phrases:
The whole API provided within
Secure Transportis very limited (...)
And there are no such API like
is_handshake_message_completeandis_waiting_for_peer_responseinSecure Transport, which means Secure Transport's callback model doesn't provide the hooks needed for smart protocol-aware flushing (...)
so the safest approach is to flush after every write on macOS.
The code changes in this MR implements exactly what the latest quote suggests. I patched tokio-native-tls to flush after every write when executed on MacOS.
To be able to patch tokio-native-tls I had to clone it locally and register the patch in Cargo.toml. We need to decide if this is a procedure we want to use. I also created a PR with tokio-native-tls (https://github.com/tokio-rs/tls/pull/159) but I'm afraid this wont be reviewed due to the repo being inactive - the latest approved PR in seemed to be from 2023...
@opara Regarding this comment:
Rather than making arti less performant in order to work around an issue in another library, I think we should ideally figure out exactly which library has the issue and open an issue with it. My guess would be tokio-native-tls or native-tls.
I agree with you. The proposed code change won't make Arti in general less performant, it only provides a workaround for the specific case that native-tls is used on MacOS. We do not alter the behavior or performance for any other platform or TLS Provider.
Test / reproduce: On MacOS run hyper-http-client-example on this branch and it will run smoothly. Run on a branch without this change like main and you'll get the expected error.