Objective 2: Make the Conjure pluggable transport more reliable
The Project shall address reliability challenges, including issues caused by overloaded stations and unstable connections, by developing and implementing solutions to improve connection stability. This objective includes conducting comprehensive performance testing and devising mechanisms for connection recovery and failure diagnostics to enhance the overall reliability of the transport.
2.1: Research reliability issues — (2 months)
There are two types of reliability issues we will target in this activity: (1) overloaded Conjure stations and (2) faulty connections between stations and bridges. During this phase we will research these problems with dog-fooding and performance testing to understand how the Conjure MVP is working in terms of reliability. We’ll also assess whether implementing TurboTunnel will help with some of these issues and we’ll assess if there’s a way to distinguish between an overloaded station and a blocking event. The outcome of this activity is to implement solutions that allow for recovery for stalled connections and clearer log output for connection failures due to an overloaded station.
The work in this activity may require coordination with the upstream maintainers and Conjure station admins, but as addressed in the risk analysis section, we have a working relationship with the Conjure code maintainers and they have responded to us quickly in the past. We will continue to update them on our plans and reach out early and often if upstream changes are needed.