Snowflake stops working after large size tests (v2.1.0)
Snowflake, after some of the large message size tests, suddenly stopped being able to send/receive. The race app was still running, but starting/stopping the deployment seemed to fix the issue.
Fairly sure this is the culprit: https://github.com/RACECAR-GU/plugin/blob/snowflake-rc-2.1.0/source/TA2Plugin.go#L180
func (connection *ta2ConnUnicast) Write(msg []byte) error {
conn, dialErr := connection.ClientFactory.Dial()
if dialErr != nil {
logError("Error Connecting to Send Socket: ", dialErr.Error())
return dialErr
}
// We can't just close this connection right away before it gets a chance to send
//FIXME: This is a kludge
//go func() {
// <-time.After(5 * time.Minute)
// conn.Close()
//}()
// Send Message to Socket
_, writeErr := conn.Write(msg)
if writeErr != nil {
logError("Error Writing Message to Send Socket: ", writeErr.Error())
}
return writeErr
}
connection.ClientFactory.Dial()
is basically the same as defined here.
Is there an easy way to properly close this conn
that you can think of? @shelikhoo @meskio
Here's from my previous discussion with @cohosh
<cohosh> the reason we can't close the connection when this function terminates is that conn.Write() is not a blocking call here
<cohosh> so that will return before the message is actually written
<cohosh> and in some cases, it takes a few minutes to write the message to the connection'
<cohosh> so if we close the connection before the message has been received by the other side, the message will never be sent
The best thing to do is implement some higher level connection management/connection caching logic.
So that we're reusing connections when multiple messages are sent to the same destination(s).
Edited by xqiu