Skip to content

TPA-RFC-72: move donate-01 VM to gnt-dal cluster

in tpo/web/donate-neo#134 (closed), we've identified severe latency issues with the donation site. @lavamind suspected the tunnel crossing the atlantic might be causing those issues, and analysis (tpo/web/donate-neo#134 (comment 3083019)) shows there's indeed a 200-400ms latency over that link, which is causing severe disruptions.

at first, @lavamind thought we should move the crm-int-01 machine next to the donate-01 machine in the gnt-dal cluster, but it's actually the other way around. the crm-* machines were moved to gnt-dal over a year ago, in #41109 (closed).

so what we need to do is to move the donate-01 VM instead.

next steps:

  • make a migration plan (for now, below is an inter-cluster migration plan inspired by #41109 (comment 2900087), but should we just rebuild a donate-02?)
  • review the migration plan (@lavamind)

inter-cluster migration plan

Before:

  • schedule an outage with stakeholders
  • Look for any hard-coded IPs in donate and puppet code

During:

After:

Edited by Jérôme Charaoui
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information