Measure snowflake proxy churn of each pool separately
We have two different pools of Snowflake proxies:
- An "unrestricted" pool that contains proxies with permissive NAT and firewall types. Most clients should be able to connect to these proxies.
- A "restricted" pool that contains proxies with unknown or non-permissive NAT types. Some clients will not be able to connect to these proxies.
Our "unrestricted" proxy pool is considerably smaller than our "restricted" pool. Looking at our most recent metrics:
snowflake-ips-nat-restricted 101462
snowflake-ips-nat-unrestricted 16810
snowflake-ips-nat-unknown 48742
That's around 17K proxies in the "unrestricted" pool and 150K proxies in the "restricted" pool. Our general assumption is that many (but not all) proxies in this "unrestricted" pool are running on servers with static IP addresses. This makes them more susceptible to enumeration and blocking.
I would like to directly measure the proxy churn for each pool separately. The motivation for this is to test whether our assumption is correct and get a better understanding of how enumeration attacks will disproportionately affect users that rely on the unrestricted proxy pool. If churn in this pool is low, that could motivate looking into some recent proposals like SpotProxy to make this pool more dynamic.
My plan is to adapt the HyperLogLogPlus patch from !95 (merged) to log IP counts for each pool separately. We will be able to add these two pools together to compare the total churn metrics with our previous measurements.