We have a nice system of “rolling restarts” that allows us to update portal servers without introducing any “real” downtime for users. We introduce a pre-upgraded server to the pool and that server takes on all new sessions. The two server already in the pool don’t get any new users and “drain” as people logout. When they are both “empty” they are upgraded and then the process is reversed to get that 3rd server out of the pool and ready for the next time.
We needed to update the portal component that talks to our Learning Management System so that it could display the new fall courses. Well right in the middle of the draining our network went all haywire. See if you can spot the trouble in this chart:

Near the end of the day you can see the full extent of the crazy.

The blue “hump” in the graph is a function of application session length divided by network latency, carry the phase of the moon squared. Over the course of a day the number of sessions coming and going remain stable. When new servers are introduced they ramp up with new sessions quite quickly. But you can see that it normalized within 30 minutes or so. The fact that network traffic from the outside world was shut off also helped flood us with new sessions when the network recovered.