Quote Originally Posted by Wolfie View Post
More competent employees probably, and (again without fully knowing, but guessing) spreading the whole team globally. I get the distinct feeling that whoever is doing work in the Toronto datacenter is only getting instructions from the dudes in Japan, and doesn't actually know what's wrong or how to fix it.
That's my though as well Wolfie, I wouldn't be surprised if its something really stupid inside a router or switch somewhere where it is not hitting the fallback or primary IP number, causes the bottleneck, causes the crashes. I wouldn't be shocked if they have 2 people only working on this, checking each router or switch config file line by line to make sure it all routes right. Then they would have the software guys checking there stuff, problem is they have not been entirely descriptive. It also could be something as simple as opps we miscounted bring up 5 more instance servers boom problem fixed.