Results -9 to 0 of 10

Threaded View

  1. #9
    Player
    Raist's Avatar
    Join Date
    Aug 2013
    Posts
    2,457
    Character
    Raist Soulforge
    World
    Midgardsormr
    Main Class
    Thaumaturge Lv 60
    Something my be happening before you get to Tokyo... or at least that first hop we can see before Tokyo.

    A couple questions.... Why do you keep blacking out the first half of the route? And why aren't you running tracert? We aren't seeing the peak values with that report, which can be an important factor. These can be important factors to investigate.

    For example, look at what we can make out for hop 8:

    Unkown IP, so we don't know who manages it, nor where it is located--which may give some direction on known issues.
    Minimum response time is 5ms, last is 6ms... but the average is 46ms. We can only make wild guesses as to what your peak times are, much less what type of jitter you have on your line at that point in time. Is the peak a spike that pushes the boundary of a timeout just once in a blue moon and enough to make the average 46, or is it that the bulk of the times are 50-ish and the single digit time is the anomoly?

    Apply that as well to hop 12. Min 99, last was 101, but the average was 155. By looking at plots off to the right, it appears that it may in fact be a scenario where you have response times that may be flirting with a timeout scenario...getting close, but not quite. Is that because that one hop is flaking out, or is it a string of issues that have been stacking up from earlier in the route.

    Keep in mind that whatever is going on at any given hop can potentially compound into bigger problems along the rest of the route. More details are needed about your route than what is being presented. Also, remember that ICMP ECHO (the query used) is one of THE lowest priority items in traffic shaping...so it is one of the first things to get delayed (or dropped) during high loads. The routers do this in an effort to stave off pending congestive failure. When they start is determined by the ISP managing the router... it could be 70% or 90%. There hardware, their rules. That's what makes it such a useful "test for smoke" kind of tool. If a hop is intermittently delaying a response (or dropping it), it is a sign that utilization is scaling too high. More than a 10-15% variance in back-to-back responses can be an indicator that things are getting a little tight. Not so much an issue when your times are <30ms or so, but beyond that... some additional testing may be in order.

    It is important to see all the information you can about what is going on along your route. You need to work outward from the source and examine what happens first, as it impacts everything that happens afterwards. In the case of this game, we send our input to the server, the server processes it and sends a response back, and our client processes that and the results are displayed on our screen. So, you need to work from the client outwards first to see if there are any potential issues going forward to the server. Following that path with the data you've provided, we clearly see issues cropping up at hops well before SE's end of the equation (be that on the forward or reverse path, it could be starting at a point between you and SE). Unfortunately, we aren't seeing enough actionable information about what may be happening or how bad it may be to convince a technician to take a harder look at your route.

    Just for a quick demonstration of what I am talking about, here is some data that most technicians will like to start with:
    Code:
    Tracing route to 124.150.157.28 over a maximum of 30 hops
    
      1     1 ms    <1 ms     1 ms  LPTSRV [10.10.100.1]
      2    26 ms    28 ms    28 ms  cpe-075-176-160-001.sc.res.rr.com [75.176.160.1]
      3    19 ms    22 ms    16 ms  cpe-024-031-198-005.sc.res.rr.com [24.31.198.5]
      4    15 ms    14 ms    55 ms  clmasoutheastmyr-rtr2.sc.rr.com [24.31.196.210]
      5   165 ms   195 ms    75 ms  be33.drhmncev01r.southeast.rr.com [24.93.64.180]
      6    31 ms    29 ms    34 ms  bu-ether35.asbnva1611w-bcr00.tbone.rr.com [107.14.19.42]
      7    90 ms    88 ms    90 ms  bu-ether22.vinnva0510w-bcr00.tbone.rr.com [107.14.17.179]
      8    90 ms    91 ms    89 ms  bu-ether13.tustca4200w-bcr00.tbone.rr.com [66.109.6.2]
      9    86 ms    86 ms    88 ms  0.ae3.pr0.lax10.tbone.rr.com [66.109.9.26]
     10    88 ms    88 ms    89 ms  66.109.10.194
     11   196 ms   194 ms   197 ms  ip-202-147-0-52.asianetcom.net [202.147.0.52]
     12   195 ms   196 ms   195 ms  gi1-0-0.gw1.nrt5.asianetcom.net [202.147.0.178]
     13   193 ms   194 ms   197 ms  squareco.asianetcom.net [203.192.149.210]
     14   196 ms   196 ms   197 ms  61.195.56.129
     15   186 ms   189 ms   187 ms  219.117.144.66
     16   294 ms   313 ms   186 ms  219.117.144.53
     17   206 ms   214 ms   201 ms  219.117.144.41
     18   194 ms   195 ms   195 ms  219.117.147.194
     19   192 ms   193 ms   230 ms  124.150.157.28
    
    Trace complete.
    Yes.. I included my entire route. My public IP is not exposed in this report, so it's not the big OMFG security issue everyone panics about. Now lets look at what signs of potential problems are presented here.

    There are lag spikes at SE's end, yes? Does that automatically follow that it is a problem on SE's end? Not necessarily. Lets work through the route. Hop 5 in the Raleigh-Durham area has a bad lag spike. Hmm... perhaps that is contributing to the one at SE's end, because hop 5 flaked out gain during the 19th round of pings to SE's server. Note, that's basically what is important about tracert...how it works. It basically pings repeatedly, asking for an ICMP ECHO reply from hop1 3x, then hop2 3x, (you pass through hop1 to get to hop2 before you get that answer), then it asks Hop3 3x (again, passing through hop1 and 2 before it gets each response). So along the way, if hop 3 has issues intermittently they can re-appear in the response to another hop that follows it.

    Keeping that in mind... lets look further up the route. Look specifically at hop 16 during Asianet's routing . Notice something special about that hop? Not only the lag spike... but the pattern for it? Compare it to what we also saw at hop 5. Lets put them on top of each other to draw the eyes in a bit better:

    Code:
      5   165 ms   195 ms    75 ms  be33.drhmncev01r.southeast.rr.com [24.93.64.180]
     16   294 ms   313 ms   186 ms  219.117.144.53
    Notice how they follow a similar pattern? Could they be connected? Maybe, maybe not... but, one thing for certain--there should NOT be a swing of 75 to 195 between just 3 pings at the same hop in Raleigh...I also should not be getting that high of latency that close to home (it's like a 4 hour drive). That is a MAJOR indicator of high jitter... a tell-tell sign for congestion or something else being out of sorts either in Raleigh, or on the way there from the previous hop.

    From this simple report, we can identify where a potential root of at least one problem lies, and can move on to more in depth diagnosis of that or any other potential hot spots. We have the IP, and fortunately also a registered DNS name as well. From either of those two pieces of information, we can look up the registry data to find out who is responsible for maintaining those hops, and if we find something eventful enough going on there, someone (preferrably our ISP's Tier3 techs) can contact them to investigate further--and hopefully correct the problem.

    This can be key information to provide to your ISP's Tier3 support to conduct a proper investigation into what may be going on with your route. They have the resources to take it further, but they need a roadmap to get started. Such a report is simple to do, provides a lot of information, and is pretty universal. You can run a nearly identical report in all the *nix type of environments used in routing hardware, so it is easy to line things up and compare the data.
    (0)
    Last edited by Raist; 02-07-2015 at 12:19 PM.