1

Some days ago I've stuck into the situation of unavailability for my work VPN. As I was at work, I've tried to ping my home server, and it didn't respond; OK, too many options for that to happen. But then I've tried to connect to it from my phone (cellular network)... and it has responded. I've started investigating this issue and have run into the following facts.

  1. Pre-requisites: home equipment is Asus RT-AC68U router with Merlin firmware and no any known faults. Home IP is static, 89.179.244.35 - vnikityuk.static.corbina.ru (no DNS failures ever, that's the only IP address for this record). Provider connection is IPoE (Ethernet 100 MBps). Most of the Internet is visible from home (including Google and Serverfault.com). All necessary dances, like rebooting the router, updating its firmware, changing the connection from it to the PC directly, are done and have no effect.

  2. When I trace my home address from the office network, it gets stuck at the 3rd hop.

  3. When I trace my "neighbour's" address 89.179.244.40 from the office network, it traces successfully to the end in 10 hops.

  4. When I trace my office address from home, it gets stuck on 5th hop somewhere at the border between my provider and one of the intermediate providers.

  5. I'm in contact with the office network administrator and he swears he has no IP filtering or something like. I believe him, because of No. 5:

  6. The same situation appears with 50/50 odds when trying to trace my home address from different providers. I mean, the trace does not run any further the 2nd (or 3rd) hop. To get this, I have questioned about 10 friends, asking to trace both my and "neighbour" address.

At all of these networks, the "neighbour" IS TRACED.

It blows my mind off. I am not CCNA certified, but tended to believe I do understand how internet works - subnets, autonomous systems, BGP etc. But now it seems like to route different IP addresses in one and the same subnet (89.179.244.0/24, AS8402) different replies are received! How can this be at all?

P.S. I have a "quick fix" for that, which is changing the ISP (there are plenty of them in my district). But first I want to understand this situation in general.

  • 1
    I posit your isp is doing so.e kind of dynamic routing at the individual IP level and something id borked.. – davidgo Feb 22 '22 at 06:04
  • ISP tech support have said they can do nothing - backbone routing is out of their scope and they have no way to escalate it. So seems like I have to change the ISP. – Victor Nikityuk Feb 22 '22 at 07:08

2 Answers2

1

When I trace my home address from the office network, it gets stuck at the 3rd hop.

This may be the fault of your workplace's upstream provider, guessing purely from the hop count (making assumptions but if the company is large enough to have several offices and a VPN, it'll likely have two if not three internal hops before reaching the ISP), and also from trying to trace the specified addresses from various different locations.

When I trace my office address from home, it gets stuck on 5th hop somewhere at the border between my provider and one of the intermediate providers.

Without seeing traces from both directions, this may be misleading. Often the same router has IP addresses belonging both to its real owner and to various intermediate providers that it's peered with, and both the AS number and the 'reverse DNS' may correspond to that provider (or, for example, an IX) even though it's technically a customer's router.

now it seems like to route different IP addresses in one and the same subnet (89.179.244.0/24, AS8402) different replies are received!

There can be several explanations:

  1. BGP routes (and routes in general) do not necessarily correspond to subnets. They correspond to networks, but a network may consist of one subnet, or several, or half a subnet. It is perfectly possible for your ISP to have two /25 subnets, or four /26's, and advertise them as an aggregated /24 route via BGP (and likewise for the ISP's internal routing).

    And vice versa, it is possible that the actual subnet might be a /20 or /17 or something, but the ISP deaggregates the BGP advertisements to a bunch of /24's for (often dubious) "BGP optimization" purposes.

  2. Some routes might not necessarily point to "subnets" at all; instead each individual address from the /24 might be routed point-to-point through a different device.

    For example, it is possible that there's actually no 89.179.244.0/24 subnet as such, but internally the customer routers use completely different addresses, and their "public" addresses are internally routed as /32's (e.g. 89.179.244.35/32 via 172.16.42.35 kind of thing). I'm aware of one nearby ISP which does precisely that.

    I would not be surprised to see this in some networks that transition fully to IPv6 (using the "IPv4 as a service" model). As a more common example, most VPN types work in point-to-point mode (no ARP) rather than emulating a broadcast subnet.

  3. Similar to the above, even if an ISP has a general BGP route for your entire network, they may internally have manually added a more specific route for your address specifically, e.g. if they needed to block traffic to it for some reason. It's completely possible to route e.g. a certain /32 through somewhere else than the rest (the only question is why).

  4. The point where routes diverge may be using ECMP (equal-cost multi-path routing). In enterprise routers (and indeed even Linux), a route may have multiple gateways specified, and a packet may use any one of them. This may be used in combination with BGP, e.g. if there are several paths of equal preference, instead of using a tie-breaker to select just one "best" path, the router just combines all of them into an ECMP route.

    $ ip route show
    
    /* Normal route: */
    10.147.112.0/24 via 10.147.240.4 dev gre-ember proto bird metric 32
    
    /* ECMP route: */
    10.147.122.0/24 proto bird metric 32 
        nexthop via 10.147.240.3 dev gre-wind weight 1 
        nexthop via 10.147.240.4 dev gre-ember weight 1 
    

    Usually in ECMP, the nexthop for each packet is chosen by hashing its source and destination addresses, so that all packets from-to the same peers take the same path, but a different destination may take a different one.

    For example, if there are two nexthops, then hash(src.dst) mod 2 determines which nexthop (0th or 1st) to pick, so as a result, you may see that 'odd' destination IPs will use nexthop A while 'even' destinations use nexthop B, or the other way around.

    (This is similar to how LACP-based link aggregation works, only at L3 instead of L2.)

u1686_grawity
  • 426,297
  • 64
  • 894
  • 966
  • Thank you for the detailed comment! Still it makes me wonder why only my specific IP seem to have this "incorrect routing" - I've checked randomly other "neighbour" adresses and they all are announced correctly. – Victor Nikityuk Feb 22 '22 at 07:05
  • For the record, I don't see any problems when tracing _to_ the two addresses you mention. Though there definitely is a hop that uses ECMP (e.g. right past "gldn-gw-140G.corbina.net" the path alternates between two routers 85.21.224.82 and 195.14.62.247, but both belong to the same ISP, so it immediately converges afterwards – and from certain other locations which take a different path, the same happens after "himki-bb-be1.corbina.net" instead, or after "ams-bb.corbina.net", etc). But that doesn't seem to be a factor, both addresses are equally reachable from all locations. – u1686_grawity Feb 22 '22 at 07:20
0

Traceroute is not 100% reliable. Unless it can get a response from each hop, it won't work well. Years ago it worked really well, because no server had reason to avoid it. But now there are plenty of reasons to drop pings.

You can try nmap on the last hop you get, and see exactly what is going on with that server. Your home router may be configured to drop certain pings, in which case it would be invisible to traceroute.

You can also mess around with the traceroute program, and see if the timimg can be adjusted to get more responses. I've done traces where nothing showed after the first hop! So, you just don't know exactly what's out there unless you port scan the seemingly dead hosts.

Do you have ports open at home to accept pings? If every port is closed, i.e. no servers connected to the Internet, the router might drop pings too, because there's no point to revealing you're even there at all. Also, is your static ip a converted one from a dhcp modem connection. There are services where you submit your dynamic ip, and you get a static one. If your dynamic ip changes, you have to tell the service the new ip.

Those types of static ips are not universally accepted. It's not like dynamic dns, where your dynamic ip is converted to a url. Converting one ip to another ip is more hazardous from a networking standpoint. And not every Internet router recognizes such things.

Try it using the ip from your isp, or did you get the static ip from your isp? In any event, things are not always as they appear to simple networking tools. Sometimes you need to dig deeper.

Brian
  • 152
  • 4
  • Thank you. Sure the necessary ports are open, as I can normally ping and trace my router from some of the networks. The problem is that there are some from which I can't. – Victor Nikityuk Feb 22 '22 at 07:11
  • It certainly is an interesting phenomenon. I have to try to find some routes that defy reason. – Brian Feb 23 '22 at 20:20
  • Thank you. It seems like it highly depends on your country. I've tried Hurricane Electric "Looking glasss" and it seems that all of the routes from outside Russia come to this network without any problem; the bad link is between my ISP ("Beeline") and another Russian ISP, Comcor ("Akado"). I've changed the static IP address to 95.31.42.186 and got this problem again! For the first 2 days all was working perfectly, now I have exactly the same situation - 95.31.42.187 is traced and pinged, 95.31.42.186 is not. Tomorrow I'm going to change the ISP. – Victor Nikityuk Feb 25 '22 at 06:01