LISP – Redundancy and Load Balancing

LISP Review

On the last blogpost about LISP, we have configured LISP to demonstrates how it works as an overlay technology. On the previous topology we had dual homed CE devices (CE1, CE2, and CE3) but we have only configured LISP to operate on one interface. The database-mapping indicates which local prefixes we want to advertise the MSMR (which in our case is the same device). We have only one next hop specified with a priority of 10 and a weight of 10, the configuration is as follow :

CE1#sh run | section router lisp
router lisp
 database-mapping 1.1.1.1/32 192.168.11.100 priority 10 weight 10
 ipv4 itr map-resolver 100.100.100.100 ipv4 itr
 ipv4 etr map-server 100.100.100.100 key CISCO ipv4 etr exit

CE1 doesn’t possess the routes to ping other CE’s loopback :

CE1#show ip route 2.2.2.2 
% Network not in table
CE1#show ip route 3.3.3.3
% Network not in table
CE1#

The Core network doesn’t posses there informations as well but LISP is able to retrieve the necessary informations (see previous post on LISP ) :

CE1#ping 2.2.2.2 so lo0
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 2.2.2.2, timeout is 2 seconds:
Packet sent with a source address of 1.1.1.1 
..!!!
Success rate is 60 percent (3/5), round-trip min/avg/max = 2/6/9 ms
CE1#ping 3.3.3.3 so lo0
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 3.3.3.3, timeout is 2 seconds:
Packet sent with a source address of 1.1.1.1 
..!!!
Success rate is 60 percent (3/5), round-trip min/avg/max = 8/11/15 ms

These information are store into the LISP cache which is a way to optimize how LISP is working :

CE1#show ip lisp map-cache 
LISP IPv4 Mapping Cache for EID-table default (IID 0), 3 entries
0.0.0.0/0, uptime: 00:07:17, expires: never, via static send map-request
 Negative cache entry, action: send-map-request
2.2.2.2/32, uptime: 00:01:21, expires: 23:58:38, via map-reply, complete
 Locator Uptime State Pri/Wgt
 192.168.21.100 00:01:21 up 10/10 
3.3.3.3/32, uptime: 00:01:11, expires: 23:58:48, via map-reply, complete
 Locator Uptime State Pri/Wgt
 192.168.33.100 00:01:11 up 10/10

As you can see, there is only one next-hop cached for each prefix, we are at risk here because there is no redundancy. Let see how LISP can solve this issue.

LISP –  Redundancy

On each device we can inform the MSMR on what local prefixes we have available and over which interface we want it to be reachable. If we have another interface usable to reache these prefixes, we can simply add a new database mapping information. For example on CE1 we also want the interface G0/2 to be used :

CE1#sh run | sec router lisp
router lisp
 database-mapping 1.1.1.1/32 192.168.11.100 priority 10 weight 10
 database-mapping 1.1.1.1/32 192.168.12.100 priority 10 weight 10

So now the MSMR should have this information available :

MSMR#sh lisp site name CE1 
Site name: CE1
Allowed configured locators: any
Allowed EID-prefixes:
  EID-prefix: 1.1.1.1/32 
    First registered:     00:52:08
    Routing table tag:    0
    Origin:               Configuration
    Merge active:         No
    Proxy reply:          No
    TTL:                  1d00h
    State:                complete
    Registration errors:  
      Authentication failures:   0
      Allowed locators mismatch: 0
    ETR 192.168.11.100, last registered 00:00:52, no proxy-reply, map-notify
                        TTL 1d00h, no merge, hash-function sha1, nonce 0x01847332-0xFFD41BBB
                        state complete, no security-capability
                        xTR-ID 0xE31A0AA7-0xB472F34C-0x58BDB63A-0xA2981FB1
                        site-ID unspecified
      Locator         Local  State      Pri/Wgt  Scope
      192.168.11.100  yes    up          10/10   IPv4 none
      192.168.12.100  yes    up          10/10   IPv4 none

In case we don’t have access to the MSMR, we can check the status directly on the xTR router :

CE1#lig self ipv4 
Mapping information for EID 1.1.1.1 from 192.168.12.100 with RTT 17 msecs
1.1.1.1/32, uptime: 00:15:52, expires: 23:59:59, via map-reply, self, complete
  Locator         Uptime    State      Pri/Wgt
  192.168.11.100  00:15:52  up, self    10/10 
  192.168.12.100  00:15:52  up, self    10/10

This output shows the way the MSMR is seeing us. Now we need to have some form of mecanism that will inform other xTRs when an interface goes down. If we do not have such mecanism, we will have to wait a long time before the current cache expire as you can see in the previous command (23:59:59)… The method which can be used is the rloc-rprobing. Let’s configure both CE1 and CE2 with it :

CE1#show run | sec router lisp
router lisp
 loc-reach-algorithm rloc-probing
 database-mapping 1.1.1.1/32 192.168.11.100 priority 10 weight 10
 database-mapping 1.1.1.1/32 192.168.12.100 priority 10 weight 10
 ipv4 itr map-resolver 100.100.100.100
 ipv4 itr
 ipv4 etr map-server 100.100.100.100 key CISCO
 ipv4 etr
 exit
CE2#sh run | sec router lisp
router lisp
 loc-reach-algorithm rloc-probing
 database-mapping 2.2.2.2/32 192.168.21.100 priority 10 weight 10
 ipv4 itr map-resolver 100.100.100.100
 ipv4 itr
 ipv4 etr map-server 100.100.100.100 key CISCO
 ipv4 etr
 exit

Now let’s try to ping, identify the interface which is used and then shutdown it :

CE1#sh int | i is up|packets input
     0 packets input, 0 bytes, 0 no buffer
GigabitEthernet0/1 is up, line protocol is up 
     1146 packets input, 171494 bytes, 0 no buffer
GigabitEthernet0/2 is up, line protocol is up 
     61 packets input, 8976 bytes, 0 no buffer
LISP0 is up, line protocol is up 
     1180 packets input, 118000 bytes, 0 no buffer
Loopback0 is up, line protocol is up 
     0 packets input, 0 bytes, 0 no buffer

So G0/1 on CE1 needs to be shutdown, here is the result on CE2 :

CE2#ping 1.1.1.1 so lo0 rep 10000
Type escape sequence to abort.
Sending 10000, 100-byte ICMP Echos to 1.1.1.1, timeout is 2 seconds:
Packet sent with a source address of 2.2.2.2 
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
<snip>
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!.............................
....................!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!.
Success rate is 98 percent (3310/3360), round-trip min/avg/max = 1/8/49 ms
CE2#

The redundancy mecanism is not very fast as we have lost 50 pings before the convergence. We can speedup the convergence time by playing with the priority value. This value indicates to LISP what is the primary interface, so let’s tell that G0/2 (192.168.12.100) has a better priority than G0/1 (192.168.11.100) :

CE1#lig self ipv4
Mapping information for EID 1.1.1.1 from 192.168.11.100 with RTT 73 msecs
1.1.1.1/32, uptime: 00:00:00, expires: 23:59:59, via map-reply, self, complete
  Locator         Uptime    State      Pri/Wgt
  192.168.11.100  00:00:00  up, self    20/10 
  192.168.12.100  00:00:00  up, self    10/10

Now CE2 should be aware of what is the preffered interface from CE1’s point of view :

CE2#sh ip lisp map-cache 
LISP IPv4 Mapping Cache for EID-table default (IID 0), 2 entries
0.0.0.0/0, uptime: 00:13:51, expires: never, via static send map-request
  Negative cache entry, action: send-map-request
1.1.1.1/32, uptime: 00:00:09, expires: 23:59:50, via map-reply, complete
  Locator         Uptime    State      Pri/Wgt
  192.168.11.100  00:00:09  up          20/10 
  192.168.12.100  00:00:09  up          10/10

From a forwarding perspective now, G0/2 is preffered. This can be seen if we take a look on the interface after a series of ping :

CE1#sh int | i is up|packets input
     0 packets input, 0 bytes, 0 no buffer
GigabitEthernet0/1 is up, line protocol is up 
     6032 packets input, 900929 bytes, 0 no buffer
GigabitEthernet0/2 is up, line protocol is up 
     17599 packets input, 2467156 bytes, 0 no buffer
LISP0 is up, line protocol is up 
     18132 packets input, 1813200 bytes, 0 no buffer
Loopback0 is up, line protocol is up 
     0 packets input, 0 bytes, 0 no buffer

It can also be seen on the remote router by taking a look at the forwarding table of LISP :

CE2#sh ip lisp forwarding eid remote 1.1.1.1
Prefix                 Fwd action  Locator status bits
1.1.1.1/32             encap       0x00000003
  packets/bytes       0/0
  path list 0EB0DD4C, 3 locks, per-destination, flags 0x49 [shble, rif, hwcn]
    ifnums:
      LISP0(9): 192.168.12.100
    1 path
      path 0EB0C9AC, share 10/10, type attached nexthop, for IPv4
        nexthop 192.168.12.100 LISP0, IP midchain out of LISP0, addr 192.168.12.100 0EB0E
188
    1 output chain
      chain[0]: IP midchain out of LISP0, addr 192.168.12.100 0EB0E188
                IP adj out of GigabitEthernet0/1, addr 192.168.22.2 0DBCF178

LISP –  Load Balancing

LISP is also able to do load balancing. Of course, the configured priority needs to be the same for this to work. The weight parameter can now determine the amount of flow that each interface should receive. The preceding configuration did not show load balancing even when the priority is configured equally between the two interfaces. This is because by default LISP will do the load balancing on a per-destination basis. It’s possible to change this behavior on a per-packet basis. On CE1 we need to change back to have the same priority and weight on both interfaces. Once it’s done we change the load balancing scheme on CE2 :

CE2#sh run int lisp
Building configuration...
Current configuration : 51 bytes
!
interface LISP0
 ip load-sharing per-packet
end

Next we clear the counters on CE1 to have a fresh start and we ping again from CE2’s loopback to CE1’s loopback :

CE2#ping 1.1.1.1 so lo0 rep 1000
Type escape sequence to abort.
Sending 1000, 100-byte ICMP Echos to 1.1.1.1, timeout is 2 seconds:
Packet sent with a source address of 2.2.2.2 
.!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
<snip>
Success rate is 99 percent (999/1000), round-trip min/avg/max = 1/7/41 ms

See the result on CE1 –  perfect load balancing 🙂

CE1#sh inter | in is up|packets input
     0 packets input, 0 bytes, 0 no buffer
GigabitEthernet0/1 is up, line protocol is up 
     504 packets input, 75597 bytes, 0 no buffer
GigabitEthernet0/2 is up, line protocol is up 
     504 packets input, 75597 bytes, 0 no buffer
LISP0 is up, line protocol is up 
     1000 packets input, 100000 bytes, 0 no buffer
Loopback0 is up, line protocol is up 
     0 packets input, 0 bytes, 0 no buffer

Now just as a test, let’s try to modify the weight to have a ratio of 4:1 on the incoming traffic :

CE1#sh run | s router lisp
router lisp
 loc-reach-algorithm rloc-probing
 database-mapping 1.1.1.1/32 192.168.11.100 priority 10 weight 100
 database-mapping 1.1.1.1/32 192.168.12.100 priority 10 weight 25
 ipv4 itr map-resolver 100.100.100.100
 ipv4 itr
 ipv4 etr map-server 100.100.100.100 key CISCO
 ipv4 etr
 exit

After a ping let’s check the packet input counter :

CE1#sh inter | in is up|packets input
     0 packets input, 0 bytes, 0 no buffer
GigabitEthernet0/1 is up, line protocol is up 
     804 packets input, 120488 bytes, 0 no buffer
GigabitEthernet0/2 is up, line protocol is up 
     203 packets input, 30406 bytes, 0 no buffer
LISP0 is up, line protocol is up 
     1000 packets input, 100000 bytes, 0 no buffer
Loopback0 is up, line protocol is up 
     0 packets input, 0 bytes, 0 no buffer

Done !

Leave a Reply

Your email address will not be published. Required fields are marked *