A routing loop is a situation where a packet keeps getting routed between two or more routers because of problems in the routing table. In case of distance vector protocols, the fact that these protocols route by rumor and have a slow convergence time can cause routing loops.
To understand how routing loops can occur with distance vector protocols, consider the network shown in Figure 4-4.
Figure 4-4 Routing Loops
When converged, all the routers in the network shown above will know about the 192.168.5.0/24 network. If RouterD looses connectivity to 192.168.5.0/24, it will remove the route to that network from its routing table. When RouterC receives the next periodic update from RouterD, it will know that the route to 192.168.5.0/24 is lost, and will remove it from its routing table. At this stage, RouterA and RouterB still think that 192.168.5.0/24 is reachable via RouterC.
While RouterC waits to send out the periodic update, if RouterB sends its own update, it will contain 192.168.5.0/24 as a destination network. Since RouterC does not have that network in its routing table, it will assume that it is a new destination and RouterB knows about and will install the route to that network, pointing towards RouterB. After this, the periodic update form RouterC will contain the 192.168.5.0/24 network and RouterB will assume that it knows of all the networks contained in that update!
Now when RouterB receives a packet destined to 192.168.5.0/24, it will forward it out to RouterC. When RouterC receives that packet, it will see that 192.168.5.0/24 is towards RouterB and will send it back. This loop will continue till the IP TTL value in the packet header reaches zero and one of the routers drops it.
To prevent against such routing loops, distance vector protocols have some checks in place. These checks are discussed in the following sections.
Maximum Hop Count
Without checks in place, the wrong routing information can spread throughout the network. To prevent this, protocols such as RIP have a maximum hop count. For RIP this value is set to 15. Any route with more than the maximum hop count is deemed unreachable and will not be used. In the above scenario, the original hop count of 192.168.5.0/24 on RouterB was 2. After RouterA lost the connectivity and RouterC learned the wrong information, it would see 192.168.5.0/24 at 3 hop counts. When RouterB gets this update back from RouterC, it will add 1 to the hop count and make it 4. This cycle will go on. Without a maximum hop count in place, this will go on. This phenomenon is called counting to infinity. Without maximum hop count in place, the increasing hop count will cause the routes to be deemed unreachable, and will be removed from the routing table causing the loop to be resolved.
Split Horizon
The split horizon rule states that routing information learned from one interface cannot be advertised back to that interface. With this rule in place in the above scenario, RouterB would have never advertised 192.168.5.0/24 network back to RouterC since that’s where the route originated. Hence a routing loop would never occur. By default split horizon is enabled for RIP and EIGRP.
Route Poisoning
Route poisoning uses the maximum hop counts to stop network loops. When a router looses a route, it advertises that route with a hop count of more than the maximum hop count. The receiving router now finds the destination network unreachable and advertises it ahead as such. It also sends the update back towards the source router to ensure that the route is now poisoned in the entire network. This process is called poison reverse.
In the above network when RouterD looses 192.168.5.0/24, it would advertise the route to RouterC with a hop count of more than the maximum hop count. RouterC in turn will update RouterB. This is the route poisoning process. RouterC also sends the poisoned route back to RouterD to ensure that the whole network is in sync. This is the poison reverse process.
Hold Downs
Routing protocols implement timers to allow lost routes to recover or to switch to the next best route to the same destination. These timers are called hold down timers. This is typically useful is case of links going down and coming back up rapidly (this is called flapping). One such route going in and out of the routing table can cause loops and stop the network from converging. Hold down timers also prevent changes which effect a route that was recently lost.
In the above example, a hold down timer would have prevented the update from RouterB from effecting RouterC immediately after the route to 192.168.5.0/24 was lost. In the meantime, RouterC would have updated RouterB about the lost route.