Main BGP post

BGP uses attributes to determine which Next-Hop to use for the prefixes advertised to the router. The attributes should not be considered a metric, but a checklist that the BGP router goes through. I'm saying it's not a metric because BGP doesn't calculate anything with the attributes. The attributes are static (mostly) and either set on the local router or something received from a neighbor. The BGP router goes through, from top to bottom, in the list and compares each attribute between valid paths. Like an ACL, BGP will choose the first attribute that can decide which path to prefer, regardless of whats configured "further down". This isn't a continuous process, meaning BGP doesn't lookup these attributes every single time it has to send traffic. The attributes are looked at when choosing which routes are valid/best paths in the BGP table. It means BGP does have an algorithm to run through, comparing these attributes, but it's by definition different from metric calculation.

Route/attribute selection process

This is a 3 phase process in BGP. Bear in mind, there are actually 2 different processes when dealing with eBGP and iBGP. Overall it's the same thing, but it may start a bit different between the two.

Phase 1 is when BGP has received an update containing new routes. It is now calculating the preference for each route through the attributes. It doesn't decide which route is the best here.

Phase 2. There are 3 criteria to accepting the route(s). First, the routes have a better attribute than what is currently installed. Second, the route doesn't exist. Third, if the route has multiple paths with equal attributes, a tie breaker is "held" and further attributes are checked.

Phase 3. It gets a little special here and I'm not going to get into the RFC details, because it's simply too nit picky and kinda arbitrary. Phase 3 will start after phase 2 has completed, so it can't run while phase 2 is active. However, there are criteria that can cause phase 3 to run without phase 2 "activation", but not while phase 2 is running! It may start to make sense why BGP has a long convergence time. Phase 3 completes the BGP process and sends advertisements to neighbors.

What are the attributes

I grace thee with another fantastic diagram. A lot of explanation down below.

There are actually more attributes, like AS4_PATH is a different attribute than AS_PATH, but this is what I'll work with.

As mentioned earlier, the importance of the attributes are simply top to bottom. Before BGP starts looking at the attributes, it first checks if the route is actually valid. Meaning no loop, reachable next-hop and so on.

Weight: Cisco proprietary (at least it was last I checked). It is locally significant, meaning the router doesn't advertise this attribute. Higher weight is preferred, ranging from 0 - 65535. Locally originated routes have a weight of 32768, while all other routes have a weight of 0 (default). It can be configured on either the neighbor statement or route-map. If configured on the neighbor, then every route received through that link will be applied with the weight. The route-map can be more selective of specific routes.

Local Preference: This attribute is advertised, but only within the same AS. While weight only matters if a route hits that exact router, local preference is advertised to all iBGP neighbors and tells them that they should use this path.

Local preference can be changed either through route-map for more specific entries or the default value for the whole router can be changed. The default value is changed with the bgp default local-preference 200 command. Local preference will use the highest value as the best and the default value is 100. The value is 32bit, so it can be configured quite high. It is an RFC standard, so it should be supported for all vendors.

Locally generated: This is pretty straight forward. If a route originates on the BGP router itself, it automatically gets the weight value of 32768. Maybe this shouldn't be listed as an attribute that is checked, because the path check is actually done in the weight attribute. This one simply assigns the weight to the routers. Another thing about injection of routes is it matters how they are configured into BGP. Network command > redistribute > aggregate. Network command and aggregate will be advertised as IGP sources while redistribute will be incomplete.

AS_PATH: The AS hop count. Everytime a packet passes through an AS, that AS number is prepended to this attribute field. This also provides the simple loop mechanism that a router checks the AS_PATH and if it sees its own AS, the packet is discarded. A path with the least amount of AS hops will be preferred. It doesn't check how many iBGP hops there are in each AS, so the shortest AS path, may not be the shortest path in terms of router count.

The AS_PATH attribute can actually be ignored in the path selection process. There is a hidden command under router BGP bgp bestpath as-path ignore. Does this mean loop protection is gone? I can't find the clear yes or no answer to this, but it makes sense if the loop protection is still there. The AS_PATH attribute is mandatory, it has to be in the NLRI packet, which means the information to prevent loop is available. I did a quick lab:

I simply used the as-path ignore command on R2 and on R1 I advertise the 192 network while I add a route-map on R1 that prepends AS2 to the advertisement sent towards R2. R2 should ignore this because it'd mean a loop.

So, when I apply the route-map and wait a while, the route is surely enough taken out of the routing table.

It is possible to make a route less or more prefered. AS numbers can be added or removed, where the use of prepending extra AS numbers is probably a much more likely scenario. Prepending the AS is done on eBGP peers outbound. It is even possible to prepend the same AS several times and it won't trigger any loop mechanism, because BGP routers do not care for any other AS than their own in this regard. It makes sense to use the router's own AS in the prepend, since adding another random AS might cause connectivity issues, as that AS can no longer be traversed due to loop prevention.

A miscellaneous thing worth mentioning here is the ip as-path access-list [number] permit ^$. This command is often refered to when wanting to block transit traffic, but it is a bit more than that. The "^$" doesn't mean "block transit", it means traffic originating in the AS. So this ACL will permit any traffic that has originated in the AS itself, which can both be used to block any other traffic from going through the AS, but it can also be used to apply attributes to the locally originating routes.

MED: Or Multi Exit Discriminator and in the CLI will be referred to as metric. MED works only between connected eBGP peers and iBGP peers. On the topology below, if MED was applied on R2 to R3 nothing would happen.

I've made a small lab, which helped determine the behavior of MED propagation (illustration below).

On R1 I've created a loopback interface and applied a route-map out towards R2, setting MED to 100. R2 will receive the MED value for the prefix and apply it in the BGP table, but it won't propagate it to R3.

The MED value is only propagated to eBGP if it's applied on a connected neighbor, which'd be R2.

The eBGP neighbor will automatically propagate to its AS, which is R3 sending the MED value to R4. MED is a non-transit value, which means it's only advertised to the eBGP peer and its AS.

MED prefers the lowest value and can be applied with neighbor statement and route-map. The default MED value is 0, which is also a "MED not set" value for vendors like Cisco. However, some vendors use the 4,29 billion value (all 32 bit set) as the MED not set. In BGP a command can be used to change this: bgp bestpath med missing-as-worst, which sets all 32 bits on prefixes without an assigned MED. This value is also considered infinite and means MED won't be considered in the Path Selection. The detailed design implications and considerations of this are for ISPs and beyond my studies of BGP.

MED is below AS_PATH, which means it's only used when there are more paths to the same AS. bgp always-compare-med does as it states, compares the MED value for all prefixes, always.

MED is a decent place to introduce potato routing. There are hot and cold potatoes that we have to deal with. Tt's a terminology that refers to responsibility of routing traffic. Hot potato means the decision of where to send traffic is done once it has been received. It's a hot potato because you want to get rid of traffic as fast as possible, to make sure you continue to have bandwidth capacity available. Cold Potato is when the provider wants to keep the traffic in their own network, which is exactly what could be done when using MED. In the end it's about economics of the ISPs, which I don't have any significant knowledge of. Packetpusher has a longer writeup on the subject.

There are some more MED features, but they are more advanced and tie better into some ISP network.

Prefer external eBGP: What's written as eBGP > iBGP. The idea with BGP is to route traffic outside your local network. BGP will prefer a route it's received from eBGP neighbors above iBGP. A route learned from both iBGP and eBGP means the route originates externally and the most logical thing is to get rid of the traffic as fast as possible.

This was originally supposed to be one complete post, but I started it in July, got interrupted by my exam and now I want to write something about multicast. To be continued.