BGP in a nutshell, Operational model, Attributes and router map technologies explained

Unlike IGP, BGP's operational model is quite different, IGPs operate within an autonomous system and EGPs operate between autonomous systems.  BGP is used to exchange routing information for the Internet and is the protocol used mainly between Internet service providers (ISP) . On the other hand, BGP is also commonly used by enterprise who require redundancy and load balancing for the networks its advertising to the outside world, for enterprise customer there primary reason to employ BGP is to avoid single point of failures, should one ISP fail, the networks hosted by that enterprise will still be reachable from the secondary ISP.

There are four connection redundancy types commonly referenced in BGP, Single-homed, Dual-homed, Multi-homed and Dual-multi-homed.
 
There is no point in using BGP if a customer has only one exit point out of their network, only a single connection to the Internet or another autonomous system will achieve neither redundancy nor load balancing,   one of the most common configuration with enterprise's and small ISP's is the multi homing configuration, keeping in mind what we discussed above, BGP can be configured in 3 different styles, default routes, partial routes and full routes.

So what are these default, partial and full routes? If an organization has determined that it will perform multi-homing with BGP, there are three ways to do this, default routes are used by customer who do not require BGP for path manipulation or do not wish to become a transit AS.
Second in line is the partial updates +default routes, meaning each ISP passes only a default route and provider-owned specific routes to the customers AS,
The last in line is the full routing table, the entire internet routing table will be exchanged by the ISP with the customers AS, this option is beneficial for enterprise/ ISP's who requires more granular control on path manipulation and bandwidth manipulation.

So far we have discussed the need/use of BGP and common deployment scenarios, now we will discuss BGP neighbor relationship formation and the algorithm it uses to formulate the best path.

BGP neighbor formation starts by two BGP speakers forming a TCP connection (using port 179). Each side sends an OPEN message.  An OPEN message contains parameters needed for the BGP connection,
A KEEPALIVE message is then sent to confirm the connection.  UPDATE messages are then sent between the BGP speakers to exchange routing information. UPDATE messages contain the path attributes used to make routing decisions.
It is important to understand that when BGP is running between routers in different AS it is called EBGP and BGP running between routers in the same AS it is called IBGP,
It is also important to note that IBGP implementation / configuration requirements differ depending on each customer needs, customers running a transit AS must make sure that all routers in a transit AS have complete knowledge of external routes; this can be done by re-distributing BGP routes into IGP (not recommended) or by configuring a full mesh IBGP network.

It’s time we discuss one important rule, BGP synchronization, the rule states that do not use or advertise a route learned via IBGP until the same route has been learned from IGP also!!

What this means is even if you had a full mesh IBGP topology, in which case you do not need IGP in the first place to re-distribute routes , if Synchronization is turned on, routes from IBGP peers will not be learned or advertise.
So in order to make a fully meshed IBGP topology work, you will need to disable BGP synchronization.

Synchronization was originally intended if there were a small enough number of BGP routes so that they could be redistributed into an IGP running in an autonomous system, IBGP would not be needed in every router in the transit path. However, synchronization would be needed to make sure that packets did not get lost.

So how do path attributes and router maps help in the Route-Selection Decision Process?
As we discussed earlier, update messages contain the path attributes used to make routing decisions, multiple paths might exist to reach a given network. The BGP selection process eliminates multiple paths until a single best path is left. However using default settings for path selection, BGP might cause uneven use of bandwidth; this is where router maps can be useful. In BGP, router maps are specifically used to control which routes are allowed to flow into and out of the BGP process, this is done by assigning a route map to a specific BGP session, in addition router maps can be used to manipulate path attributes as well as filtering routes,

Path attributes fall into four categories:

Well-known mandatory attributes
·         AS-path
·         Next hop
·         Origin
Well-known discretionary attributes
·         Local preference
·         Atomic aggregate
 Optional transitive attributes
·         Aggregator
·         Community
Optional non-transitive attribute
·         Multiexit-discriminator (MED) 


From the above we will discuss some key attributes that can be used to control traffic flow,

Weight:
Cisco Proprietary

The Next Hop Attribute
BGP uses the hop count method to calculate its paths, similar to RIP, the difference being instead of routers it uses autonomous systems as its hop counts. It uses the next hop attribute (which has the next hop IP address) to reach a destination, one important thing to note here is when a routes are passed between iBGP peers, next-hop processing is NOT done, meaning these routes will not populate in the routing table as they will not be considered “best routes” in the BGP table, this is another example where route maps are useful, you can configure “ip next-hop” setting in a route map which will prevent this issue.

Local Pref
Local pref are exchanged by BGP speakers between each other within its own AS, it is used to influence how traffic will flow from one AS to another when multiple paths exist, routes with higher local preferences are used by BGP speakers, if multiple routes have the same preference, it uses the route that was originated by the local router, a router map can be used to change local preference of paths for better load balancing should the need arise.

As Path
The AS-path identifies all autonomous that a route has traversed to reach a destination. it attaches its own AS number to the beginning of the AS_PATH when a BGP speaker forwards routing information to a peer in a separate AS, in the tie breaking process of best route selection, path with the shortest AS-Path is preferred.

Origin
It simply defines the origin of the path, there are 3 types, IGB, EGP and incomplete, incomplete means the route origin is unknown, In the tie breaking process, if length of the AS path is the same, it uses the path with the lowest origin code (IGP < EGP < incomplete).

MED
When there are multiple exit/entry points for the same neighboring AS, MED is used to, Unlike local preference, the MED is exchanged between AS, lower MED is given preference over higher one.  In the tie breaking process, if all origin codes are the same, BGP prefers the path with the lowest MED.

These were some of the attributes used by BGP to make routing decisions, All in all, BGP provides a high degree of control and flexibility for doing inter domain routing while enforcing policy and performance constraints and avoiding routing loops


Thanks
Huzeifa Bhai

OSPF -- four data structures used to store information in an OSPF based infrastructure.

OSPF stores its operational data, configured parameters, and statistics in four main data structures:

Interface table: This table lists all interfaces that have been enabled for OSPF. All directly connected interfaces are stored into the link state database as LSA type 1. When an interface is configured as a passive interface, it is still listed in the OSPF interface table, but no neighbor relationships are established on this interface.



Neighbor table: This table is used to keep track of all active OSPF neighbors. Neighbors are added to this table based on the reception of Hello packets, and they are removed when the OSPF dead time for a neighbor expires or when the associated interface goes down. OSPF goes through a number of states while establishing a neighbor relationship (also known as adjacency), and the neighbor table lists the current state for each individual neighbor.

Link-state database: This is the main data structure that OSPF uses to store all its network topology information. This database contains full topology information for the areas that a router connects to, and information about the paths available to reach networks and subnets in other areas or other autonomous systems. Because this database contains a wealth of network topology information, it is one of the most important data structures to gather information from when troubleshooting OSPF problems.

Routing Information Base: After executing the SPF algorithm, the results of this calculation are stored in the RIB. This information includes the best routes to each individual prefix in the OSPF network with their associated path costs. When the information in the link-state database changes, only a partial recalculation might be necessary (depending on the nature of the change), and routes might be added to or deleted from the RIB without the need for a full SPF recalculation. From the RIB, OSPF offers its routes to the IP routing table.




OSPF Information Flow Within an Area
OSPF discovers neighbors through the transmission of periodic Hello packets. Two routers will become neighbors only if the following parameters match in the Hello packets:
·         Hello and dead timers: Two routers will only become neighbors if they use the same Hello and dead time. The default values for broadcast and point-to-point type networks are 10-second Hello and 40-second dead time. If these timers are changed on an interface of a router, the timers should be configured to match on all neighboring routers on that interface.
·         OSPF area number: Two routers will become neighbors on a link only if they both consider that link to be in the same area.
·         OSPF area type: Two routers will become neighbors only if they both consider the area to be the same type of area (normal, stub, or not-so-stubby area [NSSA]).
·         IP subnet and subnet mask: Two routers will not become neighbors if they are not on the same subnet. The exception to this rule is on a point-to-point link, where the subnet mask is not verified.
·         Authentication type and authentication data: Two routers will become neighbors only if they both use the same authentication type (null, clear text, or message digest 5 [MD5]). If they use authentication, the authentication data (password or hash value) also needs to match.


show ip ospf interface: This command is used to display the interfaces that have been activated for OSPF. This list contains all the interfaces that have an IP address that is covered by one of the network statements under the OSPF configuration. This command displays a lot of detailed information for each interface. For a brief overview, issue the command show ip ospf interface brief.

·         show ip ospf neighbor: This command lists all neighbors that have been discovered by this router on its active OSPF interfaces and shows their current state.
·         show ip ospf database: This command displays the content of the OSPF link-state database. When the command is issued without any additional options, it will display a summary of the database, listing only the LSA headers. Using additional command options, specific LSAs can be selected, and the actual LSA content can be inspected.
·         show ip ospf statistics: This command can be used to view how often and when the SPF algorithm was last executed. This command can be helpful when diagnosing routing instability.
The following debug commands enable you to observe the transmission and reception of packets and the exchange of routing information:
·         debug ip routing: This command is not specific to the OSPF protocol, but displays any changes that are made to the routing table, such as installation or removal of routes. This can prove useful when troubleshooting routing protocol instabilities.
·         debug ip ospf packet: This command displays the transmission and reception of OSPF packets. Only the packet headers are displayed, not the content of the packets. This command is useful to verify whether Hellos are sent and received as expected.
·         debug ip ospf events: This command displays OSPF events. This includes reception and transmission of Hellos, but also the establishment of neighbor relationships and the reception or transmission of LSAs. This command can also provide clues (mismatched parameters such as timers, area number, and so on) as to why neighbor Hellos might be ignored.
·         debug ip ospf adj: This command displays events related to the adjacency building process and enables you to see a neighbor relationship transition from one state to the next. During troubleshooting, you can observe the transitions from one state to another, and possibly the state at which the relation gets stuck.
·         debug ip ospf monitor: This command monitors when the SPF algorithm is scheduled to run and displays the triggering LSA and a summary of the results after the SPF algorithm has completed. During troubleshooting, this command enables you to discover which LSA was received and triggered an SPF computation. For example, you can easily discover a flapping link.

OSPF uses LSA's to exchange information between neighbours:

LSA= Link state advertisement

There are multiple types of LSA's:

Type 1 = router LSA (describe the source router and its OSPF paramenters  (hello timers, dead timers mtu)

Type 2 =network LSA, this LSA is generated by the DR on the broadcast network.

Type 3 = summary LSA is generated by the ABR router to advertise summary routes of adjacent areas.

Type 4 = ASBR summary (area system boarder routers) LSA are genereted by ASBR routers to inform about its own existence to the entire network, ASBR summary LSA's are not filtered by ABR's.

Type 5 = External LSA, external LSA's are generated by ASBR router, type 5 LSA's advertise external routes to the entire network.

Type 6 = multicast LSA. Not supported by CISCO

Type 7 = NSSA external LSA, NSSA= not so stuby area.


OSPF needs to establish neighbours before it exchanges routing information.

Neighbour establishment process is as follows.


Parameters needed to form a  neighbour relationship:
Authentication has to match
Must be in the same subnet
Must be in the Same OSPF area
Must be in the same area type (either standard, nssa, stub)
Must no have duplicated router id.
Hello and there timers need to match.
MTU size should match.


Common process found in routing protocols

Reception of routing information from neighbours: / exchange routing updates and form neighbours
Though the use of routing updates, each routing protocol learns path to other subnets which are not directly connected.
Updates are exchanged with layer 2 adjacent devices called neighbours, some IGP protocols such as EIFRP, OSPF, IS-IS protocol form a neighbour relationship before exchanging routing updates, where as other routing protocols such as RIV2 exchange routing information without forming neighbours with adjacent devices.

Routing protocol data structures:  / store routing updates in there data structure
All modern routing protocols have data structures where they store information received from there neighbours in the form of routing updates including the directly connected or static routes as well as external routes which are redistributed.  Some older routing protocols such as RIV does not have data structures and thus installs and stores route directly into the routing table.

Route injection or redistribution :
All IGPS inject routes into their data structures,  injected routes are those which are either directly connected or static routes, by directly connected I am referring to interfaces where the routing protocol has been enabled.
External routes which are learned from other sources can also be redistributed into data structures and then advertised to neighbours.

Route selection and installation:
Each routing protocol will select a best path for a prefix from its data structure,  different routing protocols use different algorithm to select its best path, best path selection is based on routing protocol metrics.
Each routing protocol offers its best path for a prefix to the ip routing table, if multiple  routing protocols offer best path for the same prefix, path with the least administrative distance will get installed in the routing table.

Transmission of routing information to neighbours:

Routing information which is stores in data structures , injected or static routes as well as external routes which are redistributed in data structures is then advertised to neighbours, as mentioned in the reception of routing information from neighbours, IGP's require a neighbour relationship before they start exchanging routing information, where as other routing protocols such as RIPv2 does not require neighbour relationship for exchanging routing updates.

Similarities and Differences between layer 3 switches and routers

Similarities between a router and a layer 3 switch.

In essence they both perform packet switching.
They both use routing protocols and static routes to reach destinations which are not directly connected
They both follow the same process meaning:
They receive a frame, strip off the layer 2 header
They perform a layer 3 lookup to find the outbound interface and next hop information
They encapsulate the packet into a new layer 2 frame and transmit the packet.


Differences between a router and a layer 3 switch.

A router connects heterogeneous environments where as a layer 3 switch is usually used in homogeneous environments.
A router uses multi purpose hardware to switch packets, where as a layer 3 switch uses specialised hardware (ASCI) to switch packets.
Because a router uses multi purpose  hardware it is usually slower then a layer 3 switch.
Because a layer 3 switch uses specialised hardware it is generally common to see the switch catering for  only Ethernet environments, where as a router because it primarily supports heterogeneous environments it is common to see a router equip with different line cards e.g. frame relay, isdn, atm, etc to support hetrogenious enviroment.
Adding new features to a router can be done by just upgrading the software, which is not the case in a layer 3 switching environment.

Differences between SVI and a routed port.

One of the primary difference is a routed port do not have any layer 2 protocols enabled such as STP and DTP
A routed port has a direct co-relation between its interface status, meaning if a link is down , the routed port will be down also.
Where as an SVI (meaning a vlan interface) is dependant on the ports associated with its vlan, if there are 3 ports in a vlan, an SVI interface will still  be up even if 2 out of the 3 ports are down.

A layer 3 switch can do 3 additional things compared to a router.
It can switch within vlan
It can switch between vlans (inter vlan routing)
It can switch between a vlan and a routed port (no switch port)



Kerberos for dummies ! Kerberos authentication process

Steps involved in Kerberos authentication
A simple description of the Kerberos authentication process using the example of a user trying to access a database server.

  1. Emily types in his/her username and password, the Kerberos software at the user end sends the user name to the Authentication service of the KDC, the AS on the KDC verifies if the user name exists in the KDC database,
  2. if it does, it creates a TGT (ticket granting ticket) which has the secret key(password)  of the user encrypted.
  3. If the password which Emily typed in earlier (temporarily stored on user machines) is the same as the encrypted secret key, Emily is allowed access to the machine.
  4. Once this is done,  now say Emily needs access to the database server, in order to get this, Emily will send her TGT which was granted by the authentication service to the TGS(ticket granting service)
  5. The TGS in return creates a second ticket which has 2 session keys which are the same, the first session key is encrypted with the Emileys shared secret key (user-kdc-key), and the second session key is encrypted with the database shared secret key (db-kdc-key), in addition to this the kdc adds the authenticator information of the requesting user in the second session key.
  6. Once Emily receives this new ticket, Emily will be able to decrypt the first part which was encrypted by the user-kdc- key to get to the session key, once decrypted Emily will add an authenticator (sequence number, ip address, etc) to that session and will re-encrypt using  the extracted session key, this updated ticket will then be sent by Emily to the database server.
  7. The DB server will receive the updated ticket and will be able to decrypt the part which was encrypted by the db-kdc-key, once decrypted db server will have a session key and a authenticator, DB server will use the extracted session key to decrypt the first part of the ticket which was encrypted by Emily with the extracted session key.
  8. Once decrypted, Db server will be able to match the authenticator information extracted from the first session key with the authentication information extracted from the second session key.

This process proves to the db server that emily is a authenticated user who is trusted by the kdc due to the fact that she had encrypted the first part of the ticket with the session key which was orignally created by the kdc.

Components of the KDC

KDC is primarily used to provide authentication services, it is useful in an environment where users / services do not trust each other, a mechanism is needed which would vouch for a user to another user that the user requesting access is indeed  trusted / authenticated  user , basically this is the service which KDC provides, authenticating  users as well as  vouching for authenticated  users .

KDC components:
  1. KDC is the core service within the KDC architecture, it holds the secret keys of all user principles and service principles, and uses the secret keys to encrypt messages.
  2. Authentication service, when a user request access, the service request goes to the authentication service, authentication service verifies if there is a username exist in the database, if it does, it creates a TGT which is encrypted by the secret key (password) of that user.
  3. TGT:  Is a ticket granting ticket which is issued by the authentication service, this is the enabler for single sign on capability of Kerberos, once the user has a TGT, a user is not required to insert passwords for every service they require access on.
  4. TGS: ticket granting service is responsible for creating tickets which have 2 sessions with the same key, these session keys allows users to communicate with each other.
  1. User principle: user principles is the user device which is requesting access, user principle interact with AS, TGS and target device.
  2. Service Principle: is the target service which the user principle needs access to, it could also be vice versa, the service principle verifies if the user principle has been authenticated by the KDC.
  3. The session ticket contains 2 copies of the session key, one copy is encrypted with the user-kdc secret key, the other is encrypted with the service-kdc-secret key.

Time stamp and sequence numbers, are primarily used to stop reply attacks and false impersonation, the service principle compares the  ticket time stamp against its own internal time stamp to verify it has not been sniffed by a hacker wanting to impersonate a user principle at a later time, likewise sequence number is verified by the service principle if it has not received the same sequence number before.


Difference between the secret key and session key.
A secret key is shared between the principle and the kdc, it is static in nature.
Where as a session key is generated dynamically and its shared between 2 principles, it is created on a need basis and destroyed when not required.

Weakness in Kerberos:

  1. Single point of failure, If the KDC goes down which holds all the secret keys for all user and service principles, no aunthentication can take place.
  2. Dependent on clock synchronization between client and server.
  3. Open to password guessing, Kerberos does not know if the password brut force attack is taking place in the background, a separate mechanism must be in place to protect this (windows has password login attempts which prevent this)

SESAME
Secure European system for application in a multi vendor environment

Sesame is based on symmetric asymmetric cryptography, Unlike KDC which holds all the secret keys of all principles, it uses Privilege attribute server to digitally sign PAC (privilege attribute certificates) using asymmetric cryptography , these PACS can be decrypted by the public key of the PAS which makes sesame far more scalable.

Spanning Tree election process summerized

SPANNING TREE election process

Step 1 ) Root bridge election
  1. All routers & switches identify them selves as root bridge
  1. They will have there bridge id and root id the same
  2. Each router will send a BPDU to the other router , BPDU contains the bridge ID and router id, the receiving switch will verify if the root id of the foreign BPDU is lower than its own bridge id, if its lower it will assign foreign BPDU root  ID as its Root id.
  1. Tie breaker is the lowest bridge  id

Step 2) Root port election.
  1. Each non root bridge needs to elect a root port
  2. Root port election is based on least cost to the  root bridge.
  3. Tie are broken on lowest upstream bridge id
  4. Tie  are broker on lowest port id

Step 3) Designated port/device  election.
  1. Elect a designated device/ port for each segment
  2. Ports with the lest cost to root  are elected as designated ports.
  3. Tie are broken based on bridge id
  4. Ties are broken on lowest port id


Step 4) ports which are neither designated ports or root ports go into the blocking state, root and designated ports first go into the learning state and eventually go into the forwarding state.

Cisco SPAN & RSPAN explained

The Switched Port Analyzer (SPAN) feature of Cisco Catalyst switches allows copying the traffic from one or more switch interfaces or VLANs to another interface on the same switch. You connect the system with the protocol analyzer capability to an interface on the switch; this will be the destination interface of SPAN. Next, you configure the switch to send a copy of the traffic from one or more interfaces or VLANs to the SPAN destination interface, where the protocol analyzer can capture and analyze the traffic. The traffic that is copied and sent to the SPAN destination interface can be the incoming traffic, outgoing traffic, or both, from the source interfaces. The source and destination interfaces (or VLANs) all reside on the same switch.

Using the Remote Switched Port Analyzer (RSPAN) feature, however, you can copy traffic from ports or VLANs on one switch (let’s call it the source switch) to a port on a different switch (destination switch). A VLAN must be designated as the RSPAN VLAN and not be used for any other purposes. The RSPAN VLAN receives traffic from the ports or VLANs on the source switch. The RSPAN VLAN then transports the traffic through one or more switches all the way to the destination switch. On the destination switch, the traffic is then copied from the RSPAN VLAN to the destination port. Be aware that each switching platform has certain capabilities and imposes certain restrictions on the usage of RSPAN/SPAN. You can discover these limitations and capabilities of such in the corresponding device documentation

SPAN
Commands to remember:
monitor session session# source
monitor session session# destination

RSPAN
monitor session session# source remote vlan vlan#
monitor session session# destination


E.g.
monitor session 2  source inerface fa0/7
monitor session 2  destination remote vlan 100

monitor session 2  source remote vlan 100

monitor session 2  destination inerface fa0/7

Commands to troubleshoot VLANS

Please note, below commands are primarily for a Cisco environment, however with a little bit of common sense, you should be able to apply the same logical sequence in troubleshooting VLANS in a non Cisco environment.

show mac-address-table: This is the main command to verify Layer 2 forwarding. It shows you the MAC addresses learned by the switch and their corresponding port and VLAN associations. This command gives you an indication if frames sourced by a particular host have succeeded in reaching this switch. It will also help you verify whether these frames were received on the correct inbound interface. Note that if the MAC address table becomes full, no more learning can happen. During troubleshooting, always check to see whether the table is full.

show vlan: This command enables you to verify VLAN existence and port-to-VLAN associations. This command lists all VLANS that were created on the switch (either manually or through VTP). It will also list the ports that are associated to each of the VLANs. Note that trunks are not listed because they do not belong to any particular VLAN.

show interfaces trunk: This command displays all interfaces that are configured as trunks. It will also display on each trunk which VLANs are allowed and what the native VLAN is.

show interfaces switchport: This command combines some of the information found in show vlan and show interfaces trunk commands. It is most useful if you are not looking for a switch-wide overview of trunk or VLAN related information, but if you would rather have a quick summary of all VLAN-related information for a single interface.

show platform forward interface: You can use many parameters with this command and find out how the hardware would forward a frame that matches the specified parameters, on the specified interface.

traceroute mac: You specify a source and destination MAC address with this command to see the list of switch hops that a frame from that source MAC address to that destination MAC address passes through. Use this command to discover the Layer 2 path frames take from the specified source MAC address to the specified destination MAC address. This command requires that Cisco Discovery Protocol (CDP) be enabled on all the switches in the network (or at least within the path).

Based on the information they provide, the commands listed can be categorized. To display the MAC address table, use the show mac-address-table command. To display VLAN database and port-to-VLAN mapping, use the show vlan command. To see the trunk port settings and port-to-VLAN associations, use the show interfaces switchport and show interfaces trunk commands. To directly verify frame forwarding, use the show platform forward and the traceroute mac commands.


Frames are not received on the correct VLAN: This could point to VLAN or trunk misconfiguration as the cause of the problem.

Frames are received on a different port than you expected: This could point to a physical problem, spanning-tree issues, or duplicate MAC addresses.


The MAC address is not registered in the MAC address table: This tells you that the problem is most likely upstream from this switch. You should retrace your steps and investigate between the last point where you know that frames were received and this switch.

How to detect a Spanning tree loop !

I have come across this situation many times in my career, and below are the common symptoms i have managed to jot down, hope it helps.

  1. Loads on all links will increase , not just on the links which has a loop but all the links which are in the switch domain, this is because some of the frames are flooded on all links. naturally when a spanning tree failure is limited to vlan, only links on that vlan will get affected, the rest of the vlans will stay unaffected.
     
  1. If spanning-tree failure has caused more than one bridging loop, it will increase traffic exponentially,  this is because not only frames will cycle in an endless loop but because of having multiple loops, frames will start getting duplicated.
     
  1. When Control pane traffic such as HSRP, OSPF, EIGRP start entering the loop, the devices which are running these protocol will soon get overloaded, there CPU will increase exponentially , in some cases upto 100 % processing the load of control panel traffic. In many cases the earliest indication of a broadcast storm in progress is routers and layer 3 switches report control pane failures e.g. continual HSRP state changes or routers continually running at 100 % CPU.
     
  2. Switches will experience frequent mac address table changes, this is because of the frames looping in both direction, looping in both direction cause the switch to see a frame with a source address coming through one port and then shortly later the same frame coming though a different port.
  1. Because of the combination of high load on all links as well as high CPU at the same time, this causes the switches and routers to go into a state where they are unreachable, making it nearly impossible to troubleshoot which the broadcast storm is in progress.

  1. Load on all links will increate, not only the loop links but all links, limited by vlan
  2. Multiple loops will cause exponential traffic, frames will start getting duplicated
  1. Control pane traffic will increase making CPU high, indication of broadcast storm is to look for 100 cpu and continual hsrp state changes.

  1. MAC address table continually changing , because frames looping in both direction, switch will see frames with a source address coming from one port and than moments later the same frame coming though a different port.
  2. Both high load and high cpu renders the switch / routers un-useable.

Disk Latency


a) Exchange :
http://technet.microsoft.com/en-us/library/aa995945(v=exchg.80).aspx

The average Read latency to the transaction log drives should be below 20 ms. Spikes in read latency should be under 50ms. 

b) Core:
http://blogs.technet.com/b/askperf/archive/2010/11/05/performance-tuning-windows-server-2008-r2-pt-2.aspx

Since modern disks are typically rated at well under 10 ms random access time, any number much higher than that is problematic. 

c) Core:
http://blogs.technet.com/b/askcore/archive/2012/02/07/measuring-disk-latency-with-windows-performance-monitor-perfmon.aspx

You have probably heard general statements about what are acceptable disk latency measurements: “Less than 10 milliseconds is good and more than 20 milliseconds is bad”. Although these rules of thumb are used to simplify analysis, they do not apply in all cases and may lead to incorrect conclusions.

Important consideration:
The time the IO spent in queue is added to the disk latency in perfmon.

How to mount several identical GFS volumes (same UUID) in Red Hat in a non clustered configuration

All testing performed in this article is based on the following technologies
Redhat 5 update 4
Equallogic SAN 6 Series --ISCSI environment
EMC CX4 -240 --Fiber Channel.


Consider the following scenarios:


  • Can you mount GFS Volumes in a non clustered configuration ? ---> Yes you can.
  • With all newer SAN (Storage area network) boxes a standard functionality is Snapshot Luns / Cloned Luns, many customers would like  to mount these snapshot luns on the same server as the original lun for many reasons such as recovery, read only access, performance, etc, however when they try to mount a snapshot lun whose parent lun is already mounted on the same system, you get the following error

    Errors:

    The GFS2 file system on the snapshot LUN cannot be mounted because the file system name already exists.
    GFSLUNS:Test1 already exists on the source LUN (/dev/mapper/Test1).

    [root@hiflex-nd ~]# mount -t gfs2 /dev/mapper/GFSLUNS-Test1/mnt/Testsnap1
    /sbin/mount.gfs2: error mounting /dev/mapper/GFSLUNS-lTest1/ on /mnt/Testsnap1: File exists

    GFS2: fsid=: error -17 adding sysfs fileskobject_add failed for hiflex:daten with -EEXIST, don't try to register things with the same name in the same directory. 




Solution and testing performed:
There are 2 ways around this
1)  use "lock_nolock" without any locktable name while creating the GFS Volume
2)  use a different locktable name for the cloned /snapped lun while creating the GFS Volume


Below is detailed explanation of each procedure.


Example 1 ) "lock_nolock" without any locktable name






-Allocated a 1GB Lun to an unclustered Rhel5 server
-Created a GFS Volume as follows:

# mkfs.gfs2 -p lock_nolock -t gfsvolume -j 1  /dev/VolGroup01/LogVolGFS

-Mounted the volume and created some files
-Then created a clone of the LUN in Equallogic SAN
-Used the procedures in the following link to mask the existing LVM, present the cloned LUN and change the uuids:


- was then able to see both Logical Volumes on the same server (while the original was still mounted)
- then successfully mounted the cloned logical volume without any error messages

Note : 
I did not receive the naming conflict, this is because the GFS Volume was created using the  "lock_nolock " with no locktable name.

Note for my logical volumes, the gfs superblocks do not specify a locktable:

[root@node4 mnt]# gfs2_tool sb /dev/group1/production all
  mh_magic = 0x01161970
  mh_type = 1
  mh_format = 100
  sb_fs_format = 1801
  sb_multihost_format = 1900
  sb_bsize = 4096
  sb_bsize_shift = 12
  no_formal_ino = 2
  no_addr = 23
  no_formal_ino = 1
  no_addr = 22
  sb_lockproto = lock_nolock
  sb_locktable =

 [root@node4 mnt]# gfs2_tool sb /dev/group1_clone/production all
  mh_magic = 0x01161970
  mh_type = 1
  mh_format = 100
  sb_fs_format = 1801
  sb_multihost_format = 1900
  sb_bsize = 4096
  sb_bsize_shift = 12
  no_formal_ino = 2
  no_addr = 23
  no_formal_ino = 1
  no_addr = 22
  sb_lockproto = lock_nolock
  sb_locktable =

Example 2) Using a different name for the cloned/ snapped locktable.



To simulate the above errors, performed the following steps by adding locking table names to both GFS volumes:

[root@node4 mnt]# gfs2_tool sb /dev/group1/production table gfsvolume
You shouldn't change any of these values if the filesystem is mounted.

Are you sure? [y/n] y

current lock table name = ""
new lock table name = "gfsvolume"
Done

[root@node4 mnt]# gfs2_tool sb /dev/group1/production all
  mh_magic = 0x01161970
  mh_type = 1
  mh_format = 100
  sb_fs_format = 1801
  sb_multihost_format = 1900
  sb_bsize = 4096
  sb_bsize_shift = 12
  no_formal_ino = 2
  no_addr = 23
  no_formal_ino = 1
  no_addr = 22
  sb_lockproto = lock_nolock
  sb_locktable = gfsvolume

[root@node4 mnt]# gfs2_tool sb /dev/group1_clone/production table gfsvolume
You shouldn't change any of these values if the filesystem is mounted.

Are you sure? [y/n] y

current lock table name = ""
new lock table name = "gfsvolume"
Done

[root@node4 mnt]# gfs2_tool sb /dev/group1_clone/production all
  mh_magic = 0x01161970
  mh_type = 1
  mh_format = 100
  sb_fs_format = 1801
  sb_multihost_format = 1900
  sb_bsize = 4096
  sb_bsize_shift = 12
  no_formal_ino = 2
  no_addr = 23
  no_formal_ino = 1
  no_addr = 22
  sb_lockproto = lock_nolock
  sb_locktable = gfsvolume


Both volumes now had the same locking table, trying to mount them both resulted in the following:

[root@node4 mnt]# mount /dev/group1/production gfs  <SUCCESS>

[root@node4 mnt]# mount /dev/group1_clone/production gfsclone/
/sbin/mount.gfs2: error 17 mounting /dev/mapper/group1_clone-production on /mnt/gfsclone  <FAILURE>


To address the issue, I then changed the name on the cloned GFS volume as follows:

[root@node4 mnt]# gfs2_tool sb /dev/group1_clone/production table gfsvolume_clone
You shouldn't change any of these values if the filesystem is mounted.

Are you sure? [y/n] y

current lock table name = "gfsvolume"
new lock table name = "gfsvolume_clone"
Done

[root@node4 mnt]# gfs2_tool sb /dev/group1_clone/production all
  mh_magic = 0x01161970
  mh_type = 1
  mh_format = 100
  sb_fs_format = 1801
  sb_multihost_format = 1900
  sb_bsize = 4096
  sb_bsize_shift = 12
  no_formal_ino = 2
  no_addr = 23
  no_formal_ino = 1
  no_addr = 22
  sb_lockproto = lock_nolock
  sb_locktable = gfsvolume_clone

[root@node4 mnt]# mount /dev/group1_clone/production gfsclone/

[root@node4 mnt]# mount
/dev/mapper/VolGroup00-LogVol00 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/xvda1 on /boot type ext3 (rw)
tmpfs on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
/dev/mapper/group1-production on /mnt/gfs type gfs2 (rw,localflocks,localcaching)
/dev/mapper/group1_clone-production on /mnt/gfsclone type gfs2 (rw,localflocks,localcaching)
[root@node4 mnt]#


After changing the name of the locking table, I was able to mount the cloned GFS volume ok.

This process is summarized in the following KB:





Hope this helps.
Huzeifa Bhai