The Gathering Technical blog

TG11 network: a closer look, part 2

19 Jan 2012, by Håvard Staub Nyhus from

Part one of this series is located here
Continuing from the previous post, we will continue looking at potential scenarios we have taken into account when designing the network. Apologies for the slightly different diagrams, as this is created on a different computer.

Link aggregation towards the edge switches from the core

One would think this to be an easy process - just bundle ports on both ends, and you are good to go.
However, things get a little bit more complicated when having to deal with edge devices where "everyone" have access to the cables, and having our edge switches sort of working against us.
Our solution: Layer 3 port channels, LACP and a perl script.
Let me explain:

  • To start with, we are lazy buggers.
  • We do not wish to configure every edge switch manually, and we have to take into account that a switch may break and need to be replaced. To help us achieve this goal, we have a Perl script made by our friends in tech:server that configures every device for us.
  • D-Link DGS-3100-48 by default is configured with spanning tree OFF. However, it does not COMPLETELY ignore spanning-tree, as the device by default "eats" the BPDUs aswell. This effectively makes the switch a loop generator if cabled incorrectly.
    A normal link bundle scenario would look like this:

    Configuration wise, from the Cisco side, one could use "channel group X mode on", which bundles the ports and load balances packets over the two links. Similarly on the D-link side, one would configure a link bundle without any channel protocol.
    This would work fine, if we had full control of the cabling at all times, but as you can imagine - we do not.
    Lets say somebody moves one of the uplink cables to a port configured for an end host. If you do not run any channel protocol, you would end up in a scenario where mac addresses would flap between the different ports, and depending on your design might miss their VLAN tag. Needless to say, this would be BAD, and most of the time would not work at all.

To help with this, we run LACP. On a Cisco device, this is configured as "channel-group x mode active/passive" on the member interfaces (refer to the config pack for examples). The way LACP works is that one or both of the devices participating in the link bundle actively tries to form a neighbor relationship with the other device on a per-port basis. If negotiation fails, on a Layer 2 port channel, the links will come up in "stand-alone" mode. This would be equivalent to a normal Spanning-tree design with one blocking and one forwarding port:

If negotiation is successful, the switch will bundle its ports into one logical port, and load-balance between them.
Sounds all good, doesn't it? We'll use LACP!...
Well, there is a slight problem. Remember the D-links? They drop our BPDUs by default. Thus, both links will go forwarding if the device is reset or configured with its default config, and we would have a loop regardless.

So, what we need is a way to suspend the ports, should the switch be configured with its factory default configuration.

  • We cannot use BPDU-Guard for this, as the D-link drops BPDUs.
  • There is no proprietary loop detection mechanism that works reliably on either the Cisco or the D-link side
  • Port-security would help in a large Layer 2 design. This would not help us, as we do not have a set number of user devices per switch, and a VLAN would never span more than one switch anyway. Thus we would never exceed the set amount of mac addresses.
    There is one way that we have found to be working though. Ciscos LACP implementation operate in a slightly different manner when your port-channel interface is configured with "no switchport".
    This command effectively makes the port a "router" port (or a layer 3 port), which has no understanding of spanning-tree. If spanning-tree is not available, the switch has no way of detecting a loop, thus it has to assume the worst, and keep the interfaces suspended until channel negotiation in LACP is performed and agreed upon.
    Our configuration script configures all this in order:

  • One link is brought up to the edge switch

  • LACP config is pushed to the edge switch
  • LACP config is pushed to the core/distro switch
  • Management IP and other nifty things is pushed to the edge switch
    This process, combined with clever use of the functionality provided by the Cisco switches guarantees that we will not have a loop between the core switch and the edge. Even if the D-link device breaks spanning tree in the worst way possible by default.
    Note: some IOS code versions, aswell as NX-OS does this by default when configuring "channel-group x mode passive", even for a layer 2 port channel.

In the next part, we will move away from ugly potential failure-scenarios and go into the layer 3 routing configuration and design.
In the meantime, please leave your questions and comments in the comment field below :)

Håvard Staub Nyhus


TG - Technical Blog is the unofficial rambling place of the Info:Systems, Tech:Net and Tech:Server crews from The Gathering.

Related sites