ECMP vs VRRP?

  • 0
  • 1
  • Question
  • Updated 3 years ago
  • (Edited)
We are looking at reworking our layout to start growing to have full redundancy. 

MLAG between 2x aggregation x670's Feeding each remote site with 1x 10gigabit uplink. We're good on that simple.

MLAG between our L3 Extremes we plan to use for L3 Routing. Those will connect via standard lag to both of the above aggregation switches. Still good here simple LAG+MLAG's. 

Now we reach the L3 portion. Hanging off of those L3 Extremes we have our BRAS/PPPoE Boxes, that are connected on their backend via a LAG to both of the aggregation 670's. 

We bring up a VLAN to each of the L3 Extremes with a /30(or/31 whatever), But we want to make sure if one of our L3 x670's die that we have complete redundancy. 

Simplest idea is just Enable VRRP, but i REALLY hate the idea of waisted tech resources, and the idea that 1 box is just sitting their idle urks me. So I thought Hey Why not just use OSPF ECMP to solve the issue. 


And that led me to the question why ever use VRRP if ECMP exists? Where does the drawback exist that i'm missing to using OSPF ECMP? I get connection based load balancing between the main 2 routers, and failover protection if one fails. 

We already plan to use OSPF on the BRAS's to deliver customer /32's based on radius so that we have no waisted IPv4 (as we won't be dedicating subnets to specific servers that might not use them all)

So the comparison 

BRAS -> LAG to 2-x670's -> VRRP
vs
BRAS -> VLAN to 2 ports of the BRAS to 2-x670 on a /30 each - > OSPF ECMP
Photo of Chris Chance

Chris Chance

  • 140 Points 100 badge 2x thumb

Posted 3 years ago

  • 0
  • 1
Photo of Bill Stritzinger

Bill Stritzinger, Alum

  • 6,016 Points 5k badge 2x thumb
Chris,

In the layout you describe I would (and have many times) deployed MLAG w/VRRP.  The difference here is that you create a block on the ISC so that each of the VRRP (Master & Backup) cannot talk to each other, this forces each of the ISC peers to show "active/active" and thus each device uses the uplinks there moving forward.  To accomplish this, you create your VRRP instance and them create either a dynamic or static policy and apply to the lag (one side) of the ISC link.

entry MCASTBLOCK{
    if {
        destination-address 224.0.0.18/32;
    } then {
        deny;
    }
}

That is it..  Make sense? Works like a champ... Now, If you talk about ECMP, that would change your design and in that case I would stop doing MLAG all together and go all L3..... 

Let me know if you have questions, be happy to discuss.

Bill
Photo of Jeremy Homan

Jeremy Homan

  • 190 Points 100 badge 2x thumb
Bill, this interests me. As of right now, I have 5 floors all coming down to my core via MLAG. The core is a pair of 460's. I'm utilizing VRRP at my core for L3 redundancy..

With what you're saying is that if I create that policy on one side of my core ISC channel, they won't be able to communicate with their VRRP talk and will bring up all instances of VRRP as active?

Will this pose any other problems? By doing this, both switches will be active forwarding packets?

Thanks,

Jeremy
Photo of Bill Stritzinger

Bill Stritzinger, Alum

  • 6,016 Points 5k badge 2x thumb
Jeremy,

That is correct.. it is a way to make it active/active into your core.  There are not any adverse issues in doing so.  We have done it a ton... The only caveat would be to make sure you are using 15.3 or later (older code works, but I would suggest that version and beyond).

Bill
Photo of Jeremy Homan

Jeremy Homan

  • 190 Points 100 badge 2x thumb
We're running 15.3.4.6 w/ latest patch. 

How does this work given that they share the same logical IP? IE; 1.1.1.1?

For my deployment, I had 10 vlans. I put 5 on VRID1, and 5 on VRID2. I made switch 1 master for all VRID1 sessions, and switch 2 master for all VRID2.

Is your method better?

Thanks,
Photo of Bill Stritzinger

Bill Stritzinger, Alum

  • 6,016 Points 5k badge 2x thumb
Jeremy,

The virtual IP is then broadcast down both links - in your case all you have to do is create the .pol file and then apply to the ISC link, your will see both VRRP sides go MSTR - and then you will see both links go active, there is no additional configuration necessary. 

Bill
Photo of Jeremy Homan

Jeremy Homan

  • 190 Points 100 badge 2x thumb
Is this the suggested way to setup VRRP? Thanks for answering all my questions!
Photo of Bill Stritzinger

Bill Stritzinger, Alum

  • 6,016 Points 5k badge 2x thumb
Jeremy, 

Yes.. The standard VRRP configuration is MSTR/BKUP but in the MLAG world to get a active/active we have to "trick" VRRP that both nodes present the .1 (or default route) you block the isc.  Here is a link to documentation...

http://documentation.extremenetworks.com/exos/EXOS_All/VRRP/c_vrrp-active-active.shtml

Let me know if you have any more questions!

Bill
Photo of Grosjean, Stephane

Grosjean, Stephane, Employee

  • 12,562 Points 10k badge 2x thumb
Hi,
Just make sure the virtual IP is not an interface IP. Apart from that, you should be good.
Photo of Chris Chance

Chris Chance

  • 140 Points 100 badge 2x thumb
Ok so you can do load balancing with vrrp in active-active, but wouldn't this be more... static as you would have to dictate vlans to be specifically set primarily to certain mstr?


Vs a round robin from ecmp? I mean sure ecmp isn't smooth load balancing but it seems to me as a more... balanced automated approach vs trying to balance based on vlan average traffic.


I've honestly never set up ecmp NOR vrrp before, the current network is pretty much flat, we had ospf active but had issues that legs were going down and sticking at idle even though we could ping both remote sides.... then randomly coming back up so we converted everything to static while we replace a bunch of our old hardware (x450/bd12k going to x450-xgm2sf/x670's) upgraded to the 15.x branch etc...


OSPF has to come back as i said its how im going to handle the /32 distribution to our bras servers... so the above issue doesn't exclude it for use for ecmp as it has to end up working one way or another...


VRRP and ECMP both still seem to be accomplishing the same task with the 1 caveat above i mentioned.


Also wouldnt "An ARP request from 10.0.0.4 results in duplicate ARP replies (one from each MLAG switch)." cause some issues with the arp tables on that are using the vrrp route?
(Edited)
Photo of Paul Russo

Paul Russo, Alum

  • 9,694 Points 5k badge 2x thumb
Hey Chris

I think you are mixing up the two protocols. 

VRRP is for DG redundancy for any end device that needs to be routed out to the rest of the network.  In this case VRRP is only configured on the core routers acting as the Default Gateway. 

ECMP is to provide redundant load share connections between Routers.  It allows for the router to determine the best route to use based on the route table it gets from the other routers.  ECMP does not provide GW redundancy.  In addition if a device is directly connected to a router then it's subnet is local so the router would not see equal cost in the route table.

The Active/Active VRRP that Bill and Stephane mention allows for both routers to act as the default gateway for the directly attached hosts.  By doing this each core router can route to the other subnets and L3 network regardless of what MLAG peer port the traffic comes in.

I hope that helps

P