Gracefully Upgrading JUNOS Devices with Dual RE’s

One of the coolest things about the routing plane with Juniper routers is the fact that you can have dual/redundant independent routing-engines.

Routers are constantly making decisions, running algorithms, and updating the database for the correct way for traffic to get to every destination possible.  They do this to always have the quickest/most efficient route to the packet’s destination.  The hardware responsible for this decision-making is the routing-engine.  The routing engine will make decisions based on every possible attribute of the route, then forward it’s choice for the best route to the Packet Forwarding Engines in the router that actually do the work.

The forwarding plane is always redundant (hence the need to make decisions on multiple routes for the same destination).  The reason for this is obvious.  You don’t want traffic to stop in case of a hardware failure.  The same goes for the routing plane.  You don’t want to stop making routing decisions if your routing-engine dies on you.  You want to fail over to a redundant routing engine.

There are a couple of ways to gracefully upgrade JUNOS on a Juniper platform with redundant routing-engines.  The first, which I’ll cover in this article is the manual way.  The second, is using ISSU (supported after 9.2 in larger platforms).  ISSU is typically the slowest and when it works, you never lose a packet. However, when it fails, it fails miserably and typically involves consoling into the routing-engine for some sort of disaster recovery.

Steps for a manual upgrade of dual routing engines:

  1. Load the newest version of code into the file system of each RE (I use /var/tmp/)
  2. Disable Graceful Routing-Engine Switchover (GRES) and Non-stop routing (NSR)


    deactivate chassis redundancy graceful-switchover
    deactivate chassis redundancy failover
    deactivate routing-options nonstop-routing

  3. Upgrade the backup routing-engine and reboot


    request routing-engine login other-routing-engine
    request system software add /var/tmp/jinstall-xyz.tgz
    request system reboot

  4. Wait for the backup routing engine to reboot and come up


    {master}
    bboyd@router> show chassis routing-engine
    Routing Engine status:
    Slot 0:
    Current state Master
    Election priority Master (default)
    Temperature 41 degrees C / 105 degrees F
    CPU temperature 43 degrees C / 109 degrees F
    DRAM 3584 MB
    Memory utilization 31 percent
    CPU utilization:
    User 1 percent
    Background 0 percent
    Kernel 5 percent
    Interrupt 1 percent
    Idle 93 percent
    Model RE-A-2000
    Serial ID 9009038820
    Start time 2014-01-19 10:43:11 UTC
    Uptime 33 minutes, 42 seconds
    Last reboot reason Router rebooted after a normal shutdown.
    Load averages: 1 minute 5 minute 15 minute
    0.15 0.19 0.27

    Routing Engine status:
    Slot 1:
    Current state Backup
    Election priority Backup (default)
    Temperature 39 degrees C / 102 degrees F
    CPU temperature 42 degrees C / 107 degrees F
    DRAM 3584 MB
    Memory utilization 28 percent
    CPU utilization:
    User 1 percent
    Background 0 percent
    Kernel 1 percent
    Interrupt 1 percent
    Idle 98 percent
    Model RE-A-2000
    Serial ID 9009037599
    Start time 2014-01-18 11:56:55 UTC
    Uptime 23 hours, 19 minutes, 55 seconds
    Last reboot reason Router rebooted after a normal shutdown.
    Load averages: 1 minute 5 minute 15 minute
    0.01 0.04 0.07

  5. Manually failover the routing plane to the routing-engine with the new software. (Some downtime will occur during routing protocol setup (typically ~60 seconds)


    {master}
    bboyd@router> request chassis routing-engine master switch check

    {master}
    bboyd@router> request chassis routing-engine master switch
    Toggle mastership between routing engines [yes,no] (no) yes

    Resolving mastership...

    Connection reset by peer

  6. Upgrade the original primary routing-engine and reboot


    request routing-engine login other-routing-engine
    request system software add /var/tmp/jinstall-xyz.tgz
    request system reboot

  7. Verify the new routing engine has come back


    {master}
    bboyd@router> show chassis routing-engine
    Routing Engine status:
    Slot 0:
    Current state Backup
    Election priority Backup (default)
    Temperature 41 degrees C / 105 degrees F
    CPU temperature 43 degrees C / 109 degrees F
    DRAM 3584 MB
    Memory utilization 31 percent
    CPU utilization:
    User 1 percent
    Background 0 percent
    Kernel 5 percent
    Interrupt 1 percent
    Idle 93 percent
    Model RE-A-2000
    Serial ID 9009038820
    Start time 2014-01-19 10:43:11 UTC
    Uptime 33 minutes, 42 seconds
    Last reboot reason Router rebooted after a normal shutdown.
    Load averages: 1 minute 5 minute 15 minute
    0.15 0.19 0.27

    Routing Engine status:
    Slot 1:
    Current state Master
    Election priority Master (default)
    Temperature 39 degrees C / 102 degrees F
    CPU temperature 42 degrees C / 107 degrees F
    DRAM 3584 MB
    Memory utilization 28 percent
    CPU utilization:
    User 1 percent
    Background 0 percent
    Kernel 1 percent
    Interrupt 1 percent
    Idle 98 percent
    Model RE-A-2000
    Serial ID 9009037599
    Start time 2014-01-18 11:56:55 UTC
    Uptime 23 hours, 19 minutes, 55 seconds
    Last reboot reason Router rebooted after a normal shutdown.
    Load averages: 1 minute 5 minute 15 minute
    0.01 0.04 0.07

  8. Re-enable GRES and NSR


    activate chassis redundancy graceful-switchover
    activate chassis redundancy failover
    activate routing-options nonstop-routing

  9. Log into the newly upgraded routing-engine and check switchover status


    {backup}
    bboyd@router> show system switchover
    Graceful switchover: On
    Configuration database: Ready
    Kernel database: Synchronizing
    Peer state: Steady State

  10. Perform the routing-engine switch on the master RE


    {master}
    bboyd@router> request chassis routing-engine master switch
    Toggle mastership between routing engines ? [yes,no] (no) yes

    Resolving mastership...

This process should give you minimal downtime.

If you want 0 (zero) downtime, look into using ISSU, but be warned that when/if ISSU fails, your downtime grows considerably.

  • I wouldn’t recommend ISSU before JUNOS 12.1.
  • And I also NEVER recommend ISSU from one major release to another (10 to 11 | 11 to 12)

Some reference links from Juniper’s Documentation:

JUNOS OS High Availability Configuration Guide

http://www.juniper.net/techpubs/en_US/junos11.4/topics/reference/requirements/gres-system-requirements.html

http://www.juniper.net/techpubs/en_US/junos13.1/topics/task/configuration/gres-configuring.html

http://www.juniper.net/techpubs/en_US/junos11.4/topics/concept/gres-overview.html

 


Leave a Reply

There aren't any comments at the moment, be the first to start the discussion!