Quality of service overview

Quality of Service (QoS) provides an appropriate level of service for packets as they flow inside the switch and between switches in the network. The required level of service depends on the application that generates the flow of packets, and can be defined by the application’s sensitivity to packet loss, delay, and jitter.

QoS functionality is supported on the 7250 IXR, 7220 IXR-D2 and D3, and the 7220 IXR-H2 and H3.

Note: In Release 22.6, the 7220 IXR-D5 supports the following subset of SR Linux QoS functionality:
  • DSCP classifier and rewrite-rule policies (VXLAN not supported)
  • Queue depth (unicast only)
  • WRED slope
  • ECN slope
  • WRR
  • Strict priority scheduling
  • Forwarding class peak rate (unicast only)

You can group packets that require a similar treatment (per-hop behavior) into a Forwarding Class (FC), also known as a behavior aggregate. You can specify up to eight FCs. Traffic is scheduled and can optionally be marked based on its FC.

A configurable drop probability expresses the packet loss sensitivity. Assign a low drop probability to packets that are sensitive to loss. To provide the required congestion management and intelligent discard decisions when congestion occurs, balance the traffic classifications between low, medium, and high drop probability.

How QoS works for transit traffic

This section describes how QoS applies to transit packets on the SR Linux.

  1. Packets are received on a subinterface.

  2. Each received packet is classified as belonging to one of eight forwarding classes (fc0 to fc7) and one of three drop probabilities (low, medium, or high).

    • If the configuration of the ingress subinterface refers to a DSCP classifier policy, then the forwarding class and drop probability level are determined from that policy.

      Note: If there is no entry of this policy matching the received DSCP, then the assigned forwarding class is fc0 and the assigned drop probability is low. This FC and drop probability classification corresponds to a best effort treatment.
    • If there is no DSCP classifier policy bound to the ingress subinterface, the FC and drop probability are determined from the default DSCP classifier policy. See System default DSCP classifier policy.

      Note: On all VLAN-based subinterfaces, the system currently ignores the 802.1p bits for purposes of forwarding class and drop-probability classification.
  3. A forwarding lookup on the packet determines its egress port.

  4. On the 7250 IXR, if the packet is a unicast packet, it is associated with a Virtual Output Queue (VOQ) based on the ingress port, egress port, and FC.

    On a 7220 IXR-D2, D3, and D5 or 7220 IXR-H2 and H3, the packet is associated directly with an Egress Queue (EGQ) of the egress port, based on the FC of the packet and its type (either unicast or multicast).

  5. While it waits for its VOQ or EGQ to be serviced, the packet is stored in buffer memory. The total amount of buffer memory varies by platform.

  6. The packet is dropped if the buffer memory is close to full or if the Maximum Burst Size (MBS) of the VOQ or EGQ is exceeded.

    The MBS is one of the parameters that is configurable in a queue template. When a queue template is applied to a set of queues, all of those queues have the MBS value specified in the template. If the MBS is not specified in a queue template, then the default value is platform dependent. The MBS is not a guaranteed allocation of buffer memory.

  7. When the packet is Explicit Congestion Notification (ECN)-capable, and ECN is enabled globally with the qos explicit-congestion-notification command, and the VOQ or EGQ has an active ECN slope that applies to the packet, the ECN field may be remarked depending on the current (weighted) queue depth.

    • If the current queue depth is below the configured min-threshold-percent of the ECN slope, the ECN field of the packet is unchanged.

    • If the current queue depth is above the configured max-threshold-percent of the ECN slope, the ECN field of the packet is (re)marked as Congestion Experienced (CE), ECN=11.

    • If the current queue size is between the min-threshold-percent and max-threshold-percent of the ECN slope, the ECN field of the packet is (re)marked as CE, ECN=11, based on a probability function that increases linearly from 0% at the minimum threshold to n% at the maximum threshold, where n is the operational max-probability of marking the packet.

      Note: The operational values of the max-probability may be significantly different from the configured values based on internal hardware calculations. You can check the hardware configured values for any slope calculations.
  8. When the packet is non-ECN-capable (the ECN field is zero) and the egress queue has an active WRED slope for the drop probability of the packet, then the packet may be dropped by the WRED algorithm, which operates as follows:

    • If the current queue depth is below the configured min-threshold-percent of the WRED slope, then the packet is admitted to the queue.

    • If the current queue depth is above the configured max-threshold-percent of the WRED slope, then the packet is dropped.

    • If the current queue size is between the minimum threshold and maximum threshold of the WRED slope, then the packet is dropped based on a probability function that increases linearly from 0% at the minimum threshold to n% at the maximum threshold, where n is the operational max-probability of dropping the packet.
      Note: The operational values of the max-probability may be significantly different from the configured values based on internal hardware calculations. You can check the hardware configured values for any WRED slope calculations.
  9. Each unicast queue and each multicast queue of an egress port is associated with a scheduler node. The mapping of queues to scheduler nodes is platform-dependent and cannot be configured. See Output queue scheduling.

  10. Each egress queue can be individually configured with a Peak Information Rate (PIR). The PIR is configured as a percentage of the egress port bandwidth.

    By default, the PIR of each queue is 100%. The operational PIR is stored by the peak-rate-bps leaf in bits per second. The bits counted in this rate include the Layer 2 framing of the packet (including the 14 byte Ethernet header, the 4-byte VLAN header, and the 4-byte CRC) but exclude the 20-byte Layer 1 overhead (SFD, preamble, IPG).

  11. The DSCP field in the IPv4 or IPv6 header of the outgoing packet can be rewritten. On the 7250 IXR, the DSCP field must be rewritten when ECN is enabled and the packet ECN field is non-zero. When there is a rewrite policy applied, the DSCP in the outgoing packet is based on the FC (and potentially also the drop probability) of the packet. If the FC (and drop-probability) matches an entry in the applied policy, then the new DSCP value is based on the policy entry. If there is no matching entry in the applied policy, then the new DSCP value is 0.

System default DSCP classifier policy

Table 1. System default DSCP classifier policy
DSCP values Included DSCP names Forwarding class Drop probability
0, 2 to 7 CS0/BE fc0 Low
1 LE fc0 High
8 to 11 CS1, AF11 fc1 Low
12 to 13 AF12 fc1 Medium
14 to 15 AF13 fc1 High
16 to 19 CS2, AF21 fc2 Low
20 to 21 AF22 fc2 Medium
22 to 23 AF23 fc2 High
24 to 27 CS3, AF31 fc3 Low
28 to 29 AF32 fc3 Medium
30 to 31 AF33 fc3 High
32 to 35 CS4, AF41 fc4 Low
36 to 37 AF42 fc4 Medium
38 to 39 AF43 fc4 High
40 to 47 CS5, EF fc5 Low
48 to 55 CS6/NC1 fc6 Low
56 to 63 CS7/NC2 fc7 Low

How QoS works for VXLAN traffic

When the 7220 IXR-D2 and D3 receives a terminating VXLAN packet on a subinterface, it classifies the packet to one of eight forwarding classes and one of three drop probabilities (low, medium, or high). The classification is based on the following considerations:

  • The outer IP header DSCP is ignored.

  • If the payload packet is non-IP, the classified FC is fc0 and the classified drop probability is low.

  • If the payload packet is IP, and the qos classifiers vxlan-default command references a classifier policy, that policy is used to determine the FC and drop probability from the header fields of the payload packet.

  • If the payload packet is IP, and the qos classifiers vxlan-default command does not reference a classifier policy referenced by , the default DSCP classifier policy is used to determine the FC and drop probability from the header fields of the payload packet.

When the 7220 IXR-D2 and D3 adds VXLAN encapsulation to a packet and forwards it out a subinterface, the inner header IP DSCP value is not modified if the payload packet is IP, even if the egress routed subinterface has a DSCP rewrite rule policy bound to it that matches the packet FC and drop probability. If a DSCP rewrite policy is bound to the egress routed subinterface, that policy modifies the outer header IP DSCP.

How QoS works for router-terminated traffic

This section describes how QoS applies to traffic that terminates on the SR Linux.

  1. A packet is received on a subinterface and is determined to need extraction toward the CPM. The packet is directed to one of the queues associated with the CPM as a destination ‟physical port” based on its protocol and type. The following traffic types have their own independent queue:

    • sflow

    • ICMPv4 ping

    • BFD

    • ARP

    • ICMPv6 neighbour solicitation and neighbor advertisement

    • BGP

    • gRPC

    • LLDP

    • IPv4 packets with IP options and IPv6 packets with extension headers

    • DHCPv6

    • IS-IS hello PDUs

    • OSPF/OSPFv3 hello PDUs

  2. Some of the queues toward the CPM have a PIR shaping rate designed to prevent an overload of one type of traffic. The PIR shaping rates vary by platform.

How QoS works for router-originated traffic

This section describes how QoS applies to traffic that originates on the SR Linux.

  1. An application on the SR Linux CPM has an IPv4 or IPv6 packet to send to another system.

  2. The CPM datapath assigns a DSCP to the self-generated packet based on its protocol and the hard coded mapping shown in Default forwarding class and DSCP marking for router-originated traffic.

    Except for ICMP and ICMPv6 echo-request packets, the DSCP values cannot be overridden. For originated echo-request packets, the DSCP override value can be configured as an optional parameter of the ping command.

  3. The CPM datapath looks up the DSCP from the previous step (either the fixed value or the override value for echo-request) in the default DSCP classifier policy (see System default DSCP classifier policy) to determine the FC and drop probability level.

  4. A forwarding lookup determines the egress port.

  5. On the 7250 IXR, the packet is sent to the egress line card and added to a Virtual Output Queue (VOQ) appropriate for its forwarding class and the egress port. The decision to drop or enqueue the packet in the VOQ and the scheduling of the VOQ follows the previous description for transit traffic. There is no scheduling differentiation between router-originated traffic and transit traffic of the same FC on the egress IMM.

  6. The packet is directed to the egress queue appropriate for its forwarding class and packet type. On the 7220 IXR-D2, D3, and D5 and the 7220 IXR-H2 and H3, the decision to drop or enqueue the packet in the egress queue and the scheduling of the egress queue follow QoS treatment of transit traffic described in How QoS works for transit traffic.

  7. The DSCP field in the IPv4 or IPv6 header is always written based on the hard coded mapping described in Default forwarding class and DSCP marking for router-originated traffic. If the packet also matches a DSCP policy rewrite rule applied to the output subinterface, the rewrite-rule policy is ignored.

Default forwarding class and DSCP marking for router-originated traffic

Table 2. Default forwarding class and DSCP marking for router-originated traffic
Protocol / message type Forwarding class Drop probability DSCP marking
IPv4 ARP request/reply 6 Low N/A
ICMPv4 including echo-request1, echo- reply2, dest-unreachable, redirect, time-exceeded, parameter-problem 0 Medium 0
ICMPv4 echo-request with ToS/DSCP override = x look up X in system-default DSCP classifier look up X in system-default DSCP classifier x
ICMPv4 echo-reply to echo-request with non-zero DSCP x look up X in system-default DSCP classifier look up X in system-default DSCP classifier x
UDP traceroute 0 Low 0
IPv6 neighbor solicitation 6 Low 48 (CS6/NC1)
IPv6 neighbor advertisement 6 Low 48 (CS6/NC1)
All other ICMPv6 including dest unreachable, packet-too-big, time-exceeded, parameter-problem, echo-request, echo-reply, router-solicitation, redirect 0 Medium 0
ICMPv6 echo-request with DSCP override = x look up x in system-default DSCP classifier look up x in system-default DSCP classifier x
ICMPv6 echo-reply to echo-request with non-zero DSCP x look up x in system-default DSCP classifier look up x in system-default DSCP classifier x
BFD 6 Low 48 (CS6/NC1)
BGP 6 Low 48 (CS6/NC1)
DNS query 4 Low 34 (AF41)
FTP/TFTP 4 Low 34 (AF41)
gNMI 4 Low 34 (AF41)
JSON RPC 4 Low 34 (AF41)
LLDP N/A Low N/A
NTP 4 Low 34 (AF41)
sFlow 0 Low 0
SNMP 4 Low 34 (AF41)
SSH 4 Low 34 (AF41)
Syslog 4 Low 34 (AF41)
TACACS+ 4 Low 34 (AF41)
1 Echo-request generated by a ping command with no DSCP parameter specified.
2 Echo-reply to an echo-request packet with DSCP=0.