VXLAN v4
This chapter describes the implementation for VXLAN tunnels that use IPv4 in the underlay.
VXLAN configuration
VXLAN on SR Linux uses a tunnel model where VXLAN interfaces are bound to network instances, in the same way that subinterfaces are bound to network instances. Up to one VXLAN interface per network instance is supported.
Configuration of VXLAN on SR Linux is tied to EVPN. VXLAN configuration consists of the following steps:
Configure a tunnel interface and VXLAN interface.
A tunnel interface for VXLAN is configured as
vxlan<N>, where N can be 0 to 255.A VXLAN interface is configured under a tunnel interface. At a minimum, a VXLAN interface must have an index, type, and the ingress VXLAN network identifier (VNI).
The index can be a number in the range 0 to 4294967295.
The type can be bridged or routed and indicates whether the vxlan-interface can be linked to a MAC-VRF (bridged) or IP-VRF (routed).
The system looks for the ingress VNI in incoming VXLAN packets to classify them to this VXLAN interface and its network instance.
Configuration of an explicit VXLAN interface egress source IP is not permitted, given that the data path supports one source tunnel IP address for all VXLAN interfaces. The source IP used in the VXLAN interfaces is the IPv4 address of sub-interface
system0.0in the default network instance.Associate the VXLAN interface to a network instance.
A VXLAN interface can only be associated with one network instance, and a network instance can have only one VXLAN interface.
Associate the VXLAN interface to a BGP-EVPN instance.
The VXLAN interface must be linked to a BGP-EVPN instance so that VXLAN destinations can be dynamically discovered and used to forward traffic.
The following configuration example shows these steps:
tunnel-interface vxlan1 {
// (Step 1) Creation of the tunnel-interface and vxlan-interface
vxlan-interface 1 {
type bridged
ingress {
vni 1
}
egress {
source-ip use-system-ipv4-address
}
}
}
network-instance blue {
type mac-vrf
admin-state enable
description "network instance blue"
interface ethernet-1/2.1 {
}
vxlan-interface vxlan1.1 {
// (Step 2) Association of the vxlan-interface to the network-instance
}
protocols {
bgp-evpn {
bgp-instance 1 {
admin-state enable
vxlan-interface vxlan1.1
// (Step 3) Association of the vxlan-interface to the bgp-evpn instance
evi 1
}
}
bgp-vpn {
bgp-instance 1 {
route-distinguisher {
route-distinguisher 1.1.1.1:1
}
route-target {
export-rt target:1234:1
import-rt target:1234:1
}
}
}
}
}
When EVPN routes are received with VXLAN encapsulation, the SR Linux creates VXLAN Termination Endpoints (VTEPs) from the EVPN route next-hops, and each VTEP is allocated an index number (per source and destination tunnel IP addresses).
When a VTEP is created in the vxlan-tunnel table and a non-zero index allocated, a tunnel-table entry is also created for the tunnel in the tunnel-table.
If the next hop is not resolved to any route in the network-instance default route-table, the index in the vxlan-tunnel table shows as 0 for the VTEP, and no tunnel-table entry would be created in the tunnel-table for that VTEP.
The following example shows the created vxlan-tunnel entries and tunnel-table entries on reception of IMET routes from three different PEs.
--{ + candidate shared default }--[ ]--
A:root@leaf3# info from state with-context tunnel vxlan-tunnel vtep *
tunnel {
vxlan-tunnel {
vtep 100.0.0.1 {
index 433965152188
// index allocated per source and destination tunnel IP addresses.
Index of 0 would mean that 10.22.22.2 is not resolved in the route-table
and no tunnel-table entry is created.
last-change "2025-10-07T11:34:01.000Z (27 minutes ago)"
}
vtep 100.0.0.2 {
index 433965152189
last-change "2025-10-07T11:34:01.000Z (27 minutes ago)"
}
vtep 100.0.0.4 {
index 433965152190
last-change "2025-10-07T11:34:01.000Z (27 minutes ago)"
}
vtep 100.0.0.111 {
index 433965152197
last-change "2025-10-07T11:34:06.000Z (27 minutes ago)"
}
vtep 100.0.0.222 {
index 433965152199
last-change "2025-10-07T11:34:10.000Z (26 minutes ago)"
}
}
}
--{ + candidate shared default }--[ ]--
A:root@leaf3# info from state network-instance default tunnel-table ipv4
tunnel 100.0.0.1/32 type vxlan owner vxlan_mgr id 1 {
// tunnel table entry for VTEP 100.0.0.1, created after the vxlan-tunnel vtep 100.0.0.1
next-hop-group 433965152204
metric 0
preference 0
last-app-update "2025-10-07T11:34:01.719Z (27 minutes ago)"
resource-allocation-failed false
fib-programming {
last-successful-operation-type modify
last-successful-operation-timestamp "2025-10-07T11:34:01.719Z (27 minutes ago)"
pending-operation-type none
last-failed-operation-type none
}
vxlan {
destination-address 100.0.0.1
source-address 100.0.0.3
time-to-live 255
}
}
tunnel 100.0.0.2/32 type vxlan owner vxlan_mgr id 1 {
next-hop-group 433965152204
metric 0
preference 0
last-app-update "2025-10-07T11:34:01.719Z (27 minutes ago)"
resource-allocation-failed false
fib-programming {
last-successful-operation-type modify
last-successful-operation-timestamp "2025-10-07T11:34:01.719Z (27 minutes ago)"
pending-operation-type none
last-failed-operation-type none
}
vxlan {
destination-address 100.0.0.2
source-address 100.0.0.3
time-to-live 255
}
}
tunnel 100.0.0.4/32 type vxlan owner vxlan_mgr id 1 {
next-hop-group 433965152204
metric 0
preference 0
last-app-update "2025-10-07T11:34:01.719Z (27 minutes ago)"
resource-allocation-failed false
fib-programming {
last-successful-operation-type modify
last-successful-operation-timestamp "2025-10-07T11:34:01.719Z (27 minutes ago)"
pending-operation-type none
last-failed-operation-type none
}
vxlan {
destination-address 100.0.0.4
source-address 100.0.0.3
time-to-live 255
}
}
tunnel 100.0.0.111/32 type vxlan owner vxlan_mgr id 1 {
next-hop-group 433965152204
metric 0
preference 0
last-app-update "2025-10-07T11:34:06.745Z (27 minutes ago)"
resource-allocation-failed false
fib-programming {
last-successful-operation-type modify
last-successful-operation-timestamp "2025-10-07T11:34:06.745Z (27 minutes ago)"
pending-operation-type none
last-failed-operation-type none
}
vxlan {
destination-address 100.0.0.111
source-address 100.0.0.3
time-to-live 255
}
}
tunnel 100.0.0.222/32 type vxlan owner vxlan_mgr id 1 {
next-hop-group 433965152204
metric 0
preference 0
last-app-update "2025-10-07T11:34:10.185Z (27 minutes ago)"
resource-allocation-failed false
fib-programming {
last-successful-operation-type modify
last-successful-operation-timestamp "2025-10-07T11:34:10.186Z (27 minutes ago)"
pending-operation-type none
last-failed-operation-type none
}
vxlan {
destination-address 100.0.0.222
source-address 100.0.0.3
time-to-live 255
}
}
statistics {
active-tunnels 5
total-tunnels 5
}
tunnel-summary {
tunnel-type vxlan {
active-tunnels 5
}
}
--{ + candidate shared default }--[ ]--
A:root@leaf3# info from state network-instance default route-table next-hop-group 433965152204
backup-next-hop-group 0
backup-active false
fib-programming {
last-successful-operation-type add
last-successful-operation-timestamp "2025-10-07T11:34:01.719Z (28 minutes ago)"
pending-operation-type none
last-failed-operation-type none
}
next-hop 0 {
next-hop 433965152203
// NH ID allocated by fib_mgr for the NHG-ID
resolved not-applicable
resource-allocation-failed false
}
next-hop 1 {
next-hop 433965152191
resolved not-applicable
resource-allocation-failed false
}
--{ + candidate shared default }--[ ]--
A:root@leaf3# info from state network-instance default route-table next-hop 433965152203
resource-allocation-failed false
type direct
// resolution of the NH ID
ip-address fe80::18f5:dff:feff:3
subinterface ethernet-1/11.0
Source and destination VTEP addresses
In the network egress direction, the vxlan-interface/egress/source IP leaf determines the
loopback interface that the system uses to source VXLAN packets (outer IP SA). The
source IP used in the VXLAN interfaces is the IPv4 address of subinterface
system0.0 in the default network instance.
The egress VTEP (outer IP DA) is determined by EVPN and must be of the same family IPv4 as the configured source IP.
Only unicast VXLAN tunnels are supported (outer IP DA is always unicast), and ingress replication is used to deliver BUM frames to the remote VTEPs in the current release.
In the network ingress direction, the IP DA matches one of the local loopback IP addresses in the default network instance to move the packet to the VNI lookup stage (for loopback interfaces only, not other interfaces in the default network instance, such as IRB subinterfaces). The loopback IP address does not need to match the configured source IP address in the VXLAN interface.
The system can terminate any VXLAN packet with an outer destination IP matching a local loopback address, with no set restriction on the number of IPs.
Ingress or egress VNI
The configured ingress VNI determines the value used by the ingress lookup to find the network instance for a further MAC/IP lookup. The egress VNI is specified by EVPN. The system ignores the value of the I flag on reception. According to RFC 7348, the I flag must be set to 1. However, the system accepts VXLAN packets with I flag set to 0; the I flag is set to 1 on transmission.
VLAN tagging for VXLAN
Outer VLAN tagging is supported (one VLAN tag only), assuming that the egress subinterface in the default network instance uses VLAN tagging.
Inner VLAN tagging is transparent, and no specific handling is needed at network ingress for Layer 2 network instances. Inner VLAN tagging is not supported for VXLAN-originated traffic or VXLAN-terminated traffic in IP-VRF network instances that are BGP-EVPN IFL enabled.
Network instance and interface MTU
No specific MTU checks are done in network instances configured with VXLAN. Make the default network-instance interface MTU large enough to allow room for the VXLAN overhead. If the size of the egress VXLAN packets exceeds the IP MTU of the egress subinterface in the default network instance, the packets are still forwarded. No statistics are collected, other than those for forwarded packets.
IP MTU checks are used only for the overlay domain; that is, for interfaces doing inner packet modifications. IP MTU checks are not done for VXLAN-encapsulated packets on egress subinterfaces of the default network instance (which are in the underlay domain).
Fragmentation for VXLAN traffic
Fragmentation for VXLAN traffic is handled as follows:
The Don't Fragment (DF) flag is set in the VXLAN outer IP header.
The TTL of the VXLAN outer IP header is always 255.
No reassembly is supported for VXLAN packets.
VXLAN and ECMP
Unicast traffic forwarded to VXLAN destinations can be load-balanced on network (underlay) ECMP links or overlay aliasing destinations.
Network LAGs, that is, LAG subinterfaces in the default network instance, are not supported when VXLAN is enabled on the on 7220 IXR-Dx platforms. LAG access subinterfaces on MAC-VRFs are supported on all platforms.
VXLAN-originated packets support double spraying based on overlay ECMP (or aliasing) and underlay ECMP on the default network instance.
Load-balancing is supported for the following:
encapsulated Layer 2 unicast frames (coming from a subinterface within the same broadcast domain)
Layer 3 frames coming from an IRB subinterface
For BUM frames, load balancing operates as follows:
BUM supports spraying in access LAGs based on a hash. That is, BUM flows received from a VXLAN or a Layer 2 subinterface of a MAC-VRF are sprayed across egress access LAG links.
BUM does not support spraying in underlay VXLAN next-hops. That is, BUM flows received from VXLAN or a Layer 2 subinterface of a MAC-VRF are sent to a single underlay subinterface.
BUM VXLAN packets are sent to a single member of the next-hop group (NHG) associated with a specific VXLAN multicast destination. The chosen member is based on a hash of the NHG-ID of the VXLAN destination and the number of links in the NHG.
VXLAN ACLs
On 7220 IXR-Dx platforms, you can configure system filter ACLs to drop incoming VXLAN packets for reasons such as the following:
the source IP is not recognized
the destination IP is not an address to be used for termination
the default destination UDP port is not being used
SR Linux supports logging VXLAN of packets dropped by ACL policies.
A system filter ACL is an IPv4 or IPv6 ACL that is evaluated before tunnel termination has occurred and before interface ACLs have been applied. A system filter can match and drop unauthorized VXLAN tunnel packets before they are decapsulated. When a system filter ACL is created, its rules are evaluated against all transit and terminating IPv4 or IPv6 traffic that is arriving on any subinterface of the router, regardless of where that traffic entered in terms of network instance, subinterface, and so on.
The system filter matches the outer header of tunneled packets; they do not filter the payload of VXLAN tunnels. If the system-filter does not drop the VXLAN-terminated packets, only egress IRB ACLs can match the inner packets. System filters can be applied only at ingress, not egress.
See the SR Linux Configuration Basics Guide for information about configuring system filter ACLs.
QoS for VXLAN tunnels
When the SR Linux receives a terminating VXLAN packet on a subinterface, it classifies the packet to one of eight forwarding classes and one of three drop probabilities (low, medium, or high). On 7220 IXR-Dx platforms, the classification is based on the following considerations:
The outer IP header DSCP is not considered.
If the payload packet is non-IP, the classified FC is fc0 and the classified drop probability is lowest.
If the payload packet is IP, and there is a classifier policy referenced by the qos classifiers vxlan-default command, that policy is used to determine the FC and drop probability from the header fields of the payload packet.
If the payload packet is IP, and there is no classifier policy referenced by the qos classifiers vxlan-default command, the default DSCP classifier policy is used to determine the FC and drop probability from the header fields of the payload packet.
When the SR Linux adds VXLAN encapsulation to a packet and forwards it out a subinterface, the inner header IP DSCP value is not modified if the payload packet is IP, even if the egress routed subinterface has a DSCP rewrite rule policy bound to it that matches the packet FC and drop probability. The outer header IP DSCP is set to a static value or copied from the inner header IP DSCP. However, this static or copied value is modified by the DSCP rewrite rule policy that is bound to the egress routed subinterface, if the rule policy exists.
Example:
You can specify a classifier policy that applies to ingress packets received from any remote VXLAN VTEP. The policy applies to payload packets after VXLAN decapsulation has been performed.
The following example specifies a VXLAN classifier policy:
--{ * candidate shared default }--[ ]--
# info with-context qos
qos {
classifiers {
vxlan-default p1 {
traffic-class 1 {
forwarding-class fc0
}
}
}
}
See 7250 IXR Quality of Service Guide and 7730 SXR SR Linux Quality of Service Guide for information on configuring QoS on other platforms.
Configuring a VXLAN classifier policy
On 7220 IXR-Dx platforms, you can specify a classifier policy that applies to ingress packets received from any remote VXLAN VTEP. The policy applies to payload packets after VXLAN decapsulation has been performed.
The following example specifies a VXLAN classifier policy:
--{ * candidate shared default }--[ ]--
# info with-context qos
qos {
classifiers {
vxlan-default p1 {
traffic-class 1 {
forwarding-class fc0
}
}
}
}
Asymmetric VXLAN VNIs
Asymmetric VNIs refer to the capability of a network instance to use different VNIs for ingress and egress traffic, instead of a single common VNI. Asymmetric VNIs are supported in MAC-VRF and IP-VRF network instances.
In MAC-VRF network instances, received MAC/IP, IMET, or AD per EVI routes with VNIs that do not match the local VNI are accepted.
In IP-VRF network instances, received IP prefixes or MAC/IP Advertisement routes with VNIs that do not match the local VNI are accepted.
There is only one unique VNI per VTEP per network instance. If a network instance receives multiple egress VNIs for the same VTEP, only the lowest egress VNI is programmed for the network instance toward that VTEP. The entries that are not programmed due to this VNI conflict will expose a next hop with oper-state down-vni-conflict. In addition, the egress VNI may not be programmed if there are insufficient hardware resources. If that is the case, the non-programmed next hop will show an oper-state failed. This is valid for all routes that can cause a conflict in the network instance.
# info detail from state platform linecard 1 forwarding-complex 0 fib-table next-hop-group 24467420
oper-state down
backup-next-hop-group 0
backup-active false
next-hop 0 {
next-hop 24467410
oper-state down-vni-conflict
}
VXLAN statistics collection
You can configure SR Linux to collect statistics for VXLAN tunnels on 7220 IXR-D2/D3/D2L/D3L platforms. By default, statistics collection for VXLAN tunnels is disabled.
System resources allocated to statistics collection are distributed among Layer 3 or IRB subinterface statistics (if configured), Layer 2 subinterface statistics, and VXLAN statistics.
When Layer 3 or IRB subinterfaces are configured, enabling VXLAN statistics collection reduces the amount of resources available for collecting Layer 2 subinterface statistics:
-
If VXLAN statistics collection is disabled (the default), resources are available for collecting statistics for all supported Layer 2 subinterfaces.
-
If VXLAN statistics collection is enabled, resources are shared between Layer 2 subinterfaces and VXLAN tunnels, reducing the Layer 2 subinterface scale.
If the maximum scale is required for Layer 2 subinterfaces, VXLAN statistics collection must be disabled. Disabling VXLAN statistics collection deallocates the resources for VXLAN statistics collection and allocates them to Layer 2 subinterface statistics collection. The VXLAN statistics are not cleared, however; to clear them, use the appropriate tools command. See Clearing VXLAN statistics.
- The amount of resources available for collecting Layer 3 or IRB subinterface statistics does not change if VXLAN statistics collection is enabled or disabled.
Enabling VXLAN statistics collection
To enable VXLAN statistics collection, use the following command. Enabling VXLAN statistics collection disables and re-enables statistics collection on the subinterfaces. During this transition, statistics are not counted for the subinterfaces.
After enabling VXLAN statistics collection, if you attempt to configure more than the maximum number of Layer 2 subinterfaces supported with VXLAN statistics, an error message is displayed. If more than the maximum number of Layer 2 subinterfaces with VXLAN statistics are already configured in the system, VXLAN statistics collection cannot be enabled.
--{ candidate shared default }--[ ]--
# info with-context tunnel vxlan-tunnel statistics
tunnel {
vxlan-tunnel {
statistics {
admin-state enable
}
}
}
Displaying VXLAN statistics
To display statistics for all VXLAN tunnels or for a specified VTEP, use the info from state command in candidate or running mode, or the info command in state mode.
Display statistics for all VXLAN tunnels
--{ running }--[ ]--
# info from state with-context vxlan-tunnel statistics
vxlan-tunnel {
statistics {
in-octets 7296882
in-packets 83012
in-discarded-packets 5
out-octets 7297496
out-packets 83007
last-clear 2021-01-29T21:58:40.919Z
}
}
Display statistics for a specified VTEP
--{ running }--[ ]--
# info from state with-context vxlan-tunnel vtep 10.22.22.2
vxlan-tunnel {
vtep {
address 10.22.22.2
index 677716962894
last-change 2021-01-29T21:52:34.151Z
statistics {
in-octets 7296882
in-packets 83012
in-discarded-packets 5
out-octets 7297496
out-packets 83007
last-clear 2021-01-29T21:58:40.919Z
}
}
}
Clearing VXLAN statistics
You can clear the statistics for all VXLAN tunnels or for a specific VTEP.
Clear statistics for all VXLAN tunnels
--{ running }--[ ]--
# tools vxlan-tunnel statistics clear
Clear statistics for a specific VTEP
--{ running }--[ ]--
# tools vxlan-tunnel vtep 10.22.22.2 clear