Fabric intents with unmanaged nodes

Typically, the process of creating a fabric intent assumes that the Fabric Services System itself is responsible for managing all of the nodes participating in the fabric. The Fabric Services System creates an initial configuration for each node, and then deploys and later updates that node configuration as part of the ongoing development and maintenance of the fabric. Typically, the Fabric Services System also uses its internal DHCP server to manage the IP addresses assigned to nodes.

In some cases, however, the management of the nodes within the fabric and the links between them might be reserved exclusively for some process external to the Fabric Services System, and the network also maintains its own, external DHCP server. This scenario is described by the Fabric Services System as a fabric consisting of "unmanaged nodes"'.

In such a case, all of the capabilities of the Fabric Services System to monitor existing fabrics and to create and manage workloads can still be used, but any alteration to the configuration of the nodes and links that constitute the underlying fabric itself is scrupulously avoided, and the existing, externally managed configurations of all nodes is carefully protected.

The procedure to create a fabric intent consisting of unmanaged nodes is described in Creating a fabric intent for unmanaged nodes. Once fabric intent creation is complete, the Fabric Services System discovers all of the nodes within the fabric, including details regarding any workloads that may already exist on that fabric.

From that point onward the Fabric Services System assumes responsibility for any node configuration pertaining to workloads. It also offers, outside of fabric configuration, the full set of features that would be available for a typical fabric consisting of managed nodes. But the system leaves the underlying fabric configuration untouched.

Note: Currently, the Fabric Services System only supports fabrics consisting entirely of managed nodes, or entirely of unmanaged nodes. Fabrics consisting of some managed nodes and some unmanaged nodes are not supported.
Important: Unmanaged nodes must be running a version of SR Linux that is supported by the current release of the Fabric Services System. This includes any software version that has been manually added to the Fabric Service System's software catalog. It is important that you identify the SR Linux software version correctly in the manual topology file for every unmanaged node. It is the operator's responsibility to ensure that the software is identified correctly; the system does not validate the software version during deployment.

If the SR Linux software running on the unmanaged node does not match that indicated in the manual topology file, the system will continue to attempt automatic deployment to the node indefinitely. As a result, the region's deployment pipeline could become filled with failed automatic deployments and this could irretrievably affect the behavior of the system.

Manual topology files for fabrics with unmanaged nodes

Any fabric consisting of unmanaged nodes must be created within the Fabric Services System as a fabric intent based upon an imported, manual topology.

There are two unique aspects of any manual topology file that describes a fabric of unmanaged nodes:
  • Node descriptions in manual topologies support the optional "IsManaged" property. For unmanaged nodes, this property is mandatory; it must be included in the node description and its value set to "false":

    "IsManaged": false

  • Because the Fabric Services System does not manage the Inter-Switch Links (ISLs) between nodes, it is not necessary to include ISL data in the manual topology file; only node data is mandatory.

    If link data is not present in the topology file, then no links are displayed in the topology view when creating the fabric intent; the nodes will appear disconnected. This will not affect the Fabric Services System's ability to work with the fabric, since the links exist and are managed by an external process.

    If link data is present in the topology file, then the links will display within the Fabric Services System normally when you create the fabric intent. However, this link data is not refreshed within the Fabric Services System if it is altered on the node.

Node configuration data stored within the Fabric Services System

Although the Fabric Services System does not directly manage the nodes within the fabric, it does store the following sets of node configuration data that are required to support its capabilities:
  • the management IP address of each unmanaged node in the fabric, stored as part of the fabric's node inventory. This information, along with the necessary certificate data, is enough for the Fabric Services System to discover an unmanaged node, establish a gNMI connection, learn the full node configuration details, and consider the node to be in a Ready state.
    Note: For unmanaged nodes, the management IP address is mandatory but the serial number is optional. For more typical, managed nodes, the serial number is mandatory.
  • the complete set of configuration data for each node, stored as a system-generated Global Configuration Override (GCO). Typically GCOs are used to store expected variations in a node's configuration. But for unmanaged nodes, the GCO is used to store the entire node configuration, where it is available for consultation by components of the Fabric Services System. For example, it is from this configuration data that the Fabric Services System creates initial configuration files as part of a maintenance intent.
    Note: Any system-generated GCO is called a "system GCO", to distinguish it from those created manually by an operator.
  • when it is required by a maintenance intent, a separate set of basic, initial configuration data parsed from the full configuration of each node. This configuration data consists of the network instance, management interface, and system information, and is required to use the provisioning processes that are part of maintenance intent deployment.

Deployment of fabric intents with unmanaged nodes

The Fabric Services System does not alter any fabric configuration data for unmanaged nodes. Nevertheless, after loading the manual topology of an unmanaged fabric, it is necessary to "deploy" the resulting fabric intent. Although no configuration data is sent to the node during deployment to an unmanaged node, the act of "deploying" the fabric intent satisfies certain internal requirements of the Fabric Services System.

For the deployment of such a fabric intent to proceed, two conditions must be met:
  • all participating nodes must be in a Ready state
  • none of the system GCOs that represent the nodes within the fabric can be empty
If either of these conditions is not met, the attempt to deploy the fabric intent fails and results in an error message in the system log. Deployment completes for those nodes that satisfy the conditions, but are suspended for those that do not. Once the conditions are satisfied for any of the remaining nodes, the deployment on that node automatically resumes.
Note: An empty system GCO for an unmanaged node usually indicates some kind of delay in obtaining the configuration data from the node. This problem typically resolves itself when regular communication with the node is established.
Note: This check to ensure the system GCO is not empty is performed when creating the first version of a fabric intent that uses unmanaged nodes. However, any subsequent deployments will not repeat this check. For this reason, even though it is possible to manually delete any system GCO, it is important not to delete the system GCO corresponding to any deployed, unmanaged node. If you do inadvertently delete such a system GCO, replace that data with an equivalent user-created GCO.

Subsequent updates to fabric configuration

If the external process that is managing the "unmanaged" nodes makes any change to their configuration, the Fabric Services System detects these changes as deviations. The system automatically accepts these deviations and incorporates them into the stored system GCO. However, because the Fabric Services System does not deploy fabric configurations to unmanaged nodes, the system does not deploy any updated version of the fabric intent after absorbing these deviations in to the system GCO.

Workload VPN intents

Workloads VPN intents on fabrics consisting of unmanaged nodes are created, deployed, and managed exactly like workload VPN intents on conventional fabrics.

The Fabric Services System does not discover pre-existing workloads on unmanaged nodes. However the configuration is accepted as part of system GCO.

New workloads created with the Fabric Services System can be deployed to unmanaged fabrics as long as no conflicting configuration is present in the fabric from pre-existing workloads. If such a conflicting configuration is present, delete the pre-existing workload configuration on the node before deploying the new workload from the Fabric Services System.

Maintenance intents

Maintenance intents allow you to manage the replacement of nodes within a fabric, and the upgrading or downgrading of SR Linux software on existing nodes.

Maintenance intents for fabrics consisting of unmanaged nodes are created, deployed, and managed exactly like maintenance intents on conventional fabrics.

However, assigning a serial number to an unmanaged node, which is a requirement for maintenance intents to function, results in an entry for each of the affected nodes in the Fabric Services System's own, internal DHCP server. This is necessary for the function of the SR Linux Zero Touch Provisioning (ZTP) capability that is used by maintenance intents.

Any maintenance intent also requires an initial configuration file for each node affected by the intent. The Fabric Services System creates this initial configuration file by reading directly from the system GCO that contains the node's configuration immediately before the creation of the maintenance intent. The following node data is extracted from the system GCO to create the initial configuration file, which is required for the processes that deploy the maintenance intent:
  • network instance management information
  • interface management information
  • system information

After deployment, this initial configuration data is sufficient to allow the existing, external process to resume management of the node after the maintenance intent has completed successfully.

Alterations to workflows when using unmanaged nodes

For the most part, other than the management of the underlay fabric itself, operations within the Fabric Services System are unchanged when working with a fabric consisting of unmanaged nodes. Workload VPN intents and maintenance intents are managed almost identically, and other features for monitoring the fabric behave just as they do for fabrics consisting of nodes managed by the Fabric Services System.

However, there are a few exceptions to standard settings or procedures when working with a fabric of unmanaged nodes.

  • When configuring a Management IP Pool for use with unmanaged nodes, any CIDR block within that pool must have the Is Managed property disabled. This is a unique requirement for unmanaged nodes.
  • Domain names are typically set once for a fabric, and so the same domain name is configured on all of the nodes within the fabric. However, for unmanaged nodes, the Fabric Services System reads the domain name individually from each node configuration. As a result, for unmanaged nodes, domain names may vary among nodes within a fabric.
  • Global Configuration Overrides (GCOs) can only add new configurations that are not already present on the node (and reflected in the corresponding system GCO). Any modification or deletion of existing configurations contained within this system GCO is not supported.
  • When configuring a mirror destination for traffic mirroring, you must manually provide a Source IP address if the source is an unmanaged node.
  • For unmanaged nodes, the Operational Deviation view of the Operational Health and Insights page does not display deviations pertaining to Inter-Switch Links (ISLs). The tracking of deviations on ISLs is exclusively supported for managed nodes.
  • It is the operator's responsibility to resolve any conflict between an already-created or discovered workload in the system GCO, and the workload created by the system.