Audit

Connect provides a mechanism to audit the state of the Fabric Services System against the state of the Connect service. The state of these two components can become disconnected as result of fabric deviations, general disconnects in the Kubernetes cluster, or manual intervention.

To recover from these scenarios, Connect introduces the audit mechanism.

To create an audit request, send a POST request including at least the deploymentID to be audited.

Connect supports the following audit scopes:

  • CONNECT_ONLY: The audit examines the full relationship between Connect and the Fabric Services System. It also examines the LLDP information in all nodes and consolidates that with the information currently contained in Connect (Note: only for the HostPorts that are currently active in the Deployment).
  • ERROR_ONLY: The audit examines only Connect resources that are in an ERROR state. Resources can enter an ERROR state when Connect cannot create or update their equivalent on the Fabric Services System because of unforeseen circumstances. Note that this kind of audit will not detect DANGLING_RESOURCE deviations on the Fabric Services System. If there are few resources in ERROR this audit will conclude much faster than a CONNECT_ONLY scope. However, if there are a lot of resources in ERROR, use the CONNECT_ONLY scope.
  • PLUGIN_ONLY: The audit only examines the relationship between the Connect Plugin and Connect itself. This type of Audit can be used when the connection was lost between Plugin and Connect or when a backup is restored.
  • FULL: A full audit covers the Plugin to Connect relation, as well as the Connect to Fabric Services System relation. It audits the full system, as a result this type of audit can take quite some time.

By default, the scope of the audit is CONNECT_ONLY. To change the scope, you need to provide the scope in the POST request.

REQUEST: POST http://localhost/rest/connect/api/v1/admin/audits
 
{"deploymentId": "422199394960411946"}
 
RESPONSE:
{
  "id": "422199984495070506",
  "enqueueTime": "2022-08-22T17:00:01.005355194+02:00",
  "endTime": "0001-01-01T00:00:00Z",
  "status": "InProgress",
  "failureReason": "",
  "dryRun": false,
  "report": [],
  "deploymentId": "422199394960411946",
  "scope": "CONNECT_ONLY",
  "totalNumberOfDiscrepancies": 0,
  "totalNumberOfSuccessfulDiscrepancies": 0,
  "totalNumberOfFailedDiscrepancies": 0
}
REQUEST: GET http://localhost/rest/connect/api/v1/admin/audits/422199984495070506
 
RESPONSE:
 
{
  "id": "422199984495070506",
  "enqueueTime": "2022-08-22T15:00:01.005Z",
  "endTime": "2022-08-22T15:00:01.578Z",
  "status": "Success",
  "failureReason": "",
  "dryRun": false,
  "report": [],
  "deploymentId": "422199394960411946",
  "scope": "CONNECT_ONLY",
  "totalNumberOfDiscrepancies": 0,
  "totalNumberOfSuccessfulDiscrepancies": 0,
  "totalNumberOfFailedDiscrepancies": 0
}
  • EnqueueTime and Endtime allow you to monitor job execution time.
  • Status indicates whether the audit was successful, errored, or is still in progress.
  • failureReason is only populated when the audit is in an error state, indicating which error occurred during the audit.
  • Report is a list of actions taken by the audit.
  • Report entries can be of the types:
    • MISSING_RESOURCE: the Fabric Services System is missing a resource that is configured in Connect. To correct this, re-create the Fabric Services System resource.
    • DANGLING_RESOURCE: Connect is no longer aware of a resource that is configured in the Fabric Services System. To correct this, delete the Fabric Services System resource.
    • MISCONFIGURED_RESOURCE: Connect has a different configuration than the Fabric Services System. To correct this, update the Fabric Services System resource.
    • EVPN_UNDEPLOYED: Connect detected that one of the tenants/EVPNs it manages is not deployed correctly.

SubAudits

With the introduction of the PLUGIN_ONLY and FULL scope, Connect offers a new system to report SubAudits. The Audit object is calculated from several SubAudit objects. In case of a FULL Audit this is:

  • the Plugin-performed Audit and report
  • the Connect-performed Audit and report for the Fabric Services System REST layer.
  • the Connect-performed Audit for LLDP information (or Topology Audit)

Currently the following types of SubAudit are available in Connect:

  • CONNECT: The SubAudit process that concerns the data consistency between Connect and Fabric Services System.
  • TOPOLOGY: The SubAudit process that concerns the data consistency between Connect and the LLDP information in SRL.
  • PLUGIN: The SubAudit process that concerns the data consistency between the plugin and Connect.

The SubAudits of a particular audit can be queried using a GET request on the plugin API of Connect. An example can be found below:

REQUEST: GET http://localhost/rest/connect/api/v1/plugins/subaudits?parentAuditID=454820455069516076

RESPONSE:

[
  {
    "parentAuditID":"454820455069516076",
    "type":"CONNECT",
    "id":"454820455086358828",
    "enqueueTime":"2023-04-04T15:55:16.62Z",
    "endTime":"2023-04-04T15:55:20.198Z",
    "status":"Success",
    "failureReason":"",
    "dryRun":false,
    "report":[],
    "deploymentId":"454816541532225836",
    "scope":"ERROR_ONLY",
    "totalNumberOfDiscrepancies":2,
    "totalNumberOfSuccessfulDiscrepancies":2,
    "totalNumberOfFailedDiscrepancies":0
  }
]