SPOF NetworkX Detection

06.05.2024 16:30 automation graph ipv4 networkx ciscoconfparse python spof

Detecting Single Points of Failure using NetworkX

The ability to think of a TCP/IP Network as just a mathematical graph is underutilised by typical (network automation) engineers in my opinion. Graph theory is a well established field of mathematics, and converting a (computer) network to it’s graphical state enables us to reap benefits at minimal cost.

In this blog post, I introduce converting a (basic) network to a graph, at the IPv4 layer, using Cisco .cfg files. We can manipulate this graph to detect Single Points of Failure (SPoF) and audit network reliability. This will be the first post in using graphs to gain visibility into (computer) networks.

GitHub Repository for NetApollo, Python implementation of the approach discussed: https://github.com/sam-w-thomas/NetApollo

Is is worthwhile pointing out, networking modelling is already a well-practiced discipline. This is just my contribution!

Graph Theory

Many Network Engineers, without a background in Computer Science, may scratch their head at what a ‘graph’ is? That’s just a bar chart or something right? In the world of Computer Science, it has a different meaning - explained below from MIT.

"Informally, a graph is a bunch of dots and lines where the lines connect some pairs of dots"

“Graphs are ubiquitous in computer science because they provide a handy way to represent a relationship between pairs of objects. The objects represent items of interest such as programs, people, cities, or web pages, and we place an edge between a pair of nodes if they are related in a certain way”

MIT ## 6.042J Chapter 5: Graph theory

Graphs come in two typical flavours:

Directed Graphs - edges have a concept of direction
Undirected Graphs - edges have no concept of direction

Example of a basic graph.

Here is the graph, but it’s in it’s (physical) network form.

Graph converted to a (basic) physical view of a network

For the purposes of simplicity, all graphs in this post will be undirected graphs. Although, we could use a directed Graph to represent receive and transmit.

Converting a TCP/IP network to a graph

Now, you can’t convert a TCP/IP network into just a single graph; We have control, data and management planes for a reason.

However, we can convert parts of a network to a single graph. For example, the IPv4 layer or the OSPF Link State Database. We shall create a single graph at the IPv4 layer in this blog post.

However, to increase complexity and ability, you could layer graphs. As you’d expect, graph theory has a word and ability for this already - Multilayer graphs.

Converting a TCP/IP network to a graph - Implementation

As mentioned above, we shall create a graph at the IPv4 layer, and use this to detect SPoFs. I break this problem down into the following domains:

Gathering IPv4 addressing
Filtering interfaces based on state (Optional)
Determining neighbors using IPv4 addressing and convert to adjacency list
Convert adjacency list into a graph
Iterate through graph to find SPoFs

Gathering IPv4 addressing

Fundamentally, there are two approaches we can use to gather addressing, neither are mutually exclusive:

Configuration files
State information (e.g. “show” commands in Cisco)

In NetApollo I demonstrate using just configuration files. This has the following advantages:

Enables a third-party to audit configuration with no live connectivity requirements (or equivalent)
Provides a building block for other information. For example, we could add state information onto the configuration files

In NetApollo, IPv4 addressing is achieved using ciscoconfparse. This scrapes the configuration files in a folder, and is used to produce the following structure for each interface:

If we apply this logic to an actual network, the one below, we get the following result(s):

We now have a table of the IPv4 addressing for an entire network, an “interface-address table”. Using scale techniques, such as Pandas Dataframes and Numpy, we could increase the efficiency of processing.

Filtering interfaces based on state

I want to keep this relatively short, as I did not implement this approach. However, you could use keys in the static configuration (e.g. interface names) and combine this with tools, such as paramiko/TextFSM, which would enable you to filter links based on a state of your choosing, such as down. Alternatively, you could also layer on information such as errors; increasing graph complexity to achieve aims.

Determining neighbors using IPv4 addressing

Iterating through the interface-address table produced, enables us to build an adjacency list. An adjacency list of the network above looks like:
R1: R2, R3
R2: R1, R3
R3: R1, R4, R5
R4: R3, R5
R5: R3, R4

Please note, we are looking at this at the IPv4 layer. R4, R5 and R3 share an NBMA network segment so there is more actual complexity; however, we simplify for our purposes.

This adjacency list can then be converted to a (NetworkX graph)[https://networkx.org/].

To determine the adjacency list from the interface-address table produced, we follow the logic:

In practice, the adjacency list takes the form of a dictionary of lists. This demonstrated below for the example topology shown above.

{
  "R1": ["R2", "R3"],
  "R2": ["R1", "R3"],  
  "R3": ["R1", "R4", "R5"],  
  "R4": ["R3", "R5"],  
  "R5": ["R3", "R4"]  
}

Convert adjacency list into a graph

Here is when we are introduced to NetworkX. NetworkX describes itself as “.. a Python package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks”. It provides the basis for (our) python graph manipulation, without us having to do the hard work.

To load our adjacency list as a NetworkX Graph, we only need the following code:

import networkx as nx

networkx_graph = nx.from_dict_of_lists(adj_list)

Iterate through the graph to find Single Points of Failure

To identify SPoFs, we iterate through every node. Removing the node, and seeing if it is a disconnected graph. A “disconnected graph” is a graph theory term which states the graph that has two distinct/seperate components, i.e. there are at least two nodes with no path between them.

Please note, when implementing this in python you need to use the .copy() function to create a separate graph each iteration. Otherwise, you would eventually remove all nodes from your original graph; We want a fresh (original) graph every iteration.

spof_nodes = []
    for node in original_graph.nodes:
        iteration_graph = original_graph.copy()
        iteration_graph.remove_node(node)
        if not nx.is_connected(demo_graph):
            spof_nodes.append(node)

Using the above example, spof_nodes now represents all nodes which are SPoFs at the IPv4 layer.

Scalability

I don’t want to delve to much into this, as I don’t have the time to demonstrate this approach on 10,000’s of nodes. However, given the simple data structures explored. I believe this approach could be scaled up to networks with 100,000’s of devices. This could potentially be achieved by breaking tasks and using tools such as Apache Hadoop.

Practical Applications

Whilst this approach could be used to pull static configuration files and quickly identify Single Points of Failure. There exist more complete solutions which may be more appropriate for your particular needs. For example, Batfish has a comprehensive approach to analyse the impact of network failure in greater complexity than is covered here. However, we hope this blog post can provide a reference for simpler analysis and fundamental learning. Furthermore, proprietary tools/products exist, such as IPFabric.