Graph Theory Explained – SMA

Welcome to Graph Theory!

Graphs are fundamental structures used to model relationships between objects. They are made up of:

  • Nodes (or Vertices): These represent the individual objects or entities. Think of them as points.
  • Edges (or Links/Arcs): These represent the connections or relationships between pairs of nodes. Think of them as lines connecting the points.

Real-world examples:

  • Social networks (people and their friendships).
  • The internet (web pages and hyperlinks).
  • Road networks (cities and roads connecting them).
  • Biological networks (proteins and their interactions).
A B C D

A simple graph with 4 nodes (A, B, C, D) and 4 edges.

Undirected Graphs

In an undirected graph, edges have no orientation or direction. If an edge connects node A to node B, it implies a mutual relationship: A is connected to B, and B is connected to A.

  • The relationship is symmetric.
  • Represented by a simple line between two nodes.

Use cases:

  • Facebook friendships (if you are friends with someone, they are friends with you).
  • Protein-protein interaction networks (if protein A interacts with protein B, B also interacts with A).
  • Co-authorship networks (if A co-authored a paper with B, B also co-authored with A).
A B (A, B) or (B, A)

An undirected edge between A and B.

Directed Graphs (Digraphs)

In a directed graph (or digraph), edges have a specific direction. An edge from node A to node B indicates a one-way relationship: A points to B, but B does not necessarily point back to A.

  • The relationship is asymmetric.
  • Represented by an arrow pointing from the source node to the target node.

Use cases:

  • Twitter followers (following someone doesn’t mean they follow you back).
  • Web page links (page A links to page B, but B might not link to A).
  • Task dependencies in a project (task A must be completed before task B can start).
  • Citation networks (paper A cites paper B).
A B A → B

A directed edge from A to B.

Network Measure: Degree & Degree Distribution

The degree of a node is the number of edges connected to it. It’s a basic measure of a node’s direct influence or activity.

  • Undirected Graphs: The degree of a node is simply the count of its neighbors.
  • Directed Graphs:
    • In-degree: Number of edges pointing *to* the node. (Popularity, Receptiveness)
    • Out-degree: Number of edges pointing *from* the node. (Influence, Activity)

The degree distribution P(k) of a network is the probability that a randomly selected node has degree k. It tells us how many nodes have a certain number of connections. Many real-world networks (like social networks or the internet) exhibit a “power-law” or “scale-free” degree distribution, meaning a few nodes (hubs) have a very high degree, while most nodes have a low degree.

A B C D

Example Directed Graph

Node A: In-degree=1 (from C), Out-degree=1 (to B)

Node B: In-degree=1 (from A), Out-degree=1 (to C)

Node C: In-degree=1 (from B), Out-degree=2 (to A, to D)

Node D: In-degree=1 (from C), Out-degree=0

Degree Distribution Example:

If we had 100 nodes: 50 nodes with degree 1, 30 nodes with degree 2, 15 nodes with degree 3, 5 nodes with degree 10. This describes the distribution.

Network Measure: Density

Density measures how connected a graph is by comparing the number of actual edges to the maximum possible number of edges.

  • A dense graph has many edges, close to the maximum possible.
  • A sparse graph has few edges.

Formulas:

  • For an undirected graph with N nodes and E edges: Density = 2E / (N * (N-1))
  • For a directed graph with N nodes and E edges: Density = E / (N * (N-1)) (Since each pair of nodes can have two directed edges: A→B and B→A)

The density value ranges from 0 (no edges) to 1 (all possible edges exist – a complete graph).

Interpretation:

  • High density implies a tightly-knit network where information or influence can spread quickly.
  • Low density implies a loosely connected network.
  • Most real-world large networks are sparse.

Network Measure: Connectivity

Connectivity refers to how well nodes in a graph can reach each other. It’s about the robustness of the network to disconnections.

For Undirected Graphs:

  • A graph is connected if there is a path between every pair of distinct nodes.
  • If not, it consists of several connected components – subgraphs that are internally connected but disconnected from each other.
  • Cut Vertices (Articulation Points): Nodes whose removal would increase the number of connected components.
  • Bridges: Edges whose removal would increase the number of connected components.

For Directed Graphs:

  • Strongly Connected: A directed graph is strongly connected if there is a directed path from any node A to any other node B, AND a directed path from B back to A.
  • Strongly Connected Components (SCCs): Maximal subgraphs where every node is reachable from every other node within that subgraph following directed paths.
  • Weakly Connected: A directed graph is weakly connected if replacing all its directed edges with undirected edges results in a connected undirected graph. (i.e., the underlying undirected structure is connected).

High connectivity is crucial for resilience and efficient flow of information or resources.

Network Measure: Centralization

Centralization describes the extent to which a network’s structure is dominated by a few central nodes. A highly centralized network has a clear “center” or “core”, while a decentralized network has a more distributed structure.

Centrality itself is a node-level property, indicating a node’s importance. Common centrality measures include:

  • Degree Centrality: Number of connections a node has. Nodes with high degree are “hubs”. (Already discussed under Degree).
  • Betweenness Centrality: Measures how often a node lies on the shortest path between other pairs of nodes. High betweenness nodes are “brokers” or “gatekeepers”.
  • Closeness Centrality: Measures the average shortest distance from a node to all other nodes in the network. High closeness nodes can spread information efficiently.
  • Eigenvector Centrality: Measures a node’s influence based on being connected to other influential nodes.

Network Centralization is a graph-level measure. It’s often calculated by looking at the variance or distribution of individual node centrality scores. For example, Freeman’s general centralization formula compares the sum of differences between the most central node and all other nodes, to the maximum possible such sum for a graph of the same size.

A star graph (one central node connected to all others) is an example of a highly centralized network.

Network Measure: Tie Strength & Trust

Tie Strength:

Tie strength refers to the intensity, importance, or quality of a relationship between two nodes. Edges in a graph can be weighted to represent tie strength.

  • Strong Ties: Typically represent close relationships, frequent interaction, emotional investment (e.g., close friends, family). They are often found in dense, clustered parts of a network and are important for social support and trust.
  • Weak Ties: Represent more distant relationships, less frequent interaction (e.g., acquaintances). Granovetter’s “Strength of Weak Ties” theory highlights that weak ties are crucial for bridging different social circles and accessing novel information or opportunities.

Tie strength can be measured by factors like: frequency of communication, duration of relationship, emotional closeness, or reciprocal services.

Trust:

Trust in a network context is the belief in the reliability, integrity, or capability of another node or the information flowing through a tie. It’s a crucial element for cooperation, information sharing, and transactions.

  • Trust can be direct (A trusts B based on past interactions) or indirect/propagated (A trusts C because A trusts B, and B trusts C).
  • Trust can be associated with nodes (how trustworthy a person is) or with edges (how trustworthy the communication channel or relationship is).
  • In computational systems, trust models are used to assess reliability, for example, in P2P networks, e-commerce reputation systems, or secure communication.

Both tie strength and trust add depth to network analysis beyond simple binary connections, allowing for more nuanced understanding of network dynamics.

© Graph Theory Learning Material

Ajink Gupta
Ajink Gupta

This account on Doubtly.in is managed by the core team of Doubtly.

Articles: 489