Monte Carlo Methods for Matrix Functions

The main focus of the MOCCA library is the computation of matrix functions using Monte Carlo methods. Let \(\mathbf{A}\) be a \(n \times n\) real matrix. A matrix function \(f(\textbf{A}) : \mathbb{R}^{n \times n} \rightarrow \mathbb{R}^{n \times n}\) can be expressed as a power series:

\[f(\mathbf{A}) = \sum_{k = 0}^\infty {c_k \mathbf{A} ^ k}\]

Naturally, calculating the matrix function using this expression is very expensive. Instead, we approximate the series using random walks over the input matrix \(\mathbf{A}\) [4] (i.e., a Monte Carlo method). Let us consider a graph constructed from the input matrix \(\mathbf{A}\): its nodes are labelled from \(1\) to \(n\) (each one corresponding to a row in the matrix) and there is a direct edge from node \(i\) to node \(j\) if and only if \(A_{ij} \neq 0\). Starting in a given node, a random walk consists of a sequence of nodes that are obtained by jumping from one node to the next along the directed edges of this graph, choosing the next node at random according to a transition probability matrix based on the value of \(\textbf{A}\). The \(k\) step of the random walk corresponds to a \(k\)-th term of the series. Therefore, the matrix function \(f(\mathbf{A})\) can be estimated by performing many random walks over the matrix \(\mathbf{A}\) and then combining the values of each random walk. This is the basic idea of the Monte Carlo methods for matrix functions.

Available Monte Carlo Solvers

Solver

Method

Matrix Function

Notes

Related Papers

mocca::ClassicalMC

“Classical” Monte Carlo

\(f(\mathbf{A})\), \(diag(f(\mathbf{A}))\), \(f(\mathbf{A})\vec{v}\)

The sequence \(c_k\mathbf{A}^k\) must be monotonically decreasing.

[4]

mocca::RandFunm

Randomized Algorithm

\(f(\mathbf{A})\), \(diag(f(\mathbf{A}))\), \(f(\mathbf{A})\vec{v}\)

The sequence \(c_k\mathbf{A}^k\) must be monotonically decreasing.

mocca::CTRW

Continuous-Time Random Walks

\(\exp(\mathbf{A}t)\), \(ML_\alpha(\mathbf{A}t^\alpha)\)

the diagonal of \(\mathbf{A}\) must not contain zeros

[1], [2], [3]

mocca::CTRWAction

Continuous-Time Random Walks

\(\exp(\mathbf{A}t)\vec{v}\), \(ML_\alpha(\mathbf{A}t^\alpha)\vec{v}\)

the diagonal of \(\mathbf{A}\) must not contain zeros

[1], [2], [3]

Common Interface

All solvers follow the same general concept. Here is a typical and general example:

using namespace mocca;

CSMatrix A;
Vector solution;
Vector v;

// Initialize A and v //

// Creates a new instance of the solver
RandFunm<CSRMatrix<double>> monte_carlo;

// Set the parameters of the solver
monte_carlo.set_param({.num_samples = 1000000,                        // The number of random samples
                       .seed = PCG32_DEFAULT_SEED,                    // Random seed for PCG64
                       .stream = PCG32_DEFAULT_STREAM });             // Random stream for PCG64

// Set the input matrix.
monte_carlo.set_matrix(A);

// Compute the matrix function `expm`
solution = monte_carlo.funm(expm, v);

When creating the solver, you can pass additional flags to specify the desired behaviour. For example, you need to indicate the sampling type of the mocca::CTRWAction solver:

mocca::CTRWAction<CSRMatrix, mocca::kBackwardSampling> monte_carlo;

In the set_matrix() method, the solver will initialize all the necessary parameters of the method that are related to the matrix \(\mathbf{A}\), such as constructing the probability matrix, calculating the weight associated with each row, etc. Note that every time the values of the matrix \(\mathbf{A}\) change, you must call set_matrix() again.

Next, call the funm() (for mocca::RandFunm) or expm()/mittag_leffler() (for mocca::CTRW and mocca::CTRWAction) method to compute the target matrix function \(f(\mathbf{A})\) or a related operation (e.g., its action over a vector). These methods can be called several times, reusing the input matrix \(\mathbf{A}\):

monte_carlo.set_matrix(A);
sol1 = monte_carlo.funm(expm, v1);
sol2 = monte_carlo.funm(expm, v2);
// ... //

Example - Heat Equation

The heat equation is a classical example in computer science and applied mathematics. For a domain \(\Omega = [-\mu, \mu]^2\), \(t > 0\), \(\vec{x}~=~(x, y)~\in~\mathbb{R}^2\), a boundary conditions \(u(\vec{x}, t)\lvert_{\partial\Omega}\ = 0\) and a initial conditions \(u(\vec{x}, 0) = u_0(\vec{x})\), the heat equation in 2D is

\[\frac{\partial u(t)}{\partial t} = \nabla^2 u\]

After discretizing the domain with a grid spacing \(\Delta x = \Delta y = 2\mu / n_x\) and applying the standard 5-point stencil finite difference approximation, we can write the approximated solution for the heat equation as

\[\vec{u}(\vec{x}, t) = \exp \left(\frac{n_x ^ 2}{4\mu^2} \, t \, \textbf{L} \right) \, \vec{u}_0(\vec{x})\]

where \(\textbf{L}\) denote the corresponding discretized Laplacian operator. Lets denote the matrix \(\textbf{A}\) as \(\textbf{A} = \frac{n_x ^ 2}{4 \mu ^ 2} \textbf{L}\). The code below calculates the solution \(\hat{u}(\vec{x}, t)\) using the Monte Carlo method, considering \(\Omega = [-1, 1]^2\), \(n_x = 64\), \(t = 0.1\) and a discrete impulse at the center of the mesh as the initial value function \(u_0(\vec{x})\).

Because \(u_0(\vec{x})\) contains mostly zeros, it is better to store it as a Sparse Vector and compute the random walks backwards (i.e., the program computes the random walks going from \(t\) to \(0\)). Otherwise, when using forward sampling, the random walk can end into a zero entry of the vector \(u_0(\vec{x})\). The computation of this random walk is “wasted” in the sense that it has no contribution to the solution vector and could be better used elsewhere.

Backward sampling requires that the matrix \(\textbf{A}\) be transposed before calculating the matrix exponential, which is done automatically when calling the set_matrix() method. However, we can indicate that \(\textbf{A}\) is symmetric and therefore, the matrix transposition is not required. See expm_poisson.cpp and ml_poisson.cpp in the examples directory.

#include "../mocca/monte_carlo/ctrw_action.h"
#include "../mocca/io/mtx.h"
#include "../mocca/io/txt.h"

using index_t = mocca::index_t;
using CSRMatrix = mocca::CSRMatrix<double>;
using Vector = mocca::Vector<double>;

int main(int argc, char* argv[])
{
    if (argc != 2)
    {
        fmt::print(stderr, "Invalid Parameters. Usage: ./{} <num samples>\n", argv[0]);
        exit(EXIT_FAILURE);
    }

    std::string filename = "poisson-2D-64";
    double time = 0.1;
    index_t nsamples = atoll(argv[1]);  // The number of random samples is passed as argument

    // Read the discretized Laplacian operator L from the file
    CSRMatrix A = mocca::io::read_sparse_mtx<double>("input/" + filename + ".mtx");
    SparseVector u0(A.rows(), 1);
    Vector solution;

    // Calculates A = n_x ^ 2 / (4 * \mu ^ 2) * L
    A *= (64 * 64) / 4;

    // Add the discrete impulse at the centre of the mesh i.e., in the position (x, y) = (32, 32) as the initial condition
    u0.insert(31 + 32 * 64, 1);

    mocca::CTRWAction<CSRMatrix, mocca::kBackwardSampling> monte_carlo;


    // Set the parameters of the solver
    monte_carlo.set_param({.num_samples = nsamples,                       // The number of random samples
                           .seed = PCG32_DEFAULT_SEED,                    // Random seed for PCG64
                           .stream = PCG32_DEFAULT_STREAM });             // Random stream for PCG64

    // Set the input matrix.
    monte_carlo.set_matrix(A, kSymmetric);

    // Compute the solution for `t = 0.1`
    solution = monte_carlo.expm(u0, time);

    // Write the results to a file TXT.
    mocca::io::write_txt(solution, "output/results_expm.vec");

    return EXIT_SUCCESS;
}

Example - Node Centrality

Another application of the matrix functions is in the analysis of complex networks, in particular, for identifying the most important nodes in the graph, such as essential proteins, vulnerable infrastructures, etc. Measures of node importance are referred to as node centrality and there are many strategies for calculating the centrality of a node. Here, we are particularly interested in two centrality measures: subgraph centrality [6], [7] and total communicability [5].

  • The subgraph centrality measures the importance of the node based on their participation in all subgraphs in the network and, for a node \(i\), it is defined as

\[SC(i) = (e^{\gamma \mathbf{A}})_{ii}\]
  • The total communicability measures the importance of the node based on how well it can communicate with the rest of the network, and, for a node \(i\), it is defined as

\[TC(i) = (e^{\gamma \mathbf{A}} \vec{1})_i\]

where \(\vec{1}\) is a vector of ones. The parameter \(\gamma\) is the “inverse temperature” of the network and it is used to control the penalization of longer walks.

The``RandFunm`` solver is particularly efficient for estimating these two metrics. The subgraph centrality only requires the diagonal of \(e^{\gamma \mathbf{A}}\) and can be calculated with the funm_diag(mocca::expm, GAMMA) method, while the total communicability is equal to the \(e^{\gamma \mathbf{A}} \vec{v}\) with \(\vec{v} = \vec{1}\) and can be estimated with funm(mocca::expm, v, GAMMA) method. Here the adjacency matrix \(\mathbf{A}\) is stored in a CSR format in an HDF5 file, which was generated with the io::write_hdf5 routine. The full code is available in examples/node_centrality.cpp.

#include "mocca/monte_carlo/rand_funm.h"
#include "mocca/core.h"
#include "mocca/io/hdf5.h"

#define MAX_TERMS 256           // Define the maximum number of terms in the series
#define GAMMA 1E-3              // The "inverse temperature" of the network
#define Wc 1E-6                 // Weight cutoff. The random walk will terminate in step `k` if its weight `Wk` is less than `Wc W0`

using index_t = mocca::index_t;
using Vector = mocca::Vector<double>;
using CSRMatrix = mocca::CSRMatrix<double>;

int main(int argc, char* argv[])
{
    if (argc != 3)
    {
        fmt::print(stderr, "Invalid Parameters. Usage: ./{} <filename> <num_samples>\n", argv[0]);
        exit(EXIT_FAILURE);
    }

    std::string filename = argv[1];
    index_t samples = atol(argv[2]);

    mocca::RandFunm<CSRMatrix> monte_carlo;

    Vector res;
    CSRMatrix A;
    Vector ref_subgraph, ref_total_comm;

    // The networks are stored as a CSR matrix in an HDF5 file. The file was generated using the `io::write_hdf5`.
    mocca::io::read_hdf5("input/" + filename + ".h5", "A", A);

    // The vector for the total communicability contains only `1`
    Vector vec(A.cols(), 1);

    monte_carlo.set_matrix(A);
    monte_carlo.set_param({.num_samples = samples,                      // Number of random walks
                           .max_num_terms = MAX_TERMS,                  // Maximum number of terms in the series. Going over this value will trigger a `std::runtime_error` exception.
                           .weight_cutoff = Wc,                         // Weight cutoff
                           .seed = PCG32_DEFAULT_SEED,                  // Seed for the PCG64
                           .stream = PCG32_DEFAULT_STREAM});            // Stream for the PCG64

    res = monte_carlo.funm_diag(mocca::expm, GAMMA);
    fmt::print("Subgraph centrality:\n {}\n\n", res);

    res = monte_carlo.funm(mocca::expm, vec, GAMMA);
    fmt::print("Total Communicability:\n {}\n\n", res);

    return EXIT_SUCCESS;
}

References