Software:GraphBLAS

From HandWiki
Short description: API for graph data and graph operations


GraphBLAS Specification
GraphBLAS logo.png
The logo of the GraphBLAS API
StatusReleased
First published29 May 2017 (2017-05-29)
Latest version2.1.0
22 December 2023 (2023-12-22)
DomainGraph algorithms
LicenseCreative Commons Attribution (CC BY) 4.0
Websitegraphblas.org

GraphBLAS (/ˈɡræfˌblɑːz/ (About this soundlisten)) is an API specification that defines standard building blocks for graph algorithms in the language of linear algebra.[1][2] GraphBLAS is built upon the notion that a sparse matrix can be used to represent graphs as either an adjacency matrix or an incidence matrix. The GraphBLAS specification describes how graph operations (e.g. traversing and transforming graphs) can be efficiently implemented via linear algebraic methods (e.g. matrix multiplication) over different semirings.[3]

The development of GraphBLAS and its various implementations is an ongoing community effort, including representatives from industry, academia, and government research labs.[4][5]

Background

Graph algorithms have long taken advantage of the idea that a graph can be represented as a matrix, and graph operations can be performed as linear transformations and other linear algebraic operations on sparse matrices.[6](ppxxv-xxvi) For example, matrix-vector multiplication can be used to perform a step in a breadth-first search.[6](pp32–33)

The GraphBLAS specification (and the various libraries that implement it) provides data structures and functions to compute these linear algebraic operations. In particular, GraphBLAS specifies sparse matrix objects which map well to graphs where vertices are likely connected to relatively few neighbors (i.e. the degree of a vertex is significantly smaller than the total number of vertices in the graph). The specification also allows for the use of different semirings to accomplish operations in a variety of mathematical contexts.

Originally motivated by the need for standardization in graph analytics, similar to its namesake BLAS,[7] the GraphBLAS standard has also begun to interest people outside the graph community, including researchers in machine learning,[8] and bioinformatics.[9] GraphBLAS implementations have also been used in high-performance graph database applications such as RedisGraph.[10][11][12][13][14]

Specification

The GraphBLAS specification has been in development since 2013,[15] and has reached version 2.1.0 as of December 2023.[16] While formally a specification for the C programming language, a variety of programming languages have been used to develop implementations in the spirit of GraphBLAS, including C++,[17] Java,[18] and Nvidia CUDA.[19]

Compliant implementations and language bindings

There are currently two fully-compliant reference implementations of the GraphBLAS specification.[20][21] Bindings assuming a compliant specification exist for the Python,[22] MATLAB,[23] and Julia[24][25] programming languages.

Linear algebraic foundations

Computing a single step in a breadth-first search of a graph. Matrix-vector multiplication can be used to compute the outbound neighbors (vertices 1 and 3, shown in blue) of a given source vertex (shown in red). Note that the matrix [math]\displaystyle{ A }[/math] is the adjacency matrix of the graph shown to the left, with outbound edges (4,1) and (4,3) shown in green.

The mathematical foundations of GraphBLAS are based in linear algebra and the duality between matrices and graphs.[26][27]

Each graph operation in GraphBLAS operates on a semiring, which is made up of the following elements:

Note that the zero element (i.e. the element that represents the absence of an edge in the graph) can also be reinterpreted.[26]("VII. 0-Element: No Graph Edge") For example, the following algebras can be implemented in GraphBLAS:

Algebra [math]\displaystyle{ \oplus }[/math] [math]\displaystyle{ \otimes }[/math] Domain Zero Element
Standard arithmetic [math]\displaystyle{ + }[/math] [math]\displaystyle{ \times }[/math] [math]\displaystyle{ \R }[/math] 0
Max–plus algebra [math]\displaystyle{ \max }[/math] [math]\displaystyle{ + }[/math] [math]\displaystyle{ \{-\infty\} \cup \R }[/math] [math]\displaystyle{ -\infty }[/math]
Min–plus algebra [math]\displaystyle{ \min }[/math] [math]\displaystyle{ + }[/math] [math]\displaystyle{ \{+\infty\} \cup \R }[/math] [math]\displaystyle{ +\infty }[/math]
Max–min algebra [math]\displaystyle{ \max }[/math] [math]\displaystyle{ \min }[/math] [math]\displaystyle{ [0, +\infty) }[/math] 0
Min–max algebra [math]\displaystyle{ \min }[/math] [math]\displaystyle{ \max }[/math] [math]\displaystyle{ (-\infty, 0] }[/math] 0
Galois field XOR AND [math]\displaystyle{ \{0,1\} }[/math] 0

All the examples above satisfy the following two conditions in their respective domains:

For instance, a user can specify the min-plus algebra over the domain of double-precision floating point numbers with GrB_Semiring_new(&min_plus_semiring, GrB_MIN_FP64, GrB_PLUS_FP64).

Functionality

While the GraphBLAS specification generally allows significant flexibility in implementation, some functionality and implementation details are explicitly described:

  • GraphBLAS objects, including matrices and vectors, are opaque data structures.[16](2.4 GraphBLAS Opaque Objects)
  • Non-blocking execution mode, which permits lazy or asynchronous evaluation of certain operations.[16](2.5.1 Execution modes)
  • Masked assignment, denoted [math]\displaystyle{ A\langle M \rangle = B }[/math], which assigns elements of matrix [math]\displaystyle{ B }[/math] to matrix [math]\displaystyle{ A }[/math] only in positions where the mask matrix [math]\displaystyle{ M }[/math] is non-zero.[16](3.5.4 Masks)

The GraphBLAS specification also prescribes that library implementations be thread-safe.[16](2.5.2 Multi-threaded execution)

Example code

The following is a GraphBLAS 2.1-compliant example of a breadth-first search in the C programming language.[16](p294)

#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
#include <stdbool.h>
#include "GraphBLAS.h"

/*
 * Given a boolean n x n adjacency matrix A and a source vertex s, performs a BFS traversal
 * of the graph and sets v[i] to the level in which vertex i is visited (v[s] == 1).
 * If i is not reachable from s, then v[i] = 0 does not have a stored element.
 * Vector v should be uninitialized on input.
 */
GrB_Info BFS(GrB_Vector *v, GrB_Matrix A, GrB_Index s)
{
  GrB_Index n;
  GrB_Matrix_nrows(&n,A);                  // n = # of rows of A

  GrB_Vector_new(v,GrB_INT32,n);           // Vector<int32_t> v(n)

  GrB_Vector q;                            // vertices visited in each level
  GrB_Vector_new(&q, GrB_BOOL, n);         // Vector<bool> q(n)
  GrB_Vector_setElement(q, (bool)true, s); // q[s] = true, false everywhere else

  /*
   * BFS traversal and label the vertices.
   */
  int32_t level = 0;                                       // level = depth in BFS traversal
  GrB_Index nvals;
  do {
    ++level;                                               // next level (start with 1)
    GrB_apply(*v, GrB_NULL, GrB_PLUS_INT32,
              GrB_SECOND_INT32, q, level, GrB_NULL);       // v[q] = level
    GrB_vxm(q, *v, GrB_NULL, GrB_LOR_LAND_SEMIRING_BOOL,
            q, A, GrB_DESC_RC);                            // q[!v] = q ||.&& A; finds all the 
                                                           // unvisited successors from current q
    GrB_Vector_nvals(&nvals, q);
  } while (nvals);                                         // if there is no successor in q, we are done.

  GrB_free(&q);                                            // q vector no longer needed

  return GrB_SUCCESS;
}

See also

References

  1. "GraphBLAS". https://graphblas.org/. 
  2. "GraphBLAS: A Programming Specification for Graph Analysis". https://www.sei.cmu.edu/research-capabilities/all-work/display.cfm?customel_datapageid_4050=6501. 
  3. Pereira, Juliana. "High-Performance Graph Algorithms Using Linear Algebra". Central European University, Department of Network and Data Science. https://networkdatascience.ceu.edu/article/2020-01-15/high-performance-graph-algorithms-using-linear-algebra. Retrieved 13 February 2020. 
  4. "People of ACM - Tim Davis". Association for Computing Machinery. https://www.acm.org/articles/people-of-acm/2019/tim-davis. Retrieved 8 November 2019. 
  5. Mattson, Tim; Gabb, Henry. "Graph Analytics: A Foundational Building Block for the Data Analytics World". Intel. https://techdecoded.intel.io/big-picture/graph-analytics-a-foundational-building-block-for-the-data-analytics-world/#gs.watx6b. Retrieved 14 February 2020. 
  6. 6.0 6.1 Kepner, Jeremy; Gilbert, John (2011). Graph Algorithms in the Language of Linear Algebra. Philadelphia, PA, USA: Society for Industrial and Applied Mathematics. ISBN 9780898719901. https://dl.acm.org/citation.cfm?id=2039367. Retrieved 8 November 2019. 
  7. Vu, Linda. "GraphBLAS: Building Blocks for High Performance Graph Analytics". https://crd.lbl.gov/news-and-publications/news/2017/graphblas-building-blocks-for-high-performance-graph-analytics/. Retrieved 8 November 2019. ""In subsequent years, various research collaborations created a variety of BLAS libraries for different tasks. Realizing the benefits to users, vendors also worked with researchers to optimize these building blocks to run on their hardware. GraphBLAS is essentially a continuation of this BLAS heritage."" 
  8. Kepner, Jeremy; Kumar, Manoj; Moreira, José; Pattnaik, Pratap; Serrano, Mauricio; Tufo, Henry (12–14 September 2017). "Enabling massive deep neural networks with the GraphBLAS". 2017 IEEE High Performance Extreme Computing Conference (HPEC). pp. 1–10. doi:10.1109/HPEC.2017.8091098. ISBN 978-1-5386-3472-1. Bibcode2017arXiv170802937K. ""In this paper we have shown that the key [deep neural network] computations can be represented in GraphBLAS, a library interface defined for sparse matrix algebra. Furthermore, we have shown that the key step of forward propagation, with ReLU as the nonlinearity, can be performed much more efficiently with GraphBLAS implementation as compared to BLAS implementation when the weight matrices are sparse."" 
  9. Vu, Linda (12 March 2018). "A Game Changer: Metagenomic Clustering Powered by Supercomputers". Lawrence Berkeley National Laboratory News Center. https://newscenter.lbl.gov/2018/03/12/metagenomic-clustering-powered-by-supercomputers/. Retrieved 10 November 2019. 
  10. "RedisGraph". https://redislabs.com/redis-enterprise/redis-modules/redis-enterprise-modules/redisgraph/. Retrieved 11 November 2019. 
  11. Anadiotis, George (24 October 2019). "Redis Labs goes Google Cloud, Graph, and other interesting places". ZDNet. https://www.zdnet.com/article/redis-labs-goes-google-cloud-graph-and-other-interesting-places/. Retrieved 8 November 2019. 
  12. "Redis Labs Introduces RedisGraph and Streams to Support a Zero Latency Future". DevOps.com. 16 November 2018. https://devops.com/redis-labs-introduces-redisgraph-and-streams-to-support-a-zero-latency-future/. Retrieved 10 November 2019. ""Built on GraphBLAS, an open-source library that employs linear algebra including matrix multiplication, RedisGraph can complete calculations up to 600 times faster than any alternate graph solution according to benchmark results."" 
  13. Woodie, Alex (28 September 2018). "Redis Speeds Towards a Multi-Model Future". Datanami. https://www.datanami.com/2018/09/28/redis-speeds-towards-a-multi-model-future/. Retrieved 10 November 2019. ""One of the newest modules to emerge from Redis Labs turns the key value store into a graph database. The module, called RedisGraph, will be based on the GraphBLAS technology that emerged out of academia and industry."" 
  14. Dsouza, Melisha (20 November 2018). "RedisGraph v1.0 released, benchmarking proves its 6-600 times faster than existing graph databases". Packt. https://hub.packtpub.com/redisgraph-v1-0-released-benchmarking-proves-its-6-600-times-faster-than-existing-graph-databases/. Retrieved 10 November 2019. ""RedisGraph is a Redis module that adds a graph database functionality to Redis. RedisGraph delivers a fast and efficient way to store, manage and process graphs, around 6 to 600 times faster than existing graph databases. RedisGraph represents connected data as adjacency matrices and employs the power of GraphBLAS which is a highly optimized library for sparse matrix operations."" 
  15. Mattson, Tim; Bader, David; Berry, Jon; Buluç, Aydin; Dongarra, Jack; Faloutsos, Christos; Feo, John; Gilbert, John et al. (10–12 September 2013). "Standards for graph algorithm primitives". 2013 IEEE High Performance Extreme Computing Conference (HPEC). 1–2. doi:10.1109/HPEC.2013.6670338. ISBN 978-1-4799-1365-7. ""It is our view that the state of the art in constructing a large collection of graph algorithms in terms of linear algebraic operations is mature enough to support the emergence of a standard set of primitive building blocks. This paper is a position paper defining the problem and announcing our intention to launch an open effort to define this standard."" 
  16. 16.0 16.1 16.2 16.3 16.4 16.5 Brock, Benjamin; Buluç, Aydın; Kimmerer, Raye; Kitchen, Jim; Kumar, Manoj; Mattson, Timothy; McMillan, Scott; Moreira, José et al.. "The GraphBLAS C API Specification: Version 2.1.0". https://graphblas.org/docs/GraphBLAS_API_C_v2.1.0.pdf. Retrieved 22 December 2023. 
  17. "GraphBLAS Template Library (GBTL)". https://github.com/cmu-sei/gbtl. Retrieved 8 November 2019. 
  18. "Graphulo: Graph Processing on Accumulo". http://graphulo.mit.edu. Retrieved 8 November 2019. 
  19. "GraphBLAST". https://github.com/gunrock/graphblast. Retrieved 8 November 2019. 
  20. Davis, Timothy. "SuiteSparse:GraphBLAS". http://faculty.cse.tamu.edu/davis/GraphBLAS.html. Retrieved 11 November 2019. ""SuiteSparse:GraphBLAS is a full implementation of the GraphBLAS standard (graphblas.org), which defines a set of sparse matrix operations on an extended algebra of semirings using an almost unlimited variety of operators and types."" 
  21. Moreira, Jose; Horn, Bill. "ibmgraphblas". https://github.com/IBM/ibmgraphblas. Retrieved 19 November 2019. 
  22. Pelletier, Michel. "GraphBLAS for Python". https://github.com/michelp/pygraphblas. Retrieved 11 November 2019. 
  23. Davis, Timothy. "SuiteSparse:GraphBLAS". http://faculty.cse.tamu.edu/davis/GraphBLAS.html. Retrieved 11 November 2019. ""Now with OpenMP parallelism and a MATLAB interface"" 
  24. Mehndiratta, Abhinav. "GraphBLAS Implementation". https://summerofcode.withgoogle.com/archive/2019/projects/5106075190165504/. Retrieved 11 November 2019. 
  25. Mehndiratta, Abhinav (7 June 2019). "An introduction to GraphBLAS". https://abhinavmehndiratta.github.io/2019-06-07/an-introduction-to-graphblas. Retrieved 11 November 2019. 
  26. 26.0 26.1 Kepner, Jeremy; Aaltonen, Peter; Bader, David; Buluç, Aydın; Franchetti, Franz; Gilbert, John; Hutchison, Dylan; Kumar, Manoj et al. (13–15 September 2016). "Mathematical foundations of the GraphBLAS". 2016 IEEE High Performance Extreme Computing Conference (HPEC). pp. 1–9. doi:10.1109/HPEC.2016.7761646. ISBN 978-1-5090-3525-0. Bibcode2016arXiv160605790K. 
  27. For additional mathematical background, see Kepner, Jeremy; Jananthan, Hayden (17 July 2018). Mathematics of Big Data: Spreadsheets, Databases, Matrices, and Graphs. The MIT Press. pp. 81–168. ISBN 978-0262038393. https://mitpress.mit.edu/books/mathematics-big-data. Retrieved 10 November 2019. 

External links