DMelt:Math/7 Graphs
Graphs
Graphs belong to a branch of mathematics, graph theory. For data analysis that requires searches of particular patterns, graph-based data mining becomes an important technique. Read Graph theory.
In real life, most of the data we have to deal with can be represented as graphs. A typical graph consists of vertices (nodes, cells), and of edges that are the connecting lines between the nodes. Analysis of graphs includes determining certain details about the graph structure. For example, determining all routes or the shortest path between two nodes or cells.
DMelt supports the following graph libraries
- Medusa [1][ (via the class [http:/datamelt.org/api/doc.php/jhplot/HMedusa HMedusa]). See Medusa API
- JGraphT library [2] (see JGraphT API)
- JGraphX library [3] (see JGraphX API)
- Jung library [4](see Jung API)
These libraries support graph creation using cells and edges, as well as their visualization. Note that DMelt includes a Swing GUI to draw graphs. Look at the menu [Tools]->[Graph Editor]. In this section we will show how to draw Graphs and analyze them.
Graph code examples |
Drawing graphs
The drawing of graphs in DMelt is based on the Java class jhplot.HGraph. Here is a simple example to draw vertices and edges:
The code is shown below:
from java.awt import Color from java.util import Random from jhplot import HGraph c1 = HGraph("Canvas",600,400,1, 1) c1.setGTitle("Connected objects") c1.visible(1) v1="v1" v2="v2" v3="v3" v4="v4" c1.addVertex( v1 ) c1.addVertex( v2 ) c1.addVertex( v3 ) c1.addVertex( v4 ) c1.setPos( v1, 130, 40 ) c1.setPos( v2, 60, 200 ) c1.setPos( v3, 310, 230 ) c1.setPos( v4, 380, 70 ) c1.addEdge( v1, v2 ) c1.addEdge( v2, v3 ) c1.addEdge( v3, v1 ) c1.addEdge( v4, v3 )
The above example shows how to build a graph using a predefined methods of jhplot.HGraph. Generally, there are two types of programs that can be used to work with graphs:
- Using the predefined graphical canvas that comes with jhplot.HGraph as shows above
- Building first graph without visualization, and the visualize it (if required)
Let us consider the latter case:
from jhplot import * import sys g=HGraph().buildDirectedGraph() v1,v2,v3,v4 = "v1","v2","v3","v4" # add vertexes g.addVertex(v1); g.addVertex(v2); g.addVertex(v3); g.addVertex(v4); g.addEdge(v1, v2); g.addEdge(v2, v3); g.addEdge(v3, v1); g.addEdge(v4, v3); HGraph().showGraph(g) # show the graph in GUI
In this example we create a graph, and work with them, and later visualize. It creates the following image:
Analyzing the graphs
There are a number of algorithms that can be used to work with the graphs. Let us consider a typical example: finding a shortest path between two verticies, Look at the API summary org.jgrapht.alg.shortestpath.package-summary and org.jgrapht.alg.package-summary.
We will use the graphical approach when we construct the graph on the canvas, access it, and run a few algorithms to find the shortest path:
# Finding a shortest distance in a graph from jhplot import HGraph c1 = HGraph("Canvas",600,600,1, 1) v1="v1" v2="v2" v3="v3" v4="v4" v5="v5" v6="v6" v7="v7" v8="v8" c1.addVertex( v1 ) c1.addVertex( v2 ) c1.addVertex( v3 ) c1.addVertex( v4 ) c1.addVertex( v5 ) c1.addVertex( v6 ) c1.addVertex( v7 ) c1.addVertex( v8 ) c1.setPos( v1, 130, 40 ) c1.setPos( v2, 60, 200 ) c1.setPos( v3, 310, 330 ) c1.setPos( v4, 380, 70 ) c1.setPos( v5, 300, 150 ) c1.setPos( v6, 50, 20 ) c1.setPos( v7, 400, 380 ) c1.setPos( v8, 50, 400 ) c1.addEdge( v1, v2 ) c1.addEdge( v2, v3 ) c1.addEdge( v3, v1 ) c1.addEdge( v4, v3 ) c1.addEdge( v1, v5 ) c1.addEdge( v2, v6 ) c1.addEdge( v6, v7 ) c1.addEdge( v4, v8 ) c1.visible() # see: http://jgrapht.org/javadoc/org/jgrapht/alg/package-summary # http://jgrapht.org/javadoc/org/jgrapht/alg/shortestpath/package-summary graph=c1.getListenableGraph() print graph from org.jgrapht import GraphTests print "Is complete=", GraphTests.isSimple(graph) print "Is complete=", GraphTests.isComplete(graph) from org.jgrapht.alg.shortestpath import DijkstraShortestPath print "Find path between ",v1, " and ", v3 d=DijkstraShortestPath(graph) short=d.getPath(v1,v3) # distance measured in the number of edges. print "Shortest path=",short, " Distance=",short.getLength() from org.jgrapht.alg.shortestpath import BidirectionalDijkstraShortestPath print "Find bi-directional shortest path between ",v1, " and ", v3 d=BidirectionalDijkstraShortestPath(graph) short=d.getPath(v1,v3) print "Shortest path=",short, " Distance=",short.getLength()
This example prints the pass using org.jgrapht.alg.shortestpath.DijkstraShortestPath (Dijkstra Shortest Path) algorithm.
Weighted graphs
In many applications, each edge of a graph can have an associated numerical value, called a weight Weighted graphs may be either directed or undirected. The weight of an edge is often referred to as the "cost" of the edge. For example, the weight may be a measure of the length of a route, the used energy that is required to drive between locations along a route, etc.
Here is an example of how to build a graph with weighted edges. Then we run several algorithms to identify a path with smallest weight, For example, KShortestPaths algorithm finds a path that has smallest weight (event although it has several edges). We also run the BellmanFordShortestPath algorithm that can be used with graph with edges.
# Find a shortest path in weighted graph from jhplot import * g=HGraph().buildWeightedPseudograph() v1,v2,v3,v4,v5,v6,v7 = "v1","v2","v3","v4","v5","v6","v7" # add vertecies g.addVertex(v1) g.addVertex(v2) g.addVertex(v3) g.addVertex(v4) g.addVertex(v5) g.addVertex(v6) g.addVertex(v7) factory = g.getEdgeFactory() e = factory.createEdge(v1,v2) g.setEdgeWeight(e, 5) g.addEdge(v1, v2, e) e = factory.createEdge(v1,v2) g.setEdgeWeight(e, 8.8) g.addEdge(v1, v5, e) e = factory.createEdge(v1,v2) g.setEdgeWeight(e, 0.1) g.addEdge(v2, v5, e) e = factory.createEdge(v1,v3) g.setEdgeWeight(e, 2) g.addEdge(v1, v3, e) e = factory.createEdge(v3,v5) g.setEdgeWeight(e, 10.5) g.addEdge(v3, v5, e) e = factory.createEdge(v1,v7) g.setEdgeWeight(e, 3) g.addEdge(v1, v7, e) e = factory.createEdge(v7,v5) g.setEdgeWeight(e, 0.1) g.addEdge(v7, v5, e) e = factory.createEdge(v2,v6) g.setEdgeWeight(e, 2.1) g.addEdge(v2, v6, e) e = factory.createEdge(v6,v5) g.setEdgeWeight(e, 1) g.addEdge(v6, v5, e) e = factory.createEdge(v3,v4) g.setEdgeWeight(e, 1) g.addEdge(v3, v4, e) e = factory.createEdge(v4,v5) g.setEdgeWeight(e, 5) g.addEdge(v4, v5, e) HGraph().showGraph(g) # show the graph in GUI from org.jgrapht.alg.shortestpath import BellmanFordShortestPath print "Find weighted path between ",v1, " and ", v5 d=BellmanFordShortestPath(g,10) short=d.getPath(v1,v5) # distance measured in the number of edges. print "Shortest path=",short, " Distance=",short.getLength(), "Weight=",d.getPathWeight(v1,v5) from org.jgrapht.alg.shortestpath import KShortestPaths print "KShortestPaths with weights path between ",v1, " and ", v5 d=KShortestPaths(g, 10) short=d.getPaths(v1,v5) print "List shortest path using weights:" for l in d.getPaths(v1,v5): print l, "length=", l.getLength(), " weight=",l.getWeight()
The last line of the code prints all paths, ordered them according to the weight.
Graph input and output
Graphs can be saved and can be read back from a files. For example, graphs can be saved to GraphML (i.e. XML-like) format using as shown in this example:
# Exports a graph with weights as GraphML. # For a description of the format see http://en.wikipedia.org/wiki/ GraphML. # see: http://jgrapht.org/javadoc/org/jgrapht/ext/package-summary from jhplot import * g=HGraph().buildWeightedPseudograph() # add vertices v1,v2,v3 = "v1","v2","v3" g.addVertex(v1), g.addVertex(v2), g.addVertex(v3) factory = g.getEdgeFactory() e = factory.createEdge(v1,v2) g.setEdgeWeight(e, 5) g.addEdge(v1, v2, e) e = factory.createEdge(v2,v3) g.setEdgeWeight(e, 8.8) g.addEdge(v2, v3, e) e = factory.createEdge(v2,v1) g.setEdgeWeight(e, 0.1) g.addEdge(v2, v1, e) HGraph().showGraph(g) # show the graph in GUI from org.jgrapht.ext import GraphMLExporter from java.io import * file="example.ml" writer=FileWriter(file) f=GraphMLExporter() f.setExportEdgeWeights(True) f.exportGraph(g, writer) writer.close() print "File ",file," is created"
Here we create a weighted graph and save the into GraphXL.
One convenient method is to save files is to use CVS files. You do not need any particular library to do this. Here we show how to save and restore a weighted graph in a CVS using the Python syntax:
from jhplot import * g=HGraph().buildWeightedPseudograph() v1,v2,v3 = "v1","v2","v3" g.addVertex(v1), g.addVertex(v2), g.addVertex(v3) factory = g.getEdgeFactory() e = factory.createEdge(v1,v2) g.setEdgeWeight(e, 5) g.addEdge(v1, v2, e) e = factory.createEdge(v2,v3) g.setEdgeWeight(e, 8.8) g.addEdge(v2, v3, e) e = factory.createEdge(v2,v1) g.setEdgeWeight(e, 0.1) g.addEdge(v2, v1, e) HGraph().showGraph(g) # show the graph in GUI name='csvfile.csv' # write to CVS with weights with open(name,'wb') as file: for e in g.edgeSet(): line=str(g.getEdgeSource(e))+","+str(g.getEdgeTarget(e))+","+str(g.getEdgeWeight(e)) file.write(line) file.write('\n') # reading from CVS gg=HGraph().buildWeightedPseudograph() vertex=set() graph=[] with open(name) as f: alist = f.read().splitlines() for l in alist: words = l.split(',') v1,v2=str(words[0]),str(words[1]) vertex.add(v1) vertex.add(v2) w=float(words[2]) graph.append([v1,v2,w]) # add vertex for v in vertex: gg.addVertex(v) for j in graph: v1,v2,w=j[0],j[1],j[2] e = factory.createEdge(v1,v2) gg.setEdgeWeight(e, w) gg.addEdge(v1, v2, e) print "New graph was created from file ",name HGraph().showGraph(gg) # show the graph in GUI
The result of this script is a graph which is is identical to the original one.
Using Medusa
Let us give another complex example: building a graph using the class jhplot.HMedusa, that calls the Medusa Java library. This Java library is a powerful framework for visualization of interactive graphs and, primary, for clustering analysis of large-scale biological networks. Here is is an image of simple interactive graph
with the code shown bellow:
from medusa.graph import Graph,Node,Edge graph=Graph() n1="v1" n2="v2" e=Edge(n1,n2,1) graph.addEdge(e) n1="v2" n2="v7" e=Edge(n1,n2,4) graph.addEdge(e) n1="v7" n2="v8" e=Edge(n1,n2,1) graph.addEdge(e) print graph.report() from jhplot import * c=HMedusa() c.visible() c.add(graph)
One can also build a complex graphs from files as will shown in the Dmelt code example. Read DMelt:ExternalGuides/Medusa.
Using Jung
Jung library [5](see Jung API) is another powerful library to build and visualize graphs. To work with it, you will need a to 1) Build a graph; 2) Create a visualizer 3) Attach a visualizer to a Java JFrame. This small code shows this:
from edu.uci.ics.jung.graph import UndirectedSparseGraph from edu.uci.ics.jung.graph.util import EdgeType graph = UndirectedSparseGraph(); graph.addEdge(1, "v1", "v2", EdgeType.UNDIRECTED) graph.addEdge(2, "v2", "v3", EdgeType.UNDIRECTED) graph.addEdge(3, "v3", "v1", EdgeType.UNDIRECTED) graph.addEdge(4, "v5", "v1", EdgeType.UNDIRECTED) from edu.uci.ics.jung.visualization import VisualizationViewer from edu.uci.ics.jung.algorithms.layout import FRLayout from javax.swing import JFrame jf = JFrame() vv = VisualizationViewer(FRLayout(graph)) jf.getContentPane().add(vv) jf.setDefaultCloseOperation(JFrame.DISPOSE_ON_CLOSE) jf.pack() jf.setVisible(True)
with the image given bellow:
References
- ↑ Medusa visualisation Medusa-Visualisation web site
- ↑ JGraphT web site
- ↑ JGraphX web site
- ↑ Jung web site
- ↑ Jung web site