DMelt:IO/8 Cross Paltform IO
Cross platform I/O
HBook class
Many DMelt object (data arrays, histograms, functions) can be saved and restored using XML files without using the Java serialization. The Java class which writes data in XML is called jhplot.io.HBook or jhplot.io.CFBook (which has slightly different XML tags). The class jhplot.io.CFBook can read XML files with histograms created in C++ or Fortran programs.
These classes, unlike the standard Java serialization (including XML) discussed in Section Java Serialization, is very similar to the "AIDA" implementation. Notably:
- jhplot.io.HBook keeps only information content of objects (values, titles, labels), without graphical attributes (Color, fonts)
- jhplot.io.HBook does not use XML tags to keep separate values. This means that this class generates substantially smaller outputs since the standard tags for data values are not used. This is especially important for large data volumes, 2D matrices and histograms.
- jhplot.io.HBook is better suited for cross platform. In particular, C++ and fortran packages are available to read and write data in this format. See the package CFBook.
Here is a small example of how to write a few DataMelt objects:
from jhplot import * from jhplot.io import * hb = HBook("output.jdat","w") # HBook object p1=P1D("test") p1.add(1.0,2.0) p1.add(2.0,3.0) hb.write("data",p1) # we add data using "data" key f1=F1D("x*cos(x*10)") hb.write(10,f1) # we can also add a function using numeric key (integer) hb.close() # write and close the file.
The output of the JDAT file look as:
<dmelt> <created-by>JDAT file. Work.ORG: @S.Chekanov</created-by> <created-on>Sat Jan 26 20:43:42 CST 2013</created-on> <description>writer</description> <version>2</version> <f1d> <id>10</id> <title>x*cos(x*10)</title> <name>x*cos(x*10)</name> </f1d> <p1d> <id>data</id> <title>test</title> <size>2</size> <dimen>2</dimen> <labelx>X</labelx> <labely>Y</labely> <data> 1.0,2.0 2.0,3.0 </data> </p1d> </dmelt>
All XML tags of JDAT are self-explanatory. Now we will read both objects using their keys:
from jhplot import * from jhplot.io import * hb = HBook("output.jdat") # read HBook object print hb.getKeys() # print all the keys p1=hb.get("data") f1=hb.get('10') print p1.toString() print f1.getName()
The expected number is:
array(java.lang.String, [u'10', u'data']) # title=test dimension=2 # X Y 1.0 2.0 2.0 3.0 x*cos(x*10)
You can browser the data as explained in Sect.Input and Output. Similarly, you can save histograms, i.e. jhplot.H1D, jhplot.H2D and other objects.
Here is a more detailed example. We write data arrays, histogram and a function to 3 types of files with the extensions .jdat (HBook), .jser (HFile), .jpbu (PFile, protocol buffer). Then we open the DataBrowser for each file automatically and then plot the objects using the mouse clicks.
from jhplot.io import * from jhplot import * import time f=HFile("test.jser","w") start = time.clock() for i in range(10): p0= P0D('Random='+str(i)) p0.randomNormal(1000,0.0,1.0) f.write(p0) print 'PFile time (s)=',time.clock()-start f.close() # browser objects c1=HPlot("Browser") c1.visible() f=HFile("test.jser") BrowserHFile(c1,f,1)
The data can be organized in "directories" inside the HBook XML files and can be viewed in the browser as a trees. Here is an example showing this:
To make a directory inside the HBook file, simply use "/" in the keys. This simple example shows this: we create 2 folders "folder" and "folder2" and put there the objects. When we will use the data browser, they will be shown as a tree.
hb=HBook("output.jdat","w") hb.write("folder/histo",h1) # put histogram h1 to the folder "folder" hb.write("folder/func",f1) # also a function hb.write("folder2/1d array",p) # put data "p" to another folder. hb.write("folder2/2d array",pn) hb.close()
Parse HBook files in CPython
HBook files with the extension ".jdat" can be read in CPython (Python implemented in C) using "xml.dom" Python module. In this case, one can use the output in a platform-specific languages (for example, CPython can be interfaced with C++/C libraries). Here is a small example how to parse jdat files in the CPython:
from xml.dom import minidom xmldoc = minidom.parse('data.jdat') itemlist = xmldoc.getElementsByTagName('p1d') print len(itemlist) alldata={} for staff in itemlist: ary=[] sid = staff.getAttribute("id") sid = (staff.getElementsByTagName("id")[0]).firstChild.data title = (staff.getElementsByTagName("title")[0]).firstChild.data size= (staff.getElementsByTagName("size")[0]).firstChild.data dimension=(staff.getElementsByTagName("dimen")[0]).firstChild.data values = (staff.getElementsByTagName("data")[0]).firstChild.data print "Read: id=",sid, "title=",title," size=",size for line in values.splitlines(): line = line.strip() if not line:continue floats = [float(x) for x in line.split()] ary.append(floats) alldata[sid]=[title,int(size),int(dimension),ary] print alldata["eta1"] # print all attributes
In this example ".jdat" file includes several P1D objects. We read them all objects with the keys and create a map where the ID of the object is the key. The array of this map has data.
CFBook class
jhplot.io.CFBook can visualize histograms created by Fortran or C++ code. For this, use a light stand-alone library called CFBook (See https://datamelt.org/?id=cfbook-library). It can be linked to a C++ or Fortran code. You can compile it using gcc (for C++ programs) or gfortran (for Fortran program). This library creates XML file with 1D and 2D histograms, that can be read by Jas4pp. Here is an example of reading 1D histogram from fortran.xml file:
from jhplot import * from jhplot.io import * hb = CFBook() hb.read("fortran.xml") print hb.listAll() print hb.getKeysH1D() # list keys h1=hb.getH1D(1) # use the key 1 to retrive H1D c1=HPlot("Test") c1.setGTitle("Histograms from a file"); c1.visible(1) c1.setAutoRange() c1.draw(h1)
PFile class
This class is an attempt to build a multi-platform (and multi-language) I/O format based the Google's protocol buffer package. This package can be used to write files using a C++ program (or any other language) and read using Java, or write data in Java and read by a C/C++ programs.
The DataMelt Java class which implements the Google's protocol buffer format is called jhplot.io.PFile. The class jhplot.io.PFile is designed to store mainly DataMelt containers (arrays) described in Data Structures and data projected to 1,2 and 3 dimensions (histograms) described in Histograms section.
A C/C++ package which is used to write data arrays, histograms and structural data (ntuples) in C++ to be read by Java jhplot.io.PFile class is called CBook.
Once data are written in C/C++ with the help of the CBook package, one can read such data with the jhplot.io.PFile class or even to open data in a interactive Browser using jhplot.io.PFileBrowser class.
Here is a more complicated example: We write a 2D histogram and 1D array into a file. Then we read the objects back and plot them. One can use the class jhplot.io.HFile as well (just replace PFile by HFile). But, in case of PFile, you can write histograms and arrays using a C++ code and read data back using Java/Jython!
Unlike the HFile class, only pre-defined data containers can be stored in PFile files (all jhplot arrays, functions, histograms). PFile class is optimized for write/read speed and small output. In addition, one can read/write such files in C++.
from jhplot.io import * from jhplot import * from java.util import Random c1 = HPlot3D("Canvas",600,400,2,1) c1.visible(1) c1.setAutoRange() h1 = H2D("input2D",5,-2, 2.0, 5, -3,3) r = Random(33) for i in range(100): h1.fill(r.nextGaussian(), r.nextGaussian()) c1.draw(h1) f=PFile('tmp.jpbu','w') f.write(h1) f.write(P0D("sss")) f.close() c1.cd(2,1) c1.setAutoRange() f=PFile('tmp.jpbu','r') print f.size() # how many objects are stored print f.listEntries() # list all entries, the size of objects and the compression level c1.draw( f.read("input2D") ) f.close()
In this example, instead of sequential reading the objects one by one, we retrieve the objects using its key "input2D" (which is the title of the histogram). Open the created file in a browser for easy plotting:
from jhplot.io import * from jhplot import * c1=HPlot("Browser") c1.visible() f=PFile("tmp.jpbu") PFileBrowser(c1,f,1)
You will see a pop-up windows with all data entries. Click on the entry - the object will be plotted on the canvas.
Using DataBrowser to open ".jpbu" files
All DataMelt objects stored in compressed Java-serialized files can be viewed using a browser. For example, if a serialized file contains P1D, P0D, H1D, etc. objects, one can view them and plot them using a mouse-click approach.
If you have a file with the extension ".jpbu", you can view it using the DataBrowser. Go to the toolbar, select [Plot}->[HPlot canvas]-> [File]-> [Open data file].
Working with C++ external programs
Here we discuss how to create files with DataMelt data containers (arrays, histograms, functions) using an C++ code and then how to read them using 100% Java code of the DataMelt.