# 2.1.2. Python Examples using h5py¶

One way to gain a quick familiarity with NeXus is to start working with some data. For at least the first few examples in this section, we have a simple two-column set of 1-D data, collected as part of a series of alignment scans by the APS USAXS instrument during the time it was stationed at beam line 32ID. We will show how to write this data using the Python language and the h5py package 1 (using h5py calls directly rather than using the NeXus NAPI). The actual data to be written was extracted (elsewhere) from a spec 2 data file and read as a text block from a file by the Python source code. Our examples will start with the simplest case and add only mild complexity with each new case since these examples are meant for those who are unfamiliar with NeXus.

1
2

The data shown plotted in the next figure will be written to the NeXus HDF5 file using only two NeXus base classes, NXentry and NXdata, in the first example and then minor variations on this structure in the next two examples. The data model is identical to the one in the Introduction chapter except that the names will be different, as shown below:

data structure, (from Introduction)

our h5py example

 1 2 3 4 /entry:NXentry /mr_scan:NXdata /mr : float64[31] /I00 : int32[31] 

plot of our mr_scan

two-column data for our mr_scan

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 17.92608 1037 17.92591 1318 17.92575 1704 17.92558 2857 17.92541 4516 17.92525 9998 17.92508 23819 17.92491 31662 17.92475 40458 17.92458 49087 17.92441 56514 17.92425 63499 17.92408 66802 17.92391 66863 17.92375 66599 17.92358 66206 17.92341 65747 17.92325 65250 17.92308 64129 17.92291 63044 17.92275 60796 17.92258 56795 17.92241 51550 17.92225 43710 17.92208 29315 17.92191 19782 17.92175 12992 17.92158 6622 17.92141 4198 17.92125 2248 17.92108 1321 

## 2.1.2.1. Writing the simplest data using h5py¶

These two examples show how to write the simplest data (above). One example writes the data directly to the NXdata group while the other example writes the data to NXinstrument/NXdetector/data and then creates a soft link to that data in NXdata.

## 2.1.2.2. Complete h5py example writing and reading a NeXus data file¶

### Writing the HDF5 file using h5py¶

In the main code section of BasicWriter.py, a current time stamp is written in the format of ISO 8601 (yyyy-mm-ddTHH:MM:SS). For simplicity of this code example, we use a text string for the time, rather than computing it directly from Python support library calls. It is easier this way to see the exact type of string formatting for the time. When using the Python datetime package, one way to write the time stamp is:

 1 timestamp = "T".join( str( datetime.datetime.now() ).split() ) 

The data (mr is similar to “two_theta” and I00 is similar to “counts”) is collated into two Python lists. We use the numpy package to read the file and parse the two-column format.

The new HDF5 file is opened (and created if not already existing) for writing, setting common NeXus attributes in the same command from our support library. Proper HDF5+NeXus groups are created for /entry:NXentry/mr_scan:NXdata. Since we are not using the NAPI, our support library must create and set the NX_class attribute on each group.

Note

We want to create the desired structure of /entry:NXentry/mr_scan:NXdata/.

1. First, our support library calls f = h5py.File() to create the file and root level NeXus structure.

2. Then, it calls nxentry = f.create_group("entry") to create the NXentry group called entry at the root level.

3. Then, it calls nxdata = nxentry.create_group("mr_scan") to create the NXentry group called entry as a child of the NXentry group.

Next, we create a dataset called title to hold a title string that can appear on the default plot.

Next, we create datasets for mr and I00 using our support library. The data type of each, as represented in numpy, will be recognized by h5py and automatically converted to the proper HDF5 type in the file. A Python dictionary of attributes is given, specifying the engineering units and other values needed by NeXus to provide a default plot of this data. By setting signal="I00" as an attribute on the group, NeXus recognizes I00 as the default y axis for the plot. The axes="mr" attribute on the NXdata group connects the dataset to be used as the x axis.

Finally, we must remember to call f.close() or we might corrupt the file when the program quits.

BasicWriter.py: Write a NeXus HDF5 file using Python with h5py

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 #!/usr/bin/env python '''Writes a NeXus HDF5 file using h5py and numpy''' import h5py # HDF5 support import numpy import six print("Write a NeXus HDF5 file") fileName = u"prj_test.nexus.hdf5" timestamp = u"2010-10-18T17:17:04-0500" # load data from two column format data = numpy.loadtxt(u"input.dat").T mr_arr = data[0] i00_arr = numpy.asarray(data[1],'int32') # create the HDF5 NeXus file f = h5py.File(fileName, "w") # point to the default data to be plotted f.attrs[u'default'] = u'entry' # give the HDF5 root some more attributes f.attrs[u'file_name'] = fileName f.attrs[u'file_time'] = timestamp f.attrs[u'instrument'] = u'APS USAXS at 32ID-B' f.attrs[u'creator'] = u'BasicWriter.py' f.attrs[u'NeXus_version'] = u'4.3.0' f.attrs[u'HDF5_Version'] = six.u(h5py.version.hdf5_version) f.attrs[u'h5py_version'] = six.u(h5py.version.version) # create the NXentry group nxentry = f.create_group(u'entry') nxentry.attrs[u'NX_class'] = u'NXentry' nxentry.attrs[u'default'] = u'mr_scan' nxentry.create_dataset(u'title', data=u'1-D scan of I00 v. mr') # create the NXentry group nxdata = nxentry.create_group(u'mr_scan') nxdata.attrs[u'NX_class'] = u'NXdata' nxdata.attrs[u'signal'] = u'I00' # Y axis of default plot nxdata.attrs[u'axes'] = u'mr' # X axis of default plot nxdata.attrs[u'mr_indices'] = [0,] # use "mr" as the first dimension of I00 # X axis data ds = nxdata.create_dataset(u'mr', data=mr_arr) ds.attrs[u'units'] = u'degrees' ds.attrs[u'long_name'] = u'USAXS mr (degrees)' # suggested X axis plot label # Y axis data ds = nxdata.create_dataset(u'I00', data=i00_arr) ds.attrs[u'units'] = u'counts' ds.attrs[u'long_name'] = u'USAXS I00 (counts)' # suggested Y axis plot label f.close() # be CERTAIN to close the file print("wrote file:", fileName) 

### Reading the HDF5 file using h5py¶

The file reader, BasicReader.py, is very simple since the bulk of the work is done by h5py. Our code opens the HDF5 we wrote above, prints the HDF5 attributes from the file, reads the two datasets, and then prints them out as columns. As simple as that. Of course, real code might add some error-handling and extracting other useful stuff from the file.

Note

See that we identified each of the two datasets using HDF5 absolute path references (just using the group and dataset names). Also, while coding this example, we were reminded that HDF5 is sensitive to upper or lowercase. That is, I00 is not the same is i00.

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 #!/usr/bin/env python '''Reads NeXus HDF5 files using h5py and prints the contents''' import h5py # HDF5 support fileName = "prj_test.nexus.hdf5" f = h5py.File(fileName, "r") for item in f.attrs.keys(): print(item + ":", f.attrs[item]) mr = f['/entry/mr_scan/mr'] i00 = f['/entry/mr_scan/I00'] print("%s\t%s\t%s" % ("#", "mr", "I00")) for i in range(len(mr)): print("%d\t%g\t%d" % (i, mr[i], i00[i])) f.close() 

Output from BasicReader.py is shown next.

Output from BasicReader.py

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 file_name: prj_test.nexus.hdf5 file_time: 2010-10-18T17:17:04-0500 creator: BasicWriter.py HDF5_Version: 1.8.5 NeXus_version: 4.3.0 h5py_version: 1.2.1 instrument: APS USAXS at 32ID-B # mr I00 0 17.9261 1037 1 17.9259 1318 2 17.9258 1704 3 17.9256 2857 4 17.9254 4516 5 17.9252 9998 6 17.9251 23819 7 17.9249 31662 8 17.9247 40458 9 17.9246 49087 10 17.9244 56514 11 17.9243 63499 12 17.9241 66802 13 17.9239 66863 14 17.9237 66599 15 17.9236 66206 16 17.9234 65747 17 17.9232 65250 18 17.9231 64129 19 17.9229 63044 20 17.9228 60796 21 17.9226 56795 22 17.9224 51550 23 17.9222 43710 24 17.9221 29315 25 17.9219 19782 26 17.9217 12992 27 17.9216 6622 28 17.9214 4198 29 17.9213 2248 30 17.9211 1321 

#### Finding the default plottable data¶

Let’s make a new reader that follows the chain of attributes (@default, @signal, and @axes) to find the default plottable data. We’ll use the same data file as the previous example. Our demo here assumes one-dimensional data. (For higher dimensionality data, we’ll need more complexity when handling the @axes attribute and we’ll to check the field sizes. See section Find the plottable data, subsection Version 3, for the details.)

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26  import h5py with h5py.File("prj_test.nexus.hdf5", "r") as nx: # find the default NXentry group nx_entry = nx[nx.attrs["default"]] # find the default NXdata group nx_data = nx_entry[nx_entry.attrs["default"]] # find the signal field signal = nx_data[nx_data.attrs["signal"]] # find the axes field(s) attr_axes = nx_data.attrs["axes"] if isinstance(attr_axes, (set, tuple, list)): # but check that attr_axes only describes 1-D data if len(attr_axes) == 1: attr_axes = attr_axes[0] else: raise ValueError(f"expected 1-D data but @axes={attr_axes}") axes = nx_data[attr_axes] print(f"file: {nx.filename}") print(f"signal: {signal.name}") print(f"axes: {axes.name}") print(f"{axes.name} {signal.name}") for x, y in zip(axes, signal): print(x, y) 

Output from reader_attributes_trail.py is shown next.

Output from reader_attributes_trail.py

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 file: prj_test.nexus.hdf5 signal: /entry/mr_scan/I00 axes: /entry/mr_scan/mr /entry/mr_scan/mr /entry/mr_scan/I00 17.92608 1037 17.92591 1318 17.92575 1704 17.92558 2857 17.92541 4516 17.92525 9998 17.92508 23819 17.92491 31662 17.92475 40458 17.92458 49087 17.92441 56514 17.92425 63499 17.92408 66802 17.92391 66863 17.92375 66599 17.92358 66206 17.92341 65747 17.92325 65250 17.92308 64129 17.92291 63044 17.92275 60796 17.92258 56795 17.92241 51550 17.92225 43710 17.92208 29315 17.92191 19782 17.92175 12992 17.92158 6622 17.92141 4198 17.92125 2248 17.92108 1321 

### Plotting the HDF5 file¶

Now that we are certain our file conforms to the NeXus standard, let’s plot it using the NeXpy 3 client tool. To help label the plot, we added the long_name attributes to each of our datasets. We also added metadata to the root level of our HDF5 file similar to that written by the NAPI. It seemed to be a useful addition. Compare this with plot of our mr_scan and note that the horizontal axis of this plot is mirrored from that above. This is because the data is stored in the file in descending mr order and NeXpy has plotted it that way (in order of appearance) by default.

3

plot of our mr_scan using NeXpy