9. Appendices#

9.1. HDF5 Format Specification#

This section describes the format of the Hierarchical Data Format (HDF5) binary files exported by SMDExplorer. HDF5 files are widely read (an written) by a number of commercial and non-commercial data analysis software and provide a convenient container format to store different kinds of numerical data.

This section is of interest to

  • users that would like to analyze .hdf5 files in other software. While the files written by SMDExplorer are largely self-documenting, this section can serve as a reference of the expected output.

  • advanced users that would like to automate reading .hdf5 files written by SMDExplorer in other software.

  • advanced users that would like to export .hdf5 files from their analysis software of choice for processing in SMDExplorer.

9.1.1. HDF5 Spectral Mappings#

        
classDiagram
direction TB


    class file["HDF5 file"]{
    }

    class Channel["1_{ChannelType}"]{
        array~float~: data
        array~float~: shift 
    }


    class data{
        HDF5 attributes:
        dimensionLabel~String~
        dimensionScale~float~
    }

    class shift{
        HDF5 attributes:
        laserWavelength~String~
        dimensionLabel~String~
    }

    file --> Channel : contains
    Channel ..|> Type
    Channel --> data 
    Channel --> shift 
     
    %% shift <|--|> intensity: HDF5 scale


class Type{
    ChannelType.Raman
    ChannelType.Brillouin
    ChannelType.FLIM
    ChannelType.Analogue
    ChannelType.Unknown
}
    

Fig. 9.1 Diagram of the HDF5 mapping container format#

The .hdf5 container format for spectral mappings contains the following items:

  • for each scan region of interest / each different selected channel, a group with a name in the form of {idx}_{ChannelType}, where idx is the index of the scan area and ChannelType the type of data recorded, e.g. Raman. Each group contains the following datasets:

    • data: the spectral data as a multi-dimensional array of floats. The last dimension contains the spectra. Each dimension (except for the (last) spectral dimension) has a scale and dimension label attached.

    • shift: the x-data of the spectral axis. For Raman or Brillouin data, this contains the shift. The shift also contains the required HDF5 attribute laserWavelength as a floating-point number.

9.1.2. HDF5 Multi-Spectrum Files#

        classDiagram
direction RL


    class file["HDF5 file"]{
    }

    class shift["Shift (basename + _sh)"]{
        HDF5 attributes:
        - float: laserWavelength
        - string: Type
        dimensionlabel~String~
    }
    class intensity["Intensity (basename + _ct)"]{
        HDF5 attributes:
        -string: date
        dimensionlabel~String~
    }

    file -->  shift: contains
    file --> intensity : contains

    shift <|--|> intensity: HDF5 scale


class Type{
    ChannelType.Raman
    ChannelType.Brillouin
    ChannelType.FLIM
    ChannelType.Analogue
    ChannelType.Unknown
}
    

Fig. 9.2 Diagram of the HDF5 multispectrum container format#

The root group (the HDF5 file) contains the spectra as datasets named basename + _ct for the intensity and basename + _sh for the spectral axis. Here, basename is the user-defined name in SMDExplorer, which is derived from the .s1d filename or the database entry in .mdt files. The two datasets are connected using the HDF5 dimension scale: The spectral axis is the dimension scale for the first dimension of the intensity dataset. Spectral units are attached to the data using the HDF5 dimension label attribute.

In addition, the spectral data (_sh) contains the following required HDF5 attributes:

  • laserWavelength: a floating-point number specifying the laser wavelength (in nm) used in the experiment. This is required for converting to and from the different spectral representations.

  • type: a string specifying the type of data the spectrum contains.

The intensity data (_ct) contains the following required attributes:

  • date: an ISO-formatted string containing the date when the spectrum was recorded.

9.1.3. HDF5 Peak Fitting Results#

        classDiagram
direction RL
note for result "data: [N x M x ... x (baselineParams + numPeaks * paramPerPeak)]"
note for rowLabels "[y0, m0, Peak_1_A, Peak_1_x0, Peak_1_width, ...]"

    class file["HDF5 file"]{
    }

    class roi["ROI"]{
        Group
    }

    class rowLabels{
        list~string~
    }

    class scale{
        array~float~
    }

    class result["result"]{
        array~float~: data
        HDF5 attributes:
        dimensionlabel~String~
        dimensionScale: scale
    }

    class error["fit_uncertainties"]{
        array~float~ : data
    }
     class lineshape{
        string~Lineshape~
     }

    

    file -->  roi: contains
    roi --> result: contains
    roi --> lineshape: contains
    roi --> error: contains
    roi --> rowLabels : contains
    result <--> scale


class Lineshape{
    string lor
    string gauss
    string psvoigt
}
    

Fig. 9.3 Diagram of the HDF5 peak fitting result container format#

The peak fitting result format is constructed as follows: The root HDF5 file node contains HDF5 groups named ROI_1, ROI_2, etc. for each scan area contained in the original data. Each of the groups contains the following HDF5 datasets:

  • result: the dataset containing the peak fitting result. This is multi-dimensional array that contains the coefficients of the fit function. The last dimension has the length \(N_\text{peaks} * n_\text{paramPerPeak} + n_\text{paramBaseline}\). Here, \(n_\text{paramPerPeak}\) is the number of fit parameters per peak (3 for Gaussian and Lorentzian peaks, 4 for pseudo-Voigt peaks), \(N_\text{peaks}\) the number of peaks and \(n_\text{paramBaseline}\) the parameters of the baseline (2 for a linear baseline \(x=mx+y_0\)). The other dimensions represent the size of the dataset, for example for an XY-scan, the first two dimensions would be the number of points in the X and Y direction.
    The result dataset contains the following attributes

    • dimension scales for each spatial / temporal dimension.

    • a dimension label for each spatial / temporal dimension.

  • fit_uncertainties: the uncertainties in the form of ± one standard deviation. The shape is the same as result. Each entry contains the uncertainty of the corresponding parameter in result

  • lineshape: a string describing the line shape. Allowed values are

    • lor for Lorentzian

    • gauss for gaussian

    • psvoigt for pseudo-Voigt

  • rowLabels: an array of strings that provides a textual description of a peak deconvolution parameter, in the form of [y0, m, Peak_1_x0, …]. The number of entries is equal to the length of the last dimension of result.

9.1.4. Igor Pro HDF5 export#

To export data from the Igor Pro application, which controls the Phalanx instrument, to SMDExplorer, you can use the following code snippet.
After pasting the snippet into the Procedure window, the snippet will add a ExportAsHDF5 item to the Igor Pro Macros menu. Selecting this item will display a prompt for the 4D data and 1D shift waves, followed by a prompt for the output file name.

Listing 9.1 A code snippet to export data obtained from the Phalanx instrument from Igor Pro to .hdf5#
 1macro ExportAsHDF5()
 2    exportHDF5()
 3end
 4
 5function exportHDF5()
 6    string dataWName
 7    string shiftWName
 8
 9    Prompt dataWName,"Data",popup,wavelist("*",";","DIMS:4")
10    Prompt shiftWName,"Shift",popup,wavelist("*",";","DIMS:1")
11
12    DoPrompt "Select Data",dataWName,shiftWName
13
14    if (V_Flag)
15    	return 0    // user canceled
16    endif
17
18    wave shift=$shiftWName
19	wave data=$dataWName
20
21    string shiftNote = note(shift)
22    string wlString = stringbykey("Excitation_wavelength", shiftNote, ": ", "\r")
23    variable laserWL = str2num(wlString)
24
25    if(numtype(laserWL) != 0) // invalid laser WL
26
27        Prompt laserWL, "Laser Wavelength"
28
29        DoPrompt "Please enter the excitation wavelength", laserWL
30        if (V_Flag)
31            return 0    // user canceled
32        endif
33    endif
34
35    variable fileID
36    HDF5CreateFile /I /O fileID as ""
37
38    if (V_Flag)
39        return 0	// user canceled
40    endif
41
42    HDF5SaveData /IGOR = -1 /O data, fileID
43    HDF5SaveData /IGOR = -1 /O shift, fileID, "RamanShift"
44	
45    HDF5CloseFile fileID
46	
47end
        classDiagram
direction RL
    note for RamanShift "Excitation_wavelength: 532.1"


    class file["HDF5 file"]{
    }



    class data["{datasetName}"]{
        HDF5 attributes:
        IGORWaveScaling ~array~float~~
    }

    class RamanShift{
        HDF5 attributes:
        IGORWaveNote~String~
    }

    file --> data
    file --> RamanShift 
    

Fig. 9.4 Diagram of the HDF5 Igor Pro Export Format#

The Igor Pro .hdf5 format preserved Igor Pro scaling information and metadata. In addition to the scaling information, the only other required parameter is the Excitation_wavelength, which is stored in the Igor Pro wave note in the Phalanx file format. The export code snippet displays a prompt if this information is not present.

Note

The export code snippet has been tested with Igor Pro 9.