Skip to content

Commit

Permalink
scripts: add h5py format script
Browse files Browse the repository at this point in the history
convert output xml to h5py format with below naming convertion

1. file name is the version number.
2. each board is defined as one attribute

usage:

pip3 install h5py
python3 ./h5py_demo.py --input <path to xml file> --output <path to
output folder>

Signed-off-by: Hake Huang <hake.huang@oss.nxp.com>
  • Loading branch information
hakehuang committed Jan 13, 2021
1 parent 2be7661 commit 8483448
Showing 1 changed file with 63 additions and 0 deletions.
63 changes: 63 additions & 0 deletions scripts/h5py_demo.py
@@ -0,0 +1,63 @@
"""Create an HDF5 file in memory and retrieve the raw bytes
require python3.7.5 above
"""
import sys
import os
import argparse
import logging

import xml.etree.ElementTree as ET
import h5py

def file_path(string):
'''
check whether string is a directory
'''
if os.path.isfile(string):
return string
raise NotADirectoryError(string)

def dir_path(string):
'''
check whether string is a directory
'''
if os.path.isdir(string):
return string
raise NotADirectoryError(string)

def parser_args():
'''
parser arguements
'''
parser = argparse.ArgumentParser(description='xml to h5py format convert')
parser.add_argument('--input', type=file_path, required=True, help='Path to the imput xml file')
parser.add_argument('--output', type=dir_path, default=".",
help='Path to the output h5py folder')
return parser.parse_args()

def process_xml(opts):
'''
process xml to hash
'''
summary = ET.parse(opts.input).getroot()[0]
name_in_report = summary.attrib['name']
properties = summary[0]
version = "unknow"
for _p in properties:
if _p.attrib['name'] == "version":
version = _p.attrib['value']
logging.info(version)
out_put = os.path.join(opts.output, version + ".h5")
logging.info(out_put)
_h5f = h5py.File(out_put, 'a')
_xmlfh = open(opts.input, 'rb')
_h5f.attrs[name_in_report] = _xmlfh.read()
_h5f.close()
_xmlfh.close()


if __name__ == '__main__':
sys.stdout.flush()
logging.basicConfig(level=logging.DEBUG)
m_opts = parser_args()
process_xml(m_opts)

2 comments on commit 8483448

@PerMac
Copy link

@PerMac PerMac commented on 8483448 Jan 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the reasoning behind this script. Now you just wrapped XML into hdf5. This doesn't make the output "real" hdf5. What I meant during the testing wg meeting was:
-if we decide to use a "database" approach for results management (i.e. creating a database with the results and use it with some custom script to query for different statistics/analysis) we might use hdf5 structure for the database part.
We would then need to translate XML reports into hdf5 data structures. This would mean that we will need to write some simple translator that would load the structure from XML reports and convert it to hdf5 database type structure. E.g. the h5 file called results.h5 will contain entries, called by the zephyr version. Each such entry will be populated with entries named by the corresponding platforms. Each of the "platform" entries will contain a table with test results, where field like "module", "test suite", "test case", "duration", "verdict" etc will be populated according to the xml. The script for analyzing/plotting/etc the data might load the file and collect the required data (e.g. results of "hello_word" test case, on selected platforms, in the last 10 zephyr versions).
The benefit of using hdf5 instead of XML structure should be the ease of analyzing the results by selected conditions and optimized performance in terms of obtaining queried results.

@hakehuang
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PerMac , the problem is that convert xml to hdf5 is not a simple script, there is ready rules. and hdf5 only have three data type
Datasets Datasets are very similar to NumPy arrays.
Groups Groups are the container mechanism by which HDF5 files are organized
Attributes are a critical part of what makes HDF5 a “self-describing” format.

then we need translate the xml tags to hdf5 attributes, but the major benefit of hdf5 is to store large numbers, and we do not have many numbers in our report, but cases names , pass and fail, and logs.

I suppose we need to be a generally accepted rules to convert xml to hdf5, which need more discussion and agreements from community.

Please sign in to comment.