Multiple Divide by Zero in HDF5 (1.10.2, 1.10.3)

Loginsoft-2018-15672

September 24, 2018

CVE Number

CVE-2018-15672, CVE-2018-17237, CVE-2018-17233

CWE

CWE-369: Divide By Zero

Product Details

HDF5 is a data model, library, and file format for storing and managing data. It supports an unlimited variety of datatypes and is designed for flexible and efficient I/O and for high volume and complex data. HDF5 is portable and is extensible, allowing applications to evolve in their use of HDF5. The HDF5 Technology suite includes tools and applications for managing, manipulating, viewing, and analyzing data in the HDF5 format.

URL: https://www.hdfgroup.org/downloads/hdf5/

Vulnerable Versions

HDF5 1.10.2 and 1.10.3

Vulnerability Details

A division by zero was discovered in H5D__chunk_init in H5Dchunk.c in the HDF HDF5 1.10.2 library. It could allow a remote denial of service attack.

SYNOPSIS

Datasets: Very similar to `NumPy` arrays, they are homogeneous collections of data elements, with an immutable datatype and (hyper)rectangular shape.

Attributes :
– shape
– size
– dtype

Chunk/chuncking – Chunking refers to a storage layout where a dataset is partitioned into fixed-size multi-dimensional chunks.

Raw chunk cache data – Calling write many times from the application would result in poor performance when data is written within a chunk. A raw data chunk cache layer was added to improve the performance. By default, the chunk cache will store 521 chunks or 1MB of data.

Ref: http://docs.h5py.org/en/latest/high/dataset.html

 

H5D_t * 
H5D__open_name(const H5G_loc_t *loc, const char *name, hid_t dapl_id, 
hid_t dxpl_id) 
{ 
. 
. 
. 
/* Open the dataset */ 
if(NULL == (dset = H5D_open(&dset_loc, dapl_id, dxpl_id))) [1] 
HGOTO_ERROR(H5E_DATASET, H5E_CANTINIT, NULL, "can't open dataset") 
static herr_t 
H5D__open_oid(H5D_t *dataset, hid_t dapl_id, hid_t dxpl_id) 
{ 
. 
. 
. 
/* Get the layout/pline/efl message information */ 
if(H5D__layout_oh_read(dataset, dxpl_id, dapl_id, plist) < 0) [2] 
HGOTO_ERROR(H5E_DATASET, H5E_CANTGET, FAIL, "can't get layout/pline/efl info") 
/* Initial scaled dimension sizes */ 
if(dset->shared->layout.u.chunk.dim[u] == 0) 
HGOTO_ERROR(H5E_DATASET, H5E_BADVALUE, FAIL, "chunk size must be > 0, dim = %u ", u) 
rdcc->scaled_dims[u] = dset->shared->curr_dims[u] / dset->shared->layout.u.chunk.dim[u];  [3] 

 

`H5D__open_name()` function opens an existing dataset via the name & looks for the dataset object thereby checking for the correctness of the object found. If valid, it accesses the dataset by calling H5D_open() [1] internally calling `H5D__open_oid()`, which is responsible for doing different operations such as opening the dataset object, loading type, dataspace information, caching the dataspace info, getting the layout/pline/efl message information etc.

During the operation of getting the layout/pline/efl message information, the function H5D__layout_oh_read() [2] is called to initiate the operation. It invokes `H5D__chunk_init()`, Initializing the raw data chunk cache for a dataset (culprit), usually called when the dataset is initialized. While computing the scaled dimension info, the value of raw data chunk cache is computed by performing a division of the dataset current dimensions `dset->shared->curr_dims[u]` with the dataset layout chunk dimension `dset->shared->layout.u.chunk.dim[u]` [3]. The value of dataset layout chunk dimension if gone zero, will end up creating Divide by zero issue & raising a floating-point exception.

 

Fix
As a part of fix, bound check is being done to check if dataset layout chunk dimension is a non-zero value.

https://www.hdfgroup.org/2018/07/hdfview-3-0-pre-release-newsletter-163/

` (dset->shared->layout.u.chunk.dim[u] == 0)`

“`
+ if(dset->shared->layout.u.chunk.dim[u] == 0)
+ HGOTO_ERROR(H5E_DATASET, H5E_BADVALUE, FAIL,
+ “chunk size must be > 0, dim = %u “, u)
rdcc->scaled_dims[u] = dset->shared->curr_dims[u] / dset->shared->layout.u.chunk.dim[u];

Analysis
break H5Dchunk.c:1022 if u = 3

Breakpoint 2, H5D__chunk_init (f=0x60700000de60, dxpl_id=0xa00000000000008, dset=0x606000000c20, dapl_id=0xa00000000000007) at H5Dchunk.c:1022
1022 - rdcc->scaled_dims[u] = dset->shared->curr_dims[u] / dset->shared->layout.u.chunk.dim[u];
1: dset->shared->layout.u.chunk.dim[u] = 0x0
2: dset->shared->curr_dims[u] = 0x101
3: u = 0x3

Backtrace
#0 H5D__chunk_init (f=0x60700000de60, dxpl_id=0xa00000000000008, dset=0x606000000c20, dapl_id=0xa00000000000007) at H5Dchunk.c:1022
#1 0x00007ffff6285bef in H5D__layout_oh_read (dataset=0x606000000c20, dxpl_id=0xa00000000000008, dapl_id=0xa00000000000007, plist=0x60400000cfd0) at H5Dlayout.c:653
#2 0x00007ffff62668fa in H5D__open_oid (dataset=0x606000000c20, dapl_id=0xa00000000000007, dxpl_id=0xa00000000000008) at H5Dint.c:1598
#3 0x00007ffff62644e5 in H5D_open (loc=0x7fffffffc3f0, dapl_id=0xa00000000000007, dxpl_id=0xa00000000000008) at H5Dint.c:1390
#4 0x00007ffff62638e5 in H5D__open_name (loc=0x7fffffffc580, name=0x602000009470 "/Dataset1", dapl_id=0xa00000000000007, dxpl_id=0xa00000000000008) at H5Dint.c:1324
#5 0x00007ffff61ea0b0 in H5Dopen2 (loc_id=0x100000000000000, name=0x602000009470 "/Dataset1", dapl_id=0xa00000000000007) at H5D.c:293
#6 0x00000000004063ac in ?? ()
#7 0x0000000000407563 in ?? ()
#8 0x0000000000438833 in ?? ()
#9 0x00007ffff64068e9 in H5G_visit_cb (lnk=0x7fffffffcd20, _udata=0x7fffffffd400) at H5Gint.c:925
#10 0x00007ffff641cc50 in H5G__node_iterate (f=0x60700000de60, dxpl_id=0xa00000000000008, _lt_key=0x61200001e048, addr=0x4e0, _rt_key=0x61200001e050, _udata=0x7fffffffd070) at H5Gnode.c:1004
#11 0x00007ffff61461fa in H5B__iterate_helper (f=0x60700000de60, dxpl_id=0xa00000000000008, type=0x7ffff6deba80 <H5B_SNODE>, addr=0x180, op=0x7ffff641c5eb <H5G__node_iterate>, udata=0x7fffffffd070) at H5B.c:1179
#12 0x00007ffff61465c7 in H5B_iterate (f=0x60700000de60, dxpl_id=0xa00000000000008, type=0x7ffff6deba80 <H5B_SNODE>, addr=0x180, op=0x7ffff641c5eb <H5G__node_iterate>, udata=0x7fffffffd070) at H5B.c:1224
#13 0x00007ffff64336ab in H5G__stab_iterate (oloc=0x606000000e68, dxpl_id=0xa00000000000008, order=H5_ITER_INC, skip=0x0, last_lnk=0x0, op=0x7ffff64061cc <H5G_visit_cb>, op_data=0x7fffffffd400) at H5Gstab.c:563
#14 0x00007ffff6425ac9 in H5G__obj_iterate (grp_oloc=0x606000000e68, idx_type=H5_INDEX_NAME, order=H5_ITER_INC, skip=0x0, last_lnk=0x0, op=0x7ffff64061cc <H5G_visit_cb>, op_data=0x7fffffffd400, dxpl_id=0xa00000000000008) at H5Gobj.c:706
#15 0x00007ffff6408508 in H5G_visit (loc_id=0x100000000000000, group_name=0x442e80 "/", idx_type=H5_INDEX_NAME, order=H5_ITER_INC, op=0x438308, op_data=0x7fffffffd630, lapl_id=0xa00000000000000, dxpl_id=0xa00000000000008) at H5Gint.c:1160
#16 0x00007ffff64e7b04 in H5Lvisit_by_name (loc_id=0x100000000000000, group_name=0x442e80 "/", idx_type=H5_INDEX_NAME, order=H5_ITER_INC, op=0x438308, op_data=0x7fffffffd630, lapl_id=0xa00000000000000) at H5L.c:1381
#17 0x0000000000438dca in ?? ()
#18 0x000000000043c6e4 in ?? ()
#19 0x000000000040c21e in ?? ()
#20 0x00007ffff5ba7830 in __libc_start_main (main=0x40bb52, argc=0x7, argv=0x7fffffffddf8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffdde8) at ../csu/libc-start.c:291
#21 0x0000000000404e59 in ?? ()

 

Proof of concept

./h5stat -A -T -G -D -S $POC

-A prints attribute information
-T prints dataset’s datatype metadata
-G prints file space information for groups’ metadata
-D prints file space information for dataset’s metadata

 

Vulnerability Details

A SIGFPE signal is raised in the function H5D__chunk_set_info_real() of H5Dchunk.c in the HDF HDF5 1.10.3 library during an attempted parse of a crafted HDF file, because of incorrect protection against division by zero. This issue is different from CVE-2018-11207.

 

SYNOPSIS
``` 
if(H5D__chunk_set_info(dset) < 0) [1] 
HGOTO_ERROR(H5E_DATASET, H5E_CANTINIT, FAIL, "unable to set # of chunks for dataset") 
``` 

``` 
if(H5D__chunk_set_info_real(&dset->shared->layout.u.chunk, dset->shared->ndims, dset->shared->curr_dims, dset->shared->max_dims) < 0) [2] 
HGOTO_ERROR(H5E_DATASET, H5E_CANTSET, FAIL, "can't set layout's chunk info") 
``` 

``` 
for(u = 0, layout->nchunks = 1, layout->max_nchunks = 1; u < ndims; u++) { 
/* Round up to the next integer # of chunks, to accomodate partial chunks */ 
layout->chunks[u] = ((curr_dims[u] + layout->dim[u]) - 1) / layout->dim[u]; [3] 
if(H5S_UNLIMITED == max_dims[u]) 
layout->max_chunks[u] = H5S_UNLIMITED; 
``` 

A similar issue as CVE-2018-15672 was discovered in `H5D__chunk_set_info_real()` function at `src/H5Dchunk.c`.

The Function `H5D__layout_oh_read()` invokes `H5D__chunk_init()` which Initializes the raw data chunk cache for a dataset, called when the dataset is initialized.

It then computes the scaled dimension information followed by setting the number of chunks in a dataset for which it calls `H5D__chunk_set_info()` [1] passing the dataset (dset).

`H5D__chunk_set_info_real()` [2] then sets the base layout information. While computing the number of chunks in dataset dimensions, there’s an invalid computation during the calculation of the layout chunk. The current dimensions `curr_dims[u]` is added to the layout dimension `layout->dim[u]`, subtracted by 1 and divided with the layout dimension `layout->dim[u]` [3]. The layout dimensions if set to zero, can end up creating Divide by zero issue & raising a floating-point exception.

 

Analysis
gef➤ p curr_dims[u]
$1 = 0x4
gef➤ p layout->dim[u]
$2 = 0x0
gef➤ p ((curr_dims[u] + layout->dim[u]) - 1) / layout->dim[u]
Division by zero

x/i $pc
=> 0x7ffff700b505 <H5D__chunk_set_info+437>:	div    r14

info registers 
rax            0x3	0x3
rbx            0x555555837ab0	0x555555837ab0
rcx            0x0	0x0
rdx            0x0	0x0
rsi            0x0	0x0
rdi            0x555555837a10	0x555555837a10
rbp            0x555555838678	0x555555838678
rsp            0x7fffffffd350	0x7fffffffd350
r8             0x1	0x1
r9             0x1	0x1
r10            0x11f	0x11f
r11            0x555555838478	0x555555838478
r12            0x1	0x1
r13            0x4	0x4
r14            0x0	0x0
r15            0xffffffffffffffff	0xffffffffffffffff
rip            0x7ffff700b505	0x7ffff700b505 <H5D__chunk_set_info+437>
eflags         0x10217	[ CF PF AF IF RF ]
cs             0x33	0x33
ss             0x2b	0x2b
ds             0x0	0x0
es             0x0	0x0
fs             0x0	0x0
gs             0x0	0x0


Backtrace
[#0] 0x7ffff700b505 → Name: H5D__chunk_set_info_real(max_dims=0x555555838678, curr_dims=0x555555838478, ndims=0x1, layout=0x555555837bb0)
[#1] 0x7ffff700b505 → Name: H5D__chunk_set_info(dset=0x555555837a10)
[#2] 0x7ffff700c42c → Name: H5D__chunk_init(f=<optimized out>, dset=0x555555837a10, dapl_id=<optimized out>)
[#3] 0x7ffff7093ec3 → Name: H5D__layout_oh_read(dataset=0x555555837a10, dapl_id=0xa00000000000007, plist=0x555555831d70)
[#4] 0x7ffff70807aa → Name: H5D__open_oid(dapl_id=0xa00000000000007, dataset=0x555555837a10)
[#5] 0x7ffff70807aa → Name: H5D_open(loc=0x7fffffffd530, dapl_id=0xa00000000000007)
[#6] 0x7ffff7082ceb → Name: H5D__open_name(loc=0x7fffffffd5c0, name=0x555555836f30 "/Dataset1", dapl_id=0xa00000000000007)
[#7] 0x7ffff6fe1d98 → Name: H5Dopen2(loc_id=0x100000000000000, name=0x555555836f30 "/Dataset1", dapl_id=<optimized out>)
[#8] 0x5555555d79ca → test rax, rax
[#9] 0x5555555db6e8 → test eax, eax


Proof of concept

./h5dump -H $POC

-H Prints the header but displays no data.

 

Vulnerability Details

A SIGFPE signal is raised in the function H5D__create_chunk_file_map_hyper() of H5Dchunk.c in the HDF HDF5 through 1.10.3 library during an attempted parse of a crafted HDF file, because of incorrect protection against division by zero. It could allow a remote denial of service attack.

 

SYNOPSIS
``` 
herr_t 
H5Dread(hid_t dset_id, hid_t mem_type_id, hid_t mem_space_id, [1] 
hid_t file_space_id, hid_t plist_id, void *buf/*out*/) 
{ 
. 
. 
. 
else { 
/* read raw data */ 
if(H5D__read(dset, mem_type_id, mem_space, file_space, plist_id, buf/*out*/) < 0) [2] 
HGOTO_ERROR(H5E_DATASET, H5E_READERROR, FAIL, "can't read data") 
} 
``` 

``` 
if(sel_hyper_flag) { 
/* Build the file selection for each chunk */ 
if(H5D__create_chunk_file_map_hyper(fm, io_info) < 0) [3] 
HGOTO_ERROR(H5E_DATASET, H5E_CANTINIT, FAIL, "unable to create file chunk selections") 
``` 

``` 
for(u = 0; u < fm->f_ndims; u++) { 
scaled[u] = start_scaled[u] = sel_start[u] / fm->layout->u.chunk.dim[u]; [4] 
coords[u] = start_coords[u] = scaled[u] * fm->layout->u.chunk.dim[u]; 
end[u] = (coords[u] + fm->chunk_dim[u]) - 1; 
``` 

[1] `H5Dread() `functions read a part of dataset file into the applications memory buffer, it internally calls `H5D__read()` [2] for reading in the raw data.

`H5D__chunk_io_init() `is responsible for performing any initialization before any I/O on the raw data, further calling `H5D__chunk_io_init()`. Inside `H5D__chunk_io_init()` a check is done to find out if the file selection is not a hyperslab selection, for which it calls `H5D__create_chunk_file_map_hyper()` [3]. It also is responsible for building the file selection for each chunk and creating all chunk selections in a file. It gets the number of elements selected in a file, bounding box for selection & then sets the initial chunk location & hyperslab size, being the area where things are going wrong.

There a division being done between the Offset of low bound of file selection `sel_start[u]` and the file memory layout of the dataset `fm->layout->u.chunk.dim[u]` [4]. The file memory layout of the dataset if set to zero, can end up providing a result of zero causing Divide by zero issue & raising a floating-point exception.

 

Analysis
Backtrace
{
DATASET "BAG_root/metadata" {
   DATATYPE  H5T_STRING {
      STRSIZE 1;
      STRPAD H5T_STR_NULLTERM;
      CSET H5T_CSET_ASCII;
      CTYPE H5T_C_S1;
}
   DATASPACE  SIMPLE { ( 4795 ) / ( H5S_UNLIMITED ) }

Program received signal SIGFPE, Arithmetic exception.
0x00007ffff6140acf in H5D__create_chunk_file_map_hyper (fm=0x61e000000c80, io_info=0x7fffffffb910) at H5Dchunk.c:1578
1578	        scaled[u] = start_scaled[u] = sel_start[u] / fm->layout->u.chunk.dim[u];

(gdb) x/i $pc
=> 0x7ffff6140acf :	div    rdi

(gdb) info registers 
rax            0x7ffff668b280	140737327444608
rbx            0x7fffffffb320	140737488335648
rcx            0x0	0
rdx            0x0	0
rsi            0x7ffff668b280	140737327444608
rdi            0x0	0
rbp            0x7fffffffb340	0x7fffffffb340
rsp            0x7fffffffaa30	0x7fffffffaa30
r8             0x7	7
r9             0x61e000000c80	107614700571776
r10            0x3d1	977
r11            0x7ffff66882e1	140737327432417
r12            0xffffffff550	17592186041680
r13            0x7fffffffaa80	140737488333440
r14            0x7fffffffaa80	140737488333440
r15            0x7fffffffb3e0	140737488335840
rip            0x7ffff6140acf	0x7ffff6140acf 
eflags         0x10206	[ PF IF RF ]
cs             0x33	51
ss             0x2b	43
ds             0x0	0
es             0x0	0
fs             0x0	0
gs             0x0	0
ASAN:DEADLYSIGNAL
=================================================================
==37286==ERROR: AddressSanitizer: FPE on unknown address 0x7ffff6140acf (pc 0x7ffff6140acf bp 0x7fffffffb340 sp 0x7fffffffaa30 T0)
    #0 0x7ffff6140ace in H5D__create_chunk_file_map_hyper /home/ethan/hdf5-1_10_3_gcc/src/H5Dchunk.c:1578
    #1 0x7ffff613dfa0 in H5D__chunk_io_init /home/ethan/hdf5-1_10_3_gcc/src/H5Dchunk.c:1169
    #2 0x7ffff61b6702 in H5D__read /home/ethan/hdf5-1_10_3_gcc/src/H5Dio.c:589
    #3 0x7ffff61b2515 in H5Dread /home/ethan/hdf5-1_10_3_gcc/src/H5Dio.c:198
    #4 0x5555555bce14  (/home/ethan/hdf5-1_10_3_gcc/hdf5/bin/h5dump+0x68e14)
    #5 0x5555555be2b4  (/home/ethan/hdf5-1_10_3_gcc/hdf5/bin/h5dump+0x6a2b4)
    #6 0x5555555cc6de  (/home/ethan/hdf5-1_10_3_gcc/hdf5/bin/h5dump+0x786de)
    #7 0x555555582a85  (/home/ethan/hdf5-1_10_3_gcc/hdf5/bin/h5dump+0x2ea85)
    #8 0x5555555881c1  (/home/ethan/hdf5-1_10_3_gcc/hdf5/bin/h5dump+0x341c1)
    #9 0x555555579872  (/home/ethan/hdf5-1_10_3_gcc/hdf5/bin/h5dump+0x25872)
    #10 0x7ffff5aa41c0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x211c0)
    #11 0x555555572129  (/home/ethan/hdf5-1_10_3_gcc/hdf5/bin/h5dump+0x1e129)

Proof of concept

h5dump -r -d BAG_root/metadata $POC

-r switch is used to print 1-bytes integer datasets as ASCII.

-d is for dumping a dataset from a group in a hdf5 file.

  

Timeline

Vendor Disclosure: 2018-09-24

Patch Release: 2018-09-25

Public Disclosure: 2018-09-26

 

Credit

Discovered by ACE Team – Loginsoft