A Practical Guide for InSAR Data Storage - Version 2.0 with Multi-track and various product-type support
HDF5 is a file format designed to store and organize large amounts of scientific data. Think of it as a sophisticated container that can hold:
HDF5 is like a miniature file system inside a single file:
Just like you organize files on your computer into folders, HDF5 lets you organize data arrays into groups!
| Advantage | What It Means for InSAR |
|---|---|
| Self-Describing | All metadata travels with the data - you know what satellite, what dates, what processing software and methods were used |
| Georeferenced | Built-in geographic coordinates (lon/lat) for every pixel - no separate geolocation files needed |
| Efficient Storage | Built-in compression can reduce file sizes by 50-90% |
| Multiple Datasets | Store unwrapped phase, wrapped phase, correlation, time series, velocity, AND coordinates all in one file |
| Multi-Product Support | NEW in v2.0: Store interferograms, time series, AND velocity in the same file |
| Partial Reading | Read just the part of the image you need without loading the whole file |
| Cross-Platform | Works on Linux, Mac, Windows - same file everywhere |
| Language Support | Can read with Python, MATLAB, R, C++, Java, etc. |
Groups organize your data hierarchically, just like folders on your computer.
/ALOS2_073_A/INTERFEROGRAM/20240101_20240113/Datasets are multi-dimensional arrays that hold your actual numerical data.
Attributes are small pieces of metadata attached to groups or datasets.
In this documentation, we use @ to indicate attributes: @platform means an attribute named "platform"
Computer File System → HDF5 Equivalent:
/Users/username/Documents/Project/ → /ALOS2_073_A/INTERFEROGRAM/photo.jpg → unwrapped_interferogram (dataset)coordinates.txt → longitude and latitude (datasets)@units, @description)/ALOS2_073_A/INTERFEROGRAM/20240101_20240113/unwrapped_interferogramIn Your InSAR File (Version 2.0):
/ALOS2_073_A/ - Contains all data for ALOS-2 track 73 ascending/ALOS2_073_A/INTERFEROGRAM/ - Contains all interferogram date pairs for this track/ALOS2_073_A/TIMESERIES/ - Contains displacement time series for this track/ALOS2_073_A/VELOCITY/ - Contains velocity products for this track/S1_064_D/ - Contains all data for Sentinel-1 track 64 descendingExample Dataset:
Coordinate Datasets:
Use Attributes for:
@product_types)@coordinate_reference_system)Use Datasets for:
Rule of Thumb: If it's bigger than a few KB, make it a dataset. If it's a description or label, make it an attribute. Coordinates are always datasets because they match the size of your data arrays.
| Concept | What It Is | Example (V2.0) | Python |
|---|---|---|---|
| Group | Container (like a folder) | /ALOS2_073_A/INTERFEROGRAM/ |
f.create_group('INTERFEROGRAM') |
| Dataset | Data array (like a file) | unwrapped_interferogram, longitude |
f.create_dataset('name', data=array) |
| Attribute | Metadata (like file properties) | @platform = "ALOS-2" |
f.attrs['platform'] = 'ALOS-2' |
| Coordinates | Geographic location of data | longitude, latitude |
track.create_dataset('longitude', data=lon_array) |
| CRS | Coordinate reference system | @coordinate_reference_system = "EPSG:4326" |
track.attrs['coordinate_reference_system'] = 'EPSG:4326' |
| Product Types | Declares available products | @product_types = ["INTERFEROGRAM"] |
track.attrs['product_types'] = '["INTERFEROGRAM"]' |
| Track | One satellite/orbit combination | ALOS2_073_A, S1_064_D |
track = f.create_group('ALOS2_073_A') |