vptstools package
Subpackages
Submodules
vptstools.odimh5 module
- class vptstools.odimh5.ODIMReader(file_path: str)[source]
Bases:
object
Read ODIM (HDF5) files with context manager
Should be used with the “with” statement (context manager) to properly close the HDF5 file.
- hdf5
- Type:
HDF5 file object
- property root_object_str: str
Get the root what.object attribute as a string.
- Possible values according to the standard:
“PVOL” (Polar volume)
“CVOL” (Cartesian volume)
“SCAN” (Polar scan)
“RAY” (Single polar ray)
“AZIM” (Azimuthal object)
“ELEV” (Elevational object)
“IMAGE” (2-D cartesian image)
“COMP” (Cartesian composite image(s))
“XSEC” (2-D vertical cross section(s))
“VP” (1-D vertical profile)
“PIC” (Embedded graphical image)
- property root_source: Dict[str, str]
Get the root what.source attribute as a dict.
Example: {‘WMO’:’06477’, ‘NOD’:’bewid’, ‘RAD’:’BX41’, ‘PLC’:’Wideumont’}
- vptstools.odimh5.check_vp_odim(source_odim: ODIMReader) None [source]
Verify ODIM file is an hdf5 ODIM format containing ‘VP’ data.
vptstools.s3 module
- class vptstools.s3.OdimFilePath(source: str, radar_code: str, data_type: str, year: str, month: str, day: str, hour: str = '00', minute: str = '00', file_name: str = '', file_type: str = '')[source]
Bases:
object
ODIM file path with translation from/to different S3 key paths
- Parameters:
source (str) – Data source, e.g. baltrad, ecog-04003,…
radar_code (str) – country + radar code
data_type (str) – ODIM data type, e.g. vp, pvol,…
year (str) – year, YYYY
month (str) – month, MM
day (str) – day, DD
hour (str = "00") – hour, HH
minute (str = "00") – minute, MM
file_name (str = "", optional) – File name from which the other properties were derived
file_type (str = "", optional) – File type from which the other properties were derived, e.g. hdf5
- property country
Country code
- property daily_vpts_file_name
Name of the corresponding daily VPTS file
- classmethod from_inventory(h5_file_path)[source]
Initialize class from S3 inventory which contains source and file_type
- classmethod from_s3fs_enlisting(h5_file_path)[source]
Initialize class from S3 inventory which contains bucket, source and file_type
- static parse_file_name(file_name)[source]
Parse an hdf5 file name radar_code, data_type, year, month, day, hour, minute and file_name.
- Parameters:
file_name (str) – File name to be parsed. An eventual parent path and extension will be removed
- Return type:
radar_code, data_type, year, month, day, hour, minute, file_name
Notes
File names are expected to have the following format:
radar_type_yyyymmddThhmmextra.h5
with
radar
the 5-letter radar code,type
the data type,yyyymmdd
the date andhhmm
the hours and minutes.T
is optional,extra
is ignored.
- property radar
Radar code
- property s3_file_path_daily_vpts
S3 key of the daily VPTS file corresponding to the HDF5 file
- property s3_file_path_monthly_vpts
S3 key of the monthly concatenated VPTS file corresponding to the HDF5 file
- property s3_folder_path_h5
S3 key with the folder containing the HDF5 file
- vptstools.s3.extract_daily_group_from_inventory(file_path)[source]
Extract file name components to define a group
The coverage file counts the number of files available per group (e.g. daily files per radar). This function is passed to the Pandas
groupby
to translate the file path to a countable set (e.g. source, radar-code, year month and day for daily files per radar).- Parameters:
file_path (str) – File path of the ODIM HDF5 file. Only the file name is taken into account and a folder-path is ignored.
- vptstools.s3.extract_daily_group_from_path(file_path)[source]
Extract file name components to define a group
The coverage file counts the number of files available per group (e.g. daily files per radar). This function is passed to the Pandas
groupby
to translate the file path to a countable set (e.g. source, radar-code, year month and day for daily files per radar).- Parameters:
file_path (str) – File path of the ODIM HDF5 file. Only the file name is taken into account and a folder-path is ignored.
- vptstools.s3.handle_manifest(manifest_url, modified_days_ago='2day', storage_options=None)[source]
Extract modified days and coverage from a manifest file
- Parameters:
manifest_url (str) – URL of the S3 inventory manifest file to use; s3://…
modified_days_ago (str, default '2day') – Time period to check for ‘modified date’ to extract the subset of files that should trigger a rerun.
storage_options (dict, optional) – Additional parameters passed to the read_csv to access the S3 manifest files, eg. custom AWS profile options ({“profile”: “inbo-prd”})
- Returns:
df_cov (pandas.DataFrame) – DataFrame with the ‘directory’ info (source, radar_code, year, month, day) and the number of files in the S3 bucket.
df_days_to_create_vpts (pandas.DataFrame) – DataFrame with the ‘directory’ info (source, radar_code, year, month, day) and the number of new files within the look back period.
Notes
Check https://docs.aws.amazon.com/AmazonS3/latest/userguide/storage-inventory.html for more information on S3 bucket inventory and manifest files.
vptstools.vpts module
- class vptstools.vpts.BirdProfile(identifiers: dict, datetime: datetime, what: dict, where: dict, how: dict, levels: List[int], variables: dict, source_file: str = '')[source]
Bases:
object
Represent ODIM source file
Data class representing a single input ODIM source file, i.e. (https://github.com/adokter/vol2bird/wiki/ODIM-bird-profile-format-specification) single datetime, single radar, multiple altitudes with variables for each altitude: dd, ff, …
This object aims to stay as close as possible to the HDF5 file (no data simplification/loss at this stage). Use the from_odim method as a convenient instantiation.
- classmethod from_odim(source_odim: ODIMReader, source_file=None)[source]
Extract BirdProfile information from ODIM with OdimReader
- Parameters:
source_odim (ODIMReader) – ODIM file reader interface.
source_file (str, optional) – URL or path to the source file from which the data were derived.
- to_vp(vpts_csv_version)[source]
Convert profile data to a CSV
- Parameters:
vpts_csv_version (AbstractVptsCsv) – Ruleset with the VPTS CSV ruleset to use, e.g. v1.0
Notes
When ‘NaN’ or ‘NA’ values are present inside a column, the object data type will be kept. Otherwise the difference between NaN and NA would be lost and this overcomes int to float conversion when Nans are available as no int NaN is supported by Pandas.
- vptstools.vpts.validate_vpts(df, schema_version='v1.0')[source]
Validate VPTS DataFrame against the frictionless data schema and return report
- Parameters:
df (pandas.DataFrame) – DataFrame as created by the vp or vpts functions
schema_version (str, v1.0,...) – Version according to a release tag of https://github.com/enram/vpts-csv/tags
- Returns:
Frictionless validation report
- Return type:
- vptstools.vpts.vp(file_path, vpts_csv_version='v1.0', source_file='')[source]
Convert ODIM HDF5 file to a DataFrame
- Parameters:
file_path (Path) – File Path of ODIM HDF5
vpts_csv_version (str, default "") – Ruleset with the VPTS CSV ruleset to use, e.g. v1.0
source_file (str | callable) – URL or path to the source file from which the data were derived or a callable that converts the file_path to the source_file. See https://aloftdata.eu/vpts-csv/#source_file for more information on the source file field.
Examples
>>> file_path = Path("bejab_vp_20221111T233000Z_0x9.h5") >>> vp(file_path) >>> vp(file_path, ... source_file="s3://aloftdata/baltrad/hdf5/2022/11/11/bejab_vp_20221111T233000Z_0x9.h5") #noqa
Use file name itself as source_file representation in VP file using a custom callable function
>>> vp(file_path, source_file=lambda x: Path(x).name)
- vptstools.vpts.vpts(file_paths, vpts_csv_version='v1.0', source_file=None)[source]
Convert set of HDF5 files to a DataFrame all as string
- Parameters:
file_paths (Iterable of file paths) – Iterable of ODIM HDF5 file paths
vpts_csv_version (str) – Ruleset with the VPTS CSV ruleset to use, e.g. v1.0
source_file (callable, optional) – A callable that converts the file_path to the source_file. When None, the file name itself (without parent folder reference) is used.
Notes
Due tot the multiprocessing support, the source_file as a callable can not be a anonymous lambda function.
Examples
>>> file_paths = sorted(Path("../data/raw/baltrad/").rglob("*.h5")) >>> vpts(file_paths)
Use file name itself as source_file representation in VP file using a custom callable function
>>> def path_to_source(file_path): ... return Path(file_path).name >>> vpts(file_paths, source_file=path_to_source)
- vptstools.vpts.vpts_to_csv(df, file_path)[source]
Write VP or VPTS to file
- Parameters:
df (pandas.DataFrame) – DataFrame with VP or VPTS data
file_path (Path | str) – File path to store the VPTS file
vptstools.vpts_csv module
- class vptstools.vpts_csv.AbstractVptsCsv[source]
Bases:
ABC
Abstract class to define VPTS CSV conversion rules with a certain version
- abstract mapping(bird_profile) dict [source]
Translation from ODIM bird profile to VPTS CSV data format.
Data columns can be derived from the different attributes of the bird profile:
identifiers
: radar identification metadatadatetime
: the timestamplevels
: the heights or levels of the measurementvariables
: the variables in the data (e.g. dd, ff, u,…)how
: ODIM5 metadatawhere
: ODIM5 metadatawhat
: ODIM5 metadata
An example of the dict to return:
dict( radar=bird_profile.identifiers["NOD"], height=bird_profile.levels, u=bird_profile.variables["u"], v=bird_profile.variables["v"], vcp=int(bird_profile.how["vcp"]) )
As data is extracted as such additional helper functions can be added as well, e.g.:
... datetime=datetime_to_proper8601(bird_profile.datetime), gap=number_to_bool_str(bird_profile.variables["gap"]), radar_latitude=np.round(bird_profile.where["lat"], 6) ...
Notes
The order of the variables matter, as this defines the column order.
- abstract property sort: dict
Columns to define row order
The dict need to provide the column name in combination with the data type to use for the sorting, e.g.:
dict(radar=str, datetime=str, height=int, source_file=str)
As the data is returned as strings, casting to the data is done before sorting, after which the casting to str is applied again.
- source_file_regex = '.*'
- class vptstools.vpts_csv.VptsCsvV1[source]
Bases:
AbstractVptsCsv
- mapping(bird_profile)[source]
Translation from ODIM bird profile to VPTS CSV data format.
Notes
The order of the variables matter, as this defines the column order.
- source_file_regex = '^(?=^[^.\\/~])(^((?!\\.{2}).)*$).*$'
- exception vptstools.vpts_csv.VptsCsvVersionError[source]
Bases:
Exception
Raised when non supported VPTS CSV version is asked
- vptstools.vpts_csv.check_source_file(source_file, regex)[source]
Raise Exception when the source_file str is not according to the regex
- Parameters:
- Returns:
source_file
- Return type:
:raises ValueError : source_file not according to regex:
Examples
>>> check_source_file("s3://aloftdata/baltrad/2023/01/01/" ... "bejab_vp_20230101T000500Z_0x9.h5", ... r".*h5") 's3://aloftdata/baltrad/2023/01/01/bejab_vp_20230101T000500Z_0x9.h5'
- vptstools.vpts_csv.datetime_to_proper8601(timestamp)[source]
Convert datetime to ISO8601 standard
- Parameters:
timestamp (datetime.datetime) – datetime to represent to ISO8601 standard.
Notes
See https://stackoverflow.com/questions/19654578/python-utc-datetime- objects-iso-format-doesnt-include-z-zulu-or-zero-offset
Examples
>>> from datetime import datetime >>> datetime_to_proper8601(datetime(2021, 1, 1, 4, 0)) '2021-01-01T04:00:00Z'
- vptstools.vpts_csv.get_vpts_version(version: str)[source]
Link version ID (v1, v2,..) with correct AbstractVptsCsv child class
- Parameters:
version (str) – e.g. v1.0, v2.0,…
- Returns:
VptsCsvVx
- Return type:
child class of the AbstractVptsCsv
:raises VptsCsvVersionError : Version of the VPTS CSV is not supported by an implementation:
- vptstools.vpts_csv.int_to_nodata(value, nodata_values, nodata='')[source]
Convert str to either integer or the corresponding nodata value if enlisted
- Parameters:
- Return type:
Examples
>>> int_to_nodata("0", ["0", 'NULL'], nodata="") '' >>> int_to_nodata("12", ["0", 'NULL'], nodata="") 12 >>> int_to_nodata('NULL', ["0", 'NULL'], nodata="") '' ""
Module contents
- vptstools.validate_vpts(df, schema_version='v1.0')[source]
Validate VPTS DataFrame against the frictionless data schema and return report
- Parameters:
df (pandas.DataFrame) – DataFrame as created by the vp or vpts functions
schema_version (str, v1.0,...) – Version according to a release tag of https://github.com/enram/vpts-csv/tags
- Returns:
Frictionless validation report
- Return type:
- vptstools.vp(file_path, vpts_csv_version='v1.0', source_file='')[source]
Convert ODIM HDF5 file to a DataFrame
- Parameters:
file_path (Path) – File Path of ODIM HDF5
vpts_csv_version (str, default "") – Ruleset with the VPTS CSV ruleset to use, e.g. v1.0
source_file (str | callable) – URL or path to the source file from which the data were derived or a callable that converts the file_path to the source_file. See https://aloftdata.eu/vpts-csv/#source_file for more information on the source file field.
Examples
>>> file_path = Path("bejab_vp_20221111T233000Z_0x9.h5") >>> vp(file_path) >>> vp(file_path, ... source_file="s3://aloftdata/baltrad/hdf5/2022/11/11/bejab_vp_20221111T233000Z_0x9.h5") #noqa
Use file name itself as source_file representation in VP file using a custom callable function
>>> vp(file_path, source_file=lambda x: Path(x).name)
- vptstools.vpts(file_paths, vpts_csv_version='v1.0', source_file=None)[source]
Convert set of HDF5 files to a DataFrame all as string
- Parameters:
file_paths (Iterable of file paths) – Iterable of ODIM HDF5 file paths
vpts_csv_version (str) – Ruleset with the VPTS CSV ruleset to use, e.g. v1.0
source_file (callable, optional) – A callable that converts the file_path to the source_file. When None, the file name itself (without parent folder reference) is used.
Notes
Due tot the multiprocessing support, the source_file as a callable can not be a anonymous lambda function.
Examples
>>> file_paths = sorted(Path("../data/raw/baltrad/").rglob("*.h5")) >>> vpts(file_paths)
Use file name itself as source_file representation in VP file using a custom callable function
>>> def path_to_source(file_path): ... return Path(file_path).name >>> vpts(file_paths, source_file=path_to_source)
- vptstools.vpts_to_csv(df, file_path)[source]
Write VP or VPTS to file
- Parameters:
df (pandas.DataFrame) – DataFrame with VP or VPTS data
file_path (Path | str) – File path to store the VPTS file