EMADDC 3.0 is a newly developed near-real-time processing system utilizing algorithms originally designed for R2.2. Several modifications to the code were implemented to support real-time data processing, including deduplication, outlier detection, and various quality control measures. These enhancements have resulted in an increased number of observations, particularly for temperature data, while preserving data integrity and quality. Future development efforts are planned to further improve data quality and quantity using improvec corrections, explore the creation of new products, exploer new data delivery methods to reduce delay even further, and provide additional metadata once EMADDC 3.0 is fully operational.
The following datasets are now available:
- EMADDC General and Met Office Global Data
- Dataset Identifier: emaddc_general_and_mo_global_data
- Dataset Version: 2.0
- This dataset is available to users with licenses signed with KNMI and the UK Met Office.
- EMADDC Met Office Global Data
- Dataset Identifier: emaddc_mo_global_data
- Dataset Version: 2.0
- This dataset is available to users with a license signed with the UK Met Office.
- EMADDC General Data
- Dataset Identifier: emaddc_general_data
- Dataset Version: 2.0
- This dataset is available to users with a license signed with KNMI.
Access to these datasets is permitted based on the signed user license(s). All datasets are now considered to be global in scope, and duplicate observations have been removed. Users are therefore strongly advised not to combine data from multiple datasets, as this may reintroduce duplicates. Each dataset also contains all 3 formats instead of having different datasets for each file format.
Existing user registrations from the earlier EMADDC datasets have been transferred to the new datasets where possible. Users who previously had access should therefore be able to continue working with the new datasets on KDP without submitting a new access request.
New users who wish to access EMADDC data can request access by contacting opendata@knmi.nl. Users who experience any issues related to access, permissions, or data downloads are also requested to contact this address.
EMADDC 3.0 introduces a revised data delivery structure and updated file formats. The most significant changes apply to the CSV format, while NetCDF and BUFR formats remain broadly aligned with the previous R2.2 structure. Further details on the EMADDC 3.0 data formats are available in the accompanying documentation.
In the EMADDC 3.0 CSV format, variable names have been updated to more descriptive forms. For example, short names such as lat, lon, wspd, and wl_flag have been replaced by latitude, longitude, wind_speed, and whitelist_flag. Users who process EMADDC CSV data operationally are advised to review and update their ingestion workflows accordingly before transitioning to the new format.
The CSV header structure has also been revised compared to R2.2. Users should verify that their software correctly handles the updated metadata line, sequence number, obs_id offset information, and the revised column name row. The emaddc-readers Python package will be updated in due course to support these changes.
An additional important update concerns the harmonization of wind speed units. In EMADDC 3.0, all wind speed values are provided in meters per second (m/s). This includes MRAR data, which in R2.2 is provided incorreclty in knots. The previous inconsistency in wind speed units across products has therefore been resolved.
EMADDC 3.0 no longer differentiates between "fast" and "regular" files and now generates files at five-minute intervals, with an approximate delay of one and a half minutes. Data generally becomes accessible within a maximum delay of six and a half minutes relative to the start of the time window. This is great reduction w.r.t. the current "fast" files provided by R2.2. In cases of delayed data, new files may be uploaded for the same time window, resulting in an increased Sequence Number, as described in section R2.2. Additionally, EMADDC 3.0 now appends the Sequence Number to the filename when applicable, preventing files from being overwritten. Users are encouraged to select the appropriate file for use in their systems to prevent duplication in these cases.
NetCDF and BUFR outputs remain largely consistent with the earlier R2.2 formats. Nevertheless, users are encouraged to review the updated format documentation prior to operational use of the new datasets. Note that BUFR is supported for the time being and is planned to be replaced by a conversion tool in the future. The NetCDF format is also expected to be updated to become CF- and WIS 2.0 compliant.
For downloading data from KDP, users are strongly encouraged to use the official EMADDC KDP download client available at: https://gitlab.com/KNMI-OSS/emaddc/kdp-notification-client. Existing users of the tool can copy existing logic while changing dataset details. Since all three file formats are now incorporated into a single dataset, users may need to filter their downloaded files. This functionality will be implemented by EMADDC at a later stage.
This download script has recently been updated to support a larger SessionExpirationInterval, in line with recent changes on KDP. The session expiry period has been extended from one hour to one day, improving reliability for long-running and large-volume downloads. Users are advised to ensure they are using the latest version of the script. If needed, users should also request bulk access if too many downloads are peformed by contacting opendata@knmi.nl.
Users are encouraged to test the EMADDC 3.0 datasets on KDP before fully transitioning production workflows, particularly where CSV processing or wind speed variables are involved.
EMADDC looks forward to receiving feedback on this initial phase of the transition from R2.2 to EMADDC 3.0. Updates regarding the next steps in this process, including the operationalization of 3.0 and the discontinuation of R2.2, will be communicated in a timely manner.