easy-era5-trck

A super lightweight Lagrangian model for calculating millions of trajectories using ERA5 data

https://github.com/lzhenn/easy-era5-trck

Science Score: 39.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (15.2%) to scientific vocabulary

Keywords

era5 lagrangian multiprocessing python trajectory
Last synced: 6 months ago · JSON representation

Repository

A super lightweight Lagrangian model for calculating millions of trajectories using ERA5 data

Basic Info
  • Host: GitHub
  • Owner: lzhenn
  • License: mit
  • Language: Python
  • Default Branch: master
  • Homepage:
  • Size: 35.3 MB
Statistics
  • Stars: 41
  • Watchers: 2
  • Forks: 13
  • Open Issues: 0
  • Releases: 1
Topics
era5 lagrangian multiprocessing python trajectory
Created over 5 years ago · Last pushed 6 months ago
Metadata Files
Readme License

ReadMe.md

Easy-ERA5-Trck

Disclaimer: The open-source tool utilizes a trajectory calculation methodology based on nearest-neighbor interpolation and first-guess velocity estimation to optimize computational efficiency. This tool does not account for atmospheric diffusion or convection processes. No warranties, express or implied, are provided regarding the accuracy, reliability, or suitability of the results generated by this tool. The tool is not intended for scientific or professional applications unless the user independently verifies the results through rigorous validation procedures.

For accurate trajectory calculations, users are directed to consult peer-reviewed methodologies, such as those described in the publication available at http://journals.ametsoc.org/doi/abs/10.1175/BAMS-D-14-00110.1, or to employ established professional models, such as NOAA HYSPLIT. The developers and contributors of this tool disclaim any liability for damages, losses, or consequences arising from the use of this tool or reliance on its outputs.

Easy-ERA5-Trck is a super lightweight Lagrangian model for calculating thousands (even millions) of trajectories simultaneously and efficiently using ERA5 data sets. It can implement super simplified equations of 3-D motion to accelerate integration, and use python multiprocessing to parallelize the integration tasks. Due to its simplification and parallelization, Easy-ERA5-Trck performs great speed in tracing massive air parcels, which makes areawide tracing possible.

Another version using WRF output to drive the model can be found here.

Any question, please contact Zhenning LI (zhenningli91@gmail.com)

Galleries

Tibetan Plateau Air Source Tracers

tp_tracer

Tibetan Plateau Air Source Tracers (3D)

tp_tracer_3d

Install

If you wish to run easy-era5-trck using grib2 data, Please first install ecCodes.

Please install python3 using Anaconda3 distribution. Anaconda3 with python3.8 has been fully tested, lower version of python3 may also work (without testing).

Now, we recommend to create a new environment in Anaconda and install the requirements.txt:

bash conda create -n test_era5trck python=3.8 conda activate test_era5trck pip install -r requirements.txt

If everything goes smoothly, first cd to the repo root path, and run config.py:

bash python3 config.py

This will convey fundamental configure parameters to ./conf/config_sys.ini.

Usage

test case

When you install the package ready. You may first want to try the test case. config.ini has been set for testcase, which is a very simple run: ``` python [INPUT] inputera5case = ./testcase/ inputparcelfile=./input/input.csv

[CORE]

timestep in min

time_step = 30 precession = 1-order

1 for forward, -1 for backward

forward_option = -1

for forward, this is the initial time; otherwise, terminating time

start_ymdh = 2015080212

integration length in hours

integration_length = 24

how many processors are willing to work for you

ntasks = 4

not used yet

boundary_check = False

[OUTPUT]

output format, nc/csv, nc recommended for large-scale tracing

outfmt = nc outprefix = testcase

output frequency in min

out_frq = 60

when out_fmt=csv, how many parcel tracks will be organized in a csv file.

sep_num = 5000

`` When you typepython3 run.py, Easy-ERA5-Trck will uptake the above configurations, by which the ERA5 UVW data in./testcase` will be imported for driving the Lagrangian integration.

Now you will see your workers are dedicated to tracing the air parcels. After several seconds, if you see something like: bash 2021-05-31 17:32:14,015 - INFO : All subprocesses done. 2021-05-31 17:32:14,015 - INFO : Output... 2021-05-31 17:32:14,307 - INFO : Easy ERA5 Track Completed Successfully! Congratulations! The testcase works smoothly on your machine!

Now you could check the output file in ./output, named as testcase.I20150802120000.E20150801120000.nc|csv, which indicates the initial time and endding time. For backward tracing, I > E, and vice versa.

You could choose output files as plain ascii csv format or netCDF format (Recommended). netCDF format output metadata looks like: bash { dimensions: time = 121 ; parcel_id = 413 ; variables: double xlat(time, parcel_id) ; xlat:_FillValue = NaN ; double xlon(time, parcel_id) ; xlon:_FillValue = NaN ; double xh(time, parcel_id) ; xh:_FillValue = NaN ; int64 time(time) ; time:units = "hours since 1998-06-10 00:00:00" ; time:calendar = "proleptic_gregorian" ; int64 parcel_id(parcel_id) ; }

setup your case

Congratulation! After successfully run the toy case, of course, now you are eager to setup your own case. First, build your own case directory, for example, in the repo root dir: bash mkdir mycase Now please make sure you have configured ECMWF CDS API correctly, both in your shell environment and python interface.

Next, set [DOWNLOAD] section in config.ini to fit your desired period, levels, and region for downloading.

```python [DOWNLOAD] storepath=./mycase/ startymd = 20151220 end_ymd = 20160101 pres=[700, 750, 800, 850, 900, 925, 950, 975, 1000]

eara: [North, West, South, East]

area=[-10, 0, -90, 360]

data frame frequency: recommend 1, 2, 3, 6.

lower frequency will download faster but less accurate in tracing

freq_hr=3 ``` Here we hope to download 1000-700 hPa data, from 20151220 to 20160101, 3-hr temporal frequency UVW data from ERA5 CDS.

./utlis/getERA5-UVW.py will help you to download the ERA5 reanalysis data for your case, in daily file with freq_hr temporal frequency. bash cd utils python3 getERA5-UVW.py

While the machine is downloading your data, you may want to determine the destinations or initial points of your targeted air parcels. ./input/input.csv: This file is the default file prescribing the air parcels for trajectory simulation. Alternatively, you can assign it by input_parcel_file in config.ini.

The format of this file:

airp_id, init_lat, init_lon, init_h0 (hPa) For forward trajectory, the init{lat|lon|h0} denote initial positions; while for backward trajectory, they indicate ending positions. You can write it by yourself. Otherwise, there is also a utility `./utils/takebox_grid.py`, which will help you to take air parcels in a rectanguler domain.

plese also set other sections in config.ini accordingly, now these air parcels are waiting your command python3 run.py to travel the world!

Besides, ./utils/control_multi_run.py will help you to run multiple seriels of the simulation. There are some postprocessing scripts for visualization in post_process, you may need to modify them to fit your visualization usage.

Repository Structure

run.py

./run.py: Main script to run the Easy-ERA5-Trck.

conf

  • ./conf/config.ini: Configure file for the model. You may set ERA5 input file, input frequency, integration time steps, and other settings in this file.
  • ./conf/config_sys.ini: Configure file for the system, generate by run config.py.
  • ./conf/logging_config.ini: Configure file for logging module.

core

  • ./core/lagrange.py: Core module for calculating the air parcels Lagrangian trajectories.

lib

  • ./lib/cfgparser.py: Module file containing read/write method of the config.ini
  • ./lib/air_parcel.py: Module file containing definition of air parcel class and related methods such as march and output.
  • ./lib/preprocess_era5inp.py: Module file that defines the field_hdl class, which contains useful fields data (U, V, W...) and related method, including ERA5 grib file IO operations.
  • ./lib/utils.py: utility functions for the model.

post_process

Some visualization scripts.

utils

Utils for downloading, generating input.csv, etc.

Version iteration

Oct 28, 2020

  • Fundimental pipeline design, multiprocessing, and I/O.
  • MVP v0.01

May 31, 2021

  • Major Revision, logging module, and exception treatment
  • test case
  • Major documentation update
  • Utility for data downloading
  • Utility for taking grids in a box
  • Basic functions done, v0.10

Jun 09, 2021

  • The automatic detection of longitude range is added, allowing users to adopt two different ranges of longitude: [-180°, 180°] or [0°, 360°].
  • Currently, if you want to use the [-180°, 180°] data version, you can only set ntasks = 1 in the config.ini file.

Oct 19, 2021

  • Modify requirements.txt to fit updated version of libs.

Jun 24, 2022

  • Add Administrative Grid and sparse matrix match utils: ./utils/assign_nodes_to_city.py and ./utils/assign_sparse_nodes.py.

Owner

  • Name: Zhenning Li
  • Login: lzhenn
  • Kind: user
  • Location: Hong Kong
  • Company: HKUST

Wind extinguishes a candle but energizes fire.

GitHub Events

Total
  • Watch event: 2
  • Push event: 2
  • Fork event: 1
Last Year
  • Watch event: 2
  • Push event: 2
  • Fork event: 1

Committers

Last synced: 6 months ago

All Time
  • Total Commits: 42
  • Total Committers: 2
  • Avg Commits per committer: 21.0
  • Development Distribution Score (DDS): 0.048
Past Year
  • Commits: 2
  • Committers: 1
  • Avg Commits per committer: 2.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
novarizark n****k@g****m 40
Junbin Wang 4****g@u****m 2

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 0
  • Total pull requests: 3
  • Average time to close issues: N/A
  • Average time to close pull requests: 1 day
  • Total issue authors: 0
  • Total pull request authors: 2
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 3
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
  • WanJubing (2)
  • lzhenn (1)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

requirements.txt pypi
  • attrs >=21.2.0
  • certifi >=2020.12.5
  • cffi >=1.14.5
  • cfgrib >=0.9.9.0
  • cftime >=1.4.1
  • click >=8.0.1
  • eccodes >=1.3.2
  • findlibs >=0.0.2
  • netCDF4 >=1.5.6
  • numpy >=1.20.3
  • pandas >=1.2.4
  • pycparser >=2.20
  • python-dateutil >=2.8.1
  • pytz >=2021.1
  • six >=1.16.0
  • xarray >=0.17.0