get-station-data

Easily grab weather station data from around the globe (e.g. GHCN)

https://github.com/scotthosking/get-station-data

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    1 of 7 committers (14.3%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.4%) to scientific vocabulary

Keywords

python
Last synced: 6 months ago · JSON representation

Repository

Easily grab weather station data from around the globe (e.g. GHCN)

Basic Info
  • Host: GitHub
  • Owner: scotthosking
  • License: mit
  • Language: Jupyter Notebook
  • Default Branch: main
  • Homepage:
  • Size: 1.03 MB
Statistics
  • Stars: 28
  • Watchers: 1
  • Forks: 11
  • Open Issues: 1
  • Releases: 0
Topics
python
Created about 9 years ago · Last pushed over 2 years ago
Metadata Files
Readme License

README.md

Get daily weather station data (Global)

A set of Python tools to make it easier to extract weather station data (e.g., temperature, precipitation) from the Global Historical Climatology Network - Daily (GHCND)

"The Global Historical Climatology Network daily (GHCNd) is an integrated database of daily climate summaries from land surface stations across the globe. GHCNd is made up of daily climate records from numerous sources that have been integrated and subjected to a common suite of quality assurance reviews. GHCNd contains records from more than 100,000 stations in 180 countries and territories. NCEI provides numerous daily variables, including maximum and minimum temperature, total daily precipitation, snowfall, and snow depth. About half the stations only report precipitation. Both record length and period of record vary by station and cover intervals ranging from less than a year to more than 175 years." source

More information on the data can be found here

Installation

  1. Install from the source code:
  • Clone the repository source code:

bash git clone https://github.com/scotthosking/get-station-data.git

  • Install along with its dependencies:

bash cd /path/to/my/get-station-data pip install -v -e .

Worked through example

```python from getstationdata import ghcnd from getstationdata.util import nearest_stn

%matplotlib inline ```

Read station metadata

python stn_md = ghcnd.get_stn_metadata()

Choose a location (lon/lat) and number of nearest neighbours

python london_lon_lat = -0.1278, 51.5074 my_stns = nearest_stn(stn_md, london_lon_lat[0], london_lon_lat[1], n_neighbours=5 ) my_stns

station lat lon elev name
52113 UKE00105915 51.5608 0.1789 137.0 HAMPSTEAD
52165 UKM00003772 51.4780 -0.4610 25.3 HEATHROW
52098 UKE00105900 51.8067 0.3581 128.0 ROTHAMSTED
52191 UKW00035054 51.2833 0.4000 91.1 WEST MALLING
52131 UKE00107650 51.4789 0.4489 25.0 HEATHROW

Download and extract data into a pandas DataFrame

```python df = ghcnd.getdata(mystns)

df.head() ```

station year month day element value mflag qflag sflag date lon lat elev name
0 UKE00105915 1959 12 1 TMAX NaN 1959-12-01 0.1789 51.5608 137.0 HAMPSTEAD
1 UKE00105915 1959 12 2 TMAX NaN 1959-12-02 0.1789 51.5608 137.0 HAMPSTEAD
2 UKE00105915 1959 12 3 TMAX NaN 1959-12-03 0.1789 51.5608 137.0 HAMPSTEAD
3 UKE00105915 1959 12 4 TMAX NaN 1959-12-04 0.1789 51.5608 137.0 HAMPSTEAD
4 UKE00105915 1959 12 5 TMAX NaN 1959-12-05 0.1789 51.5608 137.0 HAMPSTEAD

Filter data for, e.g., a single variable

```python var = 'PRCP' # precipitation df = df[ df['element'] == var ]

Tidy up columns

df = df.rename(index=str, columns={"value": var}) df = df.drop(['element'], axis=1)

df.head() ```

station year month day PRCP mflag qflag sflag date lon lat elev name
93 UKE00105915 1960 1 1 2.5 E 1960-01-01 0.1789 51.5608 137.0 HAMPSTEAD
94 UKE00105915 1960 1 2 1.5 E 1960-01-02 0.1789 51.5608 137.0 HAMPSTEAD
95 UKE00105915 1960 1 3 1.0 E 1960-01-03 0.1789 51.5608 137.0 HAMPSTEAD
96 UKE00105915 1960 1 4 0.8 E 1960-01-04 0.1789 51.5608 137.0 HAMPSTEAD
97 UKE00105915 1960 1 5 0.0 E 1960-01-05 0.1789 51.5608 137.0 HAMPSTEAD

python df.drop(columns=['mflag','qflag','sflag']).tail(n=10)

station year month day PRCP date lon lat elev name
83938 UKE00107650 2016 12 22 0.0 2016-12-22 0.4489 51.4789 25.0 HEATHROW
83939 UKE00107650 2016 12 23 1.4 2016-12-23 0.4489 51.4789 25.0 HEATHROW
83940 UKE00107650 2016 12 24 0.0 2016-12-24 0.4489 51.4789 25.0 HEATHROW
83941 UKE00107650 2016 12 25 1.0 2016-12-25 0.4489 51.4789 25.0 HEATHROW
83942 UKE00107650 2016 12 26 0.0 2016-12-26 0.4489 51.4789 25.0 HEATHROW
83943 UKE00107650 2016 12 27 0.0 2016-12-27 0.4489 51.4789 25.0 HEATHROW
83944 UKE00107650 2016 12 28 0.2 2016-12-28 0.4489 51.4789 25.0 HEATHROW
83945 UKE00107650 2016 12 29 0.4 2016-12-29 0.4489 51.4789 25.0 HEATHROW
83946 UKE00107650 2016 12 30 0.0 2016-12-30 0.4489 51.4789 25.0 HEATHROW
83947 UKE00107650 2016 12 31 0.4 2016-12-31 0.4489 51.4789 25.0 HEATHROW

Save to file

python df.to_csv('London_5stns_GHCN-D.csv', index=False)

Plot histogram of all data

python df['PRCP'].plot.hist(bins=40)

png

Plot time series for one station

python heathrow = df[ df['name'] == 'HEATHROW' ] heathrow['PRCP'].plot()

<matplotlib.axes._subplots.AxesSubplot at 0x81f0d7240>

png

Owner

  • Name: Scott Hosking
  • Login: scotthosking
  • Kind: user
  • Location: Cambridge, UK

Environmental AI, British Antarctic Survey & The Alan Turing Institute

GitHub Events

Total
  • Watch event: 3
Last Year
  • Watch event: 3

Committers

Last synced: 6 months ago

All Time
  • Total Commits: 97
  • Total Committers: 7
  • Avg Commits per committer: 13.857
  • Development Distribution Score (DDS): 0.33
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Scott Hosking j****g@g****m 65
Magnus m****s@g****m 14
Tom Andersson t****d@b****k 6
Scott Hosking s****t@S****e 5
Tom Andersson t****3@g****m 4
Scott Hosking s****t@S****l 2
Alejandro © a****c@g****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 4
  • Total pull requests: 6
  • Average time to close issues: 4 months
  • Average time to close pull requests: 6 days
  • Total issue authors: 3
  • Total pull request authors: 4
  • Average comments per issue: 3.0
  • Average comments per pull request: 1.0
  • Merged pull requests: 5
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 1
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 1
  • Pull request authors: 0
  • Average comments per issue: 0.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • manmeet3591 (1)
  • scotthosking (1)
  • magnusross (1)
  • lpeJhrrby (1)
Pull Request Authors
  • magnusross (3)
  • tom-andersson (1)
  • scotthosking (1)
  • acocac (1)
Top Labels
Issue Labels
Pull Request Labels