get-station-data
Easily grab weather station data from around the globe (e.g. GHCN)
Science Score: 10.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
✓Committers with academic emails
1 of 7 committers (14.3%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.4%) to scientific vocabulary
Keywords
Repository
Easily grab weather station data from around the globe (e.g. GHCN)
Basic Info
Statistics
- Stars: 28
- Watchers: 1
- Forks: 11
- Open Issues: 1
- Releases: 0
Topics
Metadata Files
README.md
Get daily weather station data (Global)
A set of Python tools to make it easier to extract weather station data (e.g., temperature, precipitation) from the Global Historical Climatology Network - Daily (GHCND)
"The Global Historical Climatology Network daily (GHCNd) is an integrated database of daily climate summaries from land surface stations across the globe. GHCNd is made up of daily climate records from numerous sources that have been integrated and subjected to a common suite of quality assurance reviews. GHCNd contains records from more than 100,000 stations in 180 countries and territories. NCEI provides numerous daily variables, including maximum and minimum temperature, total daily precipitation, snowfall, and snow depth. About half the stations only report precipitation. Both record length and period of record vary by station and cover intervals ranging from less than a year to more than 175 years." source
More information on the data can be found here
Installation
- Install from the source code:
- Clone the repository source code:
bash
git clone https://github.com/scotthosking/get-station-data.git
- Install along with its dependencies:
bash
cd /path/to/my/get-station-data
pip install -v -e .
Worked through example
```python from getstationdata import ghcnd from getstationdata.util import nearest_stn
%matplotlib inline ```
Read station metadata
python
stn_md = ghcnd.get_stn_metadata()
Choose a location (lon/lat) and number of nearest neighbours
python
london_lon_lat = -0.1278, 51.5074
my_stns = nearest_stn(stn_md,
london_lon_lat[0], london_lon_lat[1],
n_neighbours=5 )
my_stns
| station | lat | lon | elev | name | |
|---|---|---|---|---|---|
| 52113 | UKE00105915 | 51.5608 | 0.1789 | 137.0 | HAMPSTEAD |
| 52165 | UKM00003772 | 51.4780 | -0.4610 | 25.3 | HEATHROW |
| 52098 | UKE00105900 | 51.8067 | 0.3581 | 128.0 | ROTHAMSTED |
| 52191 | UKW00035054 | 51.2833 | 0.4000 | 91.1 | WEST MALLING |
| 52131 | UKE00107650 | 51.4789 | 0.4489 | 25.0 | HEATHROW |
Download and extract data into a pandas DataFrame
```python df = ghcnd.getdata(mystns)
df.head() ```
| station | year | month | day | element | value | mflag | qflag | sflag | date | lon | lat | elev | name | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | UKE00105915 | 1959 | 12 | 1 | TMAX | NaN | 1959-12-01 | 0.1789 | 51.5608 | 137.0 | HAMPSTEAD | |||
| 1 | UKE00105915 | 1959 | 12 | 2 | TMAX | NaN | 1959-12-02 | 0.1789 | 51.5608 | 137.0 | HAMPSTEAD | |||
| 2 | UKE00105915 | 1959 | 12 | 3 | TMAX | NaN | 1959-12-03 | 0.1789 | 51.5608 | 137.0 | HAMPSTEAD | |||
| 3 | UKE00105915 | 1959 | 12 | 4 | TMAX | NaN | 1959-12-04 | 0.1789 | 51.5608 | 137.0 | HAMPSTEAD | |||
| 4 | UKE00105915 | 1959 | 12 | 5 | TMAX | NaN | 1959-12-05 | 0.1789 | 51.5608 | 137.0 | HAMPSTEAD |
Filter data for, e.g., a single variable
```python var = 'PRCP' # precipitation df = df[ df['element'] == var ]
Tidy up columns
df = df.rename(index=str, columns={"value": var}) df = df.drop(['element'], axis=1)
df.head() ```
| station | year | month | day | PRCP | mflag | qflag | sflag | date | lon | lat | elev | name | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 93 | UKE00105915 | 1960 | 1 | 1 | 2.5 | E | 1960-01-01 | 0.1789 | 51.5608 | 137.0 | HAMPSTEAD | ||
| 94 | UKE00105915 | 1960 | 1 | 2 | 1.5 | E | 1960-01-02 | 0.1789 | 51.5608 | 137.0 | HAMPSTEAD | ||
| 95 | UKE00105915 | 1960 | 1 | 3 | 1.0 | E | 1960-01-03 | 0.1789 | 51.5608 | 137.0 | HAMPSTEAD | ||
| 96 | UKE00105915 | 1960 | 1 | 4 | 0.8 | E | 1960-01-04 | 0.1789 | 51.5608 | 137.0 | HAMPSTEAD | ||
| 97 | UKE00105915 | 1960 | 1 | 5 | 0.0 | E | 1960-01-05 | 0.1789 | 51.5608 | 137.0 | HAMPSTEAD |
python
df.drop(columns=['mflag','qflag','sflag']).tail(n=10)
| station | year | month | day | PRCP | date | lon | lat | elev | name | |
|---|---|---|---|---|---|---|---|---|---|---|
| 83938 | UKE00107650 | 2016 | 12 | 22 | 0.0 | 2016-12-22 | 0.4489 | 51.4789 | 25.0 | HEATHROW |
| 83939 | UKE00107650 | 2016 | 12 | 23 | 1.4 | 2016-12-23 | 0.4489 | 51.4789 | 25.0 | HEATHROW |
| 83940 | UKE00107650 | 2016 | 12 | 24 | 0.0 | 2016-12-24 | 0.4489 | 51.4789 | 25.0 | HEATHROW |
| 83941 | UKE00107650 | 2016 | 12 | 25 | 1.0 | 2016-12-25 | 0.4489 | 51.4789 | 25.0 | HEATHROW |
| 83942 | UKE00107650 | 2016 | 12 | 26 | 0.0 | 2016-12-26 | 0.4489 | 51.4789 | 25.0 | HEATHROW |
| 83943 | UKE00107650 | 2016 | 12 | 27 | 0.0 | 2016-12-27 | 0.4489 | 51.4789 | 25.0 | HEATHROW |
| 83944 | UKE00107650 | 2016 | 12 | 28 | 0.2 | 2016-12-28 | 0.4489 | 51.4789 | 25.0 | HEATHROW |
| 83945 | UKE00107650 | 2016 | 12 | 29 | 0.4 | 2016-12-29 | 0.4489 | 51.4789 | 25.0 | HEATHROW |
| 83946 | UKE00107650 | 2016 | 12 | 30 | 0.0 | 2016-12-30 | 0.4489 | 51.4789 | 25.0 | HEATHROW |
| 83947 | UKE00107650 | 2016 | 12 | 31 | 0.4 | 2016-12-31 | 0.4489 | 51.4789 | 25.0 | HEATHROW |
Save to file
python
df.to_csv('London_5stns_GHCN-D.csv', index=False)
Plot histogram of all data
python
df['PRCP'].plot.hist(bins=40)

Plot time series for one station
python
heathrow = df[ df['name'] == 'HEATHROW' ]
heathrow['PRCP'].plot()
<matplotlib.axes._subplots.AxesSubplot at 0x81f0d7240>

Owner
- Name: Scott Hosking
- Login: scotthosking
- Kind: user
- Location: Cambridge, UK
- Website: https://scotthosking.com
- Twitter: scotthosking
- Repositories: 3
- Profile: https://github.com/scotthosking
Environmental AI, British Antarctic Survey & The Alan Turing Institute
GitHub Events
Total
- Watch event: 3
Last Year
- Watch event: 3
Committers
Last synced: 6 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Scott Hosking | j****g@g****m | 65 |
| Magnus | m****s@g****m | 14 |
| Tom Andersson | t****d@b****k | 6 |
| Scott Hosking | s****t@S****e | 5 |
| Tom Andersson | t****3@g****m | 4 |
| Scott Hosking | s****t@S****l | 2 |
| Alejandro © | a****c@g****m | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 4
- Total pull requests: 6
- Average time to close issues: 4 months
- Average time to close pull requests: 6 days
- Total issue authors: 3
- Total pull request authors: 4
- Average comments per issue: 3.0
- Average comments per pull request: 1.0
- Merged pull requests: 5
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 1
- Pull request authors: 0
- Average comments per issue: 0.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- manmeet3591 (1)
- scotthosking (1)
- magnusross (1)
- lpeJhrrby (1)
Pull Request Authors
- magnusross (3)
- tom-andersson (1)
- scotthosking (1)
- acocac (1)