georelate
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.2%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: edkrueger
- License: mit
- Language: Python
- Default Branch: main
- Size: 147 KB
Statistics
- Stars: 2
- Watchers: 2
- Forks: 1
- Open Issues: 5
- Releases: 0
Metadata Files
README.md
GEORELATE
Georelate constructs design matrices from geographical data.
Introduction
Along with an explosion in the availability of granular data, researchers in the social sciences, public health and demography are increasingly interested in using location data to identify diverse spatial treatment effects, from the local health benefits of new hospitals to the economic costs of natural disasters. Yet spatial data can only be as useful as it is easy usable. Unfortunately, using this data can require sophisticated techniques. Enter GEORELATE. GEORELATE uses code that efficiently computes millions of distance pairs on a single CPU core using only Python, Numpy and Pandas to facilitate spatial analysis. This User Guide explains the package’s functionality, demonstrates its usability with datasets built into the package (also shown below) and explains how users can use GEORELATE for their own work.
User Guide
Description
GEORELATE can take any 2 datasets, each containing a set of geographic points, and output a new dataset that includes variables indicating which points in one dataset are closest to each point in the other. GEORELATE also creates a new set of variables in the output dataset that indicate how far apart each point is from the nearest points in the other dataset.
Installation
GEORELATE can be installed by running the line below:
pip install georelate
Requirements:
Python 3.10+
GEORELATE in action
To more clearly demonstrate GEORELATE’s functionality in the sections below we include an example of how to apply the package with sample files and code.
The two input files below contain fictionalized data for Brazil. The first contains the IDs and geographic coordinates of polling stations. The second contains the same information for new foreign aid projects. The image below that shows the two sets of points overlayed on a map of Brazil. Finally, GEORELATE was used to create the output dataset shown below the map. The output dataset contains the IDs and distances away of the 3 closest aid projects to each polling station.
The input and out files are below:
Polling Stations
| local_id | lat | lon | |
|---|---|---|---|
| 0 | 1 | -6.69255 | -39.76566 |
| 1 | 2 | -4.76871 | -39.61186 |
| 2 | 3 | -3.28926 | -40.75443 |
Aid Projects
| projectidaid | lataid | longaid | |
|---|---|---|---|
| 0 | p1 | -6.61616 | -39.97990 |
| 1 | p2 | -4.76871 | -39.77116 |
| 2 | p3 | -4.26065 | -39.39030 |
Map
Blue = polling stations
Red = aid projects

Output
| localid | projectidaid1closest | distance1closest | projectidaid2closest | distance2closest | projectidaid3closest | distance3_closest | |
|---|---|---|---|---|---|---|---|
| 0 | 1 | p1 | 25.124568 | p2 | 213.787803 | p3 | 273.415835 |
| 1 | 3 | p3 | 185.826237 | p2 | 197.251459 | p1 | 379.513297 |
| 2 | 2 | p2 | 17.640953 | p3 | 61.562643 | p1 | 209.292590 |
Example Code
Below I include the code used to create the output file (output.csv).
```python from georelate import designmatrix from georelate.data import loadpollaiddata
load in the foreign_aid data and the polling station data
leftdf, rightdf = loadpollaid_data()
the code below creates the output dataset
df = designmatrix( left = leftdf, right = rightdf, leftid = "localid", rightid = "projectidaid", rightlat = "lataid", rightlon = "longaid", k_closest=3 ) ```
A closer look at the code above...
The line above that begins with “designmatrix” creates the final output file and includes 7 arguments.
The first argument (left = leftdf) specifies the foreign aid projects dataset. GEORELATE will review all the coordinate points in this dataset to identify which foreign aid projects are nearest to the polling stations in the dataset specified in the second argument.
The second argument (right = rightdf) specifies the polling station dataset.
The third argument (leftid = "localid") specifies the column name containing the IDs of the foreign aid projects.
The fourth argument (rightid = "projectidaid") contains the column name containing the IDs of the polling stations.
The fifth argument contains the column name containing the latitude of the foreign aid projects.
The sixth argument contains the column name containing the longitude of the foreign aid projects.
The last argument is used to set the number of nearest foreign projects GEORELATE will include in the final output file. In the example above it is set to 3. Therefore, the output file includes information about the 3 nearest foreign aid projects to each polling station.
Development
Local Dev Instructions
Run poetry install to install the env.
Run poetry run pre-commit install to initialize the git hooks.
Run poetry run pre-commit run --all-files if there are file that were committed before adding the git hooks.
Activate the shell with: poetry shell
Lint with: poetry run pylint georelate/ tests/
Test with: poetry run pytest --cov=georelate
Pushing to PyPI
Environmental variables
Environmental variables are a good way to keep our tokens secret and our options configurable. To set them, copy sample.envrc to .envrc and change the values. Leave PYPI_USERNAME=__token__ and change the value of PYPI_PASSWORD to your password.
To load .envrc, one could just run .envrc as a shell script, but direnv will make things easier by automatically loading the variable when you enter the directory, once allowed.
To install direnv on a mac running zsh, use brew to install with brew install direnv and hook in into your shell by adding eval "$(direnv hook zsh)" to your .zshrc file. For other install instructions see: https://direnv.net/.
To allow direnv to load .envrc in a directory run direnv allow.
Alternatively, set the values of the environmental variable using export PYPI_USERNAME=__token__ && export PYPI_PASSWORD=<your-pypi-token>
Push to PyPI
Run poetry publish --build --username $PYPI_USERNAME --password $PYPI_PASSWORD to build and push the package to PyPI.
Owner
- Name: Edward Krueger
- Login: edkrueger
- Kind: user
- Location: Austin, TX
- Company: Peak Values
- Website: https://medium.edkruegerdata.com/
- Twitter: edkruegerdata
- Repositories: 34
- Profile: https://github.com/edkrueger
Citation (CITATION.cff)
cff-version: 1.2.0
title: georelate
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Edward
family-names: Krueger
email: edwardkrueger@utexas.edu
affiliation: University of Texas at Austin
orcid: 'https://orcid.org/0000-0002-9112-496X'
- given-names: Alexander
family-names: Wais
email: alexwais@utexas.edu
affiliation: University of Texas at Austin
orcid: 'https://orcid.org/0009-0006-1913-7287'
GitHub Events
Total
Last Year
Dependencies
- astroid 2.15.5 develop
- black 23.3.0 develop
- cfgv 3.3.1 develop
- click 8.1.3 develop
- colorama 0.4.6 develop
- coverage 7.2.5 develop
- dill 0.3.6 develop
- distlib 0.3.6 develop
- exceptiongroup 1.1.1 develop
- filelock 3.12.0 develop
- identify 2.5.24 develop
- iniconfig 2.0.0 develop
- isort 5.12.0 develop
- lazy-object-proxy 1.9.0 develop
- mccabe 0.7.0 develop
- mypy-extensions 1.0.0 develop
- nodeenv 1.8.0 develop
- packaging 23.1 develop
- pathspec 0.11.1 develop
- platformdirs 3.5.1 develop
- pluggy 1.0.0 develop
- pre-commit 3.3.1 develop
- pylint 2.17.4 develop
- pytest 7.3.1 develop
- pytest-cov 4.0.0 develop
- pyyaml 6.0 develop
- setuptools 67.7.2 develop
- tomli 2.0.1 develop
- tomlkit 0.11.8 develop
- typing-extensions 4.5.0 develop
- virtualenv 20.23.0 develop
- wrapt 1.15.0 develop
- numpy 1.24.3
- pandas 2.0.1
- python-dateutil 2.8.2
- pytz 2023.3
- six 1.16.0
- tzdata 2023.3
- black * develop
- pre-commit * develop
- pylint * develop
- pytest * develop
- pytest-cov * develop
- numpy ^1.24.3
- pandas ^2.0.1
- python ^3.10