demography

Quickly load demographic data based on UK post codes to enrich your dataset. Based on data made available by the UK's Office for National Statistics (ONS).

https://github.com/markdouthwaite/demography

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
○
Academic publication links
✓
Committers with academic emails
1 of 2 committers (50.0%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (12.9%) to scientific vocabulary

Keywords

data-enrichment data-utilities demographics python

Last synced: 10 months ago · JSON representation

Repository

Quickly load demographic data based on UK post codes to enrich your dataset. Based on data made available by the UK's Office for National Statistics (ONS).

Basic Info

Host: GitHub
Owner: markdouthwaite
License: mit
Language: Python
Default Branch: master
Homepage:
Size: 6.72 MB

Statistics

Stars: 1
Watchers: 0
Forks: 0
Open Issues: 2
Releases: 1

Topics

data-enrichment data-utilities demographics python

Created almost 6 years ago · Last pushed over 2 years ago

Metadata Files

Readme License

demography

This package implements a simple mechanism for quickly loading demographic data based on post codes. This is currently only implemented for the UK. It is based on data made available by the UK's Office for National Statistics (ONS).

The data was taken from Geoportal.

If you want to jump to seeing how this package can play with pandas, see below.

The package comes with built-in caching, makes extensive use of hash maps (i.e. dictionaries), and should generally be pretty fast!

As well as providing mappings to OAC11 groups (demographic codes), you can also map to lower-level groups within these codes too. See below for examples.

Hopefully it'll save you having to repeatedly find, load and transform ONS census data!

Getting started

You can install demography with:

bash pip install demography

There's only really one main function in this package, and it works like this:

```python import demography

demography.get("SW1A 0AA", using="groups") ```

You'll get something like:

['Cosmopolitans', 'Aspiring and affluent', 'Highly-qualified quaternary workers']

These are Classification for Output Areas (OAC) groups -- demographic groupings provided by ONS for specific regions. If a specific OAC group cannot be found from the full postcode, it will default to using the prefix value (i.e. area-level demographics). If this too does not return a value, it will return the value provided by the default parameter.

You can also get the group codes:

python demography.get("SW1A 0AA", using="oac")

And you'd get:

text 2D2

If you want to access the mappings between OAC codes and the groups together, you can use:

python demography.groups("uk")

To give:

text {'1A1': ['Rural residents', 'Farming communities', 'Rural workers and families'], '1A2': ['Rural residents', 'Farming communities', 'Established farming communities'] ...

Finally, it can be useful to have these groups encoded with:

python demography.get("SW1A 0AA", using="encoded_groups")

To give:

text [30, 55, 59]

To retrieve the encodings for this, you can use:

python demography.encoded_groups("uk")

Validation

As an additional benefit, you can enable validation for postcodes with:

python demography.get("SW1A 0AA", using="encoded_groups", validate=True)

Playing with pandas

You can use demography to encode pandas.DataFrame columns pretty easily too:

```python import pandas as pd import demography as dm

df = pd.read_csv("my-dataset.csv")

get the encoded 'super group', 'group', 'sub group' set.

datagen = (dm.get(code, using="encodedgroups") for code in df["postcode"])

build a dataframe

dmdf = pd.DataFrame(data=datagen, columns=["supergroup", "group", "subgroup"])

horizontally concatenate the groups dataframe to your original frame.

df = pd.concat([df, dm_df], axis=1) ```

Or alternatively, if you only need oac11 codes, you can use:

python df["demographic"] = df["postcode"].apply(lambda _: dm.get(_))

Note that you'll need to use the name of your column for postcode!

Owner

Name: Mark Douthwaite
Login: markdouthwaite
Kind: user
Location: Manchester, UK
Company: Peak AI

Website: mark.douthwaite.io
Twitter: markldouthwaite
Repositories: 53
Profile: https://github.com/markdouthwaite

AI, Product, Software. Currently a Principal Product Manager in AI (Platform & Gen AI) @ Peak AI.

GitHub Events

Total

Last Year

Committers

Last synced: over 2 years ago

All Time

Total Commits: 11
Total Committers: 2
Avg Commits per committer: 5.5
Development Distribution Score (DDS): 0.182

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Mark Douthwaite	m**k@d**o	9
Mark Douthwaite	m**7@y**k	2

Committer Domains (Top 20 + Academic)

york.ac.uk: 1 douthwaite.io: 1

Issues and Pull Requests

Last synced: over 1 year ago

All Time

Total issues: 1
Total pull requests: 1
Average time to close issues: N/A
Average time to close pull requests: N/A
Total issue authors: 1
Total pull request authors: 1
Average comments per issue: 0.0
Average comments per pull request: 0.0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 1

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

markdouthwaite (1)

Pull Request Authors

dependabot[bot] (2)

Top Labels

Issue Labels

Pull Request Labels

dependencies (2)

Packages

Total packages: 1
Total downloads:
- pypi 41 last-month

Total dependent packages: 0
Total dependent repositories: 1
Total versions: 4
Total maintainers: 1

pypi.org: demography

Demographic mapping based on UK ONS & census data.

Homepage: https://github.com/markdouthwaite/demography
Documentation: https://demography.readthedocs.io/
License: MIT
Latest release: 0.0.2
published almost 6 years ago

Versions: 4
Dependent Packages: 0
Dependent Repositories: 1
Downloads: 41 Last month

Rankings

Dependent packages count: 10.0%

Dependent repos count: 21.7%

Average: 25.3%

Forks count: 29.8%

Stargazers count: 31.9%

Downloads: 33.0%

Maintainers (1)

MarkDouthwaite

Last synced: 11 months ago

Dependencies

requirements.txt pypi

black ==19.10b0
pytest ==5.4.3

setup.py pypi

demography

Science Score: 23.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

demography

Getting started

Validation

Playing with pandas

get the encoded 'super group', 'group', 'sub group' set.

build a dataframe

horizontally concatenate the groups dataframe to your original frame.

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: demography

Rankings

Maintainers (1)

Dependencies