cbsodata

Unofficial Statistics Netherlands (CBS) open data API client for Python

https://github.com/j535d165/cbsodata

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    1 of 5 committers (20.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (7.9%) to scientific vocabulary

Keywords

census-api census-data data national-statistics netherlands open-data python-library
Last synced: 4 months ago · JSON representation ·

Repository

Unofficial Statistics Netherlands (CBS) open data API client for Python

Basic Info
Statistics
  • Stars: 46
  • Watchers: 11
  • Forks: 21
  • Open Issues: 5
  • Releases: 12
Topics
census-api census-data data national-statistics netherlands open-data python-library
Created about 9 years ago · Last pushed 4 months ago
Metadata Files
Readme License Citation

README.md

Statistics Netherlands opendata API client for Python

pypi tests

Retrieve data from the open data interface of Statistics Netherlands (Centraal Bureau voor de Statistiek) with Python. The data is identical in content to the tables which can be retrieved and downloaded from StatLine. CBS datasets are accessed via the CBS open data portal.

The documentation of this package is found at this page and on readthedocs.io.

R user? Use cbsodataR.

Installation

From PyPi

sh pip install cbsodata

Usage

Load the package with

``` python

import cbsodata ```

Tables

Statistics Netherlands (CBS) has a large amount of public available data tables (more than 4000 at the moment of writing). Each table is identified by a unique identifier (Identifier).

``` python

tables = cbsodata.gettablelist() print(tables[0]) {'Catalog': 'CBS', 'ColumnCount': 18, 'DefaultPresentation': 'la=nl&si=&gu=&ed=LandVanUiteindelijkeZeggenschapUCI&td=Perioden&graphType=line', 'DefaultSelection': "$filter=((LandVanUiteindelijkeZeggenschapUCI eq '11111') or (LandVanUiteindelijkeZeggenschapUCI eq '22222')) and (Bedrijfsgrootte eq '10000') and (substringof('JJ',Perioden))&$select=LandVanUiteindelijkeZeggenschapUCI, Bedrijfsgrootte, Perioden, FiscaalJaarloonPerBaan15", 'ExplanatoryText': '', 'Frequency': 'Perjaar', 'GraphTypes': 'Table,Bar,Line', 'ID': 0, 'Identifier': '82010NED', 'Language': 'nl', 'MetaDataModified': '2014-02-04T02:00:00', 'Modified': '2014-02-04T02:00:00', 'OutputStatus': 'Regulier', 'Period': '2008 t/m 2011', 'ReasonDelivery': 'Actualisering', 'RecordCount': 32, 'SearchPriority': '2', 'ShortDescription': '\nDeze tabel bevat informatie over banen en lonen bij bedrijven in Nederland, uitgesplitst naar het land van uiteindelijke zeggenschap van die bedrijven. Hierbij wordt onderscheid gemaakt tussen bedrijven onder Nederlandse zeggenschap en bedrijven onder buitenlandse zeggenschap. In de tabel zijn alleen de bedrijven met werknemers in loondienst meegenomen. De cijfers hebben betrekking op het totale aantal banen bij deze bedrijven en de samenstelling van die banen naar kenmerken van de werknemers (baanstatus, geslacht, leeftijd, herkomst en hoogte van het loon). Ook het gemiddelde fiscale jaarloon per baan is in de tabel te vinden. \n\nGegevens beschikbaar vanaf: 2008 \n\nStatus van de cijfers: \nDe cijfers in deze tabel zijn definitief.\n\nWijzigingen per 4 februari 2014\nDe cijfers van 2011 zijn toegevoegd.\n\nWanneer komen er nieuwe cijfers?\nDe cijfers over 2012 verschijnen in de eerste helft van 2015.\n', 'ShortTitle': 'Zeggenschap bedrijven; banen, grootte', 'Source': 'CBS.', 'Summary': 'Banen en lonen van werknemers bij bedrijven in Nederland\nnaar land van uiteindelijke zeggenschap en bedrijfsgrootte', 'SummaryAndLinks': 'Banen en lonen van werknemers bij bedrijven in Nederland
naar land van uiteindelijke zeggenschap en bedrijfsgrootte
http://opendata.cbs.nl/ODataApi/OData/82010NED
http://opendata.cbs.nl/ODataFeed/OData/82010NED', 'Title': 'Zeggenschap bedrijven in Nederland; banen en lonen, bedrijfsgrootte', 'Updated': '2014-02-04T02:00:00'} ```

Info

Get information about a table with the get_info function.

``` python

info = cbsodata.get_info('82070ENG') # Returns a dict with info info['Title'] 'Caribbean Netherlands; employed labour force characteristics 2012' info['Modified'] '2013-11-28T15:00:00' ```

Data

The function you are looking for!! The function get_data returns a list of dicts with the table data.

``` python

data = cbsodata.getdata('82070ENG') [{'CaribbeanNetherlands': 'Bonaire', 'EmployedLabourForceInternatDef1': 8837, 'EmployedLabourForceNationalDef2': 8559, 'Gender': 'Total male and female', 'ID': 0, 'Periods': '2012', 'PersonalCharacteristics': 'Total personal characteristics'}, {'CaribbeanNetherlands': 'St. Eustatius', 'EmployedLabourForceInternatDef1': 2099, 'EmployedLabourForceNationalDef2': 1940, 'Gender': 'Total male and female', 'ID': 1, 'Periods': '2012', 'PersonalCharacteristics': 'Total personal characteristics'}, {'CaribbeanNetherlands': 'Saba', 'EmployedLabourForceInternatDef1': 1045, 'EmployedLabourForceNationalDef_2': 971, 'Gender': 'Total male and female', 'ID': 2, 'Periods': '2012', 'PersonalCharacteristics': 'Total personal characteristics'}, # ... ] ```

The keyword argument dir can be used to download the data directly to your file system.

``` python

data = cbsodata.getdata('82070ENG', dir="dirtosavedata") ```

Catalogs (dataderden)

There are multiple ways to retrieve data from catalogs other than 'opendata.cbs.nl'. The code below shows 3 different ways to retrieve data from the catalog 'dataderden.cbs.nl' (known from Iv3).

On module level.

``` python cbsodata.options.catalog_url = 'dataderden.cbs.nl'

list tables

cbsodata.gettablelist()

get dataset 47003NED

cbsodata.get_data('47003NED') ```

With context managers.

python with cbsodata.catalog('dataderden.cbs.nl'): # list tables cbsodata.get_table_list() # get dataset 47003NED cbsodata.get_data('47003NED')

As a function argument.

``` python

list tables

cbsodata.gettablelist(catalog_url='dataderden.cbs.nl')

get dataset 47003NED

cbsodata.getdata('47003NED', catalogurl='dataderden.cbs.nl') ```

Pandas users

The package works well with Pandas. Convert the result easily into a pandas DataFrame with the code below.

``` python

data = pandas.DataFrame(cbsodata.get_data('82070ENG')) data.head() ```

The list of tables can be turned into a pandas DataFrame as well.

``` python

tables = pandas.DataFrame(cbsodata.gettablelist()) tables.head() ```

Command Line Interface

This library ships with a Command Line Interface (CLI).

``` bash

cbsodata -h usage: cbsodata [-h] [--version] [subcommand]

CBS Open Data: Command Line Interface

positional arguments: subcommand the subcommand (one of 'data', 'info', 'list')

optional arguments: -h, --help show this help message and exit --version show the package version ```

Download data:

``` bash

cbsodata data 82010NED ```

Retrieve table information:

``` bash

cbsodata info 82010NED ```

Retrieve a list with all tables:

``` bash

cbsodata list ```

Export data

Use the flag -o to load data to a file (JSON lines).

``` bash

cbsodata data 82010NED -o table_82010NED.jl ```

Owner

  • Name: Jonathan de Bruin
  • Login: J535D165
  • Kind: user
  • Location: Netherlands
  • Company: Utrecht University

Research engineer working on software, datasets, and tools to advance open science 👐 @UtrechtUniversity @asreview

Citation (CITATION.cff)

cff-version: 1.2.0
title: >-
  cbsodata - Statistics Netherlands opendata API client for
  Python
message: 'We appreciate, but do not require, attribution.'
type: software
authors:
  - family-names: De Bruin
    given-names: Jonathan
    orcid: 'https://orcid.org/0000-0002-4297-0502'
repository-code: 'https://github.com/J535D165/cbsodata'
url: 'https://github.com/J535D165/cbsodata'
repository-artifact: 'https://pypi.org/project/cbsodata/'
keywords:
  - census-data
  - national-statistics
  - census
  - dataset
  - open-data
  - netherlands
license: MIT

GitHub Events

Total
  • Watch event: 2
  • Issue comment event: 6
  • Push event: 7
  • Pull request review event: 1
  • Pull request event: 9
  • Fork event: 3
  • Create event: 1
Last Year
  • Watch event: 2
  • Issue comment event: 6
  • Push event: 7
  • Pull request review event: 1
  • Pull request event: 9
  • Fork event: 3
  • Create event: 1

Committers

Last synced: 7 months ago

All Time
  • Total Commits: 71
  • Total Committers: 5
  • Avg Commits per committer: 14.2
  • Development Distribution Score (DDS): 0.07
Past Year
  • Commits: 4
  • Committers: 2
  • Avg Commits per committer: 2.0
  • Development Distribution Score (DDS): 0.5
Top Committers
Name Email Commits
Jonathan j****e@g****m 66
Ewout ter Hoeven E****n@s****l 2
nieuwenhoven f****n@g****m 1
ncvanegmond n****d@g****m 1
huisman 2****n 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 13
  • Total pull requests: 20
  • Average time to close issues: 4 months
  • Average time to close pull requests: 9 months
  • Total issue authors: 10
  • Total pull request authors: 7
  • Average comments per issue: 2.38
  • Average comments per pull request: 1.15
  • Merged pull requests: 13
  • Bot issues: 0
  • Bot pull requests: 2
Past Year
  • Issues: 0
  • Pull requests: 9
  • Average time to close issues: N/A
  • Average time to close pull requests: 3 months
  • Issue authors: 0
  • Pull request authors: 3
  • Average comments per issue: 0
  • Average comments per pull request: 0.67
  • Merged pull requests: 4
  • Bot issues: 0
  • Bot pull requests: 2
Top Authors
Issue Authors
  • J535D165 (4)
  • zufanka (1)
  • TheHunter896 (1)
  • wbijster (1)
  • ericjanbelier (1)
  • rcsmit (1)
  • dkorpershoek (1)
  • jpvandervelden (1)
  • wesselhuising (1)
  • joosbuijsNL (1)
Pull Request Authors
  • J535D165 (8)
  • EwoutH (6)
  • pre-commit-ci[bot] (2)
  • reidhin (1)
  • huisman (1)
  • nieuwenhoven (1)
  • ncvanegmond (1)
Top Labels
Issue Labels
Pull Request Labels
bug (1)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 8,008 last-month
  • Total dependent packages: 2
  • Total dependent repositories: 16
  • Total versions: 17
  • Total maintainers: 1
pypi.org: cbsodata

Statistics Netherlands opendata API client for Python

  • Versions: 17
  • Dependent Packages: 2
  • Dependent Repositories: 16
  • Downloads: 8,008 Last month
Rankings
Dependent repos count: 3.6%
Downloads: 3.7%
Average: 4.0%
Dependent packages count: 4.8%
Maintainers (1)
Last synced: 4 months ago

Dependencies

docs/requirements.txt pypi
  • requests *
  • sphinx-rtd-theme *
setup.py pypi
  • requests *