https://github.com/inseefrlab/pynsee
pynsee package contains tools to easily search and download French data from INSEE and IGN APIs
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (15.5%) to scientific vocabulary
Keywords
Repository
pynsee package contains tools to easily search and download French data from INSEE and IGN APIs
Basic Info
- Host: GitHub
- Owner: InseeFrLab
- License: mit
- Language: Python
- Default Branch: master
- Homepage: https://pynsee.readthedocs.io/en/latest/
- Size: 134 MB
Statistics
- Stars: 83
- Watchers: 10
- Forks: 12
- Open Issues: 21
- Releases: 11
Topics
Metadata Files
README.md
pynsee gives a quick access to more than 150 000 macroeconomic series,
a dozen datasets of local data, numerous sources available on insee.fr,
geographical limits of administrative areas taken from IGN
as well as key metadata and SIRENE database containing data on all French companies.
Have a look at the detailed API page portail-api.insee.fr.
This package is a contribution to reproducible research and public data transparency. It benefits from the developments made by teams working on APIs at INSEE and IGN.
Installation & API subscription
Credentials are necessary to access SIRENE API available through pynsee by the module sirene. API credentials can be created here : portail-api.insee.fr. All other modules are freely accessible.
```python
Download Pypi package
pip install pynsee[full]
Get the development version from GitHub
git clone https://github.com/InseeFrLab/pynsee.git
cd pynsee
pip install .[full]
Subscribe to portail-api.insee.fr and get your credentials!
Save your credentials with init_conn function :
from pynsee.utils import initconn initconn(sirenekey="mysirene_key")
Beware : any change to the keys should be tested after having cleared the cache
Please do : from pynsee.utils import clearallcache; clearallcache()
```
Data Search and Collection Advice
- Macroeconomic data :
First, use
get_dataset_listto search what are your datasets of interest and then get the series list withget_series_list. Alternatively, you can make a keyword-based search withsearch_macrodata, e.g.search_macrodata('GDP'). Then, get the data withget_datasetorget_series - Local data : use first
get_local_metadata, then get data withget_local_data - Metadata : e.g. function to get the classification of economic activities (Naf/Nace Rev2)
get_activity_list - Sirene (French companies database) : use first
get_dimension_list, then usesearch_sirenewith dimensions as filtering variables - Geodata : get the list of available geographical data with
get_geodata_listand then retrieve it withget_geodata - Files on insee.fr: get the list of available files on insee.fr with
get_file_listand then download it withdownload_file
For further advice, have a look at the documentation and gallery of the examples.
Example - Population Map
```python import math
import matplotlib.cm as cm import matplotlib.pyplot as plt import numpy as np import pandas as pd
from pynsee.geodata import getgeodatalist, get_geodata
get geographical data list
geodatalist = getgeodata_list()
get departments geographical limits
mapcom = getgeodata("ADMINEXPRESS-COG-CARTO.LATEST:commune").tocrs(epsg=3035)
area calculations depend on crs which fits metropolitan france but not overseas departements
figures should not be considered as official statistics
mapcom.attrs["area"] = mapcom.area / 10**6 mapcom = mapcom.to_crs(epsg=3857)
mapcom['REFAREA'] = 'D' + mapcom['codeinsee'] mapcom['density'] = mapcom['population'] / mapcom.attrs["area"]
mapcom = mapcom.transform_overseas(departement=['971', '972', '974', '973', '976', 'NR'], factor=[1.5, 1.5, 1.5, 0.35, 1.5, 1.5])
mapcom = mapcom.zoom( departement=["75","92", "93", "91", "77", "78", "95", "94"], factor=1.5, startAngle = math.pi * (1 - 3.5 * 1/9))
density_ranges = [ 40, 80, 100, 120, 150, 200, 250, 400, 600, 1000, 2000, 5000, 10000, 20000 ]
rvals = np.full(len(mapcom), "< 40", dtype=object)
list_ranges = ["< 40"]
for rmin, rmax in zip(densityranges, densityranges[1:]): rangestring = f"[{rmin}, {rmax}[" listranges.append(range_string)
rvals[(mapcom.density >= rmin) & (mapcom.density < rmax)] = range_string
rvals[mapcom.density.values > density_ranges[-1]] = "> 20 000"
list_ranges.append("> 20 000")
mapcom.loc[:, "range"] = pd.Categorical(rvals, ordered=True, categories=list_ranges)
fig, ax = plt.subplots(1, 1, figsize=(15, 15)) lgd = {'bboxtoanchor': (1.1, 0.8), 'title': 'density per km2'} mapcom.plot(column="range", cmap=cm.viridis, legend=True, ax=ax, legendkwds=lgd) ax.setaxis_off() ax.set(title='Distribution of population in France') plt.show()
fig.savefig('popfrance.svg', format='svg', dpi=1200, bboxinches = 'tight', pad_inches = 0) ```
How to avoid proxy issues ?
```python
Use the proxyserver argument of the initconn function to change the proxy server address
from pynsee.utils import initconn initconn(sirenekey="mysirenekey", httpproxy="http://myproxyserver:port", httpsproxy="http://myproxy_server:port")
Beware : any change to the keys should be tested after having cleared the cache
Please do : from pynsee.utils import *; clearallcache()
Alternativety you can use directly environment variables as follows.
Beware not to commit your credentials!
import os os.environ['sirenekey'] = 'mysirenekey' os.environ['httpproxy'] = "http://myproxyserver:port" os.environ['httpsproxy'] = "http://myproxy_server:port"
```
Support
Feel free to open an issue with any question about this package using the Github repository.
Contributing
All contributions, whatever their forms, are welcome. See CONTRIBUTING.md
Owner
- Name: InseeFrLab
- Login: InseeFrLab
- Kind: organization
- Email: innovation@insee.fr
- Location: France
- Website: https://insee.fr
- Repositories: 214
- Profile: https://github.com/InseeFrLab
Lab de @InseeFr
GitHub Events
Total
- Create event: 27
- Commit comment event: 1
- Release event: 7
- Issues event: 43
- Watch event: 12
- Delete event: 32
- Issue comment event: 136
- Push event: 182
- Pull request review event: 118
- Pull request review comment event: 132
- Pull request event: 75
- Fork event: 2
Last Year
- Create event: 27
- Commit comment event: 1
- Release event: 7
- Issues event: 43
- Watch event: 12
- Delete event: 32
- Issue comment event: 136
- Push event: 182
- Pull request review event: 118
- Pull request review comment event: 132
- Pull request event: 75
- Fork event: 2
Committers
Last synced: 9 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Hadrien Leclerc | l****n@g****m | 1,647 |
| Thomas Grandjean | t****e@g****m | 35 |
| linogaliana | l****a@y****r | 30 |
| tfardet | 7****t | 17 |
| hadrilec | h****c@i****r | 3 |
| raphaeleadjerad | r****d@g****m | 3 |
| elias showk | e****s@s****e | 1 |
| Milena Suarez Castillo | 5****t | 1 |
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 78
- Total pull requests: 188
- Average time to close issues: 4 months
- Average time to close pull requests: about 1 month
- Total issue authors: 13
- Total pull request authors: 8
- Average comments per issue: 2.28
- Average comments per pull request: 1.69
- Merged pull requests: 135
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 24
- Pull requests: 78
- Average time to close issues: about 1 month
- Average time to close pull requests: 8 days
- Issue authors: 6
- Pull request authors: 5
- Average comments per issue: 1.92
- Average comments per pull request: 1.35
- Merged pull requests: 53
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- hadrilec (20)
- tgrandje (15)
- raphaeleadjerad (14)
- tfardet (12)
- linogaliana (7)
- souhir-am (2)
- FFredericAL (2)
- cthiounn (1)
- MelineeTS (1)
- strayMat (1)
- elGringo11 (1)
- daniel-odc (1)
- fbmfbm (1)
Pull Request Authors
- hadrilec (78)
- tgrandje (52)
- tfardet (37)
- linogaliana (14)
- raphaeleadjerad (2)
- cthiounn (2)
- milena-git (2)
- elishowk (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- Jinja2 >=3.0
- appdirs >=1.4.4
- datetime >=3.5.9
- descartes *
- geopandas *
- ipykernel ==6.13.0
- ipython >=7.16.1
- jupyter ==1.0.0
- jupyter-cache ==0.5.0
- m2r2 ==0.2.7
- markupsafe ==2.0.1
- nbclient ==0.5.13
- nbconvert ==6.5.0
- nbformat ==5.3.0
- nbsphinx ==0.8.7
- openpyxl *
- pandas >=0.24.2
- pathlib2 >=2.3.5
- py7zr *
- pyyaml >=5.4.1
- readthedocs-sphinx-search ==0.1.0rc3
- requests >=2.25.1
- seaborn *
- shapely ==1.8.0
- sphinx ==4.4.0
- sphinx-gallery ==0.10.0
- sphinx_copybutton ==0.5.0
- sphinx_rtd_theme ==0.5.1
- sphinxcontrib-svg2pdfconverter *
- tqdm >=4.56.0
- unidecode >=1.2.0
- openpyxl *
- xlrd >=2.0.1
- appdirs >=1.4.4
- datetime >=3.5.9
- pandas >=0.24.2
- pathlib2 >=2.3.5
- requests >=2.25.1
- shapely ==1.8.0
- tqdm >=4.56.0
- unidecode >=1.2.0
- appdirs >=1.4.4
- datetime >=3.5.9
- pandas >=0.24.2
- pathlib2 >=2.3.5
- requests >=2.25.1
- shapely ==1.8.0
- tqdm >=4.56.0
- unidecode >=1.2.0
- docker/build-push-action v2 composite
- docker/login-action v1 composite
- docker/setup-buildx-action v1 composite
- docker/setup-qemu-action v1 composite
- actions/checkout v2 composite
- actions/setup-python v2 composite
- actions/checkout v3 composite
- actions/upload-artifact v1 composite
- actions/checkout v2 composite
- actions/setup-python v2 composite
- codecov/codecov-action v1 composite
- actions/checkout v2 composite
- actions/create-release v1 composite
- actions/download-artifact v2 composite
- actions/setup-python v2 composite
- actions/upload-artifact v2 composite
- actions/upload-release-asset v1 composite
- pypa/gh-action-pypi-publish v1.4.2 composite
- python 3.9-slim-bullseye build
- actions/checkout v4 composite
- actions/setup-python v2 composite
- codecov/codecov-action v1 composite