python-datascientist
Dépôt associé au cours Python pour data scientists (ENSAE 2e année)
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: zenodo.org -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (7.7%) to scientific vocabulary
Keywords
Repository
Dépôt associé au cours Python pour data scientists (ENSAE 2e année)
Basic Info
- Host: GitHub
- Owner: linogaliana
- License: other
- Language: Python
- Default Branch: main
- Homepage: https://pythonds.linogaliana.fr/
- Size: 1.01 GB
Statistics
- Stars: 134
- Watchers: 1
- Forks: 49
- Open Issues: 5
- Releases: 10
Topics
Metadata Files
README.md
Data Science with Python 
[!NOTE]
This is the English 🇬🇧🇺🇸 version of theREADME.
To see the French 🇫🇷 version, click here:
📚 About
This repository contains the source files for the course Python for Data Science
taught in the second year (Master 1) at ENSAE.
The course website is available here:
🌐 https://pythonds.linogaliana.fr/
🎨 Gallery
Some visualizations produced during the course:
📖 Course content
This course is suitable for both beginners and advanced learners.
The syllabus below is fully clickable and collapsible.
1. Getting started: why Python for data science?
🔗 https://pythonds.linogaliana.fr/en/content/getting-started/ - Getting a functional Python environment for data science - How to deal with a data set - Python basics2. Data wrangling
🔗 https://pythonds.linogaliana.fr/en/content/manipulation/ - Numpy, the foundation of data science - Introduction to Pandas - Data wrangling with Pandas - Spatial data with GeoPandas - Webscraping with Python - Retrieving data with APIs - Mastering regular expressions - Importing data from Parquet and S33. Data visualisation and communication
🔗 https://pythonds.linogaliana.fr/en/content/visualisation/ - Building graphics with Python - Introduction to cartography4. Modeling
🔗 https://pythonds.linogaliana.fr/en/content/modelisation/ - Why preprocessing matters - Evaluating model quality - Introduction to classification - Introduction to regression - Feature selection - Clustering5. Natural Language Processing (NLP)
🔗 https://pythonds.linogaliana.fr/en/content/nlp/ - Cleaning and structuring texts - Bag-of-words approach - Text embeddings🔗 Resources
The course content relies heavily on open data, including French datasets (from data.gouv and Insee) and American datasets.
Complementary course with Romain Avouac (@avouacr):
https://ensae-reproductibilite.github.io/website/
🚀 Accessing the course in Jupyter Notebooks
[!TIP]
Run examples instantly on SSP Cloud or Google Colab. Here is an example forPandaschapter:
🤝 Contributing
I welcome contributions!
Owner
- Name: Lino Galiana
- Login: linogaliana
- Kind: user
- Location: Paris
- Company: Insee
- Website: https://linogaliana.fr/
- Twitter: linogaliana
- Repositories: 14
- Profile: https://github.com/linogaliana
Data Scientist Insee - Teaching at ENSAE
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use some content of this repository, please cite it as below." authors: - family-names: "Galiana" given-names: "Lino" orcid: "https://orcid.org/0000-0001-8663-5100" title: "Python pour la data science" doi: 10.5281/zenodo.5386096 date-released: 2024-06-01 url: "https://github.com/linogaliana/python-datascientist"
GitHub Events
Total
- Issues event: 28
- Watch event: 23
- Delete event: 40
- Issue comment event: 16
- Push event: 315
- Pull request event: 71
- Fork event: 3
- Create event: 39
Last Year
- Issues event: 28
- Watch event: 23
- Delete event: 40
- Issue comment event: 16
- Push event: 315
- Pull request event: 71
- Fork event: 3
- Create event: 39
Committers
Last synced: 11 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Lino Galiana | l****a@i****r | 769 |
| Romain Avouac | 4****r | 27 |
| Antoine Palazzolo | 9****z | 12 |
| Julien PRAMIL | 1****l | 7 |
| Thomas Faria | 5****a | 5 |
| Kim A | k****y@l****t | 2 |
| Raphaele Adjerad | 5****d | 2 |
| lbaudin | 1****n | 2 |
| tomseimandi | t****i@g****m | 2 |
| Expressso | 9****o | 1 |
| Idrissa KONKOBO | 9****a | 1 |
| Mélissa Tamine | 9****a | 1 |
| jblaval | l****e@g****m | 1 |
| romanegajdos | 7****s | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 64
- Total pull requests: 102
- Average time to close issues: 7 months
- Average time to close pull requests: 4 days
- Total issue authors: 7
- Total pull request authors: 5
- Average comments per issue: 0.63
- Average comments per pull request: 0.01
- Merged pull requests: 88
- Bot issues: 0
- Bot pull requests: 5
Past Year
- Issues: 12
- Pull requests: 52
- Average time to close issues: 30 days
- Average time to close pull requests: 3 days
- Issue authors: 5
- Pull request authors: 4
- Average comments per issue: 0.92
- Average comments per pull request: 0.0
- Merged pull requests: 38
- Bot issues: 0
- Bot pull requests: 5
Top Authors
Issue Authors
- linogaliana (74)
- fa5fou5 (3)
- jpramil (3)
- daniel-odc (3)
- jaerdoster (1)
- leomignot (1)
- avouacr (1)
- antoine-palazz (1)
- bpezet (1)
- Orlogskapten (1)
- raphaelfournier (1)
Pull Request Authors
- linogaliana (151)
- jpramil (5)
- dependabot[bot] (5)
- avouacr (4)
- ThomasFaria (2)
- antoine-palazz (2)
- lbaudin (1)
- ntoulemonde (1)
- fa5fou5 (1)
- romanegajdos (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- actions/checkout v2 composite
- actions/setup-node v2 composite
- actions/upload-artifact v2 composite
- actions/checkout v2 composite
- actions/setup-node v2 composite
- actions/upload-artifact v1 composite
- actions/checkout v3 composite
- actions/setup-python v4 composite
- actions/upload-artifact v3 composite
- actions/checkout v3 composite
- actions/upload-artifact v2 composite
- linogaliana/github-action-push-to-another-repository main composite
- contextily *
- geoplot *
- graphviz *
- kaleido *
- plotnine *
- pynsee *
- pywaffle *
- wordcloud *
- xlrd *
- yellowbrick *















