https://github.com/aphp/spark-etl
Better bridge apache spark and postgresql
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (1.4%) to scientific vocabulary
Keywords
etl
postgresql
spark
Last synced: 5 months ago
·
JSON representation
Repository
Better bridge apache spark and postgresql
Basic Info
- Host: GitHub
- Owner: aphp
- License: apache-2.0
- Language: Scala
- Default Branch: master
- Size: 6.85 MB
Statistics
- Stars: 23
- Watchers: 4
- Forks: 8
- Open Issues: 10
- Releases: 0
Topics
etl
postgresql
spark
Created about 7 years ago
· Last pushed over 2 years ago
Metadata Files
Readme
License
README.md
SPARK-ETL
This repository contains several modules around ETL processes with a focus on scalability and quality. It is based on various technologies among:
- apache SPARK
- apache HIVE
- apache SOLR
- PostgreSQL
Owner
- Name: Greater Paris University Hospitals (AP-HP)
- Login: aphp
- Kind: organization
- Location: Paris
- Website: https://www.aphp.fr/
- Repositories: 35
- Profile: https://github.com/aphp
GitHub Events
Total
Last Year
Committers
Last synced: 8 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Rasmey SARETH | r****h@g****m | 268 |
| parisni | n****s@r****t | 162 |
| Jean-François YUEN | j****n@d****m | 72 |
| Joseph Allemandou | j****u@g****m | 19 |
| tlama | t****a@g****m | 9 |
| Adrien Lavoillotte | a****e@d****m | 8 |
| LAMA | t****a@c****m | 7 |
| Nicolas Paris | n****s@a****r | 6 |
| Saad ELBASSITI | s****t@a****r | 3 |
| Adrien Lavoillotte | s****c@f****r | 3 |
| saad elba | e****d@g****m | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 10 months ago
All Time
- Total issues: 15
- Total pull requests: 20
- Average time to close issues: about 1 month
- Average time to close pull requests: 3 months
- Total issue authors: 3
- Total pull request authors: 2
- Average comments per issue: 0.27
- Average comments per pull request: 0.85
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 19
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- parisni (13)
- selmi2 (1)
- sgorantla (1)
Pull Request Authors
- dependabot[bot] (19)
- ivan-veselovsky (1)
Top Labels
Issue Labels
enhancement (8)
Pull Request Labels
dependencies (19)
java (6)
javascript (1)
Dependencies
pom.xml
maven
- com.opentable.components:otj-pg-embedded 0.13.3 provided
- com.sksamuel.scapegoat:scalac-scapegoat-plugin_2.11 provided
- org.apache.hadoop:hadoop-client 2.6.5 provided
- org.apache.hadoop:hadoop-common 2.6.5 provided
- org.apache.spark:spark-core_2.11 2.4.3 provided
- org.apache.spark:spark-sql_2.11 2.4.3 provided
- com.amazon.deequ:deequ 1.0.2
- com.esotericsoftware:kryo-shaded 4.0.2
- com.fasterxml.jackson.core:jackson-core 2.6.7
- com.fasterxml.jackson.core:jackson-databind 2.6.7
- com.fasterxml.jackson.module:jackson-module-scala_2.11 2.6.7
- com.github.tomakehurst:wiremock 1.56
- com.sksamuel.scapegoat:scalac-scapegoat-plugin_2.11 1.3.10
- com.typesafe.scala-logging:scala-logging_2.11 3.9.2
- de.bytefish:pgbulkinsert 4.1
- io.delta:delta-core_2.11 0.6.1
- io.frama.parisni:spark-csv 1.0.13-SNAPSHOT
- io.frama.parisni:spark-dataframe 1.0.13-SNAPSHOT
- io.frama.parisni:spark-hive 1.0.13-SNAPSHOT
- io.frama.parisni:spark-meta 1.0.13-SNAPSHOT
- io.frama.parisni:spark-postgres 1.0.13-SNAPSHOT
- io.frama.parisni:spark-quality 1.0.13-SNAPSHOT
- io.frama.parisni:spark-sync 1.0.13-SNAPSHOT
- net.jcazevedo:moultingyaml_2.11 0.4.1
- org.apache.solr:solr-core 8.5.1
- org.apache.solr:solr-solrj 8.5.1
- org.apache.solr:solr-test-framework 8.5.1
- org.joda:joda-convert 1.2
- org.postgresql:postgresql 42.2.5
- org.scala-lang:scala-library 2.11.11
- com.lucidworks.spark:spark-solr 3.7.6 test
- com.opentable.components:otj-pg-embedded 0.13.3 test
- junit:junit test
- junit:junit 4.13 test
- org.apache.spark:spark-catalyst_2.11 2.4.3 test
- org.apache.spark:spark-core_2.11 2.4.3 test
- org.apache.spark:spark-sql_2.11 2.4.3 test
- org.junit.jupiter:junit-jupiter-engine 5.1.0 test
- org.scalatest:scalatest_2.11 test
- org.scalatest:scalatest_2.11 3.0.8 test
spark-csv/pom.xml
maven
- io.frama.parisni:spark-dataframe
spark-dataframe/pom.xml
maven
- io.delta:delta-core_${scala.tools.version}
- io.frama.parisni:spark-quality
spark-hive/pom.xml
maven
- io.delta:delta-core_${scala.tools.version}
- io.frama.parisni:spark-dataframe
- io.frama.parisni:spark-postgres
- net.jcazevedo:moultingyaml_${scala.tools.version}
spark-meta/pom.xml
maven
- com.opentable.components:otj-pg-embedded
- io.frama.parisni:spark-csv
- io.frama.parisni:spark-dataframe
- io.frama.parisni:spark-postgres
- net.jcazevedo:moultingyaml_${scala.tools.version}
spark-postgres/pom.xml
maven
- com.opentable.components:otj-pg-embedded provided
- org.apache.spark:spark-core_${scala.tools.version} provided
- org.apache.spark:spark-sql_${scala.tools.version} provided
- de.bytefish:pgbulkinsert
- io.frama.parisni:spark-dataframe
- org.postgresql:postgresql
- com.opentable.components:otj-pg-embedded test
- org.apache.spark:spark-sql_${scala.tools.version} test
- org.scalatest:scalatest_${scala.tools.version} test
spark-quality/pom.xml
maven
- com.amazon.deequ:deequ
- net.jcazevedo:moultingyaml_${scala.tools.version}
spark-sync/pom.xml
maven
- org.apache.solr:solr-core compile
- com.esotericsoftware:kryo-shaded
- com.fasterxml.jackson.core:jackson-core
- com.fasterxml.jackson.core:jackson-databind
- com.fasterxml.jackson.module:jackson-module-scala_${scala.tools.version}
- io.delta:delta-core_${scala.tools.version}
- io.frama.parisni:spark-dataframe
- io.frama.parisni:spark-postgres
- net.jcazevedo:moultingyaml_${scala.tools.version}
- org.apache.solr:solr-solrj
- org.postgresql:postgresql
- com.github.tomakehurst:wiremock test
- com.lucidworks.spark:spark-solr test
- com.opentable.components:otj-pg-embedded test
- org.apache.solr:solr-test-framework test
- org.apache.spark:spark-sql_${scala.tools.version} test
spark-meta-frontend/package-lock.json
npm
- 1518 dependencies
spark-meta-frontend/package.json
npm
- @emotion/core ^10.0.28
- @emotion/styled ^10.0.27
- @material-ui/core ^4.9.7
- @material-ui/icons ^4.9.1
- @material-ui/lab ^4.0.0-alpha.48
- @projectstorm/react-diagrams ^6.0.2
- @projectstorm/react-diagrams-core ^6.0.2
- @testing-library/jest-dom ^4.2.4
- @testing-library/react ^9.5.0
- @testing-library/user-event ^7.2.1
- closest 0.0.1
- dagre ^0.8.5
- express ^4.17.1
- mathjs ^6.6.1
- pathfinding ^0.4.18
- paths-js ^0.4.10
- pg ^8.0.3
- react ^16.13.1
- react-dom ^16.13.1
- react-router-dom ^5.1.2
- react-scripts 3.4.1
- resize-observer-polyfill ^1.5.1
- tmp ^0.2.1
- typescript ^3.8.3
spark-meta-frontend/Dockerfile
docker
- debian buster-slim build
spark-query/pom.xml
maven