arx

ARX is a comprehensive open source data anonymization tool aiming to provide scalability and usability. It supports various anonymization techniques, methods for analyzing data quality and re-identification risks and it supports well-known privacy models, such as k-anonymity, l-diversity, t-closeness and differential privacy.

https://github.com/arx-deidentifier/arx

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
  • Committers with academic emails
    4 of 32 committers (12.5%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (17.7%) to scientific vocabulary

Keywords

arx cross-platform data-analytics data-anonymization de-identification open-source privacy
Last synced: 6 months ago · JSON representation ·

Repository

ARX is a comprehensive open source data anonymization tool aiming to provide scalability and usability. It supports various anonymization techniques, methods for analyzing data quality and re-identification risks and it supports well-known privacy models, such as k-anonymity, l-diversity, t-closeness and differential privacy.

Basic Info
  • Host: GitHub
  • Owner: arx-deidentifier
  • License: apache-2.0
  • Language: Java
  • Default Branch: master
  • Homepage: http://arx.deidentifier.org/
  • Size: 375 MB
Statistics
  • Stars: 670
  • Watchers: 33
  • Forks: 227
  • Open Issues: 62
  • Releases: 25
Topics
arx cross-platform data-analytics data-anonymization de-identification open-source privacy
Created almost 13 years ago · Last pushed 6 months ago
Metadata Files
Readme Contributing License Code of conduct Citation

readme.md

ARX - Open Source Data Anonymization Software

Note

ARX is the result of a research project. To support our research, please cite one of our papers instead of referencing our website in scientific articles. You can find an overview of papers about ARX here. If you are not sure which paper to cite, we recommend this one:

Prasser F., Eicher J., Spengler H., Bild R., Kuhn K. A. (2020) Flexible Data Anonymization Using ARX Current Status and Challenges Ahead. Software Pract Exper 2020;128. (Link)

Thanks!

Introduction

ARX is a comprehensive open source software for anonymizing sensitive personal data. It has been designed from the ground up to provide high scalability, ease of use and a tight integration of the many different aspects relevant to data anonymization. Its highlights include:

  • Utility-focused anonymization using different statistical models
  • Syntactic privacy models, such as k-anonymity, -diversity, t-closeness and -presence
  • Semantic privacy models, such as (, )-differential privacy
  • Methods for optimizing the profitability of data publishing based on monetary cost-benefit analyses
  • Data transformation with generalization, suppression, microaggregation and top/bottom coding as well as global and local recoding
  • Methods for analyzing data utility
  • Methods for analyzing re-identification risks

The software is able to handle very large datasets on commodity hardware and features an intuitive cross-platform graphical user interface. You can find further information on the project website.

Development setup

Currently, the main development of ARX is carried out using Eclipse as an IDE and Ant as a build tool. Support for further IDEs such as IntelliJ IDEA and Maven is experimental.

The Ant build script features various targets that can be used to build different versions of ARX (e.g. including GUI code or not). To build only the core code using Maven, set the system property core to true. This will build a platform independent jar with the ARX main code module and no GUI components:

$ mvn compile -Dcore=true

Contributing and code of conduct

See here and here.

License

ARX (C) 2012 - 2025 Fabian Prasser and Contributors.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

External Libraries

ARX uses external libraries. Their licenses are listed in the respective folders.

Owner

  • Name: ARX
  • Login: arx-deidentifier
  • Kind: user
  • Location: Berlin, Germany

Open source data de-identification / anonymization software providing scalability, usability and support for various anonymization techniques.

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
preferred-citation:
  type: article
  authors:
  - family-names: "Prasser"
    given-names: "Fabian"
    orcid: "https://orcid.org/0000-0003-3172-3095"
  - family-names: "Eicher"
    given-names: "Johanna"
    orcid: "https://orcid.org/0000-0003-4871-0282"
  - family-names: "Spengler"
    given-names: "Helmut"
    orcid: "https://orcid.org/0000-0002-1389-3755"
  - family-names: "Bild"
    given-names: "Raffael"
    orcid: "https://orcid.org/0000-0002-7398-5598"
  - family-names: "Kuhn"
    given-names: "Klaus A"
    orcid: "https://orcid.org/0000-0003-2575-8507"
  doi: "10.1002/spe.2812"
  journal: "Software: Practice and Experience"
  month: 6
  start: 1277
  end: 1304
  title: "Flexible data anonymization using ARX - Current status and challenges ahead"
  issue: 7
  volume: 50
  year: 2020

GitHub Events

Total
  • Watch event: 41
  • Issue comment event: 3
  • Push event: 10
  • Pull request event: 3
  • Fork event: 13
Last Year
  • Watch event: 41
  • Issue comment event: 3
  • Push event: 10
  • Pull request event: 3
  • Fork event: 13

Committers

Last synced: almost 3 years ago

All Time
  • Total Commits: 4,004
  • Total Committers: 32
  • Avg Commits per committer: 125.125
  • Development Distribution Score (DDS): 0.243
Top Committers
Name Email Commits
Fabian Prasser m****l@f****e 3,032
Florian Kohlmayer k****r@g****m 554
Karol Babioch k****l@b****e 196
eicherj j****r@g****m 50
ga82bos m****n@t****e 48
Johanna Eicher e****j@u****m 27
waltl w****l@w****l 15
Fabian Prasser m****l@f****e 14
dependabot[bot] 4****]@u****m 8
Sebastian Stammler s****r@c****e 6
Thierry Meurers t****s@c****e 6
jgaupp j****p@v****u 5
ARX a****r@g****m 5
Raffael Bild r****d@t****e 5
ElMuto h****r@g****m 5
Ibraheem Al-Dhamari i****x@y****m 4
Martin Waltl M****l@1****5 3
nartz n****d@g****m 3
Martin Waltl M****l@M****x 2
marklackey m****y@u****m 2
Craig Hawco c****o@g****m 2
dfirman d****n@h****a 2
Florian Kohlmayer f****r@g****m 1
Di w****4@m****a 1
iylee71 5****1@u****m 1
tanjascats 4****s@u****m 1
Shu w****s@y****m 1
Rourke101 k****3@m****m 1
Igor Vujosevic i****r@s****m 1
Nenad Jevdjenic n****c@g****m 1
and 2 more...

Dependencies

pom.xml maven
  • colt:colt 1.2.0
  • com.carrotsearch:hppc 0.6.0
  • com.github.haifengl:smile 1.3.1
  • com.github.ralfstuckert.pdfbox-layout:pdfbox2-layout 1.0.0
  • com.google.guava:guava 18.0
  • com.oracle:ojdbc7 12.1.0.2
  • com.univocity:univocity-parsers 2.8.4
  • commons-io:commons-io 2.7
  • commons-lang:commons-lang 2.6
  • commons-validator:commons-validator 1.4.1
  • de.linearbits.swt:swtchoicesdialog 0.0.1
  • de.linearbits.swt:swtknob 1.0.0
  • de.linearbits.swt:swtpreferences 0.0.1
  • de.linearbits.swt:swtrangeslider 0.0.1
  • de.linearbits.swt:swtsimplebrowser 0.0.1
  • de.linearbits.swt:swttable 0.0.1
  • de.linearbits.swt:swttiles 0.0.1
  • de.linearbits:jhpl 0.0.1
  • de.linearbits:libjhc swt
  • de.linearbits:newtonraphson 0.0.1
  • de.linearbits:objectselector lib
  • jtds:jtds 1.3.1
  • mysql:mysql-connector-java 8.0.28
  • net.objecthunter:exp4j 0.4.8
  • org.apache.commons:commons-lang3 3.12.0
  • org.apache.commons:commons-math3 3.6.1
  • org.apache.hadoop:hadoop-core 1.2.1
  • org.apache.logging.log4j:log4j-1.2-api 2.17.2
  • org.apache.logging.log4j:log4j-api 2.17.2
  • org.apache.logging.log4j:log4j-core 2.17.2
  • org.apache.mahout:mahout-core 0.9
  • org.apache.mahout:mahout-math 0.11.1
  • org.apache.pdfbox:pdfbox-app 2.0.1
  • org.apache.poi:poi 4.1.1
  • org.apache.poi:poi-ooxml 3.10-FINAL
  • org.apache.poi:poi-ooxml-schemas 3.10-FINAL
  • org.apache.xmlbeans:xmlbeans 3.0.0
  • org.eclipse.nebula.visualization:widgets 1.1.0.202006092019
  • org.eclipse.nebula.visualization:xygraph 3.1.0.202006092019
  • org.eclipse.nebula.widgets.nattable:core 1.6.0
  • org.eclipse.nebula.widgets.pagination:core 1.0.0
  • org.eclipse.platform:org.eclipse.core.commands 3.9.700
  • org.eclipse.platform:org.eclipse.core.runtime 3.18.0
  • org.eclipse.platform:org.eclipse.equinox.common 3.12.0
  • org.eclipse.platform:org.eclipse.jface 3.20.0
  • org.eclipse:draw2d 3.2.100-v20070529
  • org.hamcrest:hamcrest-core 1.3
  • org.postgresql:postgresql 42.2.25
  • org.slf4j:slf4j-api 1.7.13
  • org.swtchart:ext 0.8.0.v20120301
  • org.swtchart:swtchart 0.8.0.v20120301
  • org.xerial:sqlite-jdbc 3.7.2
  • junit:junit 4.13.1 test
.github/workflows/codeql.yml actions
  • actions/checkout v3 composite
  • github/codeql-action/analyze v2 composite
  • github/codeql-action/autobuild v2 composite
  • github/codeql-action/init v2 composite
.github/workflows/pmd.yml actions
  • actions/checkout v3 composite
  • actions/setup-java v3 composite
  • github/codeql-action/upload-sarif v2 composite
  • pmd/pmd-github-action 40392a149b9cfa24bf4c03989cc762e6814668bd composite