io.github.andrewquijano:level-site-ppdt
Enhanced Outsourced and Secure Inference for Tall Sparse Decision Trees
Science Score: 57.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 4 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.7%) to scientific vocabulary
Keywords
Repository
Enhanced Outsourced and Secure Inference for Tall Sparse Decision Trees
Basic Info
Statistics
- Stars: 0
- Watchers: 2
- Forks: 2
- Open Issues: 2
- Releases: 3
Topics
Metadata Files
README.md
Level-Site-PPDT
Implementation of the PPDT in the paper "Evaluating Outsourced Decision Trees by a Level-Based Approach"
Installation
It is a requirement to install SDK to install Gradle. You need to install the following packages, to ensure everything works as expected ```bash sudo apt-get install -y default-jdk, default-jre, graphviz, curl, python3-pip pip3 install pyyaml pip3 install configobj curl -s "https://get.sdkman.io" | bash source "$HOME/.sdkman/bin/sdkman-init.sh"
In a new terminal, you run this command
sdk install gradle ```
Run this command and all future commands from Level-Site-PPDT folder, run the following command once to install docker and MiniKube.
Reboot your machine, then re-run the command to install minikube.
bash
bash setup.sh
Also, remember to install Sealed Secrets. ```bash sudo apt-get install jq
Fetch the latest sealed-secrets version using GitHub API
KUBESEAL_VERSION=$(curl -s https://api.github.com/repos/bitnami-labs/sealed-secrets/tags | jq -r '.[0].name' | cut -c 2-)
Check if the version was fetched successfully
if [ -z "$KUBESEALVERSION" ]; then echo "Failed to fetch the latest KUBESEALVERSION" exit 1 fi
wget "https://github.com/bitnami-labs/sealed-secrets/releases/download/v${KUBESEALVERSION}/kubeseal-${KUBESEALVERSION}-linux-amd64.tar.gz" tar -xvzf kubeseal-"${KUBESEAL_VERSION}"-linux-amd64.tar.gz kubeseal sudo install -m 755 kubeseal /usr/local/bin/kubeseal rm kubeseal*
Install Helm
curl -fsSL -o gethelm.sh https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 chmod 700 gethelm.sh ./gethelm.sh rm ./gethelm.sh
Add Sealed Secret Cluster
helm repo add sealed-secrets https://bitnami-labs.github.io/sealed-secrets helm install sealed-secrets -n kube-system --set-string fullnameOverride=sealed-secrets-controller sealed-secrets/sealed-secrets ```
Before you run the PPDT, make sure to create your keystore, this is necessary as the level-sites use TLS sockets.
Either run create_keystore.sh script, make sure the password is consistent with the Kubernetes secret, or just use the Sealed Secret.
Running PPDT locally
- Check the
config.propertiesfile is set to your needs. Currently:- It assumes level-site 0 would use port 9000, level-site 1 would use port 9001, etc.
- If you modify this, provide a comma-separated string of all the ports for each level-site.
- Currently, it assumes ports 9000–9009 will be used.
- key_size corresponds to the key size of both DGK and Paillier keys.
- precision controls how accurate to measure thresholds that are decimals. If a value was 100.1, then a precision of 1 would set this value to 1001.
- The data would point to the directory with the
answer.csvfile and all the training and testing data.
- It assumes level-site 0 would use port 9000, level-site 1 would use port 9001, etc.
- Currently, the test file will read from the
data/answers.csvfile.- The first column is the training data set, it is required to be a .arff file to be compatible with Weka. Alternatively, you can pass a .model file, which is a pre-trained Weka model. It is assumed this is a J48 classifier tree model.
- The second column would the name of an input file that is tab separated with the feature name and value
- The third column would be the expected classification given the input from the second column. If there is a mismatch, there will be an assertion error.
To run the end-to-end test, run the following:
bash
sh gradlew build
When the testing is done, you will have an output directory containing both the DT model and a text file on how to draw your tree. Input the contents of the text file into the website here to get a drawing of what the DT looks like.
If you want to analyze the level of each classification in a pre-trained decision tree from the data folder,
run the following (where argument is the name of the dataset):
bash
./gradlew run -PchooseRole=weka.finito.utils.depth_analysis --args spambase
This will read the DT in data/spambase.model which was trained from the data set data/spambase.arff.
It will classify all the data in the training set,
and get the level (1, ..., d) of the classification within the DT model.
In the paper, I used this to argue that assuming most training data is like testing data,
you likely will never need to go down the whole tree often.
Running PPDT on Kubernetes clusters
To make it easier for deploying on the cloud, we also provided a method to export our system into Kubernetes. This would assume one execution rather than multiple executions.
Option 1 - Using Minikube
You will need to start and configure minikube. When writing the paper, we provided 8 CPUs and 20 GB of memory; this was set using the arguments that fit your computer's specs.
minikube start --cpus 8 --memory 20000
eval $(minikube docker-env)
Option 2- Running it on an EKS Cluster
First install eksctl
Create a user with sufficient permissions. Go to IAM, Select Users, Create User, Attach Policies directly, for a quick experiment select all permission.
Obtain AWSACCESSKEYID and AWSSECRETACCESSKEY of the user account. See the documentation provided here
run
aws configureto input the access id and credential.Run the following command to create the cluster
bash eksctl create cluster --config-file eks-config/single-cluster.yamlConfirm the EKS cluster exists using the following
bash eksctl get clusters --region us-east-1Once you confirm the cluster is created, you need to register the cluster with kubectl:
bash aws eks update-kubeconfig --name ppdt --region us-east-1
Using/Creating a Kubernetes Sealed Secret
It is suggested you use the existing sealed secret. The password in this secret is aligned with what is on the keystore.
commandline
kubectl apply -f ppdt-sealedsecret.yaml
Alternatively, you can create a new sealed secret as follows:
bash
kubectl create secret generic ppdt-secrets --from-literal=keystore-pass=<SECRET_VALUE>
kubectl get secret ppdt-secrets -o yaml | kubeseal --scope cluster-wide > ppdt-sealedsecret.yaml
However, if you make a new sealed secret, you should re-make the keystore as well. Just remember, sealed secrets do not work in multiple clusters by default, as a heads-up.
Running Kubernetes Commands
The next step is to start deploying all the components running the following:
kubectl apply -f k8/server
kubectl apply -f k8/level_sites
kubectl apply -f k8/client
You will then need to wait until all the level sites are launched. To verify this, please run the following command. All the pods that say levelsite should have a status _running.
kubectl get pods
The output of kubectl get pods would look something like:
NAME READY STATUS RESTARTS AGE
ppdt-level-site-01-deploy-7dbf5b4cdd-wz6q7 1/1 Running 1 (2m39s ago) 16h
ppdt-level-site-02-deploy-69bb8fd5c6-wjjbs 1/1 Running 1 (2m39s ago) 16h
ppdt-level-site-03-deploy-74f7d95768-r6tn8 1/1 Running 1 (16h ago) 16h
ppdt-level-site-04-deploy-6d99df8d7b-d6qlj 1/1 Running 1 (2m39s ago) 16h
ppdt-level-site-05-deploy-855b649896-82hlm 1/1 Running 1 (2m39s ago) 16h
ppdt-level-site-06-deploy-6578fc8c9b-ntzhn 1/1 Running 1 (16h ago) 16h
ppdt-level-site-07-deploy-6f57496cdd-hlggh 1/1 Running 1 (16h ago) 16h
ppdt-level-site-08-deploy-6d596967b8-mh9hz 1/1 Running 1 (2m39s ago) 16h
ppdt-level-site-09-deploy-8555c56976-752pn 1/1 Running 1 (16h ago) 16h
ppdt-level-site-10-deploy-67b7c5689b-rkl6r 1/1 Running 1 (2m39s ago) 16h
It does take time for the level-site to be able to accept connections. Run the following command on the first level-site,
and wait for an output in standard output saying LEVEL SITE SERVER STARTED!. Use CTRL+C to exit the pod.
kubectl logs -f $(kubectl get pod -l "pod=ppdt-level-site-01-deploy" -o name)
kubectl logs -f $(kubectl get pod -l "pod=ppdt-level-site-10-deploy" -o name)
Next, you need to run the server to create Decision Tree and split the model among the level-sites. You can run it either connecting via a terminal to the pod using the commands below.
kubectl exec -i -t $(kubectl get pod -l "pod=ppdt-server-deploy" -o name) -- /bin/bash
gradle run -PchooseRole=weka.finito.server --args <TRAINING-FILE>
Alternatively, you can combine the above commands as follows:
kubectl exec -i -t $(kubectl get pod -l "pod=ppdt-server-deploy" -o name) -- bash -c "gradle run -PchooseRole=weka.finito.server --args <TRAINING-FILE>"
Once you see this output Server ready to get public keys from client-site, you need to run the client.
In a NEW terminal, start the client, run the following commands to complete an evaluation.
You would point values to something like /data/hypothyroid.values.
kubectl exec -i -t $(kubectl get pod -l "pod=ppdt-client-deploy" -o name) -- /bin/bash
gradle run -PchooseRole=weka.finito.client --args <VALUES-FILE>
# Test WITHOUT level-sites
gradle run -PchooseRole=weka.finito.client --args '<VALUES-FILE> --server'
Alternatively, you can combine both commands in one go as follows:
kubectl exec -i -t $(kubectl get pod -l "pod=ppdt-client-deploy" -o name) -- bash -c "gradle run -PchooseRole=weka.finito.client --args <VALUES-FILE>"
# Test WITHOUT level-sites
kubectl exec -i -t $(kubectl get pod -l "pod=ppdt-client-deploy" -o name) -- bash -c "gradle run -PchooseRole=weka.finito.client --args '<VALUES-FILE> --server'"
Re-running with different experiments
If you are just re-running the client with the same or different values file, just re-run the above command again. However, if you want to test with another data set, best to just rebuild the environment by deleting everything first.
bash
kubectl delete -f k8/client
kubectl delete -f k8/server
kubectl delete -f k8/level_sites
Then repeat the instructions on the previous section.
Clean up
Destroy the EKS cluster using the following:
bash
eksctl delete cluster --config-file eks-config/single-cluster.yaml --wait
Destroy the MiniKube environment as follows:
bash
minikube delete
Authors and Acknowledgement
Code Authors: Andrew Quijano, Spyros T. Halkidis, Kevin Gallagher
Kevin Gallagher is supported by NOVA LINCS ref. UIDB/04516/2020 and ref. UIDP/04516/2020 with the financial support of FCT.IP.
License
Project status
The project is fully tested.
Current Issues
- Not sure why the encryption library seems to have a bug in some specific comparisons in spambase and hypothyroid. I will debug these soon, but overall this works like a charm.
- TLS Sockets do not work on EKS, but I will fix this eventually. It works on all connections except once level-site 1 reaches out to the client for evaluation.
- Much bigger issue, so the first few runs of this application on EKS, the comparisons are pretty fast, like it takes about 0.5 seconds. But after like 10+ comparisons, the comparison performance just drops off a cliff to like 1 second. The only way I see to restore the same level of performance is to rebuild the EKS cluster. I have NO idea why this performance drop occurs, and I have tried deleting and rebuilding the pods, and even restarting the EC2 instances.
Owner
- Name: Advanced Wireless and Security Lab
- Login: adwise-fiu
- Kind: organization
- Location: United States of America
- Repositories: 3
- Profile: https://github.com/adwise-fiu
ADWISE laboratory at Florida International University - Department of Electrical and Computer Engineering.
Citation (CITATION.cff)
cff-version: 1.0.0
message: "If you use this software, please cite the paper."
authors:
- family-names: "Quijano"
given-names: "Andrew"
orcid: "https://orcid.org/0000-0002-6673-4934"
- family-names: "Halkidis"
given-names: "Spyros T."
orcid: "https://orcid.org/0000-0001-9983-1012"
- family-names: "Gallagher"
given-names: "Kevin"
orcid: "https://orcid.org/0000-0002-2714-7841"
- family-names: "Akkaya"
given-names: "Kemal"
orcid: "https://orcid.org/0000-0002-7103-4545"
- family-names: "Samaras"
given-names: "Nikolaos"
orcid: "https://orcid.org/0000-0001-8201-7081"
title: "Enhanced Outsourced and Secure Inference for Tall Sparse Decision Trees"
version: 2.0.0
doi: TBD
date-released: 2024-03-23
url: "https://github.com/AndrewQuijano/Level-Site-PPDT"
preferred-citation:
type: conference-paper
authors:
- family-names: "Quijano"
given-names: "Andrew"
orcid: "https://orcid.org/0000-0002-6673-4934"
- family-names: "Halkidis"
given-names: "Spyros T."
orcid: "https://orcid.org/0000-0001-9983-1012"
- family-names: "Gallagher"
given-names: "Kevin"
orcid: "https://orcid.org/0000-0002-2714-7841"
- family-names: "Akkaya"
given-names: "Kemal"
orcid: "https://orcid.org/0000-0002-7103-4545"
- family-names: "Samaras"
given-names: "Nikolaos"
orcid: "https://orcid.org/0000-0001-8201-7081"
journal: "ESORICS 2024"
month: 09
start: 1 # First page number
end: 19 # Last page number
title: "Enhanced Outsourced and Secure Inference for Tall Sparse Decision Trees"
issue: 1
volume: 1
year: 2024
GitHub Events
Total
- Release event: 3
- Delete event: 2
- Issue comment event: 6
- Push event: 22
- Pull request review event: 5
- Pull request event: 18
- Fork event: 2
- Create event: 6
Last Year
- Release event: 3
- Delete event: 2
- Issue comment event: 6
- Push event: 22
- Pull request review event: 5
- Pull request event: 18
- Fork event: 2
- Create event: 6
Issues and Pull Requests
Last synced: 4 months ago
All Time
- Total issues: 0
- Total pull requests: 8
- Average time to close issues: N/A
- Average time to close pull requests: 14 minutes
- Total issue authors: 0
- Total pull request authors: 2
- Average comments per issue: 0
- Average comments per pull request: 0.25
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 8
- Average time to close issues: N/A
- Average time to close pull requests: 14 minutes
- Issue authors: 0
- Pull request authors: 2
- Average comments per issue: 0
- Average comments per pull request: 0.25
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
- msthilaire5 (6)
- AndrewQuijano (3)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
- Total downloads: unknown
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 1
repo1.maven.org: io.github.andrewquijano:level-site-ppdt
This JAR file is used for implementing the Level-Site Privacy-Preserving Decision Trees (PPDT). See the paper "Enhanced Outsourced and Secure Inference for Tall Sparse Decision Trees" which describes the implementation and performance gains this approach has in avoiding the conversion to a complete binary tree as in the Joye and Salehi paper.
- Homepage: https://github.com/adwise-fiu/Level-Site-PPDT
- Documentation: https://appdoc.app/artifact/io.github.andrewquijano/level-site-ppdt/
- License: MIT License
-
Latest release: 1.0.1
published 4 months ago
Rankings
Dependencies
- actions/checkout v3 composite
- actions/setup-java v3 composite
- codecov/codecov-action v3 composite
- docker/build-push-action v4 composite
- docker/login-action v2 composite
- docker/setup-buildx-action v2 composite
- gradle latest build
- commons-io:commons-io 2.14.0 implementation
- junit:junit 4.13.1 implementation
- org.junit.jupiter:junit-jupiter-api 5.8.1 testImplementation
- org.junit.jupiter:junit-jupiter-engine 5.8.1 testRuntimeOnly