https://github.com/google-parfait/confidential-federated-compute

TEE-hosted binaries for verifiable server-side computation.

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (10.0%) to scientific vocabulary

Last synced: 9 months ago · JSON representation

Repository

TEE-hosted binaries for verifiable server-side computation.

Basic Info

Host: GitHub
Owner: google-parfait
License: apache-2.0
Language: C++
Default Branch: main
Homepage:
Size: 3.62 MB

Statistics

Stars: 15
Watchers: 6
Forks: 6
Open Issues: 7
Releases: 1

Created over 2 years ago · Last pushed 9 months ago

Metadata Files

Readme Contributing License

Confidential Federated Compute

The Confidential Federated Compute project enables Federated Learning and Analytics using Confidential Computing. This repository holds publicly verifiable components that run within Trusted Execution Environments (TEEs) and interact with user data. In order to run the components in this repository on TEEs, this repository depends on the Project Oak platform. In the Project Oak terminology, this repository contains Trusted Applications that run on the Oak Infrastructure.

Design and Code Structure

The goal of this project is to give end users meaningful control over how their data can be used when the data is uploaded to a central server. Trusted Execution Environments allow proving what is running on a central server, and our contribution is to design a distributed system that allows us to make claims about a system of TEEs, some of which may not begin execution until after the data has been uploaded, to create a "chain of trust" from the user that uploads data to the TEEs that subsequently process the data.

Data that is uploaded is encrypted and bound to a policy stating how the data can be used. This policy is a directed graph of allowed transformations on the data. The data will only be allowed to be decrypted by components that prove that the code they are running is allowed by the policy, by using remote attestation with a hardware root-of-trust to determine what binary is running.

Key Management Service

The component responsible for enforcing the policy on the data is called the Confidential Federated Compute Key Management Service (CFC KMS). When data is uploaded, it is initially only decryptable by the KMS, which replicates and stores the corresponding decryption keys in-memory. This means that if the KMS restarts, access to the uploaded data is lost forever. Other, short lived components running in Trusted Execution Environments can provide attestations to the KMS and request access to these keys, which are only shared if the attestation matches the policy under which the data was uploaded. The KMS also maintains rollback-protected state for each pipeline for tracking privacy budgets.

The code for the KMS is located in the containers/kms directory; see this directory for additional documentation. The KMS runs in a TEE using Oak Containers.

Ledger

The ledger is the KMS's predecessor and plays a similar role. The main difference is that the ledger tracks privacy budgets itself on a per-upload basis instead of delegating budget tracking to each pipeline. Unfortunately, this requires ledger operations and state to scale with the amount of uploaded data, making the ledger a bottleneck for large-scale data processing.

The code for the ledger is located in the ledger_service and ledger_enclave_app directories; see the latter for additional documentation. The ledger runs in a TEE using the Oak Restricted Kernel.

Transformations

This repository also contains code for components that run transformations over data within TEEs, if those transformations are allowed by the KMS- or ledger-enforced policy. Transforms are implemented using Oak Containers.

[containers/confidential_transform_test_concat] Example transform that concatenates its inputs.
[containers/fed_sql] Transform that aggregates aggregate using both SQLite and Aggregation Cores.

See each transform's README for more details.

Inspecting attestation verification records and endorsement transparency log entries

See docs/README.md for instructions for mapping attestation verification records logged by Federated Compute clients, as well as transparency log entries for KMS, ledger, and data access policy endorsements to the reproducibly buildable binaries in this repository.

Building

The following section provides instructions for building artifacts that can be run within Trusted Execution Environments from the source code in this repo.

The eventual goal is to achieve binary transparency; that is, we would like to verifiably link a binary with the source code that produced the binary. This way, when a Trusted Execution Environment remotely attests that it is running a particular binary, anyone can verify that the binary was produced from a particular version of the source code, and thus be convinced that the Trusted Execution Environment is in fact running a particular application.

There are different strategies to achieve binary transparency. One strategy consists of the use of a trusted builder along with provenance and endorsement statements that are signed and published in a transparency log, as described in detail in Oak's Transparent Release documentation. Another strategy is making the build process fully reproducible, so that given a particular version of the source code, bitwise identical artifacts will be produced by the build process regardless of where or when the build process runs. This allows an external auditor to run the build process themselves in order to verify that the source code at a particular version produces the specified binary.

We provide the following instructions for building the code. As we make progress toward binary transparency, we will refine these instructions to provide details on how one can verify that a particular version of the source code produces a particular binary.

Prerequisites

Bazelisk

Bazelisk is the bazel-recommended way to obtain a specific bazel version. See https://github.com/bazelbuild/bazelisk#installation for installation instructions.

The build scripts in this repository require either bazelisk to be in your PATH or the BAZELISK environment variable to be set to the location of the bazelisk binary.

Building Artifacts

Clone this repository to your developer machine. The following commands should all be run from within the repository root.

To specify a directory where the build artifacts will be output, run the following command. Consider adding the line below to your ~/.bashrc or ~/.zshrc so you don't have to run this step every time you enter a new shell.

export BINARY_OUTPUTS_DIR=/tmp/confidential-federated-compute/binaries

To build all artifacts which can be run in enclaves, run the following command:

./scripts/build.sh release

After the above command completes, use the following command to list the output binaries:

ls "${BINARY_OUTPUTS_DIR}"

You should see a subdirectory for each server, including the following:

confidential_transform_test_concat fed_sql kms

These correspond to the components described in Design and Code Structure.

Contributing

See CONTRIBUTING.md for details.

License

Apache 2.0; see LICENSE for details.

Disclaimer

This is not an officially supported Google product.

Owner

Name: Parfait
Login: google-parfait
Kind: organization

Website: federated.withgoogle.com
Repositories: 1
Profile: https://github.com/google-parfait

Private aggregation & retrieval, federated, analytics, inference, & training from Google.

GitHub Events

Total

Release event: 1
Watch event: 12
Delete event: 1
Issue comment event: 2
Push event: 286
Pull request event: 5
Fork event: 7
Create event: 7

Last Year

Release event: 1
Watch event: 12
Delete event: 1
Issue comment event: 2
Push event: 286
Pull request event: 5
Fork event: 7
Create event: 7

Issues and Pull Requests

Last synced: over 1 year ago

All Time

Total issues: 0
Total pull requests: 2
Average time to close issues: N/A
Average time to close pull requests: 2 months
Total issue authors: 0
Total pull request authors: 2
Average comments per issue: 0
Average comments per pull request: 0.5
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 2
Average time to close issues: N/A
Average time to close pull requests: 2 months
Issue authors: 0
Pull request authors: 2
Average comments per issue: 0
Average comments per pull request: 0.5
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

Pull Request Authors

dependabot[bot] (3)
bmclarnon (2)
jul-sh (2)
fml100 (1)
google-admin (1)

Top Labels

Issue Labels

Pull Request Labels

dependencies (3) python (3)

Dependencies

Cargo.lock cargo

169 dependencies

Cargo.toml cargo

cfc_crypto/Cargo.toml cargo

googletest * development
aes-gcm-siv *
anyhow *
coset *
hpke *

examples/square_enclave_app/Cargo.toml cargo

examples/square_service/Cargo.toml cargo

anyhow * development
coset * development
sha2 * development
byteorder *
prost *
prost-types *

examples/sum_enclave_app/Cargo.toml cargo

examples/sum_service/Cargo.toml cargo

anyhow * development
coset * development
sha2 * development
byteorder *

ledger_enclave_app/Cargo.toml cargo

ledger_service/Cargo.toml cargo

googletest * development
anyhow *
coset *
p256 *
prost *
prost-types *
rand *
sha2 *

pipeline_transforms/Cargo.toml cargo

googletest * development
sha2 * development
anyhow *
bitmask *
core2 *
coset *
libflate 2
prost *
prost-types *
rand *

third_party/federated_compute/Cargo.toml cargo

.github/workflows/provenance.yaml actions

.github/workflows/reusable_provenance.yaml actions