privacy - Providing Differential Privacy Solutions for Machine Learning with TensorFlow Privacy

TensorFlow Privacy Project Overview

TensorFlow Privacy is a specialized Python library developed to equip machine learning models with differential privacy capabilities. It empowers data scientists and engineers to train models that preserve privacy by integrating unique TensorFlow optimizers. This library not only facilitates differential privacy but also provides comprehensive tutorials and analytical tools to help users understand the privacy guarantees offered.

Latest Developments

TensorFlow Privacy's continuous advancements reflect its dynamic nature. As of February 14, 2024, version 0.9.0 of the library has been split into two distinct PyPI packages. The tensorflow-privacy package focuses on differentially private (DP) model training, while tensorflow-empirical-privacy targets empirical privacy testing.

Additionally, a noteworthy update from February 21, 2023, features efficient per-example gradient clipping for DP Keras models using only Dense and Embedding layers, enhancing the model training without impacting memory or runtime.

Setting Up TensorFlow Privacy

Dependencies

To utilize TensorFlow Privacy, TensorFlow (version 1.14 or higher) must be installed, with the option to leverage GPU support for enhanced performance. Detailed installation instructions can be found on the TensorFlow website.

Installation

For those intending to use TensorFlow Privacy as a library, installation is streamlined through the command:

pip install tensorflow-privacy

Alternatively, the library can be cloned from GitHub for local installation and development:

git clone https://github.com/tensorflow/privacy
cd privacy
pip install -e .

For contributors, it's recommended to fork the repository before cloning to facilitate contribution management.

Contribution Guidelines

Contributions are openly welcomed in the TensorFlow Privacy community. Prospective contributors should adhere to the following guidelines to streamline the review process:

Follow the PEP8 with two spaces coding style, similar to TensorFlow's, and verify code with autopep8.
Validate code using pylint with TensorFlow's configuration file.
Sign the Google Contributor License Agreement (CLA) for first-time pull requests.
Avoid adding git submodules due to maintenance issues.

Learning with Tutorials

The library includes a host of tutorials outlined here, guiding users through transforming existing optimizers into their differentially private versions. Users will also learn parameter tuning for privacy optimization and how to evaluate privacy measures using built-in analysis tools.

Note: Tutorials are periodically updated, and stable API expectations should not be placed on them for third-party code integration.

Research Insights

The research directory hosts code to duplicate results from privacy-related academic papers in machine learning. While not as meticulously maintained as tutorials, it serves as a valuable research archive.

TensorFlow 2.x Compatibility

TensorFlow Privacy is compatible with TensorFlow 2, with new Keras-based estimators available for use. It requires TensorFlow version 2.4 or higher to function correctly with tf.keras.Model and tf.estimator.Estimator.

Communication and Support

For unresolved queries or issues, direct contact is available with:

Galen Andrew (@galenmandrew)
Steve Chien (@schien1729)
Nicolas Papernot (@npapernot)

This project, a significant contribution from Google LLC, continues to foster developments in privacy-focused machine learning.