TensorFlow Privacy Project Overview
TensorFlow Privacy is a specialized Python library developed to equip machine learning models with differential privacy capabilities. It empowers data scientists and engineers to train models that preserve privacy by integrating unique TensorFlow optimizers. This library not only facilitates differential privacy but also provides comprehensive tutorials and analytical tools to help users understand the privacy guarantees offered.
Latest Developments
TensorFlow Privacy's continuous advancements reflect its dynamic nature. As of February 14, 2024, version 0.9.0 of the library has been split into two distinct PyPI packages. The tensorflow-privacy
package focuses on differentially private (DP) model training, while tensorflow-empirical-privacy
targets empirical privacy testing.
Additionally, a noteworthy update from February 21, 2023, features efficient per-example gradient clipping for DP Keras models using only Dense and Embedding layers, enhancing the model training without impacting memory or runtime.
Setting Up TensorFlow Privacy
Dependencies
To utilize TensorFlow Privacy, TensorFlow (version 1.14 or higher) must be installed, with the option to leverage GPU support for enhanced performance. Detailed installation instructions can be found on the TensorFlow website.
Installation
For those intending to use TensorFlow Privacy as a library, installation is streamlined through the command:
pip install tensorflow-privacy
Alternatively, the library can be cloned from GitHub for local installation and development:
git clone https://github.com/tensorflow/privacy
cd privacy
pip install -e .
For contributors, it's recommended to fork the repository before cloning to facilitate contribution management.
Contribution Guidelines
Contributions are openly welcomed in the TensorFlow Privacy community. Prospective contributors should adhere to the following guidelines to streamline the review process:
- Follow the
PEP8 with two spaces
coding style, similar to TensorFlow's, and verify code withautopep8
. - Validate code using pylint with TensorFlow's configuration file.
- Sign the Google Contributor License Agreement (CLA) for first-time pull requests.
- Avoid adding git submodules due to maintenance issues.
Learning with Tutorials
The library includes a host of tutorials outlined here, guiding users through transforming existing optimizers into their differentially private versions. Users will also learn parameter tuning for privacy optimization and how to evaluate privacy measures using built-in analysis tools.
Note: Tutorials are periodically updated, and stable API expectations should not be placed on them for third-party code integration.
Research Insights
The research directory hosts code to duplicate results from privacy-related academic papers in machine learning. While not as meticulously maintained as tutorials, it serves as a valuable research archive.
TensorFlow 2.x Compatibility
TensorFlow Privacy is compatible with TensorFlow 2, with new Keras-based estimators available for use. It requires TensorFlow version 2.4 or higher to function correctly with tf.keras.Model
and tf.estimator.Estimator
.
Communication and Support
For unresolved queries or issues, direct contact is available with:
- Galen Andrew (@galenmandrew)
- Steve Chien (@schien1729)
- Nicolas Papernot (@npapernot)
This project, a significant contribution from Google LLC, continues to foster developments in privacy-focused machine learning.