Differential Privacy in Machine Learning Algorithms

Meena Vyas
3 min readJun 12, 2020

--

(Image taken from tensor flow blog https://blog.tensorflow.org/2019/03/introducing-tensorflow-privacy-learning.html)

I was checking out Machine Learning with differential privacy in Tensor Flow in http://www.cleverhans.io/privacy/2019/03/26/machine-learning-with-differential-privacy-in-tensorflow.html

Differential Privacy is a framework for measuring the privacy guarantees provided by an algorithm. Through the lens of differential privacy, we can design machine learning algorithms that responsibly train models on private data. Learning with differential privacy provides provable guarantees of privacy, mitigating the risk of exposing sensitive training data in machine learning.

A model trained with differential privacy should not be affected by any single training example, or small set of training examples, in its data set. If a single training point does not affect the outcome of learning, the information contained in that training point cannot be memorized and the privacy of the individual who contributed this data point to our dataset is respected.

I took simple CNN code to identify MNIST database of handwritten digits using Gradient Descent Optimizer. (iPython Notebook link SimpleMnistWithoutDifferentialPrivacy.ipynb). Also I have used DP Gradient Descent Gaussian Optimizer and random noise which is a slightly modified code of what is given in tensor flow examples. (iPython Notebook link Classification_Privacy.ipynb). And then I have compared the two.

Time

Training images are 60,000 each of size 28x28x1.

Time taken to train normal in 240 epochs : 98 seconds

Time taken to train with Differential Policy in 240 epochs : 106 seconds

Around 8% more time is taken to train the images.

Accuracy and Loss

The basic idea is to use differentially private stochastic gradient descent (DP-SGD), is to modify the gradients used in stochastic gradient descent (SGD). Models trained with DP-SGD provide provable differential privacy guarantees for their input data. There are two modifications made to the vanilla SGD algorithm:

• First, the sensitivity of each gradient needs to be bounded. In other words, we need to limit how much each individual training point sampled in a mini-batch can influence gradient computations and the resulting updates applied to model parameters. This can be done by clipping each gradient computed on each training point.

• Random noise is sampled and added to the clipped gradients to make it statistically impossible to know whether or not a particular data point was included in the training dataset by comparing the updates SGD applies when it operates with or without this particular data point in the training dataset.

BoltOn Privacy adds randomness to weights. Refer https://github.com/tensorflow/privacy/blob/master/tutorials/bolton_tutorial.py

There is a nice video about that in https://towardsdatascience.com/building-differentially-private-machine-learning-models-using-tensorflow-privacy-52068ff6a88e

References

--

--

No responses yet