Noise reduction in differentially private learning

Huang, Zhanliang (2023). Noise reduction in differentially private learning. University of Birmingham. Ph.D.

Preview

Huang2023PhD.pdf
Text - Accepted Version
Available under License All rights reserved.
Download (3MB) | Preview

Abstract

Privacy protection is a rising concern that gained increasing attention over the past decade due to the widespread of machine learning applications. Differential privacy (DP) is an emerging notion of privacy that provides a rigorous information-theoretic privacy guarantee. While DP is a very useful privacy guarantee many practitioners aim to achieve, DP algorithms typically require more data samples to perform well. Moreover, the magnitude of the injected noise from DP can scale with the dimensionality, hence further increasing the difficulty of private learning in high dimensions. In addition, high-dimensional learning also incurs problem it-self known as `the curse of dimensionality'.

The notion of compressed learning that aims to achieve a low-dimensional representation of the high-dimensional data samples has shined some light on this problem. However, although the notion of random projection has pre-existed for a long time, little is known about randomly compressed models in the DP framework. In this thesis, we develop theories that quantify the effect of random projections on the learning performance, and the excess error that privacy incurred. The theory developed in this thesis demonstrates the interplay between generalisation performance, random projection and differential privacy. We quantify the effect of random projection and the effect of DP implementation on the generalisation performance of learning algorithms. We show that random projection can reduce the magnitude of the required noise injection in DP algorithms while also exploiting structure, which results in a reduced dimensionality dependence on generalisation guarantees. Finally, we also introduce a novel machine ensemble with known in-built structure-exploiting capability, which utilises the privacy budget efficiently and is able to use the assistance of unlabelled data samples to boost accuracy performance without the compression of data samples.

Type of Work:

Thesis (Doctorates > Ph.D.)

Award Type:

Doctorates > Ph.D.

Supervisor(s):