Abstract:
In the classic model, data integrity assumes a simple sender-receiver channel where threats are limited and verification is straightforward. The modern Internet has reshaped this paradigm. In an era of viral misinformation, encrypted messaging, and decentralized finance, integrity is no longer only about who sent the data, but also what it means and whether it can be trusted. Can we verify the authenticity of a screenshot from Trump's Truth Social? Is Signal's end-to-end encryption truly end-to-end when a central server distributes keys? Can blockchain protocols, which secure trillions in capital, sustain trust under attacks that rival the world's largest heists?
This talk presents a modern perspective on data integrity, arguing for a comprehensive rethinking along three axes:
- Cryptographic foundations: developing efficient primitives such as vector commitments that compress large databases into short digests while supporting secure, dynamic updates.
- Protocol engineering: designing practical protocols that bring cryptographic insights to deployment, including TLS-based oracles that dramatically cut verification costs without weakening security.
- Empirical security analysis: examining deployed systems to uncover overlooked risks, such as equivocation attacks in the Tor directory protocol, and working with developers to patch them.
To ground these ideas, I will highlight three recent works: (1) the Cauchyproofs, enabling batch updates for vector commitments via structured matrix multiplications; (2) Proxying is enough, an efficient TLS oracle construction that leverages HTTP transcript structure to avoid expensive zero-knowledge proofs; and (3) an empirical study of the Tor directory protocol that revealed and patched a critical anonymity-breaking vulnerability. Together, these studies illustrate both the fragility and opportunity in today's integrity landscape, and how cryptographic insight can bridge the gap from theory to practice.