How can we overcome the limitations of current natural language privacy policies without imposing new requirements on website operators?

SC Faculty and Researchers

​Norman Sadeh

Travis Breaux

Lorrie Cranor

Natural language privacy policies have become a de facto standard to address expectations of “notice and choice” on the Web. Yet, there is ample evidence that users generally do not read these policies and that those who occasionally do struggle to understand what they read. Initiatives aimed at addressing this problem through the development of machine implementable standards or other solutions that require website operators to adhere to more stringent requirements have run into obstacles, with many website operators showing reluctance to commit to anything more than what they currently do.

This NSF Frontier project builds on recent advances in natural language processing, privacy preference modeling, crowdsourcing, formal methods, and privacy interfaces to overcome this situation. It combines fundamental research with the development of scalable technologies to:

  1. Semi-automatically extract key privacy policy features from natural language website privacy policies, and
  2. Present these features to users in an easy-to-digest format that enables them to make more informed privacy decisions as they interact with different websites.

As such, this project offers the prospect of overcoming the limitations of current natural language privacy policies without imposing new requirements on website operators. Work in this project will also involve the systematic collection and analysis of website privacy policies, looking for trends and deficiencies both in the wording and content of these policies across different sectors and using this analysis to inform ongoing public policy debates. A transition phase will enable the transfer of these technologies to industry for large-scale deployment and to regulators and policy makers interested in tracking practices.

We are proud to be working with the following faculty and researchers from across Carnegie Mellon and the world:
Alessando Acquisti (CMU Heinz College)
Ed Hovy (CMU LTI)
Joel Reidenberg (Fordham)
Florian Schaub (U.Michigan)
Barbara van Schewick (Standford)
Noah Smith (Washington)
Shomir Wilson (U.Cincinnati)

Learn More About This Project

Project Publications

N. Sadeh, A. Acquisti, T.D. Breaux, L.F. Cranor, A.M. McDonald, J. Reidenberg, N.A. Smith, F. Liu, N.C. Russell, F. Schaub, S. Wilson, "The Usable Privacy Policy Project: Combining Crowdsourcing, Machine Learning and Natural Language Processing to Semi-Automatically Answer Those Privacy Questions Users Care About.", Tech. report CMU-ISR-13-119, Dec 2013

Liu, S. Wilson, F. Schaub, N. Sadeh, "Analyzing Vocabulary Intersections of Expert Annotations and Topic Models for Data Practices in Privacy Policies", AAAI Fall Symposium on Privacy and Language Technologies, Nov 2016

K.M. Sathyendra, S. Wilson, F. Schaub, N. Sadeh, "Automatic Extraction of Opt-Out Choices from Privacy Policies", AAAI Fall Symposium on Privacy and Language Technologies, Nov 2016

S. Zimmeck, Z. Wang, L. Zou, R. Iyengar, B. Liu, F. Schaub, S. Wilson, N. Sadeh, S.M. Bellovin, J.R. Reidenberg, "Analyzing and Predicting Privacy Law Compliance of Mobile Apps", AAAI Fall Symposium on Privacy and Language Technologies, Nov 2016

J. Bhatia, T. D. Breaux, J. R. Reidenberg, T. B. Norton, "A Theory of Vagueness and Privacy Risk Perception", IEEE 24th International Requirements Engineering Conference (RE'16), Sep 2016

L. F. Cranor, P. G. Leon, B. Ur, "A Large-Scale Evaluation of U.S. Financial Institutions Standardized Privacy Notices", ACM Transactions on the Web (TWEB), Aug 2016 (forthcoming) [pdf] [website]

S. Wilson, F. Schaub, A. Dara, F. Liu, S. Cherivirala, P.G. Leon, M.S. Andersen, S. Zimmeck, K. Sathyendra, N.C. Russell, T.B. Norton, E. Hovy, J.R. Reidenberg, N. Sadeh, "The Creation and Analysis of a Website Privacy Policy Corpus", ACL '16: Annual Meeting of the Association for Computational Linguistics, Aug 2016

F. Schaub, T.D. Breaux, N. Sadeh, "Crowdsourcing Privacy Policy Analysis: Potential, Challenges and Best Practices", it – Information Technology, Jun 2016 [doi]

A. Rao, F. Schaub, N. Sadeh, A. Acquisti, R. Kang, "Expecting the Unexpected: Understanding Mismatched Privacy Expectations Online", Symposium on Usable Privacy and Security (SOUPS '16), Denver, CO, Jun 2016 [doi] [pdf]

J. Gluck, F. Schaub, A. Friedman, H. Habib, N. Sadeh, L.F. Cranor, Y. Agarwal, "How Short is Too Short? Implications of Length and Framing on the Effectiveness of Privacy Notices", Symposium on Usable Privacy and Security (SOUPS '16), Denver, CO, Jun 2016 [doi] [pdf]

B. Liu, M.S. Andersen, F. Schaub, H. Almuhimedi, S. Zhang, N. Sadeh, A. Acquisti, Y. Agarwal, "Follow My Recommendations: A Personalized Assistant for Mobile App Permissions", Symposium on Usable Privacy and Security (SOUPS '16), Denver, CO, Jun 2016 [doi] [pdf]

S.K. Cherivirala, F. Schaub, M.S. Andersen, S. Wilson, N. Sadeh, J.R. Reidenberg, "Visualization and Interactive Exploration of Data Practices in Privacy Policies", SOUPS '16 Poster Session, Jun 2016

J.R. Reidenberg, N.C. Russell, T.B. Norton, "Rating Indicator Criteria for Privacy Policies", SOUPS 2016 Workshop on Privacy Indicators, Jun 2016 [doi] [pdf]

More project publications available here.