Things We Learned at the 2023 IAPP Global Privacy Summit

The International Association of Privacy Professionals held its annual Global Privacy Summit on April 4-5 in Washington, D.C. Here are some things we learned.

  1. Generative Artificial Intelligence (“AI”) is Ubiquitous in the Privacy Community.
    • Organizations are scrambling to deploy generative AI tools. Given the huge volume of data needed to train the large language models (“LLMs”) powering these tools, chief privacy officers (“CPOs”) are being tapped to lead their organization’s efforts regarding AI governance and ethical uses of AI.
  2. Data Scraped from the Internet to Train Large Language Models May Violate Privacy Laws.
    • Privacy laws such as the GDPR place restrictions on when and how organizations can collect and otherwise process personal data. Data scraped from the Internet and used to train LLMs include personal data.  Unless the data is processed in a way compliant with the GDPR (specifically, Article 14), use of that personal data will in nearly all circumstances not be permissible.
    • However, according to OpenAI Deputy General Counsel Che Chang, the content of personal data in training sets is not as important as the format. For example, LLMs may use personal data to learn where a name should appear in a sentence or how an email address looks – and not concerned regarding who the person is.  Accounting for that disconnect (between personal data collection and need) may be an important way forward to minimize risks to personal data.
  3. Governments are Trying to Determine How to Regulate AI.
    • While some technologists have called for a pause on further AI development, governments are also wrestling with how to regulate AI.
    • There are several new pieces of legislation that directly address AI, such as the New York City Council’s updates to Local Law 144 requiring employers and employment agencies to conduct annual independent bias audits of automated employment decision tools used to screen candidates or employees for employment decisions. Likewise, the European Union proposed a new legal framework (the EU AI Act) to regulate the use of AI, primarily focused on “high risk” applications such as the use of AI systems in autonomous vehicles, medical devices, and infrastructure. Developers and users of such high-risk AI systems must conduct rigorous testing, document the quality of data used to train the system, and create an accountability framework that includes human oversight. The EU AI Act also prohibits the use of AI systems for certain applications such as use in social scoring systems.
    • Existing regulations can be used to correct some of the wrongs created via biased AI systems. For example, Title VIII of the Civil Rights Act of 1968 is the primary federal law banning discrimination in housing. Meanwhile, the Council of the District of Columbia has proposed the Stop Discrimination by Algorithms Act (“SDAA”) that would provide transparency to D.C. residents regarding how algorithms are used to determine outcomes in credit, housing, and employment decisions. The SDAA could bridge the gap between misuse of AI systems and the Civil Rights Act. Generally, there was a sense among some EU panelists that the GDPR and other digital technology laws that currently exist do a good job of regulating some uses of AI.
    • There may already be differences in approaches between the EU and U.S. on AI regulation. One U.S. official summarized it as follows:  the U.S. is concerned about regulating based on impacts, while the EU is concerned about regulating based on AI categories.
  4. AI Governance Frameworks are Needed.
    • CPOs can play a critical role in AI governance given their board engagement and position in senior management. The CPO is already tasked with monitoring the regulatory landscape and can morph from a “data protection officer” to a “responsible AI officer.”
    • The privacy community can look to leverage tools such as data protection impact assessments to develop generative AI risk assessment templates. NIST recently released its AI Risk Management Framework that organizations can use to improve trust in the development and use of AI systems.
    • Given the volume of data used in AI systems, detailed human oversight is not sustainable. Consequently, technology will be developed to monitor other technology. However, AI systems cannot be left alone – some human intervention is still needed to sample data.
    • AI governance requires transparency, but this presents a challenge for AI systems that can be a “black box” in terms of understanding what data is processed. For example, technologists believed they removed all international data from the data set used to train GPT-2. However, LLMs that used GPT-2 were very adept at translating English text to French; this was a surprise to technologists who later discovered that some French data remained in the training set. AI systems are not rule-based and transparency requires that developers must be able to explain elements such as training methods and the data sets used.

Leave a Reply

Your email address will not be published. Required fields are marked *