Web3 – Safeguarding Our Identity and Personal Data in the Digital World

Transformative technologies are generally accompanied by a mixture of excitement and confusion in their early years. Something important is going on out there, although there’s no consensus on what it is yet. A major reason for the lack of consensus is that there’s no single dimension around which to define an emerging technology or business model. It’s like the fable of the blind men and the elephant. Each one touches a different part of the elephant. They then compare notes on what they felt, and learn that they are in complete disagreement.

This was the case with the advent of the commercial internet in the early 1990s. A lot was starting to happen around the internet, but we weren’t sure where things were heading. It was pretty clear that a communications revolution was underway: after all, the internet was fundamentally a network of networks, and email was one of its earliest and most popular applications. It was also an information revolution: anyone with a browser, a PC, and an internet connection could now access all kinds of content in the new World Wide Web. And, above all, it promised to be an economic revolution: the internet ushered a historical transition to a new kind of digital economy, including many innovative e-business applications.

Over the past few decades, the term internet has come to encompass a number of related technologies including broadband networks, mobile devices, social media, cloud computing, e-commerce platforms, big data, AI, and more. More recently, we’ve seen the emergence of a new set of technologies and business models that are once more generating excitement, confusion, and multiple opinions on what they’re all about: Web3.

Last week I wrote about web3, referencing two recent books: the just-published Digital Asset Revolution by Alex Tapscott, and the soon-to-be-published Think Blockchain by Jerry Cuomo.

I wrote that web3 aims to usher a more open, entrepreneurial internet and digital economy by replacing today’s corporate mega-platforms with blockchain-based decentralized networks. Web3 would thus give creators, developers and users a way to monetize their contributions, involve them in the governance and decision-making of the platforms supporting their work, and give individuals more privacy and control over their data.

I now want to discuss another perspective on web3 based on The Emerging New Economy: Causes and Consequences of Web 3.0, a recent Stanford seminar by Alex (Sandy) Pentland, MIT professor and faculty director of the MIT Connection Science Research initiative, – with which I’ve long been affiliated as a Connection Science Fellow.

In the seminar, Pentland cited a number of web3-related projects that his research group has been involved with over the past few years. I’d like to focus my discussion on two key, closely intertwined such projects in particular: safeguarding an individual’s digital identity and protecting their personal data.

Identity plays a major role in everyday life. Think about logging on to a website, making an online purchase, or getting on a plane. As explained in A Blueprint for Digital Identity, a report by the World Economic Forum (WEF), identity is essentially a collection of data attributes associated with an individual, enabling them to participate in specific transactions by proving that they have the attributes required to do so. Identity attributes fall mainly into three main categories: inherent, – e.g., height, age, date of birth, biometrics; accumulated, – job history, health records, home addresses, education; and assigned, – e.g., email IDs, phone numbers, social security, drivers license, passport.

“In the Web 2 paradigm, third parties like banks, social media companies, and digital conglomerates give us our identities and allow us to access their services,” wrote Tapscott in Digital Asset Revolution. Web 2’s Faustian bargain was signing our own data over to these intermediaries (via their terms of use and service). We gave them rights to use our data for their own gain, and they undermined our privacy in the process. We never get to own our identity. Rather, we simply rent it in the walled gardens.”

Self-sovereign identity gives individuals control over their digital identity, – one of the most important objectives of the web3 paradigm. “Anonymous single-sign-on will allow one username and authentication method across all web sites and accounts, rather than individual logins for each site,” wrote Cuomo in Think Blockchain. “This login would not require you to relinquish control of sensitive personal data.” With web3 wallets backed by the the appropriate type of blockchain network, users always retain control of their personal identity information (PII) and login credentials.

However, the various data attributes necessary to establish a self-sovereign digital identity are siloed within different private and public sector institutions. These institutions will not want to give up their data for a variety of competitive and legal reasons. Thus, to achieve the level of privacy and security envisioned in a web3 framework, it’s necessary to establish a federated ecosystem of institutions that can access the attributes necessary to validate an identity while preserving the privacy of the data. The more data sources such an ecosystem has access to, the higher the probability of detecting fraud and identity theft while reducing false positives.

Open Algorithms (OPAL) is a governance framework for validating identities developed by Pentland and his students and collaborators. OPAL enables the institutions in a federated ecosystem to jointly run computations on the data while keeping the data completely private. The OPAL framework is described in Open Algorithms for Identity Federation, a 2017 paper by Pentland and Connection Science CTO Thomas Hardjono.

“The identity problem today is a data-sharing problem,” wrote the authors. “Today the fixed attributes approach adopted by the consumer identity management industry provides only limited information about an individual, and therefore is of limited value to the service providers and other participants in the identity ecosystem. This paper proposes the use of the Open Algorithms (OPAL) paradigm to address the increasing need for individuals and organizations to share data in a privacy-preserving manner. Instead of exchanging static or fixed attributes, participants in the ecosystem will be able to obtain better insight through a collective sharing of algorithms, governed through a trust network. Algorithms for specific datasets must be vetted to be privacy-preserving, fair and free from bias.”

OPAL is the kind of technical and governance innovation that’s required to develop a trustworthy web3 framework. It is based on several key principles, including:

Move the algorithm to the data. Instead of gathering raw data into a central location for processing, the algorithms or queries should be sent to the repositories and be processed there.
Decentralized data architecture. Raw data must always remain in its permanent repository under the control of the repository owners. Only the results of applying the algorithm or query against the data are returned.
Open, vetted algorithms. Algorithms must be openly published, agreed to, and vetted by experts to be safe from privacy violations, bias, and other unintended consequences.
Subject consent. Data repositories must obtain explicit consent from the subjects whose data they hold for the execution of an algorithm against their data; the vetted algorithms should be made available and understandable to subjects.
Data Federation. In a group-based trust network ecosystem, algorithms must be vetted collectively by all the members of the ecosystem; each member must observe the OPAL principles and legal frameworks.
Data is always in an encrypted state. Data must be encrypted while stored, transmitted and when algorithms are applied against it.
Transparency and regulatory compliance. All requests and responses must be stored in a public blockchain to provide a shared, immutable log of events that enables the auditing of all interactions, as well as proof of regulatory compliance.

“The OPAL paradigm offers a possible way forward for industry and government to begin addressing the core issues around privacy preserving data sharing,” noted Hardjono and Pentland. “Some of these challenges include siloed data, the limited type/domain of data, and the prohibitive situation of cross-organization sharing of raw data. Instead of sharing fixed-attributes regarding a user or subject, the OPAL paradigm offers a way for Identity Providers, Relying Parties and Data Providers to share vetted algorithms. This in turn provides better insight into the user’s behavior, with their consent.

“It also allows for the development of a trust network ecosystem consisting of these entities, providing new revenue sources, governed by relevant legal agreements and contracts that form the basis for an information sharing legal trust framework. Finally, a new set of legal rules and system-specific rules must be devised that must clearly articulate the required combination of technical standards and systems, business processes and procedures, and legal rules that, taken together, establish a trustworthy system for information sharing in a federation based on the OPAL model.”

Irving Wladawsky-Berger

RECENT POSTS

CATEGORIES

Subscribe to this blog via email