BY DR SOHA MAAD
Introduction
Global data governance is the main gateway to the Global Digital Economy. In this article we consider the important theme of global data governance and the development of the global data infrastructure for the global digital economy platform. This article takes you into an introductory journey from data to global data governance and infrastructures emphasizing the importance of data security and protection. Following this introduction, the article presents steps to build data infrastructure and the various building blocks of data infrastructure. The article draws attention to main challenges facing global data and global data infrastructure and overviews various initiatives to address these challenges. Initiatives overviewed include the G20 initiative, UNCTAD, and the European Union GAIA-X project to develop Europe global data infrastructure. The article concludes with a roadmap for Arab banks and authorities to lead the development of the global data infrastructure for the global digital economy platform serving the need of all sectors including finance, global trade, climate, healthcare and other vital economic sectors.
FROM DATA TO GLOBAL DATA INFRASTRUCTURE AND GOVERNANCE
In this section we briefly overview the evolution of the concept of data to global data infrastructure. The terms “data”, “data governance”, “data localisation”, “data infrastructure”, “global data”, “global data infrastructure”, and “global data governance” are overviewed.
Data. The term “data” refers to information created, processed, saved, and stored digitally by a computer in ones and zeros—or binary format. Network connections or devices allow this data to be transferred from one computer to another. There is also a distinction that needs to be drawn between “data” (machine-readable ones and zeros, or “code”) and “information” (what that data means to humans).
Data governance. Data governance is the rules, standards, policies, and laws set by various authorities for managing data including access and use. The term “data governance” has many different meanings depending on the context and the perspective of various stakeholders. Data governance key areas include:
- national security and law enforcement: ensuring access to data for purposes of domestic and international security; avoiding misuse of that data; and protecting data against illicit collection.
- economic growth and innovation: This involves creating and accessing large databases of data for research and development of data-intensive technologies like machine learning/artificial intelligence, as well as for cross-border transactions and ecommerce.
- policies and practices for data flow: cross border data flow is a major concern raising security and data protection issues.
Global Data Governance. Global data governance is the rules, standards, policies, and laws governing how data is collected, used, stored, and transferred across borders. Trust is a major concern to ensure “Data Free Flow with Trust”. Global data governance involves how governments support data flows across borders or restrict data flows. It also involves the ability of firms to transfer data from domestic sources to foreign countries {the opposite of free data flow). Some countries require that a copy of data be stored on a server within that country before it is allowed to be sent out. Restrictions exist on certain domain and sectors like health or finance. For instance, China requires firms to store certain kinds of data on servers inside the country, while allowing transfer in or out under certain conditions. Russia and India impose local storage and local processing of data while prohibiting outbound transfer altogether especially for payment data. Cross border data flow is a major area of global data governance involving governance of relationships at super-national, national, and sub-national levels.
Data infrastructure. A data infrastructure is a digital infrastructure promoting data sharing and consumption. Similar to other infrastructures, it is a structure needed for the operation of a society as well as the services and facilities necessary for the data economy. Research Infrastructures are layered hardware and software systems which support sharing of a wide spectrum of resources, spanning from networks, storage, computing resources, and system-level middleware software, to structured information within collections, archives, and databases.
Global data infrastructures. Data drives global businesses, economies, and countries. Global data infrastructure consolidates world data and provide global access to all countries. Global data infrastructure may have various purposes. For instance, global data infrastructures for research is based on principles of global collaboration and shared resources to encompass the sharing needs of all research activities. Global data infrastructure for business and finance is intended to facilitate cross border trade and payment and fund transfer. Other economic sectors also benefit from global data infrastructure and in particular the healthcare sector where cross border flow of sensitive healthcare data is an important concern for the welfare of nation. A recent example include the need of global data infrastructure for covid-19 vaccine digital certificate.
Building data infrastructure
Data Infrastructure can be seen as a complete technology, process, or a whole set up to store, maintain, organize, and distribute data in the form of insightful information. Data Infrastructure involves properties, the entities responsible for its operation and maintenance, and policies and guides that describe how to manage and make the most use of the data. It is formed through the systematic structuring of data that drives valuable and high-quality insights to reach accurate decisions.
Steps for building data infrastructures
Alcor Fund, supporting innovation and start-ups worldwide, suggests the following steps to build data infrastructure.
- Collection of Clean Data: To get the maximum benefits we need to collect clean, good, and right data. to make correct and informed decisions. Therefore, having a streamlined system for collecting clean data mitigates the risks of skewed or duplicated statistics.
- Define Goals: Data are in various forms collected through manual or automated integration and we have to define business goals for its use.
- Leveraging SQL & BI Tools: Structured Query Language SQL and Business Intelligence tools makes data easy to understand and analyse.
- Build ETL Pipeline: “Extract, Transform, and Load” ETL pipeline is used to turn data into valuable information. This involves extraction from sources, transformation into standardized formats, and loading data into SQL query table stores.
- Utilizing Power Warehouse: The setting up of Data Warehouse involves cleaning and processing data and storing data in tables.
- Security and Auditing: Data has to be regularly audited to strike out the ambiguities. We need to build a strong authentication and authorization process. The Row-level security lets the owner restrict access within datasets.
Data Infrastructure Building Blocks
Data infrastructure is composed of various building blocks as detailed below:
- Data Ingestion: It is an infrastructure to transport data from one or many sources to a destination where it can be stored correctly for further data analysis.
- Data Access: It is an interface to retrieve, modify, copy, or move data from IT systems to the requested access query.
- API Integration: Application Programming Interface API is an interface that processes requests and looks after the seamless distribution of information through systems. Additionally, it interacts and communicates with backend systems along with various applications, devices, and programs.
- Data Storage: It refers to the physical retention and storage of data through various equipment and software.
- Data Processing: It is an interface to control the data for effective collection and to drive meaningful information.
- Databases: Databases are the organized and systematic collection of data that can be accessed electronically through computer systems.
- Networks: It is an interface of connections between computers, servers, mainframes, network devices, peripherals, etc to share data.
- Data Security: It includes all systems, applications, hardware and software for protecting data from unauthorized access. Thus, reducing the risk of data corruption throughout the data lifecycle. It includes encryption, hashing, tokenization, and key management practices for the comprehensive protection of data across all platforms.
- Data Management: It involves the collection, retention, and usage of data in a secure and cost-efficient manner.
- Data Quality: Data is regarded as high quality when it serves the intended purpose and can describe the real-world construction.
- Data Centres: It includes the physical facility or a dedicated space of an organization that is responsible for the storage of applications and data. Data centres hardware include routers, switches, firewalls, storage systems, servers, application delivery controllers, etc.
- Data Analysis: It is a process of intense inspection of data for supporting the decision making.
- Data Visualization: It refers to the representation of data in graphical form. That includes- graphs, charts, maps, etc. It brings forward easy communication of numbers in a graphical manner by building relations among the data.
- Cloud Platforms: It is a hardware-based operating server that acts as an Internet-based data centre for storage and processing.
Data Infrastructure as a Service
Data Infrastructure as a Service provides computing infrastructure and various computing resources through the internet. It complements the other forms of cloud computing services along with software as a service (SaaS) and platform as a service (PaaS). Data Infrastructure as a Service include compute resources, storage, networking along with self-service interfaces, web-based user interfaces, APIs, management tools, and cloud software infrastructure delivered as services.
The global data challenge
The G20 Policy Brief offered to the Saudi T20 (G20 Think 20) process in 2020 highlights the various challenges facing global data and data infrastructures. These challenges are overviewed below:
- Lack of unified data governance practice: Data is increasingly becoming one of the most important resources of the 21st century, greatly affecting how industries, nations and societies develop. In 2019, there were about 4.13 billion internet users around the world, a 900 percent increase from the same statistic two decades ago. Goods and services flow across borders through digital platforms at an unprecedented rate. Data governance vary significantly across countries and, even in countries where data-related laws exist, enforcement mechanisms might not be in place. Such divergence in regulation ultimately leads to a divergence in standards and experiences in the Internet of Things (IoT). For instance, while some nations are treating privacy as a human right and making great efforts to preserve users’ information and anonymity, other governments are enforcing strong state control and content censorship. Many nations demand data to be stored locally and impose limitations to data transfers outside their borders. Governments want as much control over data for security and protectionism reasons. Nevertheless, fragmented approaches to governance do more harm than good. Nations could and should take part in international agreements to converge data-related policies, but still preserve the nations’ policy sphere and sovereignty to data governance. By creating a governance framework which facilitates free data flows, while ensuring cybersecurity and trust, all nations would benefit and still decide how to regulate data.
- Lack of data Interoperability on a Global Level: There is no universally accepted definition of interoperability of data. With interoperability, transaction costs are reduced and data is more secure since agents would no longer need to invest in a new system just to “read” the new information. Agents would also no longer need the services of a third party who will convert the information into a format that is readable by their systems. This promotes efficiency through cost-savings and improves security through exchanging information directly.
- Constraint on data flows: Data localization, which implies that data collected from a country should be stored within the borders of that country, is a major challenge. Under extreme localization, data should be stored within the borders of a country and under no circumstances it will be allowed to be transferred elsewhere. On the other end of the spectrum, no localization, is no regulation as to where collected data should be stored. Countries occupy significantly different points on the spectrum. India has advocated for relatively strong localization of personal data, whereas the United States US, Mexico, and Canada have recently adopted a provision that entirely rejects localization. A consensus on data protection that does not bound or limit a country’s localization policy is needed.
ADDRESSING THE GLOBAL DATA CHALLENGE
In this section we overview the various approaches to address the challenges facing global data and global data infrastructure. These include building new data value chains, leveraging Artificial Intelligence AI technologies, adopting innovative approaches to governing data flow, and building Europe global data infrastructure.
New data value chains
The G20 Policy Brief No. 155 of September 2019, put standards for the digital economy and for developing an architecture for data collection, access and analytics. G20 leaders view digitization as a way to enhance the competitiveness of the economy and improve the delivery of services. In order to achieve this vision, new data value chains are needed.
The G20 policy brief confirms a growing demand for the creation of data value chains. It then proposes a common approach to facilitate the standardization of data collection, data access and data analytics, which could be implemented in specific sectors of the economy and outlines some of the key data governance themes that need to be addressed to facilitate data sharing between organizations.
Data value chains that cut across organizations currently do not exist. Foundational standards are needed to bring clarity to intended users across new data value chains, establish common parameters, allow for interoperability, and set verifiable data governance rules to establish and maintain trust between participants and with regulators.
In order to be successful, new data value chains will have to address a number of data governance issues that are currently impeding data sharing and exchange between organizations. Standards framing data collection and grading; data access, storage and retention; and data analytics and solutions will need to provide guidance on the following themes:
- ownership/Intellectual Property IP/copyright
- data quality and valuation
- interoperability
- safe use
Big data create new opportunities for businesses and consumers, and new challenges for security and privacy. The use of data, whether sold to third parties or used by firms to advertise or tailor their own products, has become integral to business models.
Data-intensive technologies such as artificial intelligence AI and the Internet of Things (IoT) offer greater consumer choice and personalisation. At the same time, they pose new risks to safety, privacy and security, and may discriminate against disadvantaged groups such as women and ethnic minorities.
The G20 is developing policies to raise awareness about privacy and data protection frameworks and strengthen their enforcement, while promoting accountability for data controllers.
Leveraging Artificial Intelligence
The OECD Digital Economy Outlook report 2021 considers all aspects of the digital transformation following the COVID-19 pandemic and put measures to contain the COVID-19 pandemic impact on OECD countries relationship with digital technologies. Global dependency on digital technology touched all aspects of society from education to health. Teleworking, distance learning and e-commerce have surged across the OECD, as well as uptake of digital tools in businesses. Governments, businesses and academia have been quick to grasp the potential of artificial intelligence (AI) to contribute to the crisis response, as well as the need for timely, secure and reliable access to data within nations and across borders. Global sharing and collaboration in research data have reached unprecedented levels.
Innovative approaches to governing data and data flows
The Digital Economy Report 2021 entitled “Cross-border data flows and development: For whom the data flow”, published by the UNCTAD, United Nations Conference on Trade and Development division on technology and logistics, calls for innovative approaches to governing data and data flows to ensure more equitable distribution of the gains from data flows while addressing risks and concerns. It points to the complexities involved in governing data and data flows across borders in ways that can bring sustainable development benefits.
The report lay down policy-oriented analytical work on the development implications of information and communications technologies (ICTs) and e-commerce and promotes international dialogue on issues related to ICTs for development, and contributes to building developing countries’ capacities to measure e-commerce and the digital economy and to design and implement relevant policies and legal frameworks, and manages the eTrade for all initiative.
The COVID-19 pandemic has accelerated the process of digital transformation and added urgency for Governments to respond. The key challenge addressed in UNCTAD report is how to govern and harness the surge in digital data for the global good. It has been estimated that global Internet traffic in 2022 will exceed all the Internet traffic up to 2016. Data have become a key strategic asset for the creation of both private and social value. How these data are handled will greatly affect world ability to achieve the Sustainable Development Goals. Data are multidimensional, and their use has implications not just for trade and economic development but also for human rights, peace and security. Responses are also needed to mitigate the risk of abuse and misuse of data by States, non-State actors or the private sector.
The Digital Economy Report of the United Nations Conference on Trade and Development examines the implications of growing cross-border data flows, especially for developing countries. It proposes to reframe and broaden the international policy debate with a view to building multilateral consensus, and to embark on a new path for digital and data governance.
Europe Global Data Infrastructure
The European project GAIA-X aims at building a federated data infrastructure. The project involves representatives from business, science and politics on a European level to develop next generation data infrastructure: a secure, federated system that meets the highest standards of digital sovereignty while promoting innovation. This paves the way for an open, transparent digital ecosystem, where data and services can be made available, collated and shared in an environment of trust.
The project identifies requirements for a data infrastructure to ensure openness, transparency and the ability to connect to other various countries. An open digital ecosystem is needed to enable companies and business models to compete globally. This ecosystem should allow both the digital sovereignty of cloud services users and the scalability of European cloud providers.
The federated, open data infrastructure connects centralized and decentralized infrastructures in order to turn them into a homogeneous, user-friendly system. The resulting federated form of data infrastructure strengthens the ability to both access and share data securely and confidently.
ROADMAP FOR LEADERSHIP OF THE GLOBAL DATA INFRASTRUCTURE
In this section, we trace a roadmap for Arab banks and authorities to lead the development of the global data infrastructure for the global digital economy platform serving the need of all sectors including finance, global trade, climate, healthcare and other vital economic sectors. This roadmap is based on the initiatives proposed by G20, OECD, and UNCTAD. The proposed roadmap includes the following steps to be undertaken by Arab banks and authorities to insure a leadership role in building the global data infrastructure for the global digital economy platform:
- Step #1. Contribution to the development of global data governance rules and norms.
- Step #2. Contribution to international rules for free movement of data across borders.
- Step #3. Enforcing laws and cooperation mechanisms to store, process and transfer data globally. Cooperation procedures must be revised and enforcing mechanisms standardized.
- Step #4. Establishing a legal framework to facilitate access to data stored in other jurisdictions.
- Step #5. Negotiating new multilateral agreements and establishing cooperation protocols while preserving countries’ sovereignty.
- Step #6. Developing global mechanisms of digital cooperation. A holistic approach is fundamental to acknowledge the interdependence of stakeholders and sectors, and enable the world to move together towards better digital solutions.
- Step #7. Investing in technological innovations, content production, cybersecurity measures, ethical implications, societal change, economic development, network management and privacy of global data infrastructures.
- Step #8. Adapting digital assessment methodologies to overcome obstacles for efficient multilateral data governance and policy-making. Introducing common methodology will enable even flow of data and ensure consistency of measures taken in digital assessment
- Step #9. Promoting interoperability on a global level to achieve security and efficiency.
- Step #10. Using digital management tools that support cross-border data sharing between businesses for future business transformations and pandemic responses.
REFERENCES
NEW AMERICA, CYBERSECURITY INITIATIVE on Global Data Governance, Berkley university research on global data infrastructures, Nippon Telegraph and Telephone Corporation NTT R&D, G20 insight, G20 policies, The OECD Digital Economy Outlook report 2021, UNCTAD Report 2021 entitled “Cross-border data flows and development: For whom the data flow”, Europe GAIA-X Project, Science Direct, ALCOR FUND WORLDWIDE, Wikipedia, Investopedia.