HARNESSING THE POTENTIAL OF BIG DATA IN POST-PANDEMIC SOUTHEAST ASIA

This publication is the last of four reports from a regional study completed in 2021 and funded by the technical assistance of the Asian Development Bank (ADB) on Policy Advice for COVID-19 Economic Recovery in Southeast Asia. The project supports the recovery efforts of Southeast Asian countries to return to their economic performance before the coronavirus disease (COVID-19) pandemic. It also assists countries in preparing for national, regional, or global transformations that may take place post-COVID-19. The focus countries are Cambodia, Indonesia, Myanmar, the Philippines, and Thailand, which tapped ADB’s COVID-19 Pandemic Recovery Option facility. * The study produced four reports on the following thematic areas: relevant to the focus countries in light of the current COVID-19 pandemic, but also critical in the long term to ensure effective targeting of vulnerable populations. This is particularly important where the shares of the population living below national poverty lines are relatively high at 13.5% for Cambodia, 25% for Myanmar, and 16.7% for the Philippines. 121 The study conducted by ADB in the Philippines and Thailand has shown the potential of using innovative datasets such as satellite imagery to complement traditional statistics in identifying poor households. This could be expanded to other countries through a pilot program to test the effectiveness of satellite data in mapping different poverty profiles. The program could rely on publicly accessible satellite imagery and open-source data analytics tools to demonstrate the feasibility of the approach to national statistics offices and policy makers. This report illustrates why Southeast Asian countries need big data for pandemic recovery to radically transform the delivery of key services such as health care, social welfare and protection, and education. The final of a four-part series, it looks at the impact of COVID-19 on Cambodia, Indonesia, Myanmar, the Philippines, and Thailand to determine how big data could be an invaluable tool to help governments analyze the challenges they face. It outlines policy reforms and recommendations to help capture the benefits of big data. These include drawing up digital road maps, improving technical infrastructure, increasing data quality, and ramping up training programs to create a skilled workforce to lead the digital transformation.

Using Big Data to Support the Prevention and Detection of Diabetes in Europe 7 5 Improving the Management of Unemployment Benefits and Job Placement Services Using Big Data 11 6 Mapping Poverty Levels in the Philippines and Thailand Using Satellite Imagery 12 7 Using Linkedin Data to Identify Skills Gap in South Africa 16 8 Georgia State University Leveraged Big Data to Prevent Dropouts and Increase Graduation Rates 17 9 CamDX-An Integrated and Secure Data Exchange Platform to Enable Big Data Usage 22 10 X-Road-A Data Exchange Layer that Enables Unified and Secure Data Sharing 28 11 Crowd-Sourcing Data through a "Data Insights Unit" 29 12 Establishing Regulatory Sandboxes to Promote Innovations 32 13 Methodology Used to Select Big Data Pilot Programs 34 For the focus countries, there are three main areas of opportunity where big data can be particularly useful in the health care sector. First, it can be used to improve the monitoring of infectious disease outbreaks. For example, it can be employed for preventative health surveillance by monitoring the health conditions of populations to detect possible epidemics before they occur. Second, it could enhance the prevention and detection of noncommunicable diseases. For example, United Nations (UN) Global Pulse (an initiative by the UN to bring big data and artificial intelligence to development programs) and the World Health Organization showed the potential of using big data to monitor risk factors associated with noncommunicable diseases (e.g., tobacco and alcohol use, diet, and physical activity). Finally, it can help improve treatment capacity through remote patient monitoring. For instance, big data-reliant devices can help alert physicians when there are potential issues such as heart failure.

Social Welfare and Protection
In the social welfare and protection sector, the COVID-19 pandemic has highlighted the need to improve the delivery of social welfare and protection programs. In the Philippines, ₱207 billion ($4.2 billion) was allocated as of December 2020 to the Department of Social Welfare and Development for cash assistance to low-income families. 1 However, there are a number of existing challenges that have affected the targeting and delivery of social assistance programs (e.g., lack of reach to the informal sector and inadequate granularity 1 Government of the Republic of the Philippines. Investor Relations Office. https://iro.ph/articledetails.php?articleid=3617&catid=11. x Executive Summary of poverty data). For example, many informal sector workers in Cambodia who have lost their jobs due to COVID-19 cannot benefit from the government's cash support due to their unregistered status. 2 Big data has three applications that can address the challenges in the development and delivery of social welfare and protection programs. First, it could help in identifying beneficiaries. A study conducted by the UN Global Pulse and World Food Program on the Tabasco flood in Mexico found that real-time information derived from mobile phone usage patterns can help authorities and humanitarian agencies pinpoint areas of acute need with a high level of speed and precision. 3 Second, it could improve program delivery and detect fraud. For instance, rule-based algorithms can flag suspicious correlations such as a person receiving unemployment benefits while filing for a work-related accident. Third, it could help in assessing program effectiveness. For example, using anonymized financial account data from banks or digital wallet providers allows the analysis of household behavior shifts after receiving cash transfers and how this money was spent.

Education
In the education sector, there is a potential to make use of social media platforms as well as online job portals which provide a large amount of data on jobs and skills trends. While student records are not yet fully digitized in most countries, the increasing adoption of online learning solutions has presented valuable opportunities for obtaining insightful data for education policies. For example, many educational institutes in Southeast Asia shifted to online learning platforms during the COVID-19 pandemic. Such initiatives to expand the adoption of digital technologies help build a foundation for applying data-driven analytics in the focus countries. The benefits brought about by data-driven technologies, such as personalized learning and online job matching, are estimated at an annual $77.1 billion across Southeast Asia by 2030.
There are three main areas where big data can be particularly useful in the education sector. First, big data can help in identifying skill gaps. For example, data from online job portals can be analyzed to understand employment and skill trends and develop training courses that respond to industry needs. Second, big data could help increase graduation rates and prevent dropouts. By leveraging big data analytics, schools can look into the vast number of student records to identify early warning signs and provide targeted support to those in need. Finally, big data could provide a personalized learning experience to cater to the unique needs of students. For instance, by collecting data on how students interact with the virtual learning environment, the online sources they use for research, their participation in chats and forums, the areas that they struggle with, as well as the way they present information, schools can have a better understanding of their abilities and learning styles.

Policy Enablers
However, to unlock the potential of big data in public service delivery, seven policy enablers would be required. These are (i) strategic governance, (ii) availability and quality of data, (iii) risk mechanisms, (iv) human capital for big data, (v) access to relevant technologies, (vi) data-driven culture, and (vii) information and communications technology (ICT) infrastructure.
Digitizing public sector data and accessing big data is growing and advancement in technologies could facilitate wider application. However, governments need to lay the strategic and technical groundwork to maximize the opportunities of big data and mitigate its risks, including protection for data privacy, fraud, and cyber-security. By 2025, the value of Southeast Asia's internet economy could triple to $300 billion. The broader impact on the rest of the economy could, however, be far larger than this. It is estimated that Southeast Asia's digital economy 2 has the potential to increase annual regional gross domestic product by $1 trillion through 2025 as compared to 2015. 3 Governments are actively trying to drive this digital transformation, foster the growth of their digital economy, and leverage a range of technologies to improve public service delivery. Big data and other technologies that build on it such as artificial intelligence (AI) can play a transformative role in the public sector. In particular, in the response to the coronavirus disease (COVID-19) pandemic and the subsequent recovery, big data can generate unique insights and help public institutions stay on top of the wide range of challenges they are facing.
This section focuses on big data applications and opportunities in three public service sectors-health care, social protection and assistance, and education. These three sectors were selected as the focus of this report for two reasons: (i) their importance in the economic recovery from COVID-19 and building resilience to future pandemics and (ii) their relevance for policy makers (informed by consultations with policy makers in the focus countries). A set of enabling factors that are crucial for deriving the maximum value from big data for governments are then discussed and finally, a set of recommendations, including pilots and policy reforms, are outlined to help countries capture the big data opportunity.

Big Data and Big Data Analytics
Big data presents large opportunities for public service improvements.
Simply put, big data refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze (Box 1 provides further details on the definition of big data).
Three are four drivers influencing the potential value of big data by sector: • Volume of data. The larger the amount of data in the sector, the more it indicates the potential to benefit from utilizing big data analytics. This depends not only on the volume of potential data, but how much is currently digitized. • Variety of data. The more different forms of data available in the sector (e.g., social media, video content, and structured data), the more potential value there could be in combining them to generate unique insights. • Veracity of data. The higher the quality or accuracy of the data, the better the potential insights.
• Value of applications. The degree to which there are specific applications in that sector that can deliver value.

Box 1: What are Big Data and Big Data Analytics?
"Big data" refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze. a This incorporates an evolving definition of how big a dataset needs to be in order to be considered big data as the size of datasets that qualify as big data will increase as technology advances over time.
Depending on what kinds of software tools are commonly available and what sizes of datasets are used in a particular industry, big data can range from a few dozen terabytes to multiple petabytes (thousands of terabytes). Some practitioners consider a set of data "big" if its size is larger than the size of their computer's processing power.
"Big data analytics" refers to the process of collecting, organizing, and analyzing big data to discover trends and patterns in large amounts of raw data to help make data-informed decisions. The collection and storage of big data have been facilitated by cloud computing which allows larger storage capacity, faster computing power, and flexible scaling of resources without the need for on-premises hardware. Meanwhile, the process of analyzing big data requires new data analysis methods such as data mining (i.e., looking through large datasets to identify patterns and relationships by identifying anomalies and creating data clusters), predictive analytics (i.e., using an organization's historical data to make predictions about the future and identify upcoming risks and opportunities), and deep learning (i.e., a subset of artificial intelligence which involves using multiple layers of neural network algorithms to find complex patterns in various datasets Our research has shown that the opportunity for capturing value from big data in three public service sectors-health care, social welfare and protection, and education-is significant (Table 1; see Table A1 of Appendix 1 for detailed methodology). In the health care sector, for example, there is a large amount of data from health records that can be leveraged for big data analyses. This database can be combined with data from other sources such as social media, smartphone applications, and remote monitoring systems to support policies related to prevention, detection, and treatment of diseases. In the education sector, there is a potential to make use of social media platforms as well as online job portals which provide a large amount of data on jobs and skills trends. While student records are not yet fully digitized in most countries, the increasing adoption of online learning solutions has presented valuable opportunities for obtaining insightful data for education policies. For example, most schools in Indonesia shifted to online learning platforms during the COVID-19 pandemic. The transition was supported by the Ministry of Education and Culture which partnered with EdTech companies to provide free access to online learning platforms and with telecommunications operators to provide free internet quotas for teachers and students. 4 Meanwhile, in the Philippines, the government issued a memorandum to close the gap in resources and facilities in "last mile schools," including improving internet connection and installing computerized program packages, even before the COVID-19 pandemic. 5 Such initiatives to address the digital divide and expand the adoption of digital technologies help build a foundation for applying data-driven analytics in the focus countries. Regarding data on social assistance 4 D. Gupta and N. Khairina. 2020. COVID-19 and Learning Inequities in Indonesia: Four Ways to Bridge the Gap. World Bank Blogs. 21 August. https://blogs.worldbank.org/eastasiapacific/covid-19-and-learning-inequities-indonesia-four-ways-bridge-gap. 5 Government of the Philippines, Department of Education. https://www.deped.gov.ph/2019/05/22/may-22-2019-dm-059-s-2019-prioritizingthe-development-of-the-last-mile-schools-in-2020-2021-reaching-out-and-closing-the-gap/.

A. Digital Economy Trends
Big data applications in health care can improve the monitoring of infectious diseases, enhance the prevention and detection of noncommunicable diseases, and improve treatment capacity.
Prior to COVID-19, total government expenditure on health care in Southeast Asian countries was estimated at between 0.8% and 3.4% of the gross domestic product (GDP) of these countries. 6 Health care spending in the region has significantly increased due to the COVID-19 pandemic. For example, Indonesia reallocated Rp 27 trillion ($1.8 billion) to fund the health care system in March 2020 to manage the impact of the pandemic, and announced a plan to reallocate additional budget to provide free vaccines to Indonesians. 7 Data-driven technologies, particularly big data applications, could help improve the delivery of health care services and reduce costs. Across Southeast Asia, the benefits of rolling out data-driven public health interventions and remote patient monitoring alone are estimated to be worth $24.9 billion annually by 2030 (Figure 1; see Table A2.1 of Appendix 2 for detailed methodology). . . • Improving the monitoring of infectious diseases. Ensuring effective health service delivery has become even more critical in light of the COVID-19 pandemic. For example, more than 1 million people across the five focus countries have been infected with the virus as of December 2020. 8 As a result, health systems of these countries are faced with a number of difficulties that affect their ability to manage infectious disease outbreaks. For instance, the Ministry of Health of Cambodia and the World Health Organization (WHO) have identified deficiencies in the country's prevention and detection capacities, including human resource constraints, a lack of capacity for case detection and contact tracing, and challenges in health emergency coordination at subnational levels. 9 There are several ways in which big data can play a role in the monitoring of infectious diseases. Big data can be employed for preventative health surveillance by monitoring health conditions of populations to detect possible epidemics before they occur. For instance, data from smartphone-connected thermometers can allow for real-time tracking of influenza activity. A study analyzing data from over eight million temperature readings generated by almost 450,000 smartphone-connected thermometers used by households 10 in the United States showed that the data were highly correlated with information obtained from traditional disease surveillance systems and could potentially predict influenza activity up to 2 to 3 weeks in advance. 11 Furthermore, search engines and social media are also valuable data sources that can help detect the emergence of possible epidemics. In a study that collected data of Google searches and Twitter messages related to influenza in Greece and compared them against official statistics, it was found that Google and Twitter data produced precise estimates of the influenza development and had the potential to predict influenza before it is observed in the population. 12 Big data has also been essential for contact tracing during the COVID-19 pandemic. Data from mobile phones, transport systems, or social media can be used to identify location and track travel patterns of diagnosed or suspected cases to support disease tracking and provide early warning for populations at risk. A number of countries have leveraged location data (e.g., global positioning system [GPS] data) and Bluetooth data from mobile phones for contact tracing during the pandemic. For instance, Singapore's TraceTogether app collects proximity data based on exchanges of Bluetooth signals to identify people who have prolonged proximity with infected cases. 13 Box 2 provides another example of using big data for contact tracing and surveillance in the Republic of Korea. In addition, countries can also stay one step ahead by leveraging big data to support the delivery and implementation of vaccines. Big data can be used to ensure that vaccines are stored within a precise range of temperatures, preserving their quality from the manufacturer to the point of use The study used data from commercially available Kinsa Smart Thermometers which record and store temperature measurements, using the Kinsa smartphone application. When recording temperatures, users can assign readings to profiles by age and sex, allowing readings from multiple users within a household to be distinguished. Readings are geocoded using Global Positioning System location (for enabled devices) or by Internet Protocol address.

Data
including origin, destination, delivery route, external weather, and logistics providers. 14 In another partnership between Google.org-the charitable arm of Google-and Gavi, a wireless temperature monitoring system called ColdTrace was built to provide real-time data on refrigerators used to store vaccines. 15 The system collects data from sensors placed inside vaccine refrigerators and will notify key personnel when vaccines are in danger. 16 Regarding the implementation of vaccination programs, big data can be used to identify priority populations such as particularly vulnerable groups (e.g., elderly people with preconditions) and "super spreaders" (i.e., individuals with high levels of close contact with many others such as those working in the service industry). Box 3 shows an example of an application used to identify vulnerable populations in the early days of the COVID-19 pandemic, which can be deployed for vaccination programs. When it comes to mass vaccination campaigns, data from social media platforms can be used to understand public perceptions and concerns around vaccines, and develop strategies to address them.
• Enhancing the prevention and detection of noncommunicable diseases. There are longer term pressures on each country's health care system such as the rising burden of noncommunicable diseases. In Thailand, for example, the prevalence of noncommunicable diseases (e.g., cardiovascular diseases, cancers, diabetes, and chronic respiratory diseases), driven by the country's high level of alcohol and tobacco use, has become a pressing public health issue. 17  Total health loss is a measure of how much healthy life is lost due to early death, illness, or disability as a result of certain health conditions and their consequences.

Box 2: Leveraging Big Data for Disease Surveillance during COVID-19 in the Republic of Korea
Ten minutes is all it takes for the Republic of Korea authorities to track the travel history of a coronavirus disease (COVID-19) patient using big data, compared to about 1 day if a manual epidemiological survey were used. This is enabled by the COVID-19 smart management system (SMS), a system that leverages big data provided by 28 organizations, including the police, credit card companies, and telecom service providers.
Jointly developed by the Korea Centers for Disease Control and Prevention, the Ministry of Land, and the Ministry of Science and ICT, the system was rolled out in March 2020, a month after the number of cases spiked. The system analyzes a person's credit card transactions and mobile phone location records to instantly map out a virus transmission route and identify potential infection hot spots. Big data was used in combination with direct interviews of the infected people. This allowed Korea Centers for Disease Control and Prevention to identify and isolate potential cases early, and openly share risk alerts to help other citizens stay safe. By relying on aggressive contact tracing and widespread testing, the Republic of Korea was able to contain the outbreak without resorting to drastic measures such as blanket lockdowns. risks when hit by pandemics such as COVID-19. 19 Big data can play an important role in the prevention and detection of health risks related to noncommunicable diseases. One project collected data from hospitals and population studies of five European countries and used it to detect the risk of developing diabetes in a given population (Box 4). In another example, United Nations (UN) Global Pulse (an initiative by the United Nations to bring big data and artificial intelligence to development programs) and the WHO showed the potential of using big data to monitor risk factors associated with noncommunicable diseases (e.g., tobacco and alcohol use, diet, and physical activity). The study found that indices for risk factors could be built and tracked over time on social media such as Twitter. Internet search traffic using keywords such as "stop smoking" could also be analyzed to provide faster and cheaper information on noncommunicable diseases.

Box 3: Analyzing Big Data to Identify Priority Populations for Vaccination Programs
Big data can be used to identify priority populations for vaccination programs such as people with pre-existing chronic diseases and those working in the service industry. For example, Blue Shield of California, an insurer in the United States, used big data analysis to identify clients who were most vulnerable to coronavirus disease (COVID-19). a The company leveraged a machine learning platform to analyze different factors such as individuals' health history combined with social, environmental conditions, and the most up-to-date medical literature on COVID-19. This helped identify a number risk factors that could have been overlooked such as location (e.g., individuals who did not live within a close-enough vicinity to a grocery store were at an increased risk of ending up in the hospital, on a ventilator, or even dying from COVID-19) and underlying conditions (e.g., individuals who had experienced severe mental health issues were at greater risk). Based on these findings, Blue Shield was able to provide targeted health counseling and support services to its members, including free meal delivery, medication delivery, telemedicine, and in-home clinical visits.
• Improving treatment capacity through remote patient monitoring. When it comes to the treatment of diseases, the lack of facilities and lack of human resources are critical challenges across many countries in Southeast Asia, which have been exacerbated during the pandemic. For example, with only eight hospital beds per 10,000 persons, Cambodia is faced with a shortage of facilities, particularly at the national and provincial levels. 21 Meanwhile, in the Philippines, the limited number of health care workers, at only 17 health care workers (four doctors, nine nurses, and four midwives) per 10,000 persons, is a major constraint leading to low treatment capacity. 22 Systems that leverage big data to improve treatment capacity through remote patient monitoring can include devices that monitor heart conditions and blood-sugar levels and then feed data in near real-time to electronic medical record databases. They can also alert physicians when there are potential issues such as heart failure. The use of data from remote monitoring systems can improve productivity, reduce patient in-hospital bed days, and cut emergency department visits. A study conducted by Singapore General Hospital in 2019, which piloted vital signs trackers (including a blood pressure cuff and biosensor) among a group of patients, showed significant productivity improvements. As compared to in-person checks, approximately 9 minutes were saved by remotely monitoring a patient hourly for 6 hours, and up to 22 minutes were saved when the hourly monitoring stretched across 12 hours. 23 Remote patient monitoring has been shown to have significant economic impact through reduced hospital visits, length of patients' stays, and medical procedures. The McKinsey Global Institute (MGI) estimates savings of 10%-20% to health care systems from the resultant reduced hospital visits, length of patients' stays, and number of procedures of applying remote patient monitoring systems. Applying this impact to Southeast Asia's context, one could expect cost reductions of $9.4 billion annually by 2030.
As assessment of the extent of big data usage in health care across the five focus countries reveals several important initiatives. For example, Thailand's Ministry of Public Health highlighted the use of big data as one of the major reforms in its eHealth Strategy (2017-2026). 24 The ministry is also encouraging hospitals under its supervision to tap the power of big data in tackling challenges in the prevention and treatment of diseases. Meanwhile, in Indonesia, social media data were analyzed to provide real-time insights on public perceptions on immunization in a study conducted by the Ministry of Development Planning (Bappenas), Ministry of Health, United Nations Children's Fund, WHO, and Pulse Lab Jakarta. 25 In particular, analysis of relevant conversations on Twitter shed light on public concerns around immunization such as religious issues and side effects of vaccines. The data obtained from public tweets helped identify a network of Twitter influencers (accounts with a large number of engaged followers) that could be leveraged by public health practitioners for rapid response to public concerns and misinformation related to vaccines and immunization.

Social Welfare and Protection
Social protection and assistance schemes can leverage big data during design, delivery, and review of programs.
The COVID-19 pandemic has highlighted the need to improve the delivery of social welfare and protection programs. In the Philippines, ₱35 billion ($730 million) was allocated to aid displaced workers through cash-for-work and other forms of assistance while ₱241 billion ($5 billion) was allocated to the Department of 21 World Health Organization data. https://www.who.int/data/gho/data/indicators/indicator-details/GHO/hospital-beds-(per-10-000-population Social Welfare and Development for cash assistance in areas under lockdown. 26 Cambodia also launched the Cash Transfer Program for Poor and Vulnerable Households during COVID-19 to provide a monthly allowance to poor households affected by the pandemic, with total spending estimated at KR100-KR120 billion ($25-$30 million) per month. 27 In addition, $100 million was allocated to a cash-for-work program to absorb the labor force who have lost employment from the factories or enterprises and returned home from the foreign countries while a monthly allowance of $40 will be given to each worker in the garment and tourism industries until March 2021. 28 However, there are a number of existing challenges that have affected the effectiveness of social assistance programs (e.g., lack of reach to the informal sector and partial coverage of poverty data). For example, many informal sector workers in Cambodia who have lost their jobs due to COVID-19 cannot benefit from the government's cash support due to their unregistered status. 29 Social protection and assistance programs in Asia and the Pacific will come under further strain in the years to come. While developing countries in the region have seen tremendous progress on extreme poverty reduction over the last decade (from 1.1 billion or 33.5% of the population in 2002 to 263 million or 6.9% in 2015), COVID-19 is estimated to have reversed over 50% of the progress on poverty reduction in the last 5 years, seeing 78 million additional people above pre-COVID-19 estimates slip back into extreme poverty in 2020 alone ( Figure 2  Big data has several applications that can address challenges in the development and delivery of social welfare and protection programs, including better targeting of beneficiaries and improving the design of these programs through more tailored interventions. These include: • Identifying beneficiaries. In addition to the increased need for social protection and assistance, there exist major challenges that may hinder the impact of social welfare and protection response in the focus countries. For example, while Thailand provides a relatively high level of social protection and is making use of electronic payments to improve service delivery, the country is faced with challenges in identifying target populations, particularly for the Child Support Grant which offers cash to households with children less than 6 years old and the Social Welfare Card which provides transport and gas subsidies as well as subsidies on food and other necessities to low-income individuals at designated stores. The fragmented and inconsistent management of social protection data, coupled with limited coordination among government agencies, have resulted in information silos and affected the accuracy of targeting. 31 In the Philippines, the lack of granularity of poverty data obtained from household surveys is a key constraint in identifying populations living under poverty, particularly in remote areas. 32 Identifying which segments of the population should be targeted with social assistance programs can be difficult in the absence of frequently updated databases on the socioeconomic conditions of different groups. Alternative data sources such as cell phone data or satellite imagery can complement official statistics by providing more granular and updated insights into vulnerable populations. Such innovative datasets have supported important applications in poverty reduction as well as disaster risk management. For example, in a World Bank's project in Guatemala, cell phone call records were analyzed to assess users' socioeconomic behavior, including consumption, mobility, and social patterns, to produce poverty estimates that were more cost-effective and updated than traditional survey data. 33 Innovative data sources could also play a vital role in disaster response and recovery. For instance, a study conducted by the UN Global Pulse and World Food Program on the Tabasco flood in Mexico found that real-time information derived from mobile phone usage patterns can help authorities and humanitarian agencies pinpoint areas of acute need with a high level of speed and precision. 34 By analyzing millions of aggregated and de-identified mobile phone datasets, the research team was able to map the flow of people moving across a region, identify the most damaged areas, and gain insights into affected populations.  Price of handset as proxy for income

Mobility between cell sites (including internationally) Travel patterns and movements between regions
Top-up amount, denomination, and frequency Monthly expenditure and usage as proxy for socioeconomic status Use of services and applications (e.g., voice, SMS, data, 2G, 3G, and 4G)

Basic education or literacy profile and consumption propensity
Branchless banking remittances (inward and outward) Estimation of receipts or payments to augment income estimation Source: AlphaBeta analysis.
• Another potential application of big data involves analyzing existing social welfare and protection programs to identify gaps and improve the design of these programs. In particular, governments can analyze records of past beneficiaries, results of the interventions and use these insights to develop tailored programs. Box 5 provides a case study on the usage of big data in managing unemployment benefits and job placement services.
• Improving program delivery and detecting fraud. Beyond providing an alternative source of data on at-risk populations, big data can also improve transparency and reduce risks of errors and fraud in identifying beneficiaries. This can be achieved by using algorithms to crawl through big data from a variety of sources to detect inconsistencies. For instance, rule-based algorithms can flag suspicious correlations such as a person receiving unemployment benefits while filing for a work-related accident. Other potential data sources such as bank statements or digital wallets can support the real-time refinement and tailoring of assistance provided to beneficiaries and optimize graduation or phasing out of cash transfers.
• Assessing program effectiveness. In the longer term, countries will also be required to strengthen their social welfare and protection efforts to extend other forms of support to the broader population in a post-COVID world (e.g., helping unemployed individuals or informal sector workers find new employment). Big data can be used to better evaluate the impact and success of social assistance programs. For example, using anonymized financial account data from banks or digital wallet providers allows the analysis of household behavior shifts after receiving cash transfers and how this money was spent. Similar approaches have been used in Australia to understand the impact of tax cuts on small business spending, using data from accounting software. 35 There have been several initiatives across the five focus countries to leverage big data in the area of social welfare and protection; however, most are still at the exploratory or early phase. For example, a pilot study in the Philippines and Thailand that explored the use of satellite imagery in mapping poverty levels has shown the potential of using innovative data sources to complement traditional poverty statistics (Box 6

Box 5: Improving the Management of Unemployment Benefits and Job Placement Services Using Big Data
The Bundesagentur für Arbeit (BA) or the Federal Employment Agency is the provider of labor market services in Germany with a network of 156 employment agencies and approximately 600 branches nationwide. a The agency is responsible for providing job placement and career counseling, promoting vocational and further training, and distributing unemployment benefits.
To improve its services, BA analyzed a large amount of historical data of its customers, including the histories of unemployed workers, the interventions that were implemented, as well as the outcomes (e.g., how long it took people to find a job). Based on this analysis, the agency was able to evaluate the characteristics of its unemployed and partially employed customers and develop a segmented approach that offered more effective placement and counseling services to targeted customer segments. After analyzing the outcome data of its placement programs, BA identified programs that were ineffective and removed or refined them using new approaches.
This allowed the agency to optimize its programs and reduce total spending by €10 billion ($12 billion). b The amount of time that unemployed workers took to find employment was also reduced, leading to higher satisfaction ratings. in Indonesia, the application of mobile data in mapping migration patterns was explored in a study conducted by Pulse Lab Jakarta and Bappenas in 2019. Using pseudonymous cellular data, the study provided a high level of granularity that allowed the Government of Indonesia to see the origins of individuals that migrated to large cities such as Jakarta, Medan, and Makassar. In addition to identifying migrant source communities and destination cities, the research also revealed essential insights on the volume and directional movements of rural to urban migration across Indonesia's vast archipelago. 36 The Government of Cambodia has also taken the first step in developing the infrastructure required to enable big data application in social welfare and protection. Led by the Ministry of Planning, the country implemented a digital identification system called "IDPoor" that serves as a single basis for targeting the poor population. This digital database has been used during the COVID-19 pandemic as a mechanism to identify more than 560,000 poor households for the "Cash Transfer Program for Poor and Vulnerable Households during COVID-19." 37 As a next step, "IDPoor" is planning to be integrated with databases from other government agencies such as the National Social Security Fund and the Ministry of Health to support the design and delivery of social welfare and protection programs in Cambodia.

Box 6: Mapping Poverty Levels in the Philippines and Thailand Using Satellite Imagery
Poverty statistics play a crucial role in identifying people who are at risk of socioeconomic exclusion and supporting the design of social protection programs. However, there are a number of challenges in the compilation of poverty statistics through household-based surveys. In particular, survey sample sizes are typically not large enough to provide reliable estimates at granular levels (e.g., municipalities and villages), and may, therefore, not be able to assist policy makers in efficiently targeting population segments that have the greatest need for poverty reduction programs. Furthermore, the relevance and timeliness of poverty statistics require that such surveys be conducted frequently, which can be costly for national statistics offices.
Big data can address these limitations by providing an alternative source of poverty estimates to complement traditional statistics. A study conducted by the Asian Development Bank (ADB) has shown the potential of using satellite imagery to generate richer insights on poverty levels. This approach was tested in two countries with different poverty profiles-the Philippines and Thailand, in collaboration with the countries' national statistics offices. Using machine-learning algorithms, the study predicted night-time light intensity, a proxy for human settlements and economic wealth, using daytime satellite images as input. It was found that even with publicly accessible satellite imagery, whose resolutions are not as fine as those in proprietary images, this method enabled more granular predictions of poverty levels than those currently being compiled by national statistics offices in both countries. The poverty map generated using satellite data has practical uses in mitigating the impact of coronavirus disease (COVID-19) such as identifying vulnerable households for food distribution.

Education
The use of big data could transform outcomes in education and employment.
COVID-19 has had a significant impact on employment in Southeast Asia. For example, youth across six Southeast Asian countries (Cambodia, Indonesia, the Lao People's Democratic Republic, the Philippines, Thailand, and Viet Nam) are expected to face a total of 4.3 million job losses in 2020 ( Figure 3).
Much of the unemployment caused by COVID-19 could become long term as the pandemic has in many cases advanced underlying trends of retrenchment driven by automation and changes in the industry composition of economies. Certain types of workers across the five focus countries have been particularly affected by the COVID-19 pandemic-in particular, youth, informal workers, and migrant workers have experienced the largest unemployment impacts from the pandemic (Table 3). Youth, particularly graduates entering the labor force, face gloomy job market prospects as hiring activities decline. Young workers in employment before COVID-19 have been particularly affected as many are working in areas most impacted by the pandemic such as services. A global study by the International Labour Organization (ILO) found that one in six young people (17%) who were employed before the outbreak, stopped working altogether, most notably younger workers aged 18-24. 38 Furthermore, freelance and informal workers whose income relies on ad hoc projects and daily work have been seeing a drastic decline in demand for their work. In addition to typically being accorded less priority in government COVID-19 policy, foreign workers face a range of newly imposed restrictions on work permits.
Education and skill development will be key tools for Southeast Asian countries to deal with this surge in unemployment and get affected populations back into the workforce. Beyond the need to reskill individuals impacted by the COVID-19 pandemic (where their roles may not return), there are also broader challenges facing the education sector in the long term such as ensuring the responsiveness of the education system to changing skill needs as well as increasing graduation rates. Data-driven applications could help address these challenges and improve the effectiveness of the education systems. The benefits brought about by such applications are estimated at an annual $77.1 billion across Southeast Asia by 2030 (Figure 4; see Table A2.2 of Appendix 2 for detailed methodology).
There are three main areas of big data applications that can help address challenges posed by the pandemic as well as improve the effectiveness and resilience of the education system in the long term. These include 38 ILO. 2020. Youth and COVID-19: Impacts on Jobs, Education, Rights and Mental Well-Being. https://www.ilo.org/wcmsp5/groups/public/---ed_emp/documents/publication/wcms_753057.pdf. COVID-19 = coronavirus disease. a "Large" refers to groups of workers that are assessed to have experienced significant unemployment impact due to COVID-19.
"Moderate" refers to groups of workers that are assessed to have experienced some unemployment impact due to COVID-19. "Limited" refers to groups of workers that are assessed to be minimally affected by COVID-19. Sources: Review of publicly available information on the impact of COVID-19 on employment; AlphaBeta analysis.
bridging the existing and emerging skills gap, increasing graduation rates and preventing dropouts, and providing a personalized learning experience for students.
• Identifying skills gap. Education is a key sector accounting for between 9% and 20% of total government spending in the five focus countries. 39 Yet, many Southeast Asian countries still experience significant mismatches between skill profiles generated by the education system and those demanded by the market. For example, Indonesia reported a skill gap due to the lack of industrial sectors' involvement in skills development and developing labor market information to capture emerging skills. 40 Cambodia, while having achieved notable progress in education, faces a worsening skills gap as the skills taught by technical and vocational education and training institutions do not appear to be adequately linked to those required by industry. 41 In the Philippines, the mismatch between jobs and skills has resulted in unemployment and underemployment among college-educated individuals. 42 This skills gap has widened due to COVID-19 as the pandemic has significantly accelerated digital transformation among businesses. 43 Digital skills such as using information technology (IT) tools, analyzing digital information, as well as collaborating the use of digital platforms are increasingly being a basic requirement across sectors. Big data can be used to address this skill mismatch through better skills gap identification, career advice, tailored learning, and better job matching as well as in developing a responsive education system in the long term. For example, data from online job portals can be analyzed to understand employment and skill trends and develop training courses that respond to industry needs. These job portals contain valuable information about job and skills demand by companies, which can offer complex insights into employers' needs according to sectors, professions,  . .

Personalized learning
Online job matching and types of skills. 44 Additionally, digital platforms such as LinkedIn and Facebook are increasingly becoming innovative recruitment channels, driven by growing opportunities for building relations and facilitating communication via social networks. With hundreds of millions of members, LinkedIn has the potential to offer a new, timely, and granular source of data about emerging industries, workers' changing skills composition, and how they are engaging with the labor market. The company has partnered with the World Bank Group in an initiative called "Digital Data for Development" to support innovative policy decisions in developing countries by providing insights on employment trends, skill needs, and skill adoption across industries, and talent migration. 45 Box 7 shows an example of how governments can make use of such data in developing education and training policies. Furthermore, some countries have leveraged emerging technology and analytics solutions to collect data on skills supply and demand, which can be used to develop education and training policies. An example of such initiative is Nurturing Expert Talent (NEXT) launched by TalentCorp, a national agency that is driving Malaysia's talent strategy. NEXT is a proprietary assessment system to help individuals identify their strengths, passion, and the career choices that are most suited to their skill sets. 46 The data gathered by the NEXT initiative could help TalentCorp shape future policies with evidence-based talent supply and demand data for key job roles.
• Increasing graduation rates and preventing dropouts. Another issue pertinent to educators in the five focus countries is the need to provide timely support for disadvantaged and underperforming students to increase graduation rates and prevent dropouts.

Box 7: Using Linkedin Data to Identify Skills Gap in South Africa
South Africa was faced with a critical task of managing double-digit unemployment rates and double-digit rates of individuals who were neither educated nor employed. The government, however, lacked the granular and actionable data on the types of talent and skills demanded by industry. To tackle this challenge, the World Bank partnered with LinkedIn to provide insights into the skills supply and demand in the country.
The analysis of LinkedIn data showed that South Africa had a strong global comparative advantage in traditional areas such as energy, mining, transport, and logistics and was slowly expanding as a regional leader in finance. However, the country lagged in sectors requiring digital skills (e.g., computer software and semiconductors). LinkedIn data also allowed for analysis of skill trends at subnational levels to identify unique growth capabilities of each region or city. For example, Cape Town's workforce was found to be competitive in areas related to business services, tourism, and creative work.
Based on the insights drawn from the analysis of LinkedIn data, policy makers were able to identify the most indemand skills and develop strategies to produce a pipeline of talent in these areas. The low supply combined with strong demand for digital skills indicated significant upskilling and reskilling opportunities for the local workforce that would enable them to participate in the digital economy. The government also recognized the need to strengthen the education system from primary to tertiary levels to equip graduates with in-demand skills. disadvantaged students in all subjects, with many disadvantaged students displaying low ambition or selfconfidence. 47 This is demonstrated by the fact that about one in six high-achieving disadvantaged students (versus one in 100 high-achieving advantaged students) do not expect to complete tertiary education. 48 Cambodia also saw a high dropout rate of 18% and low completion rate of 21% at the upper secondary level, with many students leaving schools with insufficient cognitive and workplace skills to meet expectations of employers (footnote 48). These problems are particularly dire in rural areas and among ethnic minority communities. In the long term, countries will need to improve the effectiveness of their education system and build capabilities to stay resilient to future shocks. Leveraging big data analytics, schools can look into the vast number of student records to identify early warning signs and provide targeted support to those in need. This application has been effectively adopted by Georgia State University in the United States to spot students who are at risk of dropping out even before they realize it and provide them with timely academic and financial support (Box 8

Box 8: Georgia State University Leveraged Big Data to Prevent Dropouts and Increase Graduation Rates
Georgia State University saw a drop in graduation rates when it accepted more students with disadvantaged backgrounds, many of whom came from low-income families and were the first in their families to attend college. a While the university realized the need to provide support for students who were underperforming or facing financial issues, it lacked the resources to frequently monitor tens of thousands of students every semester (with only one adviser for every 1,000 students).
Big data presented an effective solution to their problem by pinpointing exactly which students required help and when. 140,000 student records containing 2.5 million grades from 10 years were analyzed to identify indicators for predicting when students might be in danger of dropping out or failing out. b This analysis enabled the university to identify 800 unique circumstances that increase the likelihood of dropping out such as poor academic results in first-year courses or late tuition fee payments. These insights were then used by advisers to develop targeted interventions to get students back on track. Students who might be in danger of dropping out would be contacted by an advisor within 48 hours for an in-person meeting. During the meeting, the adviser would probe to find out the student's particular challenges and help connect him or her with the appropriate resources (e.g., tutoring, emergency financial aid, or advice about other majors).
By utilizing big data in monitoring students' performance and understanding their specific situation, Georgia State University was able to increase graduation rates, especially among students from diverse racial, ethnic, and socioeconomic backgrounds. • Providing personalized learning experience. The COVID-19 pandemic had drastic implications for how teaching was conducted. In the Philippines alone, 25 million students had to be taught remotely at home. 49 But the switch to remote learning has posed major challenges. According to the World Economic Forum's Association of Southeast Asian Nations (ASEAN) Youth Survey 2020, 69% of youths aged between 16 and 35 in Southeast Asian countries found it difficult to work or study remotely, including 7% who said it was impossible. While 31% of survey respondents found working or studying from home easy, only 13% reported no constraints at all. 50 Big data can help make remote learning more effective. Educators can analyze data on students' learning styles, areas of interest, abilities, and progress to customize teaching methods and curriculums to individuals' needs. For example, by collecting data on how students interact with the virtual learning environment, the online sources they use for research, their participation in chats and forums, the areas that they struggle with, as well as the way they present information, schools can have a better understanding of their abilities and learning styles. 51 Personalized learning paths can be developed where students, depending on their abilities, interests, priorities, and progress can delve deeper into each subject with more relevant and effective learning methods. Arizona State University in the United States provides a good example of using big data to develop customized teaching methods. Standard lectures on the university's general level mathematics course were replaced with a "mathematics emporium," which involves students sitting at their computers and working through course content in their own time, with the help of tutors. Each student was placed at the appropriate starting point based on their abilities, and then continually assessed as they progressed through the course. As a result, the class' success rate increased from about 65% to 85%. The system can also integrate courseware from different classes to identify gaps and direct students to the specific parts that they need to revise (e.g., if an engineering student is struggling in a physics class because of a misunderstanding of calculus, they would be guided to revise relevant parts of the calculus syllabus). 52 Across the five focus countries, the usage of big data in education is currently limited. One example of big data applications is observed in Thailand where data from job portals were leveraged to improve the responsiveness of the education system during the COVID-19 pandemic. In particular, its Ministry of Education funded an initiative to examine the labor market situation and build a database of skill needs using data from 12 online job portals. This enabled policy makers to identify skills gap and develop programs to re-train the workforce, particularly in digital skills required to operate during the pandemic. Seven policy enablers are crucial to unlocking the full potential of big data in public service delivery.
To capture the benefits of using big data, there is a need to create an enabling environment that enhances the availability and quality of data and encourages applications of big data analytics in government services. Seven policy enablers have been identified to support big data usage in government: (i) strategic governance, (ii) availability and quality of data, (iii) risk mechanisms, (iv) human capital for big data, (v) access to relevant technologies, (vi) data-driven culture, and (vii) ICT infrastructure. 54 Based on these seven policy enablers and an assessment of a range of indicators, several improvement opportunities emerge for the five focus countries (Table 4; see Table A3 of Appendix 3 for detailed methodology).
The following section explains these seven policy enablers in detail and provides an assessment of the five focus countries with regard to each policy enabler.
• Strategic governance. From the outset, governments should create a clear plan, road map, or national strategy to foster the digital transformation of public services and promote the use of big data applications in public service delivery. For example, a lack of clear strategic focus related to digital technologies has been cited as one of the main themes preventing governments in Latin America to increase the adoption of cloud computing services. 55  enhance data management practices. 58 In Indonesia, there has also been a shift to online services in many government offices across the country over the past 5 years, with business licensing and civil registration services (e.g., providing national identity cards) leading the way. 59 Furthermore, the Ministry of Research and Technology has launched a national strategy to promote the use of big data in the public and private sectors, including health, education, and food security. 60 Meanwhile, in the Philippines, the E-Government https://www.thejakartapost.com/news/2020/10/15/indonesia-works-on-use-of-big-data-for-better-decision-making-process.html. coordination, and improving the capacity of government agencies in utilizing ICTs in their operations. 61 The government also established a cross-ministerial task force composed of seven agencies to develop the country's AI road map, which is expected to be implemented in 2021. 62 The road map aims to transform the country into an "AI center for excellence," leveraging its local talent pool and entrepreneurship ecosystem. 63 In Cambodia, the government has developed a long-term digital transformation strategy. 64 Under the digital government pillar, the framework outlines the importance of data-driven governance, including leveraging big data applications.
• Availability and quality of data. There is a need to broaden access to data and improve the quality of data to capture the full potential of big data applications. This includes adopting open data policies, improving data collection processes, creating an integrated data platform to facilitate data sharing between government agencies, as well as improving collaboration mechanisms for private sector engagement. Based on the Global Open Data Index which assesses the availability and openness of government data across multiple aspects such as ease of access, cost of access, frequency of updates, and technical usability (e.g., machine-readable format), the five focus countries have varying levels of data availability. Thailand scored 34, the Philippines 30, followed by Indonesia at 25 out of 100. 65 Cambodia scored lower at 17 and Myanmar 1 out of 100. The five focus countries showed a gap of between 62% and 99% to the frontier or the best-performing country assessed. In some countries, while the government has been collecting multiple datasets, there is a lack of a system to standardize, consolidate, and share them to across different agencies. To date, several initiatives to improve data availability and quality have been observed across these countries. For example, in Thailand, the government announced a plan to collect and standardize data from 20 ministries into a centralized system to facilitate the use of data in policymaking and improve transparency. 66 Meanwhile, Cambodia Data Exchange (CamDX) is an important initiative by the Government of Cambodia to improve data collection and data sharing. CamDX curates data from different information systems into a unified and decentralized data exchange platform to provide a secure and standardized way of accessing data (Box 9). 67 Furthermore, mechanisms to engage the private sector are crucial to improving access to innovative datasets. This approach has been implemented in Indonesia and the Philippines. In Indonesia, Bappenas has partnered with the United Nations to launch Pulse Lab Jakarta, a joint facility to pilot innovative data solutions through partnering with the private sector and academia. Using datasets drawn from mobile communications, remote sensing, and social media, Pulse Lab Jakarta has generated insights for policymaking on topics ranging from fuel subsidies to natural disasters. 68 ADB and Gojek have signed a collaboration agreement to conduct joint research on the impact of digitization and COVID-19 on the operations and development of Indonesia's micro, small, and medium enterprises, leveraging Gojek's big data. 69 The Government of the Philippines also partnered with Grab to develop OpenTraffic, a smart data platform that provides free-of-cost GPS information for better analysis of travel speeds and journey times in Metro Manila and Cebu City. The real-time GPS data from Grab drivers could be used to address traffic congestion and identify road incident blackspots to improve emergency response timing. 70

61
• Risk mechanisms. Mechanisms need to be put in place to allow data sharing while minimizing the risk of unintended consequences such as data privacy infringements, security violations, and unethical usage of data. In the health and social welfare and protection sectors, for example, it is critical to impose strict data privacy and data security requirements to prevent breaches of sensitive information such as health records and social security data. On the other hand, governments also need to balance data protection with the need to facilitate data sharing during crises such as natural disasters or pandemics to support crisis management. To address these concerns, a comprehensive set of data protection laws needs to be developed and effectively enforced to address the risks associated with using big data while allowing for efficient sharing of data during crises that require timely and actional insights. Across the five focus countries, while data protection laws have been developed in several countries, there is a need to strengthen the enforcement of these laws to ensure personal data are protected. For example, while the Philippines' Data Privacy Act was passed in 2012, there have been concerns around the enforcement of the law and how data breaches and data privacy violations were resolved. 71 In Thailand, the Personal Data Protection Act was developed in 2019, however, the enforcement of most chapters of the law has 71 J. Santos. 2020. Philippine Privacy Regime Fails to Live Up to Expectations. MLex. 14 October. https://mlexmarketinsight.com/insights-center/ editors-picks/area-of-expertise/data-privacy-and-security/philippine-privacy-regime-fails-to-live-up-to-expectations.

Box 9: CamDX-An Integrated and Secure Data Exchange Platform to Enable Big Data Usage
Designed based on the X-Road model of Estonia, CamDX connects fragmented information systems into an ecosystem to promote data exchanges in a standardized and secure way. The main goal of CamDX is to build an infrastructure that allows for easy access to data in government databases without compromising the security and ownership of the data.
As a data exchange platform, CamDX does not store or know the content of the data, but instead allows each member to connect with information systems of other members and exchange data directly through a secure server. The platform also establishes a foundation for collaboration between the public and private sectors. Government databases such as population and business registries as well as tax, real estate, and/or vehicle registration can be used in combination with data from telecommunication companies, banks, and insurance companies for purposes such as electronic Know Your Customer.

CamDX has so far supported two important applications in Cambodia-Online Business Registration (OBR) and
Validation Application on Payment Guarantee: • OBR is a new business registration platform that allows business owners to register and obtain the licenses to operate their business using a single portal. Through CamDX, the portal can efficiently distribute the data registered by business owners to respective information systems of Ministry of Commerce, Ministry of Interior, General Department of Taxation, and Ministry of Labour and Vocational Training, thereby reducing the cost and time of business registration. The first phase of the OBR has been completed, with an estimated approval time of 8 days and a 40% reduction in the registration fee. • Validation Application on Payment Guarantee allows companies registered in Cambodia to provide a guarantee for their staff to enter Cambodia with "fast lane" during the COVID-19 pandemic, a special lane at airports that exempts travelers from a deposit and an insurance policy related to COVID-19 upon arrival to Cambodia. a The application forms submitted by companies are verified against data such as tax identification number, company name, and shareholder information from various government agencies through CamDX, including General Department of Taxation and Council for Development of Cambodia.
been postponed due to the COVID-19 pandemic. 72 Meanwhile, data protection laws in Indonesia and Cambodia are currently absent or still under development. For instance, the Personal Data Protection Bill in Indonesia, which aims to provide a comprehensive set of provisions for the protection of personal data both via electronic and nonelectronic systems, is being finalized by the Indonesian House of Representatives. 73 In Cambodia, while data protection is covered under different laws (e.g., Civil Code of Cambodia), there is a lack of comprehensive data protection legislation. 74 • Human capital for big data. Building capacity of public sector employees is key to realizing the potential of big data. Governments need to implement initiatives in education and training to increase the pipeline of graduates with the right skills who can join the civil service. In particular, it is crucial to expand the supply of talents with advanced digital skills such as data science and machine learning to take advantage of big data applications.  • Data-driven culture. It is important for countries to instill a culture of making policy decisions on the basis of rigorous evidence. A paradigm shift at the highest level of government may be required to promote a data-driven culture and increase awareness of applications and technologies used in managing and analyzing data. There are currently considerable gaps in terms of promoting emerging technologies and applications (e.g., big data analytics, AI, machine learning, and cloud computing) across the focus countries (Table 4). To date, some countries have taken steps to foster a data-driven culture within government through developing a strategic road map for data-driven governance or even establishing an agency dedicated to promoting the use of data in policymaking. For example, the Philippines is promoting data-driven governance with its ICT Statistics Roadmap 2020-2022, a 3-year plan to create a robust evidence base to guide the country's ICT development policies and initiatives. 85 In Thailand, the DEPA has been tasked to raise awareness of big data applications and improve data management practices across government agencies.
• ICT infrastructure to support big data. The application of big data requires a strong ICT infrastructure that is capable of collecting, storing, transferring, and processing large amounts of data at extremely faster rates as compared to traditional data systems. It is, therefore, necessary to invest in ICT infrastructure, in particular, improving cloud computing capabilities in government to provide a cost-effective and scalable way to store big data and enable efficient cloud-based big data analytics. The current level of investment in ICT infrastructure varies across countries, however, there exist sizable gaps that need to be addressed, particularly in Cambodia and Myanmar (Table 4) (footnote 80). The countries are taking actions to improve the quality of ICT infrastructure to support digital transformation, including big data applications. For instance, the Ministry of Public Health in Thailand is leveraging cloud infrastructure to store health data from various sources, which can be efficiently accessed to support big data applications. 86 With its Cloud First policy, the Philippines is also encouraging all government agencies to move to cloud computing as the preferred ICT deployment strategy for internal administrative use and external delivery of government services. 87 Meanwhile, under the digital transformation strategy of Cambodia, the government has also placed an emphasis on developing ICT infrastructure to build a foundation for digital transformation.
Targeted policy reforms could strengthen country performance on the key policy enablers.
Based on the assessment of key policy enablers in the previous section and a review of government initiatives in each of the five focus countries, a number of policy reforms have been identified. The following is an explanation of the policy levers that have been assessed as most relevant to achieving each opportunity and potential policy recommendations: 1. Policy reforms for improving strategic governance. Improving strategic governance will involve two aspects: installing clear senior leadership on who is overseeing the implementation of big data applications in public services delivery as well as involving a multisector group of stakeholders in the process.
• Designate a digital transformation champion in government. Countries should identify a clear lead official or lead agency that drives the digital transformation process and the adoption of big data in government. The analogy would be a Chief Data Officer or Chief Information Officer in  Table 4 with a review of government initiatives across the five focus countries. Where governments were found to have well-established programs related to the recommendation, relevance was set to "Low"; where there were early plans, but no implementation had begun or the plans had been implemented in small scale, the assessment was set to "Medium"; Relevance was high if the assessment of policy enablers showed a significant improvement opportunity and the country had no active plans of similar policies. Sources: Review of existing government initiatives; AlphaBeta analysis. private sector organizations. The appointment of a senior official to this position ensures that various government services are linked to each other and that information between government ministries are shared. The champion also plays a role in standardizing operating procedures across government departments. A key risk to avoid is to appoint champions that may have public visibility, but lack decision-making powers making the role essentially superficial. One way of ensuring this does not occur is to provide support from the highest levels of government (e.g., Prime Minister or President's office), as has been the case in the digital transformation of Singapore, as well as establishing clear reporting lines and granting the appointed digital champions the necessary powers. Countries are already making progress on this. In Thailand, the DGA and the DEPA have stepped into this role and Cambodia, with the support of ADB, has convened a transitional "Big Data Sub-Committee" which is chaired by the Secretary of State of the Ministry of Economy and Finance. Similarly, Indonesia's Ministry of Research and Technology which is leading the national strategy to promote big data usage and the Philippines's DICT which launched the E-Government Masterplan 2022 is taking lead in promoting digital transformation in these countries.
• Establish a national multistakeholder task force. It is important to set up a multistakeholder task force involving different government agencies as well as the private sector and academia to foster collaboration and explore big data applications that can be used to improve public service delivery. Not only will such multistakeholder promote innovative approaches due to a broad set of inputs, but it will also ensure that national big data ambitions tie into sector-specific road maps and address coordination challenges across sectors. Once a multistakeholder task force is established to lead the big data transformation process on the policy front, the next step is to appoint a department or agency to be in charge of the implementation of big data strategies and road maps on the ground. In addition, it is also critical to have strong engagement with the private sector such as industry associations, technology companies, and start-ups, as well as ensure that there is a strong fact based on the specific challenges and local context that will shape the implementation in each sector. 88 While some of the focus countries have taken steps in convening a platform involving different government agencies as well as the private sector (e.g., Thailand's Digital Economy Promotion Agency is working with other government agencies to promote big data and Big Data Sub-Committee in Cambodia is participated by both public and private sector representatives), there is currently a lack of multistakeholder task force with clearly defined roles and strategies to engage various partners in driving big data adoption in the country. These initial efforts should be expanded in each country to establish a multistakeholder task force which brings together government agencies, private sector, and academia to explore and test big data applications in priority sectors.
2. Policy reforms for improving availability and quality of data. Countries are required to improve the availability and quality of data to enable meaningful big data applications. This could involve creating an integrated platform for open big data to facilitate cross-ministerial and cross-sector data sharing, as well as improving mechanisms to partner with and crowd-source data from the private sector.
To ensure that open data efforts result in impactful and practical applications, it is important that governments engage the private sector and citizens to identify the most relevant data that need to be collected and analyzed to avoid investing in data that are not useful for stakeholders, which have been noted as a concern in the past across Southeast Asia. • Create integrated data platforms (i.e., one-stop shops) for open big data. Having a single portal to access information can play a crucial role in disseminating data. Singapore, for example, operates an Open Data Resources portal that provides access to an array of government data from over 70 public agencies, direct developer support, and special subportals for data from tax authorities, land transport, monetary authority, and geo-spatial data, to name a few. 90 Colombia also operates an open data resources portal ("Datos Abiertos Colombia") that provides access to an array of government data from over 1,200 public agencies, developer support, and special subportals for niche data from government entities. 91 Furthermore, the interoperability of data should be improved to create synergies from existing databases collected by various government agencies. For example, in the Philippines, data from the Philippine Identification System, the government's central identification platform, can be combined with data from social protection programs and health data to establish a unified beneficiary database. 92 In addition, countries could consider creating a distributed model where data are stored in different information systems and can be shared via a data exchange layer. Box 10 shows an example of such a model which has been implemented in a number of countries.

Box 10: X-Road-A Data Exchange Layer that Enables Unified and Secure Data Sharing
X-Road is a centrally managed distributed Data Exchange Layer between information systems, which allows organizations to exchange information over the internet while ensuring confidentiality, integrity, and interoperability between data exchange parties. Originally developed and launched by the Estonian State Information Systems Department (at the Ministry of Economy and Communications) in 2001, the model has so far been expanded and implemented in a number of countries in Europe, Africa, Asia, and Latin America. a X-Road was developed to address the issue of information silos where data are generated and administered separately by different government agencies. The platform allows authorized agencies to exchange important data in an efficient and secure manner while maintaining the integrity and confidentiality of data while it is in transit. b X-Road is resilient to cyberattacks and service interruptions as data are stored locally by data exchange parties and no third parties have access to the data. c X-Road's distributed architecture also makes it highly scalable and is, therefore, a good fit for all sizes of implementation. A number of trainings and e-resources are made available online for the public to learn about the X-Road data exchange layer. X-Road also provides X-Road Playground, which allows any individual or organization to test a preconfigured X-Road environment free of charge.
The X-Road model was adopted by Cambodia to develop CamDX, a unified and decentralized data exchange layer that has supported several government digital services. CamDX was built and set up locally in the data center of the Ministry of Economy and Finance. The implementation of CamDX has also involved other ministries such as the Ministry of Commerce, General Department of Taxation, and Ministry of Labour and Vocational Training which provide data on businesses, the Ministry of Interior which allows CamDX to verify Khmer National Identification, as well as the Council for the Development of Cambodia which allows CamDX to combine investment data with company registration data. d a X-Road. https://x-road.global/xroad-history. b E-Estonia. https://e-estonia.com/solutions/interoperability-services/x-road/. c Nordic Institute for Interoperability Solutions (NIIS). https://www.niis.org/blog. d CamDX. https://camdx.gov.kh/#what_is_camdx.
• Establish forums to interact and crowd-source data from the private sector. Governments should engage and empower private sector organizations to participate in sharing data. While it can be challenging to establish data sharing agreements between the government and the private sector, pilot programs could be created to test potential public-private partnerships before engaging in long-term agreements. One potential approach is to establish a "data insights unit" that can work across different government agencies and with the private sector as well as academic partners to mobilize data and test potential applications of big data. A similar model has been implemented in Indonesia and can be expanded across the focus countries to foster collaboration and mobilize resources (Box 11).
3. Policy reforms for improving risk mechanisms. Mechanisms are needed to minimize the risk of unintended consequences such as data privacy infringements, IP concerns, and unethical usage of data. However, decisions regarding data privacy and protection vary by country context, perceived values, and potential for misuse. It is important to ensure that policy addresses public concerns while not relying on overly restrictive regulations.
• Establish data protection frameworks. Comprehensive data protection frameworks need to be developed and effectively enforced to address the risks associated with using big data. For example, the United States introduced the Health Information Technology for Economic and Clinical Health Act to encourage health care providers to adopt health information technology while improving the protection of electronic health records through increased penalties for violation of privacy and security rules. 93 In Canada, the Privacy Act ensures that the government collects, uses, and discloses personal information according to strict rules that preserve individuals' right to privacy. 94 It is also important to balance data protection with the 93 Government of the United States, Department of Health and Human Services. HITECH Act Enforcement Interim Final Rule. https://www.hhs. gov/hipaa/for-professionals/special-topics/hitech-act-enforcement-interim-final-rule/index.html. 94 Office of the Privacy Commissioner of Canada. The Privacy Act. https://www.priv.gc.ca/en/privacy-topics/privacy-laws-in-canada/the-privacyact/pa_brief/.

Box 11: Crowd-Sourcing Data through a "Data Insights Unit"
A "data insights and innovations unit" could be established to connect different government agencies with the private sector as well as academic partners to mobilize resources and cocreate practical big data solutions. This will help demonstrate potential applications of big data and allow the government to incorporate them in their existing operations. A similar model has been implemented in Indonesia, and resulted in a number of collaborations and pilot programs to test the application of big data in public service delivery.
Founded in 2012, Pulse Lab Jakarta is a joint data innovation facility bringing the Government of Indonesia its development partners (e.g., the United Nations and World Bank) as well as the private sector (e.g., Grab, Visa, and Twitter) and academia together to explore and promote the adoption of data-driven applications. Leveraging various data sources such as mobile communications, remote sensing, and social media, Pulse Lab Jakarta has generated insights for policy and practice on a range of topics related to development and humanitarian actions (e.g., disaster response and climate change, food security and agriculture, and financial inclusion). This model has successfully mobilized data insights from the private sector to support policymaking. For example, Pulse Lab Jakarta partnered with Grab to analyze drivers' anonymized global positioning system traces in Greater Jakarta to create a set of interactive visualizations of traffic flows across different subdistricts. Leveraging the partnership with Grab, further research was conducted to show the feasibility of using ride-hailing data to inform transportation policy and planning, as well as to develop proxy measures of air quality.
need to ensure efficient sharing of critical data during times of crises (e.g., sharing personal location data during pandemics to support disease tracking). In the Republic of Korea, for example, this was achieved through a set of supportive legislation that allows the sharing of personal data during a public health crisis.
In particular, the Infectious Disease Control and Prevention Act allows authorities to access personal data that could help prevent the spread of infectious diseases, including credit card transactions and travel, medical, and location records from public and private organizations. 95 Countries also need to ensure strong data governance in public-private collaboration. In the Pulse Lab Jakarta model, for example, data protection is incorporated into innovation projects through Risks, Harms, and Benefits Assessments which are designed to identify anticipated or actual ethical and human rights issues that may occur at any stage of a data innovation process. The tool also helps develop a risk mitigation strategy and ensure that the risks do not outweigh the benefits of a given project. 96 When it comes to the implementation of data protection measures, there are opportunities to leverage new technologies such as blockchain to manage access to and use of public sector data while maintaining the security of this information. Estonia provides a relevant case in point with its Keyless Signature Infrastructure. Keyless Signature Infrastructure leverages blockchain technology to safeguard public sector data including electronic health records of all Estonian citizens by allowing government officials to track and monitor changes within various databases (e.g., who changed a record, what changes were implemented, and when they were made). 97 IT departments in government agencies may also be able to create rules and algorithms that allow data in a blockchain to be automatically shared with third parties once predefined conditions are met. • Cooperate with the international community on common standards and approaches. Standards are particularly crucial to not only ensure some minimum safeguards for safety and security around data. In the United States, for example, memoranda are used to update security protocols for data released by federal agencies, providing adequate controls to ensure that information is "resistant to tampering, to preserve accuracy, to maintain confidentiality as necessary, and to ensure that the information or service is available as intended by the Agency and as expected by users." 98 Standards are also particularly crucial for easing international cooperation, given that data may be utilized across borders. Standards impact everything from security issues to the provision of open data. For example, adopting international security and privacy standards not only assists governments in the design and development of their own frameworks, but also provides comfort and reassurance to organizations. Furthermore, having common standards decreases the barriers for domestic firms to export their operations abroad, as their security standards are likely to already comply with international markets, and vice-versa reduces the barriers to entry for foreign firms for the same reasons. Cooperating on standard setting can also facilitate the provision of open data. For example, the ASEAN Secretariat is currently developing an open data dictionary (with common standards across the ASEAN Member States) to share available government data with the public (footnote 89). Even if international standards are adopted, however, it is important that countries engage in thorough reviews with extensive local and international stakeholder consultations (with industry and standards governing bodies in particular) to solicit feedback and ensure that standards are fit for the local context.
provide training for existing civil servants to equip them with the necessary skills required to make datadriven decisions. Governments should also leverage big data expertise of the private sector.
• Provide targeted training and incentives for civil servants to acquire relevant skills. There is a significant lack of data science skills in the five focus countries in general, not to mention the public sector. Countries need to develop strategies to increase the availability of such skills, particularly among public sector employees to enable them to make data-driven decisions. This could involve providing training for civil servants on data management and analysis skills. For example, the United Kingdom launched a Data Science Accelerator program in 2015 which offers training in a range of data analysis techniques for people working in government, with a particular focus on the National Health Service. 99 Across the focus countries, there have been initiatives to train government officials on relevant skills in Thailand and the Philippines which could be expanded and replicated in other countries (the DEPA and Government Big Data Institute in Thailand provided training on big data skills while the DICT in the Philippines conducted the National Training of Trainers on data-driven governance for participants from both central and local governments). 100 To ensure the effectiveness of training programs, countries will need to identify credible courses that offer well-structured training on relevant skills from introductory to advanced levels (e.g., online resources could be leveraged to lower costs and allow for better access and flexibility). For instance, in the United States, the Federal Data Science Training Program run by the Office of Management and Budget has launched a program to teach government employees data science skills such as coding, graph analytics, and data visualization and ethics, which will be conducted entirely online. 101 The Philippines has also emphasized the use of e-learning platforms in building the digital capacity of government in its 2021 Budget Priorities Framework. 102 It is recommended to focus initial efforts on providing targeted training for officials who work in dedicated data-related agencies (e.g., government data offices and data insights units), and subsequently expand the coverage of such training (e.g., through "train-the-trainer" model). Countries could leverage private sector expertise in conducting training for government officials. For example, the Government Technology Agency of Singapore has been partnering with Tableau since 2017 to train and deepen public officers' capabilities in data analytics and visualization. 103 In addition, to encourage civil servants to engage in upskilling activities, governments could consider incentives such as providing flexible time-off to attend training, establishing mechanisms to recognize training efforts as key milestones for career progression (e.g., offering promotion and salary increases for people who obtain certifications in data skills), and offering scholarships to those who want to deepen their knowledge and skills through further studies. In the Public Service Division in Singapore, for example, employees can attend up to 100 hours of training, of which 60% should be related to their current job responsibilities while the remaining 40% can be for personal development or to prepare themselves for their future career in the service. 104 Scholarships are also made available under the Training Awards and Sponsorships Scheme for postgraduate, degree, and diploma courses. However, a major risk associated with such programs is the loss of talents to the private sector after they have obtained important data skills. This could be addressed by requiring people who receive training and scholarships to take up a commitment to work in the public sector for a certain period after they finish their programs.

5.
Policy reforms for improving access to relevant technologies. Working with big data requires adequate technologies and analytical techniques, i.e., software. Many of which exist in the private sector or in research, but are often proprietary and not accessible to governments. Many current data analysis tools are neither suitable nor effective at dealing with large datasets. This could be addressed by establishing mechanisms to crowd-source innovations and technologies from the private sector, academia, and even citizens.
• Establish mechanisms to crowd-source innovations and technologies. While multistakeholder task forces and forums will be important to provide access to innovations and technologies currently not present in government, the five focus countries will need to go beyond this. Possible approaches include providing incentives for public sector innovations, encouraging open-source research, improving transparency around government procurement of technologies, and providing regulatory flexibility for experimentation and pilots. For example, while not targeted at big data specifically, Bangladesh's "Innovation for All (a2i)" fund provides financing for low-cost, user-centric, homegrown innovations to leverage digital innovation to solve policy problems. To date, a total of $4.5 million worth of grants have been awarded to government agencies, development organizations, nongovernmental organizations, academic institutions, private companies, and even individuals to design and implement their solutions across 22 development areas such as agriculture, environment, education, health, and government services. 105 This has resulted in a number of innovations being implemented on the ground (e.g., solar-powered multimedia classrooms have been set up in off-grid locations, an online platform for Environment Clearance Certificate has been implemented to facilitate the application process, and a 3D printer has been used to print artificial limbs for disabled children from low-income families). Furthermore, there is a need to improve the transparency of how governments will reward the private sector or eventually procure big data solutions developed by the private sector (i.e., what is the monetization model for developing AI solutions and machine learning algorithms for the public sector). Before big data solutions are launched at a large scale, they first need to be piloted and evaluated within a defined environment. This may require the use of regulatory sandboxes which allow time-bound testing of big data approaches in the real world. Box 12 provides a more detailed explanation of how regulatory sandboxes can be used to promote innovations.

Box 12: Establishing Regulatory Sandboxes to Promote Innovations
A regulatory sandbox provides an environment for companies to test innovative products, services, or business models within a clearly defined space (e.g., for a limited period with a limited number of users). Following successful testing, companies can make their new products, services, or business models available to a wider customer base. These sandboxes are useful policy tools to understand the implications of introducing certain new analytical tools and applications while continuing to promote technological innovation and limiting any negative unforeseen consequences (i.e., breaches of data privacy or potential harmful biases of artificial intelligence applications). They can help regulators better understand new approaches and work collaboratively with the private sector to develop appropriate rules and regulations for emerging big data solutions. From a private sector perspective, sandboxes reduce the costs of production and time-to-market. When structuring the sandboxes, it is important to have early engagement with the private sector, research and academic institutions, civil society, and consumer protection agencies to evaluate their potential impact. Furthermore, a thorough review process is required to assess the costs and benefits of innovations before bringing lessons from pilot sandboxes into broadbased implementation.
6. Policy reforms for improving data-driven culture. Countries need to develop a culture that promotes and rewards evidence-based policymaking, which refers to establishing policies grounded on objective and scientific research and ensuring they are designed and implemented based on concrete data. 106 • Provide incentive schemes for data-driven decision-making. For government agencies and individual civil servants to change to a more data-driven culture requires incentives for such change, i.e., policy makers need to have "skin in the game." There are several approaches that can be considered, one of which is around tying promotions and career progress to using data in decision-making. South Africa, for example, introduced performance rewards linked to the use of data in decision-making in government. 107 Competition amongst different government entities could be another method. Different agencies can be empowered to develop their own big data solutions to address common challenges (e.g., poverty reduction) based on real-world context. 108 7. Policy reforms for improving information and communications technology (ICT) infrastructure to support big data. Working with big data requires adequate technological infrastructure. Governments that intend to use big data will thus need improved ICT infrastructure to store, organize, and process complex data sets in an efficient manner.
• Go 100% cloud first for government. Cloud has emerged as an ideal computing environment for big data as it provides vast quantities of computing power at low cost and on a need basis, without major hardware investments. Furthermore, an increasing number of software tools held in a hybrid cloud are also capable of performing the processing and data integration tasks. 109 Cloud computing can also lead to significant efficiency gains and cost savings for governments' ICT budgets beyond the importance of big data. For example, Saudi Arabia's Ministry of Communications and Information Technology has put forward a "Cloud First Policy" which encourages government entities to consider cloud solutions first for every new IT investment-this is expected to provide around 30% cost savings of the total cost of ownership. 110 Across the focus countries, the Philippines has launched a Cloud First policy to encourage all government agencies to move to cloud computing for internal administrative use and external delivery of government services-such initiative could be developed in other countries to develop the necessary infrastructure for big data.
Five pilot programs for using big data in health care, social welfare and protection, and education can be prioritized for the focus countries. Table 7 shows a summary of the assessment results for the five opportunities that have been prioritized for the focus countries. See Table A4 of Appendix 4 for detailed methodology.
Below is an explanation of each prioritized opportunity for potential pilot programs in the five focus countries.
• Using social media and search data to analyze COVID-19 activity. Leveraging real-time data from social media and search engines to analyze COVID-19 activity is highly relevant for the five focus countries, particularly in Indonesia, Myanmar, and the Philippines where the number of active cases in population is significantly higher than the Southeast Asia average. 111 In particular, as of mid-December 2020, Indonesia had over 94,000 active cases, the highest in the region, while the Philippines had more than 24,000 and Myanmar had more than 18,000 active cases. 112 The application of social media data to track infectious disease activity has proven feasible in a number of countries. For example, the study conducted in Greece described in the previous section showed the potential of using both Twitter data and Google data to track influenza activity and predict the emergence of possible outbreaks. 113 In addition, there have been several studies in the People's Republic of China to explore the use of real-time data such as Baidu searches to analyze COVID-19 activity, predict potential infections, and locate high-risk areas. In a study that analyzed Baidu searches related to COVID-19 (e.g., using key words such as dry cough, fever, chest distress, coronavirus, and pneumonia) from December 2019 to February 2020, it was found that search data could predict new suspected COVID-19 cases 6 to 9 days earlier. 114 To enable such application, countries need to obtain a large amount of social media data, which can be accessed through Application Programming Interfaces (APIs) of social media companies. In the study of influenza activity in Greece, for instance, Twitter data were obtained using Twitter STREAMING API while search data were collected from Google 111 Trends, a tool that provides access to an anonymized sample of Google search data. There is a potential opportunity to conduct a pilot study to explore the use of social media data in analyzing COVID-19 activity in the most affected countries such as Indonesia and the Philippines. The Philippines, for example, has also highlighted the use of big data analytics to understand the spread of diseases (e.g.,  to streamline medical care and allow for real-time collection and analysis of health data in the updated Philippine Development Plan 2017-2022. 115 • Using social media data to provide insights into public perceptions of vaccines.  Vaccines Global Access and has procured vaccines through private purchases to support its goal of completing vaccine rollout by June 2022 or before. 118 Understanding public perceptions of vaccines is particularly crucial to enable effective vaccine rollout. This could be achieved by analyzing real-time social media data such as Twitter conversations, which can be obtained through the company's API. Such analysis proved to yield valuable insights as shown in a study conducted by the Government of Indonesia and its development partners (e.g., information about public concerns around vaccination such as religious issues and side effects of vaccines, identification of influencers that could be leveraged for rapid response to public concerns, and misinformation related to vaccines) (see the previous section). This analysis could be applied across the five focus countries to provide insights into public perceptions around COVID-19 vaccines.
• Using data from social media and search engines to detect the risk of developing noncommunicable diseases. While addressing noncommunicable diseases is a longer term goal for the five focus countries, it is an important public health issue that requires timely interventions. In particular, mortality rates from noncommunicable diseases across Cambodia, Indonesia, Myanmar, and the Philippines are considerably higher than the Southeast Asia average. 119 Big data can contribute to the prevention and detection of noncommunicable diseases by allowing public health authorities to monitor relevant risk factors and provide targeted interventions. For example, social media can provide timely community-level data on health information seeking and changes in behaviors, and can be combined with other data such as demographics, environment, diet, and physical activity indicators from other digital sources (e.g., mobile applications and wearables) to monitor health behaviors to supplement delayed estimates from traditional surveillance systems. 120 Social media data such as Twitter posts and search data from Google are valuable data sources that could be used for such analysis. A pilot study leveraging Twitter data and Google search data could be developed to explore the potential application of these datasets in monitoring noncommunicable diseases in the focus countries.
• Strengthening the identification of poor households using big data. Building a database of granular and updated poverty statistics is not only relevant to the focus countries in light of the current COVID-19 pandemic, but also critical in the long term to ensure effective targeting of vulnerable populations. This is particularly important where the shares of the population living below national poverty lines are relatively high at 13.5% for Cambodia, 25% for Myanmar, and 16.7% for the Philippines. 121 The study conducted by ADB in the Philippines and Thailand has shown the potential of using innovative datasets such as satellite imagery to complement traditional statistics in identifying poor households. This could be expanded to other countries through a pilot program to test the effectiveness of satellite data in mapping different poverty profiles. The program could rely on publicly accessible satellite imagery and open-source data analytics tools to demonstrate the feasibility of the approach to national statistics offices and policy makers.
• Analyzing data from online job portals and social networks to identify skills gap. The use of data from job portals and social networks such as LinkedIn to identify skills gap and develop training programs to equip students and workers with in-demand skills is highly relevant to the focus countries, both during the COVID-19 pandemic as well as when economies start to recover. To address unemployment issues caused by COVID-19, Thailand has started analyzing data from job portals to build a database of skill needs, which will be used to develop programs to re-train the workforce (see the previous section). A similar approach could be implemented in other countries to identify in-demand skills and guide education and training efforts. This will potentially involve partnerships with online job portals to obtain data on jobs and skills. In addition, there is also a potential opportunity to partner with social network sites such as LinkedIn to provide insights into the labor market situation in each country. Such collaboration was piloted in South Africa, which resulted in important insights that enabled the government to identify emerging skills and develop strategies to produce a pipeline of talent in these areas.

Conclusion
Digital transformation would be critical for governments to undertake economic recovery in the post-pandemic environment. Big data could be a key tool to help countries initiate their digital transformation. This report provides policy recommendations on how big data can be used by public institutions in three public service sectors-health care, social welfare and protection, and education. These recommendations could help the focus countries in their post-COVID economic recovery and contribute to the existing plans of ADB, such as its Strategy 2030.
Digital technologies are an important element in ADB's Strategy 2030. 122 For example, "promoting innovative technology" is one of ADB's guiding principles. Moreover, digital technologies can play an important role in helping ADB address its operational priorities. For example, one of its seven operational priority is "addressing remaining poverty and reducing inequality" through "improving education and training." As discussed in the report, big data can play a key role in improving education. For instance, by leveraging big data analytics, schools can look into the vast number of student records to identify early warning signs and provide targeted support to those in need.
To distinguish between "low-hanging fruits" that governments can capture in the near term versus policies that require longer timeframes to implement and take effect, the policy recommendations in the report have been further analyzed in terms of their implementation timeframe. This takes into account multiple factors including the policy's potential for short-term impact and political feasibility.
The policy recommendations have been prioritized as follows: • Short-term. Implementation should begin within the next 12 months as these actions are important for providing the foundation for other actions, and have the potential for near-term impact. The first step is to establish strategic governance mechanisms such as designating a digital transformation champion in government and establishing a national multistakeholder task force to drive big data adoption. For example, lessons from other countries and consultations with policy advisors and big data experts highlighted the risks of pilot programs not being effective if governance mechanisms are not present to ensure they are prioritized appropriately and there are follow-up actions to scale-up the implementation. • Medium-term. Implementation should begin over the next 1-2 years as these actions rely to some extent on the short-term actions. These include creating integrated data platforms and forums to crowd-source data from the private sector, developing data protection frameworks and collaborating with the international community on common standards and approaches, and providing targeted training and incentives for civil servants to acquire relevant skills. As countries are at different stages of readiness, the specific implementation timelines of these policy actions will also depend on the capacity and priorities of each government. • Long-term. Implementation can take place over the next 3-5 years as these actions require more resources and long-term commitment from government. These include developing cloud first policies, providing incentive schemes for data-driven decision-making in government (e.g., performance rewards linked to the use of data in decision-making), and establishing mechanisms to crowd-source innovations and technologies (e.g., improving public procurement guidelines and establishing regulatory sandboxes). While such actions are expected to result in significant long-term impact, they require strong political will and a whole-of-government approach to be effective.

Assessment of the Potential Value of Big Data
The potential value of big data was for each sector based on four criteria: • Volume of data. The larger the amount of data in the sector, the more it indicates the potential to benefit from utilizing big data analytics. This depends not only on the volume of potential data, but how much is currently digitized. • Variety of data. The more different forms of data available in the sector (e.g., social media, video content, and structured data), the more potential value there could be in combining them to generate unique insights. • Veracity of data. The higher the quality or accuracy of the data, the better the potential insights.
• Value of applications. The degree to which there are specific applications in that sector that can deliver value. Table A1 provides a discussion of the methodology used to assess the potential value of big data in each sector based on these criteria.

APPENDIX 2
Sizing of Opportunities from Using Data-Driven Technologies The potential opportunities brought about by data-driven technologies were sized for the health and education sectors across 10 Southeast Asian countries. Tables A2.1 and A2.2 summarize the key metrics and sources used to size of the potential benefits of data-driven technologies in the health and education sectors by 2030.  https://www.mckinsey.com/featured-insights/employment-and-growth/connecting-talent-with-opportunity-in-the-digital-age. b World Bank. https://blogs.worldbank.org/opendata/new-country-classifications (accessed 10 December 2020).

Assessment of Policy Enablers for Big Data
The improvement opportunities with regard to the seven policy enablers for big data were assessed for each of the focus countries based on a review of existing government policies and strategies as well as a range of international indices. Table A3 provides a summary of the methodology used for the assessment of each policy enabler. Distance to frontier of more than 50% Distance to frontier of between 20% and 50% Distance to frontier of less than 20% ICT = information and communications technology. a Assessment for Myanmar was conducted based on its score on quality of science, technology, engineering, and mathematics education dimension in World Economic Forum's Networked Readiness Index 2016 due to data availability. b Assessment for Myanmar was conducted based on its score on availability of latest technologies in World Economic Forum's Networked Readiness Index 2016 due to data availability. c Assessment for Myanmar was conducted based on its score on importance of ICT to government vision in World Economic Forum's Networked Readiness Index 2016 due to data availability. d Assessment for Myanmar was conducted based on its score on ICT use by government to improve efficiency in World Economic Forum's Networked Readiness Index 2016 due to data availability.

Prioritization of Big Data Applications
A list of potential big data applications was assessed against a set of criteria to select the most suitable applications for pilot programs (Table A4).
Three criteria were used to prioritize the most suitable big data applications: 1. Evidence of impact. This was assessed through literature research and interviews with relevant stakeholders to identify examples of each opportunity (including research studies and pilot programs). The level of evidence is determined based on whether there are examples of such application in relevant countries (i.e., low or middle-income countries). 2. Access to data. This was assessed based on whether the data required in each big data opportunity is already available across the five focus countries (e.g., social media data and satellite data) or it needs to be Using data from smartphone-connected thermometers to track influenza activity and detect emergence of possible epidemics Data from companies providing smartphoneconnected thermometers which can be obtained through collaboration Using location and transport data as well as credit card transactions for contact tracing Data from telecom service providers, transport providers, financial institutions, and citizens Using social media data to provide insights into public perceptions of vaccines Social media data which can be accessed publicly or purchased Using data from hospitals and population studies to detect the risk of developing noncommunicable diseases Multiple databases and health records from hospitals, public health agencies, and population studies Using data from social media and search engines to detect the risk of developing noncommunicable diseases Social media and search data which can be accessed publicly or purchased Using data from remote monitoring systems to improve productivity and reduce patient in-hospital bed days Data from sensors to monitor heart conditions, blood-sugar levels, temperatures

Social welfare and protection
Strengthening the identification of poor individuals and households using big data Satellite data and/or data from mobile phone operators (e.g., mobility, use of services and applications, monthly expenditure) Improving program design by analyzing historical data of social welfare programs to identify gaps and develop more tailored approaches Electronic records of background of beneficiaries, interventions conducted, benefits received, and impact of past programs

Education
Analyzing data from online job portals and social networks to identify skills gap and develop training programs Job portal data which can be accessed publicly Analyzing student records to identify signs of dropping out and provide targeted support to those in need Electronic records of students' background, financial status, class attendance, and academic results Analyzing data on students' learning styles, areas of interest, abilities, and progress to customize teaching methods and curriculums to individuals' needs Data on students' interactions with virtual learning environments and academic records a This includes big data applications that have been implemented as well as exploratory research and/or pilot studies that have been conducted in the country. Sources: Literature review; AlphaBeta analysis. collected by the government (e.g., digital health records) or obtained through collaboration with the private sector (e.g., traffic data from shared mobility platforms). 3. Relevance. The relevance of each big data opportunity was assessed based on a range of indicators such as the number of active coronavirus disease cases in the population and the mortality rate of noncommunicable diseases. For each indicator, the team evaluated whether there is a large gap between each of the focus countries and the Southeast Asia average. This is supplemented by literature research and a review of existing government initiatives to understand if the application is relevant to the local context and is aligned with government priorities (e.g., rolling out vaccines). Table A5 provides an explanation of the methodology used in the assessment. Data required for the big data application needs to be obtained through partnering with the private sector (e.g., telecommunication companies and shared mobility companies) or collected by the government It is unclear whether there are any mechanisms or infrastructure to collect the data required for the big data application in the assessed country (e.g., digital health records and data on students' interactions with e-learning platforms)

Relevance
There is a significant challenge in the assessed country (as compared to the Southeast Asia average) that can be addressed by the big data application (e.g., high number COVID-19 active cases and high poverty rate). In some cases, there is already a plan by the government to leverage big data in that area The current state of the assessed country is similar to the Southeast Asia average (within 15% of the Southeast Asia average) The current state of the assessed country is significantly better than the Southeast Asia average This report illustrates why Southeast Asian countries need big data for pandemic recovery to radically transform the delivery of key services such as health care, social welfare and protection, and education. The final of a four-part series, it looks at the impact of COVID-19 on Cambodia, Indonesia, Myanmar, the Philippines, and Thailand to determine how big data could be an invaluable tool to help governments analyze the challenges they face. It outlines policy reforms and recommendations to help capture the benefits of big data. These include drawing up digital road maps, improving technical infrastructure, increasing data quality, and ramping up training programs to create a skilled workforce to lead the digital transformation.

About the Asian Development Bank
ADB is committed to achieving a prosperous, inclusive, resilient, and sustainable Asia and the Pacific, while sustaining its efforts to eradicate extreme poverty. Established in 1966, it is owned by 68 members -49 from the region. Its main instruments for helping its developing member countries are policy dialogue, loans, equity investments, guarantees, grants, and technical assistance.