Home / Multimedia / Blog

Don't ignore the potential of data science

Data science has the potential to support more quick, effective and accurate solutions and should be part of UN discussions on digital cooperation, say Krittika D'Silva and Deepakshi Rawat.

Technology impacts our lives in an unprecedented - and unregulated - way. The High Level Panel on Digital Cooperation, established by UN Secretary General António Guterres, convened for the first time in New York in September to discuss the impact of digital technologies on the world.

The star-studded panel, co-chaired by Melinda Gates and Jack Ma, brings together experts from government, the private sector, civil society and academia to identify gaps within policy, research and information spheres and to make “proposals to strengthen international cooperation in the digital space".

While documents from the panel highlight blockchain and the internet of things to bridge these gaps, none mention data science. However, data science should be a priority: it has greater potential to provide faster and more effective solutions and to help build individual, as well as governmental, abilities to adapt to the digital age. 

Although there is no consensus on the exact definition of data science, it can broadly be defined as “the science of extracting knowledge from data”. This alone isn’t particularly exciting, but if we consider the data we have now to what existed 20 years ago, the differences are quite remarkable.

We generate data in countless aspects of our lives, from the items we purchase, places we visit and friends we meet. The potential to extract knowledge while keeping user anonymity is enormous. Below are two examples of data generated from the technology we use in our daily lives and how it has the potential, on a large scale, to inform policy and transform our understanding of the modern world.

Twenty years ago, cell phone penetration worldwide was 5%. Now, at around 66%, each phone call and text creates a unique record. Since each call and text connects to a cell tower, we also have location data associated with each record. These records accumulated over time enable fine-grain representations of the strength of the connections between individuals and how they move between different areas over time.

Let us imagine, for example, a natural disaster like a tsunami hitting an area. Currently, calculating the impact of the disaster can take many days. Computing the number of individual fatalities, individuals displaced and houses damaged is not an easy, cheap or quick task.

However, by using phone records, providing support and relief after an earthquake is significantly faster. For example, after the 2015 earthquake in Nepal, we saw researchers use call data to build fine grain models of how populations were displaced just a few days after the disaster. This data could also be used to quantify the impact of the disaster as well as the time for recovery of different areas.

Using modern technology in a methodical, scalable and efficient way has the potential to save lives. Recognising the potential of data science to significantly improve past practices in this space is important. Mathematically modelling complex human activities with call data can help support governments and policymakers in efficient decision-making, not only when called upon to provide urgent humanitarian aid but also with long-term efforts such as resource allocation and infrastructure development.

In the much the same way as our communications, financial transactions have also become increasingly digital. Credit cards, which are now ubiquitous, only gained traction in the 1970s. These cards not only create a capacity for easy and cashless transactions, but also create digital traces of user spending habits. When anonymised and aggregated, this dataset represents so much more than just a series of transactions.

Previously, understanding the economic lifestyles of women was done through questionnaires or national censuses. Now, with details of credit card transactions we can see how the spending behaviour of women in different communities or regions differs and changes over time.

Previously, building economic indices was done at a national level with metrics such as GDP or at finer spatial granularity by conducted surveys. Now, with large datasets we can build social, economic and well-being measures which are representative of these ground-truth indices and are at finer spatiotemporal granularities.

Financial transactions can also be used to detect fraud, suggest saving habits - as well as track government subsidies. Trends generated at a national or state level from these transactions would be extremely helpful especially in developing countries, for example, to plan policies for financial literacy and inclusion.

The panel will still meet a few more times and the final report is tentatively scheduled to be released in April 2019. It is imperative that it does not overlook the importance of data science. The examples outlined above with mobile network and credit card data are just a few of the many examples of how data can be used by governments and policymakers.

Although not as flashy as artificial intelligence, blockchain or robotics, data science has the potential to support more quick, effective and accurate solutions. Understanding trends in large datasets can help reduce costs, find new markets and support better decision making in governments around the world.

*This article was written by Gates Cambridge Scholar Krittika D’Silva [2016], who is doing a PhD in Computer Science, and Deepakshi Rawat, a junior research consultant at Pulse Lab Jakarta. It was first published on Apolitical.co. Picture credit: Defense Advanced Research Projects Agency (DARPA) and Wikimedia Commons.