Search

Physical-Social-Cyber Computing Systems

Posted by Ramesh on April 20th, 2013

As difficult as it may be difficult to believe to many people, computing is a relatively young discipline. Until recently, most computing systems received input from people and represented it in cyberspace mostly in form of abstract information. For efficiency reasons most representations of real world in computing systems captured only application-specific essential abstract aspects of the problem. These systems were much closer to mathematics which defines operations on numbers representing abstractions of a problem. Moreover, these systems assumed that a human will transform data from the real physical world or from sensors capturing aspects of real world to the form acceptable to the computing system. This has been the common mode of input in computing systems even as they evolved from traditional computers to the Web and then included social networks. Even in those transaction oriented systems where data was directly input in the system, the data was available already in the abstract form compatible with the representations in the computing system. In a real sense, cyber representations of the real world entered by human beings have been the predominant mode of interfaces for computing systems.

Increasingly, computing systems are expected to interface with different sensors and other information sources directly without the mediation of a human. Emerging computing systems are expected to deal with data directly coming from different sensors and from other data sources. This data could be in the form of GPS coordinates, physical activity levels, flood level, temperature, rainfall rate, heart rates as function of time, photographs, audio, and video. Clearly all these data may come at different rates and may represent measurements over different regions or people. Since the system is suppose to take this input directly from sources, without any human mediators, it is important that the system can deal with the representations and the semantics of data. Using these diverse representations and semantics, the system should be able to extract and deal with information that will help in detecting situations of different kinds and then represent it in representations suitable for reasoning and analysis of situation.

Compared to traditional computing systems, emerging cyber-physical-social systems must deal with diverse real time data streams with different semantics and tied to spatial locations. The semantics of such data is inherently based on the sensor producing the data, location of the sensor, and the time of measurement. The system must extract information from this data that is also related to the time and location. Traditional cyber systems wanted to deal with abstractions that minimized importance of absolute location and time. In emerging cyber-physical-social systems time and location are independent variables that are used to represent measured, observed, and reported data; in traditional computing systems time and location were made attributes while entities were made independent variables. These two approaches to representation are significantly different and emphasize different characteristics of the world that we try to model.

Scientists and engineers dealing with signals and systems learnt to deal with this challenge long time ago. It was considered natural to maintain multiple representations and use them opportunistically for solving a problem. The most common example of this is the use of time-domain and frequency-domain analysis and use in designing audio and video systems. By maintaining and using both representations, one can design systems that help in understanding the requirement and steer the functionality to make these systems more useful for humans.

Using Big Data for Storytelling

Posted by Ramesh on April 5th, 2013

One of the best post that I read in some time on ‘data’ is by Om Malik. See here.

This post made me think about all the buzz about Big Data. The situation is exactly same as that described in the famous fable of ‘Six Blind Men and an Elephant”. Different communities approach big data from their very limited perspective. So here is my perspective on one aspect of big data — its use for telling compelling stories. I recently wrote something on this topic: See Micro and Mega Stories.

We collect data and use data effectively to enhance our experiences and tell stories. But this requires understanding relationships among disparate data items. And that is where the importance of Big Data really lies.
A major transformation going on in society currently is the nature of story telling. At one time story telling was based on subjective experiences by an individual that were only qualitatively available to the storyteller. The last decade has seen the arrival of ‘data’ and objective story telling. Human memory is powerful but has its limitations. Subjective human memory results in powerful anecdotes. Add to this availability of large volumes of data and ability to attach it to strengthen those anecdotes and you got powerful and compelling factual stories.

Storytelling is changing — whether for entertainment or for business or for education. It is going to be more objective – more based on data.

Corelation is the Mother of Causality

Posted by Ramesh on March 27th, 2013

I have always heard, for as long as I remember, that correlation and causality are not the same. That is OK. They are indeed different.
But there is always this implication by many, if not the most, people that correlation is second class and causality is the first class. Causality should be respected while one should be cautious about correlations.

It bothers me whenever I hear that implication about the superiority of causality. To me it is like saying that invention is better than necessity or that a baby is better than his mother. Clearly there is a strong relationship between a mother and a baby and every mother wants her baby to go much farther than her. It is important to remember, however, that baby follows mother. Similarly, if there is no necessity than there is no need for invention, or without an established causal necessity, an invention is worthless.

Correlation is observed in data and then only people think of finding out the causal chain. Except in purely theoretical reasoning, correlation seeds the need for causality. Hence, I say that ‘Correlation is the Mother of Causality’.

In these days of Big Data (hype) if there is some thing really important, it is to understand correlations because then only causality may be explored and established.

Building on the Disruption in Learning and Education

Posted by Ramesh on March 22nd, 2013

The last decades have been the decades of disruption. Books (including encyclopedia and newspapers), music, phones, and retails are among those who have already been metamorphosed, thanks to rapid advances in technology, and have to be creative just for survival. Now, people are talking about the transformation of one of the most basic human activities – learning –for some time. Clearly, learning and education are going to be transformed very soon, possibly as soon as by 2020. Recognition of Sugata Mitra’s ‘Hole in the Wall’ computers for learning and spread of MOOCs (Massive Online Open Courses) are just harbingers in learning and education.

Education and learning are suppose to go together and be synergistic. Ironically, however, education often comes in the way of learning. Education is a top down activity that reflects societal needs, while learning is the most primitive human desire to satisfy curiosity and grow. Unfortunately, the way education evolved in society, it started taming or even hindering human learning. Institutionalization of education resulted in stronger focus on grades and degrees rather than what they are supposed to reflect — the learning. As we have seen disinter-mediation transforming some other basic societal functions, in education and learning also we are soon likely to see such disinter-mediation.

The disruption in education and learning is significant. Like Gutenberg’s invention resulted in democratization of knowledge creation and dissemination, the impending transformation will address acquisition and transmission of knowledge. This is time for our society to seriously consider this issue for its own growth. It is important to consider societal need for educated people as a synergistic activity with basic human desire to learn and explore. In developing the next generation systems, technology can play a crucial role in such a development.

Developing countries, like India, are poised to be most affected by this imminent shift due to their current inadequate education system as well as their current demographic dividend. They are facing a cusp; take advantage of the demographic dividend or let it whither away and turn into a demographic curse. Fortunately, the changes brought by technology offer an opportunity for them to leapfrog and utilize its demographic advantage to become a leader.

Focused MicroBlogs (FMBs): Going Beyond Twitter

Posted by Ramesh on March 11th, 2013

Twitter became a very powerful medium and its popularity soared rapidly. Twitter became a major News source for the latest events. And it remains a popular site. Many academic researchers are exploring all aspects of twitter ranging from how it is used, how information propagates on it, how can one get information, how can one detect sentiments, and so on. Twitter data has become a good source of research papers for many budding researchers in different research fields. On the other hand, many companies build products using Twitter data. Products ranging from early detection of trends, popularity of politicians, to sentiments of people on a particular topic have been productized. However, there are some limitations of Twitter data for many applications.

Twitter data is so broad in scope that any topics can be found in tweets, but unfortunately all topics are mixed together. If we treat the relevant information as signal, and all others as noise, twitter data has very low signal-noise-ratio (SNR). Use of tools like hash-tags help, but do not really solve the problem. Given this about current Twitter like social network, we believe that they all suffer serious issue of low SNR and are less effective as information sources. Therefore, we propose the concept of Focused Micro Blogs (FMB). FMBs are on a focused topic and are targeted to a specific source. Since the topic is fixed, the information in a FMB could be easily structured and parsed. FMBs retain the important feature of sending brief real time information of Tweets, while overcoming their weakness of combining too many different topics in one place. This information is aggregated at a server designed for a specific task.

Let’s consider one concrete example of FMB in a popular application for traffic – the Waze. Waze is an application on phones that takes GPS data periodically from the phone and uses this to compute the speed of the device. This is aggregated at the server to know traffic conditions in different areas. A user may also do a micro-blog using specific input mechanisms so that the system knows about a stalled car or a police officer at a specific location. By using sensor data from the phone and specific micro-blog-inputs, a Waze server provides very useful information to its users. This is an FMB in action.