It isn't enough to collect data. Making sense of it is vital to the survival of organizations large and small. (Image via Vitalflux)
Large amounts of data are continuously being generated by organizations, governments, internet users and many others.
This data includes both open sets such as the ones that can be freely accessed online, and perhaps even larger private sets that are proprietary to the organizations that generate them (McKinsey claims that 15 out of 17 sectors in the United States have more data stored per company than the US Library of Congress). The fact of the matter is that there is so much data out there today that the term big data has been coined to address it.
Up until the recent past, the main challenge for most was to access data; this is no longer the case, and instead the challenge has now become how to make sense and use of data.
Data analysis is particularly relevant to business executives who may want to convert data into information that can be used in decision making. By harnessing this data through sophisticated analytics, and by presenting the key metrics in an efficient, easily consumable fashion, we are afforded unprecedented understanding and insight into various facets of operational and strategic business planning.
Business intelligence aims at gathering data relevant to one or more fields of interest in order to identify past trends, create predictive models, and apply those models to make business decisions. In retail environments for example, such an analysis process can create behavioral traits about one or a group of consumers (e.g., identify business travelers, outdoor enthusiasts or frequent fast food restaurant visitors), then use the result to tailor marketing campaigns, advertisements, promotions, and even in-store product display and corresponding inventory management.
Let us take a step back. Say you have a data set, how do you actually proceed with the analysis? The field is obviously vast, but a practical application may help shed light on the type of skill sets you should source or grow in house to harness the advantages of data science.
One way to take a first step is to build a driver tree; that’s a relationship analysis between one or more metrics of interest and other parameters that affect those metrics. For example, say you are interested in your firm’s profits; profits depend on revenue and cost; in turn, revenue depends on sales volumes and per unit price, whereas cost depends on fixed company costs, and variable costs which again depend on sales volumes and per unit cost. Once the relationships are properly determined, one is able to map them into an Excel spreadsheet where actual analysis can begin.
This may seem like a simple model, and indeed it is. What is important to remember however is that the challenge is not to translate a complex business problem into a complex model but to translate a complex business model into a series of meaningful yet simple models. With the model above at hand, tasks such as sensitivity analysis and simulations, becomes possible. Sensitivity analyses are achieved by changing an input cell in a spreadsheet (e.g., Volume) and recording the effects on output cells (e.g., Profit). This is where previously collected data sets come into play since the combination of past information for volume and the Data Tables tool in Excel for example can help produce various graphs such as Tornado diagrams.
While sensitivity analysis is useful, it does not capture the effects of simultaneous changes in several parameters, neither does it capture probabilities. Techniques such as Monte Carlo simulations allow you to investigate the effects of simultaneous changes of many inputs and incorporating the probabilistic variations of parameters, as derived from past data sets. Monte Carlo simulations can calculate the value for profit resulting from a large number of possible values for each of the variable driving parameters (i.e., price, volume, and per unit variable cost). The result of the simulation will show the probability distribution of profit.
Companies such as General Motors, Eli Lilly, Procter and Gamble, and Sears use simulation to estimate both the average return and the riskiness of new products: GM uses simulation for activities such as forecasting net income for the corporation, predicting structural costs and purchasing costs, and determining its susceptibility to different kinds of risk (such as interest rate changes and exchange rate fluctuations).
Lilly uses simulation to determine the optimal plant capacity that should be built for each drug; Procter and Gamble uses simulation to model and optimally hedge foreign exchange risk; and Sears uses simulation to determine how many units of each product line should be ordered from suppliers - for example, how many pairs of Dockers should be ordered on a given year. Metrics to focus on are practically boundless: customer complaints, waiting time to check out, flight delays, delivery time, billing accuracy, and so on.
Big Data also means big opportunities for entrepreneurs. (Image via Syracomm.de)
There is quite a bit more that can be done in terms of data analysis - the above has barely scratched the surface. Other types of analysis can involve data visualization and plotting, regression, forecasting, decision trees, and many others. Regression analysis for example allows creating mathematical models based on past data, in order to generate predictions and other inexistent data sets against which various foreseen courses of action can be vetted. Data sets can therefore help analyze the past, but they can also help plan for the future.
The takeaways for business executives are two-fold. First, data can generate a significant competitive advantage that ought not to be ignored; and to achieve that potential, companies may need to build internal expertise, or possibly source it from third parties, whether in terms of tools, or plain expertise.
It is equally important to setup proper processes for gathering internal data across the entire spectrum of business activities, such as: feedback from sales teams, suppliers, and customers; and suggestions from engineering groups, marketing, executive management, and the rest. Finally, it is crucial to disseminate conclusions throughout the organization, while tailoring results to specific interests of various internal audiences.
The second takeaway is that Big Data has opened the door to a large spectrum of entrepreneurship opportunities. Today, new ventures are going after gaps in pretty much the entire spectrum of Big Data related fields and applications.
Companies such as Platfora, Datameer, and Trifacta, are enabling faster collection and aggregation of data for analysis purposes; on the analytics side, Tableau Software has achieved a solid market presence, but other firms are also carving out their niche in terms of visualization (e.g., CartoDB, Qlick), online reputation monitoring (e.g., Socialmention, IceRocket), or building structured data from the unstructured online content offered by the Internet.
On that last front, we see GNIP and Datasift mining through social media networks to create data sets, EQLIM (Beirut based) and Cytora doing the same with a focus on geopolitical risk, and Brate (Beirut based) offering an online discovery platform that harvests online content to surface consumer retail trends.
Every few years, new technological advancements are disrupting existing ways of conducting business. Broadband, ubiquitous wireless connectivity, and smartphone devices have radically altered our world. Big Data is poised to do the same; it has already started.