Big data is the large scale collection of a wide range of data points. These data points are being collected constantly, from nearly every facet of modern life. Every time a person searches for something on the internet, shops online, posts or browses on social media, uses a GPS on a phone or the GPS in a vehicle, data is being collected, along with hundreds of other collection points. With the rise of the Internet of Things, even more data will be collected in the coming months and years. Data can even be collected from areas that appear unconventional; “data points” include things like placing sensors on bridges and tracking when that bridge is likely to fail and requires maintenance.

The collection of data has been on the rise in recent years due in large part to the steadily decreasing cost of technology and the cost of data storage. In the past, big data, and the power of predictive analytics that comes along with big data, was only available to large corporations and portions of the government. With the increased ability to collect and store more information, the use of big data and predictive analytics has spread to smaller companies and organizations.

What type of information is being collected?

Information is being collected nearly all the time, anytime an electronic device is utilized the data is likely being collected. Information may even be collected without direct use of an electronic device. Some make the distinction between data that is “born digital” and data that is “born analog”. Information that is born digital was created, either by humans or by computers, specifically for digital use by a computer or other processing system. Some examples of information that is “born digital” includes emails; text messages; GPS locations; metadata associated with phone calls, including the numbers dialed and length of calls; data associated with typical commercial transactions including credit card swipes, bar-code scans; data from cars, televisions and appliances in connection with the Internet of Things; and much more.

Information that is born analog comes about from characteristics of the physical world and the information cannot be accessed electronically until a sensor is applied. The sensor is any device that observes the physical impacts and then converts those impacts into a digital form. Some examples of information that is “born analog” includes the voice content of a phone call; personal health data including heartbeat, respiration, number of steps taken; video from surveillance cameras; cell phones; drones; microphones; cameras; medical imaging, and more.

What do companies and the government do with the data?

The most common use of data points and underlying information is to create profiles on individuals which can then be utilized in predictive analytics. Predictive analytics is the use of data points to essentially predict future outcomes by applying algorithms to the data points. These algorithms can then be used in virtually all areas. In marketing, predictive analytics can be used to determine customers that are the most likely to purchase specific products. In an employment setting, predictive analytics can be used to find the ideal candidates that will excel in a given position. Predictive analytics are being used in education to help college students determine their major, as well as which courses they are likely to do well in, and the courses they are likely to fail. Predictive analytics are also utilized in crime and terrorist prevention areas. In these areas, the data that is collected can be used to try and identify high crime areas, and create profiles on individuals that are likely to commit crimes or engage in criminal activity.