top of page
Writer's pictureCASILab

Simplifying the buzzword - Big Data

Updated: Apr 2, 2021


 

In this Exabyte scale era, data increases at an exponential rate. This in turn has put massive challenges in the traditional approach of data handling and processing. Handling of Big Data, which is indeed a huge data with immense complexity and a natural outcome of plethora of experiments ranging from the modern accelerators to astronomy, points towards the necessity to upgrade the existing methodology of data analysis using state-of-the-art innovations of data science technology. In this article, the author give a sketchy outline of the growth of Big Data, associated challenges, and how various tech-driven organizations are adopting this change.

 

Posted by Dipayan Dev, a Data Engineer@Walmart Lab and Author of well-known book, 'Deep Learning with Hadoop', Packt Publishing (2017). He frequently publishes research paper, book chapter, conference article etc in top-tier journals of computer science. (Email: dev.dipayan16@gmail.com)

 

In a recent statistic, IBM estimated that every day 2.5 quintillion bytes of data are created - so much that 90% of the data in the world today has been created in the last two years. It is a mind-boggling figure and the irony is that we feel less informed in spite of having more information available today.


The surprising growth in volumes of data has badly affected today's business. The online users create content like blog posts, tweets, social networking site interactions and photos. And the servers continuously log messages about what online users are doing.

The online data comes from the posts on the social media sites like Facebook and Twitter, YouTube video, cell phone conversation records etc. This massive amount of data is formally termed as Big Data.


WHAT IS BIG DATA ?

Big Data simply means datasets which continues to grow so much that it becomes difficult to manage it using existing database management concepts & tools. The difficulty can be related to data capture, storage, search, sharing, analytics and visualization etc.

The Big Data spans across major three dimensions: Volume, Velocity and Variety.

  • Volume - The size of data is very large and in terabytes and petabytes.

  • Velocity - It should be used when streaming in to the enterprise in order to maximize its value to the business. The role of time is very critical here.

  • Variety - It extends beyond the structured data, including unstructured data of all varieties: text, audio, video, posts, log files etc.

 

3Vs of Big Data

 

WHY BIG DATA?

When an enterprise can leverage all the information available with large data rather than just a subset of its data then it has a powerful advantage over the market competitors. Big Data can help to gain insights and make better decisions.

Big Data presents an opportunity to create unprecedented business advantage and better service delivery. It also requires new infrastructure and a new way of thinking about the way business and IT industry works. The concept of Big Data has changed the way we do things today.


The International Data Corporation (IDC) study had already predicted in the past that overall data would grow by 50 times by 2020, driven in large part by more embedded systems such as sensors in clothing, medical devices and structures like buildings and bridges. The study also determined that unstructured information - such as files, email and video - will account for 90% of all data created over the next decade. But the number of IT professionals available to manage all that data will only grow by 1.5 times today's levels.

The digital universe is 1.8 trillion gigabytes in size and stored in 500 quadrillion files. And its size gets more than double in every two years time frame. If we compare the digital universe with our physical universe then it's nearly as many bits of information in the digital universe as stars in our physical universe.


BIG DATA CHALLENGES

The main challenges of Big Data are data variety, volume, analytical workload complexity and agility. Many organizations are struggling to deal with the increasing volumes of data. In order to solve this problem, the organizations need to carefully design the data storage, exploit new storage techniques and implement faster data consumable later, which can further improve performance and storage utilization.


SUMMARY AND CONCLUSION

Big Data is a new gold rush & key enabler for the social business. A large or medium sized company can neither make sense of all the user generated content online nor can collaborate with customers, suppliers and partners effectively on social media channels without leveraging Big Data analytics. The collaboration with customers and insights from user generated online contents are critical for the success in the age of social media.

In a study by McKinsey's Business Technology Office and McKinsey Global Institute (MGI) firm calculated that the U.S. faces a shortage of 140,000 to 190,000 people with analytical expertise and 1.5 million managers and analysts with the skills to understand and make decisions based on the analysis of Big Data.


The biggest gap is the lack of the skilled managers to make decisions based on analysis by a factor of 10x. Growing talent and building teams to make analytic-based decisions is the key to realize the value of Big Data.

 

Join us to create relevant content:

Writing on an academic issue either in form of study-report, news or in journal brings your work to the attention technical people of your field and also to the commoners. We invite you to be associate with the research and educational programs undertaken by CASILab and captivate the readers’ attention through your positive thought & activities.


122 views0 comments

Recent Posts

See All

Comments


bottom of page