More than twice as much data will be generated in 2025 as in 2022, reaching some 180 zettabytes, approximately 36,000 times more than the data Google stores today.
Since the 1990s, large companies began to recognise the importance of data as a key differentiator in their business plans. This phenomenon intensified with the democratisation of the internet and the rise of websites, setting a trend that has evolved significantly over the past 30 years.
A report by Statista projects that more than twice as much data will be generated by 2025 as last year, reaching nearly 180 zettabytes, approximately 36,000 times more than the volume of data that Google stores today.
On the occasion of its 25th anniversary, PUE, the leading Spanish technology company in consulting and implementation of Data & Machine Learning projects, has compiled the key milestones it has witnessed in the transition from old big data to today’s capabilities. These milestones have enabled the company to respond innovatively to the growing demands of its customers.
- Transition to greater efficiency: From Hadoop to the Data Cloud: In 2006, Apache Hadoop emerged as an innovation focused on batch ETL processes for on-premise environments. However, its limited ecosystem and complex Java developments left it behind as a rudimentary framework. In 2023, the Data Cloud consolidates as an evolution that maintains the spirit of Hadoop, focusing on batch processing and streaming. This real-time advantage, combined with artificial intelligence, enables the development of processes in a broad ecosystem, both in the cloud and in hybrid environments, and offers data quality and data governance capabilities, marking a significant advance from the days of Hadoop.
- Evolution in data management: From on-premise to cloud and hybrid models: In early data projects, companies were limited to the on-premise option. With the development of the Cloud, more agile, flexible and efficient data management emerged, with potentially lower investments. Today, the key to an optimised data strategy is to analyse the nature of each project to determine whether cloud, on-premise or hybrid environments are the best option. A report by Cloudera reveals that 72% of companies are considering repatriating their data from the Cloud to on-premise, while 94% plan to move more data to the Cloud.
- Smart Data: The ultimate Big Data efficiency: As more companies look to leverage their data to expand their business, simply being a data-driven enterprise is no longer enough. Now massive data collection and storage must go hand-in-hand with innovative analytics techniques for more useful, accessible and intuitive use. The transition from big data to smart data becomes the differentiating factor. This drives process optimisation, minimises risks and threats, opens up new business opportunities, improves user experience and ultimately brings positive brand reputation.
- New challenge in the short term: Growth of unstructured data: In the early days, companies were dealing with small and manageable amounts of data. However, with advances such as IoT and connected devices, the volume of data has grown exponentially. Today, high volumes of unstructured data are common, accounting for 80-90% of new data captured by enterprises, according to Gartner. This challenge makes data governance difficult and highlights the need for the right technology and analytical approaches to extract maximum value from this data.