Computational Storage: Potential Benefits of Reducing Data Movement

Talk of the Storage town today is Computational Storage.

In the previous blog, we saw the evolution of storage architectures and emerging storage architectures. And one of the widely talked topics is computational storage. If you would have got a chance to attend SNIA SDC USA 2020, you would have seen the inclination of the session and discussions towards this trending storage architecture.

In yet another previous blog, we saw the 5Ws that you need to know to understand everything you need to know about Computational Storage. We saw that Computational Storage is providing higher capacity solutions with lower power consumption, as we use distributed processing. As a result, Computational Storage provides improved efficiency and performance.

In one of the keynote sessions at SNIA SDC USA 2020, JB Baker from ScaleFlux talked about the advantages of reducing data movement in Computational Storage. He categorized the advantages into two – Saving Time and Saving Money.

  1. Saving Time: With growing storage media, interfaces, and networks, and speeding bandwidth, data movement is becoming sluggish. Moving Tera Bytes and Zetta Bytes of data to perform mission-critical tasks such as transactional processing, big data analytics, and machine learning, can become time-consuming and reduce efficiencies.
  2. Saving Money: Supporting infrastructure to handle all this massive data movement requires consistent investment, creating a lot of challenges for all those managing the data.

So, reducing data movement might help reduce processing time and infrastructure costs. This can be a boon for the IT department and data center architects. Further, Baker gave an example of utilizing Computational Storage to reduce data movement. He said that it would be through a Data Filtering Computational Storage Service (CSS). Let’s dig deeper into the example and results explained by Baker.

Let’s take a 12TB data set that represents all the transactions, worldwide purchases that happened over the past several years. Let’s say a data analyst needs to run a query that covers just 4 months in 2016, rounding up to only 100 GB of data relevant to this query, which is <1% of the entire data set. With ordinary storage, we might have to take all 12TB and push it up through the CPU, which is an invitation to the bottleneck up there to do that filtering and then complete the query.

Instead, if we implement data filtering CSS down at the drives, it filters out the relevant data before it even leaves the drive. So we will have to move only that 1% of data relevant to the query across the PCI bus and to the CPU. This will reduce the total data movement by 99% in this case and even reduce the data processing by the CPU to finish the query, resulting in a faster query completion time. This enables more queries to run in parallel and scale more rapidly. Baker also supported the theory with the practical implementation of the above example by measuring the data movement, CPU utilization, and query completion time for ordinary storage and data filtering CSS.

The results were as below:

Ordinary Storage Data Filtering CSS
Data Movement High bandwidth for a very long period to move massive data High data movement, but for a very short period
CPU Utilization Experienced bottleneck CPU scaled nicely due to less data movement
Query Completion Time Slower query completion Rapid query completion (2-4 times faster than the usual)

This example clearly explains the potential benefit of using Computational Storage and Data Filtering Service over ordinary storage. To know more about advantages of Computational Storage, don’t forget to register for our upcoming keynote session at SNIA SDC India 2020, in which we will elaborate on the idea of computational storage and its position in the market. You can reach out to our speaker Rohit Srivastava during the session and ask your queries about CS at the event.

You can also watch the complete keynote session by Baker at SNIA SDC USA 2020 here and read more on SNIA.org/computational

 
Share:

Related Posts

Understanding Types and Trends of Data Storage Technologies

Explore the forms of data storage, latest data storage technologies and trends crucial for optimizing data management.

Share:
How to Perform Hardware and Firmware Testing of Storage Box

How to Perform Hardware and Firmware Testing of Storage Box

In this blog will discuss about how to do the Hardware and firmware testing, techniques used, then the scope of testing for both. To speed up your testing you can use tools mentioned end of this blog, all those tools are available on internet. Knowing about the Hardware/Firmware and how to test all these will help you for upgrade testing of a product which involve firmware

Share:

Importance of High Availability Feature in Storage

Discover the importance of High Availability (HA) in storage solutions. Learn how HA ensures data access, scalability, and operational efficiency in the IT industry.

Share:
Storage Solutions Redefined SSD and Cloud Storage

Storage Solutions Redefined: SSD and Cloud Storage

Solid State Drives (SSDs) and Cloud Storage are innovative storage solutions, that transform data management. Explore this blog for insights on selecting appropriate enterprise storage solutions.

Share:

Role of Cyber Security in Business Continuity

Cyber security plays a critical role in business continuity by mitigating risks, cyber-attacks, and by maintaining trust with customers and partners. Explore the crucial role of cybersecurity in ensuring business continuity!

Share:
Navigating Big Data Storage Challenges

Storage Considerations for Big Data Storage

The last decade or so has seen a big leap in technological advancements. One of the technologies to come up at this time and see a rapid…

Share: