Standards-Based Parallel Global File Systems and Automated Data Orchestration with NFS

Abstract

High-performance computing applications, web-scale storage systems, and modern enterprises increasingly have the need for a data architecture that will unify at the edge, and in data centers, and clouds. These organizations with massive-scale data requirements need the performance of a parallel file system coupled with a standards-based solution that will be easy to deploy on machines with diverse security and build environments.

Standards-Based Parallel Global File System - No Proprietary Clients

The Linux community, with contributions from Hammerspace, has developed an embedded parallel file system client as part of the NFS protocol. With NFS 4.2, standard Linux clients now can read and write directly to the storage, and scale out performance linearly for both IOPS and throughput, saturating the limits of both storage and network infrastructures. Proprietary software is no longer needed to create a high-performance parallel file system, as NFS is built into open standards and included into Linux distributions. NFS 4.2 is a commercially driven follow-on to pNFS concepts.

Today’s data architectures span multiple types of storage systems at the edge, in the data center, and in the cloud. With the rise of data orchestration systems that place data on the appropriate storage, in the optimal geographic location, NFS 4.2 is a must-have technology to deliver high-performance workflows working with distributed data sets.

Automated Data Orchestration - Across Any Storage System

Hammerspace developed and contributed the Flexible Files technology to make it possible to provide uninterrupted access to data by applications and users while orchestrating data movement even on live files across incompatible storage tiers from different vendors and multiple geographic locations.

Flexibles

Files, along with mirroring, built-in real-time performance telemetry, and attribute delegation (to name a few) are put to work in a global data environment to non-disruptively recall layouts, which enables live data access and data integrity to be maintained, even as files are moved or copied. This has enormous ramifications for enterprises as it can eliminate the downtime traditionally associated with data migrations and technology upgrades. Enterprises can combine this capability with software, such as a metadata engine, that can virtualize data across heterogeneous storage types, and automate the movement and placement of data according to IT-defined business objectives.

Building a Global Data and Storage Architecture

Hammerspace brings NFSv4.2 (in addition to SMB and NFSv3) connectivity to its parallel global file system to build a standards-based, high-performance file system that spans existing and multiple otherwise incompatible storage systems from any vendor as well as across decentralized locations. In this way it can intelligently and efficiently automate orchestration of data to applications, compute clusters, or users that need it, enabling global access for analysis, distributed workloads, or to run AI-driven insights.

David Flynn
Hammerspace
Related Sessions