High-performance data centers have been aggressively moving
toward parallel technologies like clustered computing and multi-core
processors. While this increased use of parallelism
overcomes the vast majority of computational bottlenecks, it shifts the
performance bottlenecks to the storage I/O system. To ensure that compute
clusters deliver the maximum performance, storage systems must be
optimized for parallelism. Legacy Network Attached Storage (NAS)
architectures based on NFS v4.0 and earlier have serious performance
bottlenecks and management challenges when implemented in conjunction
with large scale, high performance compute clusters.
A consortium of storage industry technology leaders created
a parallel NFS (pNFS) protocol as an optional extension of the NFS v4.1
standard. pNFS takes a different approach by allowing compute clients to read and write directly to the
storage, eliminating filer head bottlenecks and allowing single file
system capacity and performance to scale linearly.
In order to understand how pNFS works it is first necessary
to understand what takes place in a typical NFS architecture when a
client attempts to access a file. A traditional NFS architecture consists
of a filer head placed in front of disk drives and exporting a file
system via NFS. When large numbers of clients want to access the data, or
if the data set grows too large, the NFS server quickly becomes the
bottleneck and significantly impacts system performance because the NFS
server sits in the data path between the client computer and the physical
pNFS removes the performance bottleneck in traditional NAS
systems by allowing the compute clients to read and write data directly
and in parallel, to and from the physical storage devices. The NFS server
is used only to control metadata and coordinate access, allowing incredibly
fast access to very large data sets from many clients.
When a client wants to access a file it
first queries the metadata server which provides it with a map of where
to find the data and with credentials regarding its rights to read,
modify, and write the data. Once the client has those two components, it
communicates directly to the storage devices when accessing the data.
With traditional NFS every bit of data flows through the NFS server â
with pNFS the NFS server is removed from the primary data path allowing
free and fast access to data. All the advantages of NFS are maintained
but bottlenecks are removed and data can be accessed in parallel allowing
for very fast throughput rates; system capacity can be easily scaled
without impacting overall performance.
Why is pNFS important?
pNFS is important because it brings together the benefits of
parallel I/O with the benefits of the ubiquitous standard for network
file systems (NFS). This will allow users to experience increased
performance and scalability in their storage infrastructure with the
added assurance that their investment is safe and their ability to choose
best-of-breed solutions remains intact.
NFS is THE communications protocol standard when it comes to
network file systems. It is widely used in both HPC and Enterprise
markets today. The pNFS standard is appealing to both vendors and
customers alike. It allows HPC-centric storage vendors such as Panasas to
deliver the advantages previously delivered only via proprietary protocols
into the NFS markets. It allows Enterprise-focused storage vendors to
penetrate the HPC market more deeply. So for vendors it broadens their
markets. For customers, it means more options and competition for their
business. It also allows customers to simplify their IT environments by
standardizing on pNFS as their standard NAS protocol.
Benefits of Parellel I/O
- Delivers very high application performance
- Allows for massive scalability without diminished performance
- Take advantage of available bandwidth
- Increase streams to parallel storage using more than one client
- Growing clusters with ability to perform larger calculations
NFSv4 Working group RFCs for pNFS
The NFS4.1 standard document is large
because it includes a complete description of all of NFSv4 as well as the
new 4.1 features. There are two companion documents that describe the
object layout and block layout for pNFS storage.
- RFC 5661 - describes NFS version 4 minor version 1, including features retained from the base protocol and protocol extensions made subsequently.
- RFC 5662 - contains the machine readable XDR definitions for the protocol.
- RFC 5663 - provides a specification of a block based layout type definition to be used with the NFSv4.1 protocol. As such, this is a companion specification to NFS version 4 Minor Version 1.
- RFC 5664- provides a specification of an object based layout type definition to be used with the NFSv4.1 protocol. As such, this is a companion specification to NFS version 4 Minor Version 1.
Download Source Code for pNFS
Download the latest development source code for the pNFS-enabled Linux kernel from linux-nfs.org
(provided under the GNU General Public License, Version 2) and the OpenSolaris code from opensolaris.org (under the OpenSolaris Binary License)