Network-attached storage (NAS), or file storage, provides shared storage in which access to data takes place at the level of entire files.
NAS is defined by the fact it runs a file system and manages access to media behind that. Meanwhile, users and application servers are presented with storage in the familiar drive letter format.
NAS boxes originally emerged as standalone appliances. Later, scale-out NAS made it possible to link many NAS nodes into a cluster with a single file system.
More recently it has become possible to deploy NAS via software-defined storage and as part of a hyper-converged infrastructure setup.
It is also possible to specify file access storage in the public cloud.
If all you want is shared storage for several terabytes (TB) up to maybe the low tens of terabytes then a standalone NAS product could fit the bill. This could be for general file storage, or more specialised use cases such as storing surveillance footage.
Products range from consumer-grade offerings to those aimed at the smaller end of the small to medium-sized enterprise (SME) market, with suppliers that include Synology, Buffalo and QNAP. Form factor is usually some kind of micro server at the low end, with rackmount possible at the high end.
It is also theoretically possible to build your own, using commodity hardware with software from the likes of FreeNAS and Xigmanas (formerly NAS4Free), both of which are based on FreeBSD.
The limitation of a standalone NAS appliance is what to do when you’ve filled it. If you get another NAS, you can end up with silos of storage distributed between hardware instances.
That’s exactly the limitation scale-out NAS was developed to sidestep. In scale-out NAS, discrete appliances share the same file system, a so-called parallel file system to which new hardware instances can be added.
Scale-out NAS is largely the preserve of larger SMEs and enterprises. It gets around the problem of NAS silos being formed and it allows larger volumes of shared storage capacity to be created.
For those reasons, scale-out NAS is the go-to choice for anyone who can afford it.
Key suppliers are all the big five storage players – NetApp, Dell EMC, HPE, Hitachi Vantara and IBM – but there are specialists too, such as Cloudian, Qumulo and WekaIO, which we will come across again when we look at hybrid and multicloud options.
Key use cases are anything where large volumes of unstructured data need to be kept.
Distributed file storage
In the past couple of years we have seen the rise of scale-out NAS products, which are built to potentially work across on-premise and cloud deployments. Examples here include the likes of Cloudian, Elastifile, Qumulo and WekaIO.
Cloudian’s Hyperfile provides Posix/Windows-compliant file access that can be deployed on-premise and in the cloud. It works via the supplier’s Hyperstore environment, which is based on object storage and offers hybrid cloud operations across Microsoft, Amazon and Google cloud environments with data portability between them claimed.
Elastifile’s Cloud File System (ECFS) is software built to scale across thousands of compute nodes, and offers file, block and object storage. ECFS is designed to support heterogeneous environments, including public and private cloud environments, under a single global namespace.
The Qumulo File Fabric (QF2) is scale-out software that can be deployed on commodity hardware or in the public cloud. Cross-platform capabilities are provided through the ability to replicate file shares between physical locations.
WekaIO’s scale-out POSIX-compliant parallel file system is called Matrix. Matrix runs across a cluster of commodity storage servers or can be deployed in the public cloud and run on standard compute instances using local SSD block storage. It also claims hybrid operations are possible, with the ability to tier to public cloud services.
Cloud file storage
All the big three public cloud providers – Amazon Web Services (AWS), Microsoft Azure and Google Cloud Platform (GCP) – offer native NAS storage services. All three also offer better-performing file storage based on NetApp storage. Where Azure is different is that it provides file storage caching, aimed at providing low-latency access to a set of files in a single namespace at a number of service levels.
Use cases targeted by AWS and Azure include big data analytics, web serving and content management, application development and testing, media workflows, database backups, and container storage. Google is a bit more modest in its proposed use cases than some of the AWS and Azure cloud file storage offers. GCP targets video rendering, application workloads, web content management and home directories.
NAS in the cloud
Here, we’re talking about storage suppliers’ NAS products that are available in the cloud.
A recent heavyweight entrant is Dell EMC, which launched a Google Cloud version of its Isilon scale-out NAS, with OneFS cloud storage ready for production workloads.
The move allows customers to burst production use cases on Isilon storage to GCP, with up to 50PB (petabytes) of capacity available in a single namespace.
Other mainstream storage suppliers also offer instances of their file systems in the big three public clouds – AWS, Azure and GCP. NetApp has Cloud Volumes, while IBM has its Spectrum Scale parallel file system in AWS.
Cloud NAS gateways
Another way to leverage the cloud for file storage is from software- and infrastructure-based offerings such as Nasuni, Ctera and Panzura, which offer access via software or hardware gateways.
Nasuni was a pioneer in the space, with a virtual appliance cloud gateway that cached active data on-premise and stored less-frequently-accessed files in public clouds. Nasuni’s cloud-native UniFSfile file system provides a single namespace for unstructured data. Nasuni is delivered via on-site filers that stage data off to the AWS cloud.
Meanwhile, Ctera’s HC range are so-called edge filers, which act as local capacity that can sync with other Ctera nodes globally and with cloud storage capacity. Ctera targets use of cloud bursting and collaboration across geographies.
Access can be from desktop and remote/mobile devices via Ctera Drive, application programming interfaces (APIs)/web access and a mobile app, with data held in the Ctera Global File System. From there, it can be accessed by users via Ctera instances or auto-synced to S3 cloud storage systems.
Panzura is deployed on a server at each location where a local copy is held, with Google Cloud Storage as the main data retention location. Panzura’s main focus is cross-office collaboration delivered via an on-premise gateway and a global file-locking system with cloud storage.