How Many Spares Drives Dhould I Allocate For My Storage System?

If you're new here, you may want to subscribe to my RSS feed. Thanks for visiting!

This question comes up frequently and the answer is somewhat complex.  It depends on the number of drives in the storage system and the characteristics of those drives.

Remember, if you are clustering your storage system, each head in the cluster will need its own spares.  Additionally, if you are using syncmirror, you will need spares for each storage pool.

Additionally, if your storage system has both fibre channel and SATA drives you need different spares for each technology.  Fibre channel drives cannot act as spares for SATA drives and SATA drives cannot act as spares for fibre channel drives.

Ideally, if you have disks of different speeds or different sizes, you should have spares for each size and speed.  It is possible to mix drive speeds.  For example, if you have a mixture of 10k and 15k fibre channel drives in a storage system you should allocate the different speed drives to different aggregates.  If a 10k drive fails, then a 15k spare could act as a spare.  You will not be able to use the additional performance of the 15k drive, but the raid group of 10k drives will perform as before.  You are just not getting the performance you paid for when you bought the 15k drive.  The reverse if not true.  If a 15k drive fails and the only spare available is a 10k drive, then the effective speed of the raid group of 15k drives will be reduced by the 10k drive.  This is better than no spare at all, but now you have reduced performance.

It is also possible to use a large drive to act as spare for a smaller drive.  Again, this is a supported but not recommended.  The capacity of the large drive will be reduced to the same size as the drive it is replacing.  Of course, a small drive cannot act as a spare for a larger drive

After this is becomes just a question of numbers.

You should have two spares drives in each category, up to 100 drives.  Then, NetApp recommends that you add one additional spare for each additional 84 drives.  Here is a table from TR-3437:

Number of Shelves Number of Disks Recommended Spares
2 28 2
6 84 2
8 112 3
12 167 3
24 336 4
36 504 6
72 1008 12
Bookmark and Share

No Comments »

Data ONTAP and Space Management – Part 8

Let’s take a look at the snap autodelete command.  We will begin by checking the autodelete status:

The command: “snap autodelete vol1 show” returns the current status of the autodelete options for vol1.

By default, autodelete is turned off, so let’s start by turning it on:

It is now turned on.  The commitment option is set to try.  Either try or disrupt are available possibilities here.  This option controls which snapshots Data ONTAP will delete if it needs to recover space in the volume.  When set to try only snapshots which are not owned by data protection utilities may be deleted.  Data protection utilities include the dump command, ndmpcopy and the mirroring utilities.  Snapshots owned by backing utilities such as LUN or volume clones are also protected.   If the option is set to disrupt then snapshots which are not locked may be deleted.

The trigger option determines when the Data ONTAP will start deleting snapshots for that volume.  Trigger can be set to volume, snap_reserve or space_reserve.  If the trigger is set to volume, then once the volume has reached 98% of its capacity, it will begin deleting snapshots.  If the trigger is set to snap_reserve, then Data ONTAP will begin deleting snapshots when the volumes snapshot reserve is using 98% of its capacity.  The space_reserve option is a little more complex.  If the trigger is set to space_reserve than Data ONTAP begins deleting snapshots once the space reserved is at 98% capacity and the volume has used all of its snap reserve capacity.

Once Data ONTAP begins deleting snapshots it will continue until it reaches the value set in the target_free_space option.  It will apply this value against the “container” that reached the 98% threshold.  By default, this is set to 20% free space.  In our example, it would continue deleting snapshots until the volume had 20% free space available.

The delete_order option controls whether Data ONTAP will delete snapshots from oldest to newest or newest to oldest.  Valid values for this option are oldest_first or newest_first.

There may be some snapshots which we would like to avoid deleting if possible.  The defer_delete option will save these snapshots if possible.   Valid options are scheduled, user_created, prefix or none.  If we set this value to scheduled then snapshots that were created by Data ONTAP’s snapshot scheduler will be the last ones deleted.  If set to user_created than manual snapshots will be the last ones deleted.  If set to prefix, then snapshots whose names begin the prefix designated by the prefix option will be saved for last.  None means will are not trying to protect any particular classification of snapshots.

Finally, if we are trying to protect a class of snapshots by their name prefix, the prefix value is a string which contains the prefix name.  This string can be up to 15 characters long.

Bookmark and Share

No Comments »

Data ONTAP and Space Management – Part 7

With the combination of Data ONTAP 7.2 and flexible volumes we have some new options for managing space in volumes that contain LUNs.  Obviously, we still have to provide a free block when the host wants to write, but now we have a new block pool in the aggregate we can pull from.  We also have the option of automatically deleting snapshots as a source of free blocks.

If we set the fractional reserve to 0% and then set the autosize option on a volume, the volume can grow when it needs more space by pulling it from the aggregate.  Effectively the aggregate is supplying the pool of available blocks.

Effectively the aggregate is providing a pool that can be shared across all the volumes it contains.  It is also very unlikely that all the LUNs in all the volumes will suddenly become extremely active at the same time.  Therefore it may make sense to set aside a free space pool that is less than what we would have provided to support a 100% space reserve for every LUN in the aggregate.

This is especially true when we consider that the autosize option is typically combined with automatic deletion of snapshots.

The following command configures vol2’s autosize options:

In this case I have configured vol2 to grow to a max size of 1 Gigabyte in 100 megabyte steps.  I can limit the maximum size the volume can achieve and I can configure the size of the increments it uses to grow.

Automatically growing volumes is generally combined with automatic deletion of snapshots.  You can control which policy you want to implement first with the following option:

In this example vol2 will first try growing the volume before deleting snapshots.  Once the volume has reached maximum size it will start deleting snapshots to free blocks.  Remember, we are not changing the size of the LUN, we are adding fresh blocks to the volume which can be used to as provide the LUN substitute blocks for block updates or overwrites.

Which is better depends on your storage environment.  You can control this behavior individually for each volume.  If you tend to keep a lot of snapshots, perhaps some longer than necessary just to be safe, then it might be better to delete snapshots first.  If you have some extra space in your aggregate, perhaps for performance reasons, then it might be better to auto-grow.

Next we’ll take a look at how to control which snapshots are deleted.

Bookmark and Share

No Comments »

Space Management and Data ONTAP– Part 6

We have been looking at snapshot space management primarily from a NAS perspective.  From this perspective the primary technique used to deal with storage tied to the snapshots is the mechanism of the snapshot reserve.  And we have seen that all the snapshot reserve really does is set aside some space when the volume is created so that it will be available for overwrites of blocks that are tied to snapshots.  Essentially this is used to hide space usage from snapshots, but it does not affect our ability to execute a snapshot.

The situation with SAN and LUNs drives off of these ideas, but is slightly more complex.  We will be looking at this from the perspective of traditional volumes and some new capabilities available with flexible volumes.

Essentially, the problem is the same.  Once blocks in the WAFL file system are part of a snapshot they are protected.  They become read-only blocks, so to do updates we have to be able to supply unused blocks.  There is more than one way to do this.

In the world of traditional volumes we have limited options.  Traditional volumes are also aggregates.  They own their own disks and the problem has to be resolved within this context.  This is done through space reservations and the fractional reserve.

Here is an example of the lun setup script.  One of the prompts asks do we want the LUN to be space reserved and gives the explanation that this will guarantee that writes to the LUN will never fail.

Why do we need such a guarantee?  If we are not using snapshots then we really don’t need this guarantee.  But if we are taking snapshots then some blocks within the LUN will become read-only when the snapshot is taken.  Since the host operating system is not aware of the snapshot, Data ONTAP must have a pool of blocks to supply the host for overwriting blocks held in snapshots.  This pool is set aside when the LUN is created when we say “yes” to the space reserved prompt.

By default the size of this pool is equivalent to the size of the LUN.  This means the volume must be large enough to contain both the LUN, the space reserved for overwrites and the blocks tied up by snapshots.  To support a 100 MB LUN we would need volume that was at least 200MB plus some amount to support the blocks tied up in snapshots.

We can adjust the space reservations with a parameter called the fractional reserve.  The fractional reserve is applied to space reservations.  It is set to 100% by default so the block pool is able to support the LUN even if the host were to update every block in the LUN while simultaneously every block in the LUN were locked up in one or many snapshots.  This is why we can say that data writes will never fail.  If the number of blocks locked by a snapshot were to set up the situation where the number of free blocks left in the volume were less than the amount of space reserved then the snapshot would fail.  The number of free blocks in the volume would never be less than the number of blocks allocated to the LUN.

We can reduce the amount of space reserved by adjusting fractional reserve.  The following command would set the fractional reserve to 70% instead of 100% for vol2:

This decreases the amount of space reserved, but it is possible to create a situation where there may not be free blocks available in the volume when the host tries to update a block in the LUN.  If this happens the write will fail.

With traditional volumes, space reservations combined with a fractional reserve of 100% was the only way we could be sure LUN writes would never fail.  With flexible volumes and Data ONTAP 7.2, we have some new options.

Bookmark and Share

No Comments »

Space Management and Data ONTAP– Part 5

Eventually, as more files are changed, more and more blocks will be assigned from the snapshot reserve.  The df –k command will show the amount of space being used within the active file system and in the .snapshot directory.

There is nothing magical about the 20% number.  Sometimes we may want to decrease the percentage allocated to the snap reserve.  Sometimes we may want to increase it.  It all depends on the rate at which data changes in the volume and how long we want to keep our snapshots available.

Suppose I want to change the snapshot reserve in vol3 to 30% instead of 20%.

Notice the space allocation on vol3 has now changed.

As you can see, it is possible that more than 20% of the blocks within the volume may actually be assigned to the snapshot reserve.  This reduces the amount of space available to the active file system.  In this case we increased the space allocated to the snapshot reserve.  We could just as easily reduce the size of the snapshot reserve.

If these are NAS volumes – used for CIFS or NFS – then the creation and deletion of snapshots is probably being controlled by Data ONTAP’s snapshot scheduler.  This is configured either with the snap sched command or from filerview.  Filerview’s screen to modify the parameters for vol3 looks like this:

Notice we can change the percentage of space preallocated for snapshot blocks here as well as with the snap reserve command.

What I’m interested in here is the Number of Scheduled Snapshots to Keep options.  Logically, the longer I keep snapshots around the greater the difference will be between the blocks in the active filesystem and the oldest snapshot.  Remember that blocks pointed to by any snapshot cannot be updated – they are read-only - as long as the snapshot exists.

The snapshot scheduler controls how long the snapshots will be kept and when they will occur.  For example, weekly snapshots occur at midnight on Sunday nights.  If the scheduler is set to retain the last 6 weekly snapshots it will delete the oldest weekly snapshot after it has accumulated 6 weekly snapshots. The last 6 weekly snapshots will be maintained and older snapshots will be automatically deleted.

The same is true for nightly snapshots (which are taken at midnight every day except Sunday) and hourly backups (which are taken on the hours specified).  The number scheduled to be kept will limit how far back we can go to recover files.

Ideally equilibrium is reached, with the percentage of blocks allocated to the snapshot reserve adequate for the length of time we want to have the snapshots available.  As snapshots are automatically deleted blocks which are pointed to uniquely from that snapshot will be returned to the free space pool in the active file system.

If more blocks are needed than have been set aside by the snapshot reserve than they will be taken from the active file system.  (This is the default behavior and can be changed.) A user will see a situation where deleting files may not actually free any space.  This is because there are no available free blocks in the snapshot reserve.  Free blocks will not be available until some snapshots are deleted and, if no other snapshot points to them, the blocks will be returned for reuse.

In extreme situations, it is possible that the active file system may actually run out of blocks and it will be impossible to change files in the volume or add new files.  The storage system administrator will have to intervene, either by growing the volume or by deleting old snapshots so that blocks associated with them can be reused.

For volumes used to support NAS applications this is usually adequate.  This situation should be extremely rare.  After the system administrator deletes some snapshots or grows the volume, users can proceed normally.  For volumes that contain LUNs, the situation is more complex and   we’ll be looking at some additional options that might be useful for these volumes next time.

Bookmark and Share

No Comments »

Space Management and Data ONTAP– Part 4

We’ve been discussing how the WAFL file system allocates space for volumes, including the   meaning of space guarantees, space reservations and fractional reserve.  Now I’d like to look a little deeper at snapshots and space management from a snapshot perspective.

Basically, when a snapshot occurs all the blocks associated with data in the volume are frozen.  Sometimes people use the term snapshot “copy”.  This is accurate in a certain sense.  The snapshot looks like a copy of the data in the file system.  For the most part, we can generally treat the snapshot as if it were a read-only copy of the data in the file system.

In fact, the blocks of data were not copied.  They were simply frozen in place.  Each snapshot creates a version of the file system.  The blocks that belong to files associated with each version cannot be altered.  Currently WAFL supports up to 255 versions, or snapshots, per volume.

After the snapshot occurs we can continue to change files and create new files in the active version of the file system.  We can do this because WAFL supports multiple virtual inode systems to track each version of the filesystem.  Blocks are allocated from free space available to the volume.  By design, WAFL does not overwrite data.  It writes updates into free blocks.  Obviously this integrates beautifully with snapshots, but it does raise an interesting question:  since we can have multiple versions of the files in a volume how do we manage space utilization of the volume?  Across all the “versions” of the file system, I may have only a single version of one file while I could have 256 versions (counting the one in the active file system) of another.

Generally the way this is done is through something called the snapshot reserve.

By default, when a volume is created, 20% of the space within the volume is assigned to the snapshot reserve.  If I want to check this for a volume – vol3 in my example – I would use the following command:

You can also check with the df command.  For example:

You can see the amount of space allocated to each volume is divided into two categories: the active file system and the .snapshot directory.  When volumes vol1, vol2 and vol3 were created 100 megabytes were allocated to each.  For each volume, this has been subdivided into approximately 80 MB for the active file system and 20 MB for the .snapshot directory.

Suppose we were to create a file in vol3.  Some blocks would be allocated to contain this file in vol3.  If we take a snapshot of vol3, there is still only one version of the file.  No blocks will be allocated from the snapshot reserve at this point.  However if we make changes to this file then there will be some blocks which are different between the file that exists in the snapshot and the file that exists in the snapshot.  The blocks associated with the file in the snapshot that are different from the blocks in the active files system will be assigned, not from the active file system, but from the snapshot reserve.

This is even clearer with the example of a deleted file.  Suppose we take a snapshot of a volume that contains some file and then we delete the files from the active file system.  The blocks containing the files cannot actually be freed because the snapshot still references them.  As long as the snapshot exists the files will still be there.  However the space occupied by the blocks will now be counted against the snapshot reserve and an equivalent amount of free space in the snapshot reserve will be allocated to the active file system.  This makes it look, from a space perspective, like the files were actually deleted when in fact they continue to be stored in the volume and are now referenced through a .snapshot directory.

Bookmark and Share

No Comments »

Space Management and Data ONTAP – Part 3

The next term I want to take a look at is fractional reserve.  This term is often misunderstood.

Fractional reserve is applied to the space reserved blocks.  Without space reservations, fractional reserve has no meaning.  In the previous discussion, I described how space reservation is used to make sure there are enough free blocks in the volume to overwrite all the blocks in a LUN which are bound by snapshots.  Let’s take a look at a sample volume in my simulator:

Notice the fractional reserve is set to 100%.  This is the default.  It means that all of the blocks (100%) protected by space reservations will be shielded with free blocks.

If we were to fill up all the blocks in the LUN and then take a snapshot, space reservations would apply to all of the blocks in the LUN and the snapshot would fail unless the number of free blocks in the volume were at least equal to the number of blocks (in this case, all of them) that would be needed for overwrites in case the entire LUN were to be changed.

But this is a worst case scenario.  This is the default behavior and if we implement this solution then writes to the LUN will always succeed.  We will never be without free blocks, no matter how high the rate of change in the LUN.

In some scenarios, this solution is overkill.  We may know that the rate of change for this LUN is predictable and is relatively slow.  It is reasonable to set the fractional reserve to less than 100% in such a situation.

For example, suppose we are keeping 5 daily snapshots online for a database that is updating at a rate of 5% per day.  We can get a very accurate estimate of the historical rate of change by using the snap delta command.  Here is an example of the snap delta command:

Notice we are given the change rate in both KB changed between snapshots and also the rate of change in KB/hour.

Using SnapDrive, we can change the fractional reserve to 50% instead of 100%.  Now snapshots will succeed as long as we have blocks necessary to overwrite half of the block in the LUN.  Since we are only updating 5% per day this provides enough block for 10 days.  As long as we are automatically deleting snapshots after 5 days, we should never run out of space and we have an adequate safety margin even if the rate of change were to double for two or three days.  SnapDrive also has the ability to monitor and notify the storage administrator if thresholds for free space or rate of change are exceeded.

Leaving fractional reserve set to 100% is the most conservative solution and will ensure the LUN remains writable in even the worst case scenario. In some situations this is unnecessary and we can recover that space for other purposes.

Bookmark and Share

1 Comment »

Space Management and Data ONTAP– Part Two

The next topic I’d like to examine is space reservations.  Space reservation applies, not to volumes, but to files and LUNs.  In many ways, space reservations for files (and LUNs are a special case of a file) are very similar to space guarantees for volumes. 

Volume space guarantees mean that space is set aside to support the volume at the time it is created.   The same is true for files with space reservations enabled.  Space is set aside to support the file/LUN at the time it is created and before it is actually used.

A difference between the two is the way they interact with snapshots.  Blocks that are allocated to snapshots are taken from the pool of blocks that are allocated to the volume.  By default, 20% of the blocks in the volume are set aside for this purpose. 

When a snapshot occurs, blocks associated with files in the active file system get another pointer to identify them to the snapshot.  If blocks associated with a file are updated, they cannot be overwritten because that would change the snapshot data.  New blocks are allocated to the file in the active filesystem to support the update and the changed blocks associated with the previous version of the snapshot are allocated to the snapshot reserve.  As long as there are free blocks available within the volume updates succeed.  But if there were no free blocks available then we would not be able to update the file, because the blocks already associated to the file are locked by a snapshot and cannot be changed.

Let us suppose the file in question is a LUN.  The host operating system assigned to the LUN assumes it has exclusive control of the blocks assigned to it.  It assumes the blocks are part of a disk drive to which is has exclusive access.  It is completely unaware of the existence of snapshots taking place under the control of DATA ONTAP, so it is not aware that some of blocks within the LUN are not changeable. 

The purpose of space reservation is to ensure that space remains available in the volume so that updates to changed blocks will succeed.  The volume will have to substitute free blocks from the volume to the LUN for overwrites. 

This is implemented at the time the snapshot takes place.  Before the snapshot succeeds, DATA ONTAP checks to see if there will be enough blocks free after the snapshot to update the entire LUN.  If a single LUN were the sole occupant of the volume, then space reservations would ensure that the number of free blocks in the volume was equal to or greater than the number of blocks in the LUN that are tied to snapshots.   This is an important distinction.  Available blocks in the LUN that are not tied to snapshots do not need to be protected.  They can be changed.  Only the blocks that are tied to snapshots need to be protected by free blocks.  If this rule is violated – that is, there are not enough free blocks to provide overwrite protection - then the snapshot fails.

 

Bookmark and Share

No Comments »

Space Management and Data ONTAP– Part One

The subject of space management always seems to generate some interesting discussions, particularly in my Data ONTAP Fundamentals class.  I think part of the reason for this is the terminology surrounding some of these concepts.

The first concept I want to clarify is space guarantees as they apply to flexible volumes.  A flexible volume is actually a WAFL file system inside an aggregate.  It is possible to have one or more of this volume/filesystems inside a single aggregate.  This seems simple enough, but, as with many things WAFL related, it is not as simple beneath the surface.  WAFL just behaves differently compared with other filesystems that I am familiar with.  For example, one of the things I noticed when I first began working with Data ONTAP is that WAFL filesystems were never formatted.

Normally, when a filesystem is created a particular set of block is assigned to that filesystem and   then organized by writing certain data structures onto those blocks.   A WAFL filesystem behaves more like the concept of sparse files in UNIX.  When a sparse file is created, the meta data is updated like a normal file, but blocks containing empty space are not updated.  The actually size of the file and the apparent size of the file will be different.  The actual size of the file will change as data updated or written into the file, but the apparent size of the file, as reported in the file system will not change.

WAFL volumes are like this.  When the volume is created, metadata within the aggregate is updated to reflect the presence of the volume, but specific blocks are not actually assigned to the volume until we need them to store something.  When a volume is created we have the several options that control when the space for that volume will be set aside, but specific blocks will not be assigned until they are needed.

If space guarantee is set to volume – the default behavior - then a certain number of blocks from the total in the aggregate will be set aside for use by that volume.  This “guarantees” that I will have enough space within the aggregate to supply that volume.  Here that looks like from FilerView:

img1_112309

We can check the settings on a volume by using the vol options command:

img2_112309

Notice here that guarantee=volume is set for vol3.

If space guarantees are set to none when the volume is created, metadata within the aggregate is updated to reflect the presence of the volume, but the number of blocks that would reflect the size of the volume is not subtracted from the total number of blocks in the aggregate.  In this case, it is possible to create a volume bigger than the aggregate.  This is called “thin provisioning.”   If this volume were a Windows share, then a windows user who checked the properties value for the volume would see the volume size that the storage administrator used to create the volume, but he would not be able to actually write more data into the volume than could be accommodated by the aggregate.

If is also possible to have more than one volume in an aggregate without space guarantees.  In this situation, blocks will be assigned to the volumes as they are needed on a first come, first served basis.  Anyone looked at the volume from the user’s perspective would see the volume size the storage admin created, even though there are not enough blocks to supply the space reported.

In either case, it is up to storage administrator to stay ahead of the situation and grow the aggregate before the blocks are actually needed.

Bookmark and Share

2 Comments »

Flash Drives and NetApp

NetApp recently announced their direction on flash drives. The most interesting aspect of this was not just that they intend to support them, but that support will be implemented not just as plug in disk drives, but also as an extension of the WAFL cache.

One of the problems with spinning disk drives as we know them is the capacity/performance tradeoff. Drive capacity has been increasing at a faster rate than drive performance.

In many, if not most, workloads there is a relationship between capacity and demand. It may or may not be linear, but it is there. The more capacity I have the more requests there will be to access that data. If we put larger and larger amounts of data behind fewer spindles then we are bound to have performance problems. We will be constrained by the performance of the spindles. This problem will be greatest for heavy random read workloads. WAFL is very efficient and turning random writes into sequential writes, so the problem is more likely to occur with reads.

The traditional way to attack this problem is to increase the number of disk drives. This means buying more capacity than we need to support the number of IOPs we require. In addition to the up-front cost of the drives, we are going to be spending more for power and cooling to keep those drives spinning.

Recently, NetApp announced Performance Accelerator Modules. These are 16GB DRAM-based intelligent read cache boards that plug into PCIe slots in the storage controller. By increasing the available memory for WAFL to cache read data, PAM modules can substantially reduce latency of read requests. Since these requests are fulfilled without ever reaching the storage stack, they are very fast – faster even then the request could be returned from a flash drive. NetApp is currently working a second generation PAM cards that will use higher density Flash technology to increase the capacity of these cards.

The issues with Flash drives revolve primarily around price/performance. Flash drives can provide 20-30 times the random-read performance of spinning disks at 10 times the cost per megabyte. The current high cost eliminates many applications, but they are suitable for high ROI applications. Where they are cost effective, flash drives may be an excellent solution.

In lesser environments, caching solutions may be more cost effective. Caches dynamically adjust to changing workloads, holding the hottest data at any given point in time. This may provide the optimal solution when resources are scarce.

Bookmark and Share

No Comments »

Older Entries »