Performance Acceleration Module

Since I come from a bit of a database background, I often look at storage performance through a DBA’s lense.  I looked askance at the trend toward ever large disk drives.  More spindles mean better performance and drive capacity has been growing much faster than performance.

In the database world, one of the ways we deal with this problem is with a large memory cache.  (In Oracle, this cache is called the SGA).  Ideally, most database read requests can be satisfied from local cache memory.  Not only does this dramatically reduce latency on read requests but this load is removed from the storage system so it can use the freed bandwidth for other operations.  Obviously, this doesn’t work for every workload but makes a dramatic difference in many common situations.

For other workloads the problem is not so easily solved.  We depend on the memory and read algorithms of the storage system.  But what if these are not enough?  In the past we have not been able to increase the RAM in a Netapp storage controller, so our only real option to increase the performance of the storage system has been to increase the number of drives and spread the load across more spindles.  This may be expensive, not only in terms of initial cost, but also in cooling and electrical requirements.

So Netapp has come up with a solution.  It’s called PAM, for Performance Acceleration Module.  PAM is a PCIe card with 16GB of DDR2 SDRAM.  It is integrated into Data ONTAP with a new technology called FlexScale.  FlexScale is a license product.  To implement PAM requires both the hardware and a license.

Basically PAM extends Data ONTAP’s read cache.  It will have the greatest impact on highly random file intensive workloads such as home directories, but it should improve performance in a wide variety of situations.

Three different algorithms are supported.  The default mode works basically as an extension of Data ONTAP’s normal read cache.  In this mode both data and metadata are cached.

The second algorithm caches only metadata on the accelerator.  This may be more effective with workloads where data is not re-accessed, but the metadata may be reused.  This may also be effective with large working sets that are too large to fit in memory but the set of metadata associated to it will be accommodated.

Finally, there is something called Low-Priority mode.  This causes data that the standard algorithms discard to be saved in memory.  Normally this data is discarded to prevent it from pushing higher value data (data more likely to be re-used) from being pushed out of the cache.  If the access pattern is such that data is commonly reused after a time lag, this may be effective.

Different storage systems support different numbers of PAMs, with the larger systems able to use more of them.

If this sounds interesting to you, here is a link to get more information.

No Comments »

FAS3100 series

Those of you who follow Netapp may have noticed the arrival of two new models, the FAS3140 and FAS3170.  These systems are interesting for a number of reasons, but the one that particularly caught my eye is the form factor.

For many years now, Netapp has offered systems in the lower part of their range that are often referred to as “shrunken heads”.  These are the FAS250 and the FAS270.  They stood out from the rest of Netapp’s product line because they are not “dedicated” heads, but shared space in the back of a disk shelf.  From the front, all you see are drives.  And in the case of the FAS270C, it was possible to have a Netapp active-active cluster contained within the back of a single drive tray.

This made for an economical system, but there is no slot available on a 270, so expansion was limited to additional disk trays.  The FAS2020 carries on this idea, while the FAS2050 does as well, but adds a single slot for expansion.  In all these cases, the cluster interconnect between the heads is entirely internal.

Contrast this with the rest of Netapp’s lineup.  On all the midrange and high-end systems there is a combination cards which takes up a slot for the cluster interconnect.  This means that a FAS3000 series machine, which has 4 slots available in a standalone configuration, drops down to three available when configured as an active-active cluster configuration.  Each head loses a slot to the NVRAM/Cluster interconnect card.

Not with the FAS3140 and FAS3170.  Like the previous mid-range and higher machines, the head electronics get a dedicated “shelf”, but unlike the previous machines, the form factor for this “shelf” contains the electronics for two heads.  And like the lower-end systems, the cluster interconnect is internal so it doesn’t subtract from the slot count.

This is great news.

A FAS3000 series running in active-active cluster configuration would have a total of 6 slots available (3 in each head).  The FAS3100 machines will have eight (4 in each head) in the same configuration, providing an improvement in flexibility as well as performance.

No Comments »

Traditional volumes in Data ONTAP 7.0

Traditional volumes in Data ONTAP 7.0

In my previous entry I discussed RAID groups and aggregates in the context of flexible volumes.  For the sake of completeness, I am now going to discuss traditional volumes.  This should not be taken as an endorsement.  You should be using flexible volumes.  They have many significant advantages over traditional volumes.  Flexible volumes were introduced in Data ONTAP 7.0 and hopefully by now everyone has converted over to them.

In Data ONTAP 7.0, the concept of aggregates and flexible volumes were introduced. The older traditional volumes had dedicated RAID groups that belonged to the volume.  In effect, the aggregate was implied.  In Data ONTAP 7.0 and later we see this implemented.

I have created a traditional volume on my simulated filer.  I used the following command to create this volume:

vol create 3

Notice what happens if I do an aggr status:

The traditional volume is listed as an aggregate.  Notice what happens after a vol status command:

Vol 3 is also listed as a volume.  So a traditional volume has the qualities of an aggregate and the qualities of a volume.

Let’s take a look at the output of sysconfig –r:

Look at /vol3/plex0/rg0.  You can see that vol3 has its own dedicated RAID group.

This is what defines a traditional volume.  It is both an aggregate with associated disks that are assigned to RAID groups and a volume that presents logical storage for applications to access through a protocol stack.

Notice also that it has a single parity disk.  By default, traditional volumes use RAID-4 rather than RAID-dp.  This makes sense when you consider that by dedicating RAID groups to each volume you are more likely to have smaller RAID groups and more numerous RAID groups.

You can see from this the great advantage of aggregates and flexible volumes.

Instead of dedicating disk spindles to a volume I can group a large number of disks to an aggregate.  Then I can assign a volume to that aggregate, reserving exactly the amount of space that it needs, rather than a multiple of the drive size.  My volume will have better performance, because its I/O demands are spread over a large number of disks.  Many volumes can share the space available in the aggregate which provides a large pool of iops that all the volumes draw from.

Also, by building larger RAID groups and then protecting those RAID groups with two parity drives, the aggregate potentially improves efficiency (possibly fewer parity drives needed in total) and provides greater protection when a drive fails.

No Comments »

RAID Groups and Aggregates

In the course of teaching Netapp’s Data ONTAP Fundamentals course I have noticed that one of the areas that student’s sometimes struggle with are RAID groups as they exist in Data ONTAP.

To begin with, Netapp uses dedicated parity drives, unlike many other storage vendors. Parity information is constructed for a horizontal stripe of WAFL blocks in a RAID group within an aggregate and then written to disk at the same time the data disks are updated. The width of the RAID group – the number of data disks – is independent of the parity disk or disks. Take a look at this print screen from Filerview:

Notice that the RAID group size is 16. This is the default RAID group size for RAID-DP with Fibre Channel disks. Notice also that the number of disk in Aggr1 is actually 5.

When I created aggr1 I used the command:

aggr create aggr1 5

This caused Data ONTAP to create an aggregate named aggr1 with five disks in it. Let’s take a look at this with sysconfig –r:

If you notice aggr1, you can see that it contains 5 disks. Three disks are data disks and there are two parity disks, parity and dparity. The RAID group was created automatically to support the aggregate. I have a partial RAID group in the sense that the RAID group size is 16 (look at the Filerview screen shot). I only asked for an aggregate with 5 disks, so aggr1 has an aggregate with one RAID group and 5 disk drives in it.

It is fully usable in this state. I can create volumes for NAS or SAN use and they are fully functional. If I need more space, I can add disks to the aggregate and they will be inserted into the existing RAID group within the aggregate. I can add 3 disks with the following command

aggr add aggr1 3

Look at the following output:

Notice that I have added three more data disks to /aggr1/plex0/rg0.

The same parity disks are protecting the RAID group.

Data ONTAP is able to add disks from the spare pool to the RAID group quickly if the spare disks are pre-zeroed. Before the disks can be added, they must be zeroed. If they are not already zeroed, then Data ONTAP will zero them first. This may take a significant amount of time. Spares as shipped by Netapp are pre-zeroed, but drives that join the spare pool after you destroy and aggregate are not.

The inserted disks are protected by the same parity calculation that existed on the parity drives before they were inserted. This works because the new WAFL blocks that align with the previous WAFL blocks in a parity stripe contain only zeroes. They new (zeroed) disks have no affect on the parity drives.

Once the drives are part of the RAID groups within the aggregate, that space can be made available to volumes and used by applications.

An aggregate can contain multiple RAID groups. If I had created an aggregate with 24 disks, then Data ONTAP would have created two RAID groups. The first RAID group would be fully populated with 16 disks (14 data disks and two parity disks) and the second RAID group would have contained 8 disks (6 data disks and two parity disks). This is a perfectly normal situation.

For the most part, it is safe to ignore RAID groups and simply let Data ONTAP take care of things. The one situation you should avoid however is creating a partial RAID group with only one or two data disks. (Using a dedicated aggregate to support the root volume would be an exception to this rule.) Try to have at least three data disks in a RAID group for better performance.

There is a hierarchy to the way storage is implemented with Data ONTAP. At the base of the hierarchy is the aggregate, which is made up of RAID groups. The aggregate provides the physical space for the flexible volumes (flexvols) that applications see. Applications, whether SAN or NAS, pull space that has been assigned to the volume from the aggregate and are not aware of the underlying physical structure provided by the aggregate.

This is why we say that the aggregate represents the physical storage and the volumes provide the logical storage.

No Comments »

FAS2000

For a change of pace, I would like some feedback from users of the new FAS2000 series storage controllers.

For those who are unfamiliar with the FAS2000 series, this is NetApp’s new low end solution, filling a similar role to the FAS200 “shrunken head” series systems.  These are systems whose electronics are actually housed in the back of a disk tray.  The 2020 is most similar to the existing 200 series machines, as it has no expansion slot.  The 2050 has space for more drives and also has one expansion slot, so this system is significantly more flexible than the old 200 series.

One aspect of the FAS2000 series that I find particularly interesting is that they use 2.5 inch SAS disk drives.  This is the first use of this technology by NetApp.  Also, I understand that these drivers are rotating at 15,000 rpm, so they should be quite fast.  Looking at Seagate’s website, Seagate claims these are the fastest drives they manufacture.  Coming from a DBA perspective I find these drives very attractive.  I am uncomfortable with the trend of ever larger drives placing more and more storage behind a single head.  This is fine for backup products like SnapVault, perhaps even for home directories, but not good for database performance.

So for any FAS2020 and FAS2050 users out there:  what are you using these systems for and, if you are using the new SAS drives, what are your impressions?  Are you pleased with the performance?  Any reliability issues?

Thanks for responding.

No Comments »

Moving Root

You simulator is now usable, but you may have noticed aggr0, which contained the root volume (/vol/vol0) is raid type RAID 0. This is a characteristic of the simulator as it comes from Netapp. However, just like a real Network Appliance storage system, it is possible to move the root volume. I would like to do this in the interest of making the simulator configuration closer to the real hardware.

If you go to the front page of filerview there is a selection “Documentation specific to the filer”. You should see something like this:

Select Frequently asked Question and notice the first item on the list:

We’re basically going to go through the steps listed in answer to the question, “What is the procedure to convert vol0 from RAID 0 to RAID 4?”

Do a “sysconfig –r” from your storage system’s command line. You should see this:

Look at aggr0. Notice it contains 3 data disks, v4.16, v4.17, and v4.18. There are no parity drives and the raid type is raid0. You can see the attributes for vol0 by entering the command “vol status vol0”

You should see this:

So the first thing we need to do is create an aggregate where we are going to place our new root volume. Use the following command:

aggr create aggr1 3

This will create an aggregate with three disks. Once it is complete, enter the following command:

sysconfig –r

You should see the following:

Now we have an aggregate protected by parity. By default, when creating an aggregate the system will use RAID DP. Notice the two parity disks. (Of course, these are virtual disks which exist as files on the host OS).

Create the volume which will become our new new root volume by entering the following command:

vol create newvol aggr1 350m

If you type “df –m” you can compare the size of the existing root volume and the new volume. The new volume must be at least as large as the existing root volume.

Next, place the new volume in a restricted state with the following command:

vol restrict newvol

Before we can use the vol copy command, the target volume must be in a restricted state. This is because vol copy does a sequential copy of all the blocks from the source volume to the target. The target volume must be inaccessible while this process is continuing. Enter the following command:

Vol copy start vol0 newvol.

You should see the following output:

The copy process will not take very long. Once it is complete we can bring the volume back online with the following command:

vol online newvol

Then use the following command to make this the new root volume.

vol options newvol root

You should see something like the following output:

Notice the messages regarding the mailbox disk. Data ONTAP uses the mailbox disk as an alternate heartbeat path for the cluster software. If the heartbeat from the cluster interconnect should cease, each head will check the mailbox disk to see if the other head is still leaving messages. If it is, then the head may be operating properly and a problem may exist in the cluster interconnect.

In a future column we will remove the old aggr0 aggregate and the raid0 disks.

No Comments »

Ready To Go

Your Netapp simulator is now operational. It has a name and an IP address and you should now be able to access it through Filerview.

If you go to your web browser and enter the IP address of your filer you should see something like this:

At this point, we’re interested in the entry “Documentation specific to the simulator”. Select that option. You will see the following options:

The one we’re going to look at first is license keys. Take a look.

Here are simulator specific license keys for most of the Netapp product line. Notice there are keys for ISCSI but not FCP. FCP is one of the few products the filer does not support, but there are plenty of others. Clustering is supported, snapmirror, multistore, flexclone, snaprestore, snapvault … pretty much everything is here.

Go to your console and type license:

Notice that licenses are already installed for CIFS, NFS, ISCSI and snaprestore.

Next let’s go back to “documentation specific to the simulator” options and then select “frequently asked questions.”

There is a lot of good information here as well as on the installation instructions screen.

We’re going to take a look at the first one, converting vol0 from RAID 0 to RAID 4, next time. If you go to your console, or better yet, a telnet session into your simulator and type “sysconfig –r” you should see something like this:

Notice that first group of disks in aggr0. The RAID type is RAID0. Ordinarily (unless you are using a vfiler) you would not see aggregates with a raid type of RAID zero. They should be RAID 4 or RAID DP. To help keep the image size of the simulator smaller, (there is no parity information) Netapp has made the aggregate that supports /vol/vol0 RAID type 0.

In the interests of making the simulator behave closer to a real hardware storage controller our next project will be making a boot volume on an aggregate with normal RAID support. Also, we will be replacing the 126 MB drives with 526 MB drives.

No Comments »

NetApp Data ONTAP - Final Steps Simulator Disks

Your Netapp simulator is now operational, but there are still a couple of things to do.

As it sits, you have not configured the network. The Linux window where you have been working is effectively the console. This is the window where you ran the command:

/sim/runsim.sh

This may be a window in your Linux GUI, or it may be a telnet window. Remember, if you close this window the process associated with your simulator will be killed, so you must keep this window open. If you are ever in doubt about the status of your simulator process you can use the following command:

ps –ef | grep maytag.L

You should see a line which includes the word “maytag.L” plus the command line options which are passed to the maytag command through the runsim.sh script. This is the simulator process running within Linux.

From your simulator’s command line enter the following command:

ifconfig –a

You should see something like this, except you may not have addresses in place.

Notice the name of the network devices. Normally you would see device names like e0a. The simulated Ethernet ports are named ns0 and ns1. Although the names are different, they are configured just like normal Data ONTAP network devices.

At this point we are ready to run the setup command. Just like a real Netapp storage system, the setup script will be start automatically if there is no /etc/rc file on your storage controller. It will walk you through setting up your network configuration.

Initially you will be prompted to name your storage system. Then you will be asked if you want to setup virtual interfaces. I usually answer no to this question. You will then be asked for the address of ns0 and its net mask, followed by the same information for ns1. You will need to provide the appropriate information for your network environment. It is not necessary to configure ns1 at this point if you don’t want to.

Next you are asked if you want to continue configuration to the web interface. I usually answer no and continue with the command line. You will be asked for a default gateway address. Enter the correct gateway, or default router, for your environment.

Then you will be asked for the address of your administrative host. This is the computer from which you intend to administer your storage system. It can be either a Windows or a UNIX host machine. The script will add this machine as a trusted host in the /etc/hosts.equiv file.

You will be asked to enter your time zone and language. You can make selections appropriate for your environment.

We are almost to the end. You will be asked if you want to run the DNS resolver. I usually answer this yes. Finally you wild be asked for your DNS domain name and be given the opportunity to enter up to three name servers. Last you will be asked if you want to run NIS. At this time, enter no.

Your setup is complete.

No Comments »

NetApp Data ONTAP - Preparing the simulator disks

This is where we left off. The install script prompted us through creating virtual disks that will be used by the simulator.

The install script leaves us with the instructions for starting the simulator, so let’s start the simulator with the command:

/sim/runsim.sh

(This may vary if you installed the simulator in a different directory.) You will see a great many error messages as the simulator starts. Most of these are related to the disks that we created. You should see raid.config.disk.bad.label:error messages for each disk. Finally you will end up at the login prompt. Go ahead and log in as root.

Once you are logged in, type the command “sysconfig –r” and you will see something like this:

This shows the new disks that we created during the install. Notice they all have “bad label” under the RAID disk column.

To repair the disk we will need to use an advanced mode command, so enter the command “priv set advanced”.

You will get a warning message after entering the command and then the command line prompt will change to indicate that you are no longer in administrative mode. Also notice the “Device” column from the previous. This indicates the device names for the new drives we created with the install script. These names are different from what you would see on real hardware, reflecting the fact that you are using the simulator.

Type the command “disk unfail –s v4.19” to repair the disk label for device 4.19. You should see the following response:

Disk v4.19 will be placed in the spare pool. You will need to run this command for each disk. Once you have run the disk unfail command for each drive you, run “sysconfig –r” again. You should see something like the following output:

The first column output should now indicate that the disks are spares. Now type the command “disk zero spares”.

If you wait a few minutes and type “sysconfig –r” you should see output similar to the following:

The drives are now being zeroed. One this is complete, the drives will be ready to include in aggregates or traditional volumes. (You don’t have to pre-zero the drives, but if you try to use a non-zero drive, Data ONTAP will zero the drives before inserting them into RAID groups for your aggregates.

It is possible you may get an error while this process is running. Until now the virtual disk did not take much physical space. When the zeroing process is complete, the virtual disk will exist physically as files within your Linux system, taking the same amount of space as the virtual disks you selected. Make sure you have enough free space to support the combination drive size and number of drives you selected during the install.

When the drive zeroing process completes, NetApp Data ONTAP will indicate it is done with the following messages:

Finally, if you type “sysconfig –r” you should see the following output:

The drives are now in the spare pool and ready to use.

No Comments »

NetApp Data ONTAP - Installing Your Simulator

Installing your simulatorOnce you have downloaded the simulator you are ready to install. First you need a Linux host. Earlier versions of the sim – before 7.0 – required Red Hat 7.1. They would not run on anything else. Since Data ONTAP 7.0 and later the simulator has run on every version of Linux that I have tried. This has included Red Hat, various Red Hat Clones, Suse Linux as well as Debian.

In addition, the system running Linux should have two Ethernet interfaces. They can be real physical interfaces or, if you are running in a vm, they can be virtual interfaces configured as part of the virtual machine.

The reason for this is that Data ONTAP will take control of an interface that will be used for the simulated storage system. At install time it will ask you which interface to use. The default is Eth0. Therefore you will need another interface to communicate to the Linux system. (Usually Eth1, if you accept the default.)

The install:

Once you have downloaded the simulator tgz file from Netapp and copied it to your Linux box, you are ready to begin. The simulator must run with root privileges, so I usually do the entire install with the root account.

Go to the directory were you loaded the simulator file and run the following commands:

gunzip 7.2-tarfile-v22.tgz
tar xvf 7.2-tarfile-v22.tar

The file you downloaded may have a different name. If so, substitute the file names that correspond to the file you downloaded
Usually there is a script called is setup.sh which will create your simulator. Enter the following command:

./setup.sh

This will walk you through the setup.

As you notice from the script output above, you have the option of installing the simulator as a cluster. Later we will make use of this feature, but for now I did not install as a cluster.

Notice we were given an option of which host interface to use. I took the default: Eth0. Generally the default memory size of 128 is adequate but you can assign more if you’d like.

Finally we are given the option of installing more disks. The simulator has 3 disks already, but they are only 100 MB virtual disks. To do anything useful, we will need more space. Although the disks are virtual, the space the take within you Linux host is real, so you may need to adjust this to fit the storage situation on your Linux host.

I added 11 drives, filling out my virtual shelf. I also chose option e for drive size. This is a fairly useful size though you could certainly go large if you wished. As you see, the virtual drives are then created and the script ends.

There is a catch here. The drives created have bad headers. Next time we’ll cover how to repair the drives and your simulator will be ready to do some work.

1 Comment »

Older Entries »