With the preview of Windows Azure Virtual Machines, we have two new special types of blobs stored in Windows Azure Storage: Windows Azure Virtual Machine Disks and Window Azure Virtual Machine Images. And of course we also have the existing preview of Windows Azure Drives. In the rest of this post, we will refer to these as storage, disks, images, and drives. This post explores what drives, disks, and images are and how they interact with storage.
Virtual Hard Drives (VHDs)
Drives, disks, and images are all VHDs stored as page blobs within your storage account. There are actually several slightly different VHD formats: fixed, dynamic, and differencing. Currently, Windows Azure only supports the format named ‘fixed’. This format lays the logical disk out linearly within the file format, such that disk offset X is stored at blob offset X. At the end of the blob, there is a small footer that describes the properties of the VHD. All of this stored in the page blob adheres to the standard VHD format, so you can take this VHD and mount it on your server on-premises if you choose to. Often, the fixed format wastes space because most disks have large unused ranges in them. However, we store our ‘fixed’ VHDs as a page blob, which is a sparse format, so we get the benefits of both the ‘fixed’ and ‘expandable’ disks at the same time.
Uploading VHDs to Windows Azure Storage
You can upload your VHD into your storage account to use it for either PaaS or IaaS. When you are uploading your VHD into storage, you will want to use a tool that understands that page blobs are sparse, and only uploads the portions of the VHD that have actual data in them. Also, if you have dynamic VHDs, you want to use a tool that will convert your dynamic VHD into a fixed VHD as it is doing the upload. CSUpload will do both of these things for you, and it is included as part of the Windows Azure SDK.
Persistence and Durability
Since drives, disks, and images are all stored in storage, your data will be persisted even when your virtual machine has to be moved to another physical machine. This means your data gets to take advantage of the durability offered by the Windows Azure Storage architecture, where all of your non-buffered and flushed writes to the disk/drive are replicated 3 times in storage to make it durable before returning success back to your application.
Drives (PaaS)
Drives are used by the PaaS roles (Worker Role, Web Role, and VM Role) to mount a VHD and assign a drive letter. There are many details about how you use these drives here. Drives are implemented with a kernel mode driver that runs within your VM, so your disk IO to and from the drive in the VM will cause network IO to and from the VM to your page blob in Windows Azure Storage. The follow diagram shows the driver running inside the VM, communicating with storage through the VM’s virtual network adapter.
PaaS roles are allowed to mount up to 16 drives per role.
Disks (IaaS)
When you create a Windows Azure Virtual Machine, the platform will attach at least one disk to the VM for your operating system disk. This disk will also be a VHD stored as a page blob in storage. As you write to the disk in the VM, the changes to the disk will be made to the page blob inside storage. You can also attach additional disks to your VM as data disks, and these will be stored in storage as page blobs as well.
Unlike for drives, the code that communicates with storage on behalf of your disk is not within your VM, so doing IO to the disk will not cause network activity in the VM, although it will cause network activity on the physical node. The following diagram shows how the driver runs in the host operating system, and the VM communicates through the disk interface to the driver, which then communicates through the host network adapter to storage.
There are limits to the number of disks a virtual machine can mount, varying from 16 data disks for an extra-large virtual machine, to one data disk for an extra small virtual machine. Details can be found here.
IMPORTANT: The Windows Azure platform holds an infinite lease on all the page blobs that it considers disks in your storage account so that you don’t accidently delete the underlying page blob, container, nor the storage account while the VM is using the VHD. If you want to delete the underlying page blob, the container it is within, or the storage account, you will need to detach the disk from the VM first as shown here:
And then select the disk you want to detach and then delete:
Then you need to remove the disk from the portal:
and then you can select ‘delete disk’ from the bottom of the window:
Note: when you delete the disk you are not deleting the disk (VHD page blob) in your storage account. You are only disassociating it from the images that can be used for Windows Azure Virtual Machines. After you have done all of the above, you will be able to delete the disk from your storage account, using Windows Azure Storage REST APIs or storage explorers.
Images (IaaS)
Windows Azure uses the concept of an “Image” to describe a template VHD that can be used to create one or more Virtual Machines. Windows Azure and some partners provide images that can be used to create Virtual Machines. You can also create images for yourself by capturing an image of an existing Windows Azure Virtual Machine, or you can upload a sysprep’d image to your storage account. An image is also in the VHD format, but the platform will not write to the image. Instead, when you create a Virtual Machine from an image, the system will create a copy of that image’s page blob in your storage account, and that copy will be used for the Virtual Machine’s operating system disk.
IMPORTANT: Windows Azure holds an infinite lease on all the page blobs, the blob container and the storage account that it considers images in your storage account. Therefore, to delete the underlying page blob, you need to delete the image from the portal by going to the “Virtual Machines” section, clicking on “Images”:
Then you select your image and press “Delete Image” at the bottom of the screen. This will disassociate the VHD from your set of registered images, but it does not delete the page blob from your storage account. At that point, you will be able to delete the image from your storage account.
Temporary Disk
There is another disk present in all web roles, worker roles, VM Roles, and Windows Azure Virtual Machines, called the temporary disk. This is a physical disk on the node that can be used for scratch space. Data on this disk will be lost when the VM is moved to another physical machine, which can happen during upgrades, patches, and when Windows Azure detects something is wrong with the node you are running on. The sizes offered for the temporary disk are defined here.
The temporary disk is the ideal place to store your operating system’s pagefile.
IMPORTANT: The temporary disk is not persistent. You should only write data onto this disk that you are willing to lose at any time.
Billing
Windows Azure Storage charges for Bandwidth, Transactions, and Storage Capacity. The per-unit costs of each can be found here.
Bandwidth
We recommend mounting drives from within the same location (e.g., US East) as the storage account they are stored in, as this offers the best performance, and also will not incur bandwidth charges. With disks, you are required to use them within the same location the disk is stored.
Transactions
When connected to a VM, disk IOs from both drives and disks will be satisfied from storage (unless one of the layers of cache described below can satisfy the request first). Small disk IOs will incur one Windows Azure Storage transaction per IO. Larger disk IOs will be split into smaller IOs, so they will incur more transaction charges. The breakdown for this is:
- Drives
- IO < 2 megabytes will be 1 transaction
- IO >= 2 megabytes will be broken into transactions of 2MBs or smaller
- Disks
- IO < 128 kilobytes will be 1 transaction
- IO >= 128 kilobytes will be broken into transactions of 128KBs or smaller
In addition, operating systems often perform a little read-ahead for small sequential IOs (typically less than 64 kilobytes), which may result in larger sized IOs to drives/disks than the IO size being issued by the application. If the prefetched data is used, then this can result in fewer transactions to your storage account than the number of IOs issued by your application.
Storage Capacity
Windows Azure Storage stores pages blobs and thus VHDs in sparse format, and therefore only charges for data within the VHD that has actually been written to during the life of the VHD. Therefore, we recommend using ‘quick format’ because this will avoid storing large ranges of zeros within the page blob. When creating a VHD you can choose the quick format option by specifying the below:
It is also important to note that when you delete files within the file system used by the VHD, most operating systems do not clear or zero these ranges, so you can still be paying capacity charges within a blob for the data that you deleted via a disk/drive.
Caches, Caches, and more Caches
Drives and disks both support on-disk caching and some limited in-memory caching. Many layers of the operating system as well as application libraries do in-memory caching as well. This section highlights some of the caching choices you have as an application developer.
Caching can be used to improve performance, as well as to reduce transaction costs. The following table outlines some of the caches that are available for use with disks and drives. Each is described in more detail below the table.
Type |
Purpose |
Data Persistence |
|
FileStream |
Memory |
Improves performance and reduces IOs for sequential reads and writes. |
Writes are not automatically persisted. Call “Flush” to persist writes. |
Operating System Caching |
Memory |
Improves performance and reduces IOs for random and sequential reads and writes. |
Writes are not automatically persisted. Use “Write through” file handles, or “Flush” to persist writes. |
Window Azure Drive Caches |
Memory And Disk |
Reduces read transactions to storage. Can improve performance for sequential IO, depending on workload. |
Writes are automatically persisted. Use “Write through” file handles, or “Flush” to know writes are persisted. |
Windows Azure Virtual Machine Disk Caches |
Memory And Disk |
Reduces transactions to storage. Can improve performance for sequential IO, depending on workload. Improves boot time. |
Writes are automatically persisted. Use “Write through” file handles, or “Flush” to know writes are persisted. |
No Disk or Drive Cache |
N/A |
Can improve performance for random and sequential IO, depending on workload. |
Writes are automatically persisted. Use “Write through” file handles, or “Flush” to know writes are persisted. |
FileStream (applies to both disks and drives)
.NET framework’s FileStream class will cache reads and writes in memory to reduce IOs to the disk. Some of the FileStream constructors take a cache size, and others will choose the default 8k cache size for you. You can not specify that the class use no memory cache, as the minimum cache size is 8 bytes. You can force the buffer to be written to disk by calling the FileStream.Flush(bool) API.
Operating System Caching (applies to both disks and drives)
The operating system itself will do in-memory buffering for both reads and writes, unless you explicitly turn it off when you open a file using FILE_FLAG_WRITE_THROUGH and/or FILE_FLAG_NO_BUFFERING. An in-depth discussion of the in memory caching behavior of windows is available here.
Windows Azure Drive Caches
Drives allow you to choose whether to use the node’s local temporary disk as a read cache, or to use no cache at all. The space for a drive’s cache is allocated from your web role or worker role’s temporary disk. This cache is write-through, so writes are always committed immediately to storage. Reads will be satisfied either from the local disk, or from storage.
Using the drive local cache can improve sequential IO read performance when the reads ‘hit’ the cache. Sequential reads will hit the cache if:
- The data has been read before. The data is cached on the first time it is read, not on first write.
- The cache is large enough to hold all of the data.
Access to the blob can often deliver a higher rate of random IOs than the local disk. However, these random IOs will incur storage transaction costs. To reduce the number of transactions to storage, you can use the local disk cache for random IOs as well. For best results, ensure that your random writes to the disk are 8KB aligned, and the IO sizes are in multiples of 8KB.
Windows Azure Virtual Machine Disk Caches
When deploying a Virtual Machine, the OS disk has two host caching choices:
- Read/Write (Default) – write back cache
- Read - write through cache
When you setup a data disk on a virtual machine, you get three host caching choices:
- Read/Write – write back cache
- Read – write through cache
- None (Default)
The type of cache to use for data disks and the OS disk is not currently exposed through the portal. To set the type of host caching, you must either use the Service Management APIs (either Add Data Disk or Update Data Disk) or the Powershell commands (Add-AzureDataDisk or Set-AzureDataDisk).
The read cache is stored both on disk and in memory in the host OS. The write cache is stored in memory in the host OS.
WARNING: If your application does not use FILE_FLAG_WRITE_THROUGH, the write cache could result in data loss because the data could be sitting in the host OS memory waiting to be written when the physical machine crashes unexpectedly.
Using the read cache will improve sequential IO read performance when the reads ‘hit’ the cache. Sequential reads will hit the cache if:
- The data has been read before.
- The cache is large enough to hold all of the data.
The cache’s size for a disk varies based on instance size and the number of disks mounted. Caching can only be enabled for up to four data disks.
No Caching for Windows Azure Drives and VM Disks
Windows Azure Storage can provide a higher rate of random IOs than the local disk on your node that is used for caching. If your application needs to do lots of random IOs, and throughput is important to you, then you may want to consider not using the above caches. Keep in mind, however, that IOs to Windows Azure Storage do incur transaction costs, while IOs to the local cache do not.
To disable your Windows Azure Drive cache, pass ‘0’ for the cache size when you call the Mount() API.
For a Virtual Machine data disk the default behavior is to not use the cache. If you have enabled the cache on a data disk, you can disable it using the Update Data Disk service management API, or the Set-AzureDataDisk powershell command.
For a Virtual Machine operating system disk the default behavior is to use the cache. If your application will do lots of random IOs to data files, you may want to consider moving those files to a data disk which has the caching turned off.
Andrew Edwards and Brad Calder