Introducing Asynchronous Cross-Account Copy Blob

June 12, 2012, 8:44 am

≫ Next: TechEd 2012: New Windows Azure Storage Features, Improved Manageability, and Lower Prices

≪ Previous: Introducing Table SAS (Shared Access Signature), Queue SAS and update to Blob SAS

We are excited to introduce some changes to the Copy Blob API with 2012-02-12 version that allows you to copy blobs between storage accounts. This enables some interesting scenarios like:

Backup your blobs to another storage account without having to retrieve the content and saving it yourself
Migrate your blobs from one account to another efficiently with respect to cost and time

NOTE: To allow cross-account copy, the destination storage account needs to have been created on or after June 7^th 2012. This limitation is only for cross-account copy, as accounts created prior can still copy within the same account. If the account is created before June 7^th 2012, a copy blob operation across accounts will fail with HTTP Status code 400 (Bad Request) and the storage error code will be “CopyAcrossAccountsNotSupported.”

In this blog, we will go over some of the changes that were made along with some of the best practices to use this API. We will also show some sample code on using the new Copy Blob APIs with SDK 1.7.1 which is available on GitHub.

Changes to Copy Blob API

To enable copying between accounts, we have made the following changes:

Copy Source is now a URL

In versions prior to 2012-02-12, the source request header was specified as “/<account name>/<fully qualified blob name with container name and snapshot time if applicable >”. With 2012-02-12 version, we now require x-ms-copy-source to be specified as a URL. This is a versioned change, as specifying the old format with this new version will now fail with 400 (Bad Request). The new format allows users to specify a shared access signature or use a custom storage domain name. When specifying a source blob from a different account than the destination, the source blob must either be

A publicly accessible blob (i.e. the container ACL is set to be public)
A private blob, only if the source URL is pre-authenticated with a Shared Access Signature (i.e. pre-signed URL), allowing read permissions on the source blob

A copy operation preserves the type of the blob: a block blob will be copied as a block blob and a page blob will be copied to the destination as a page blob. If the destination blob already exists, it will be overwritten. However, if the destination type (for an existing blob) does not match the source type, the operation fails with HTTP status code 400 (Bad Request).

Note: The source blob could even be a blob outside of Windows Azure, as long as it is publicly accessible or accessible via some form of a Signed URL. For source blobs outside of Windows Azure, they will be copied to block blobs.

Copy is now asynchronous

Making copy asynchronous is a major change that greatly differs from previous versions. Previously, the Blob service returns a successful response back to the user only when the copy operation has completed. With version 2012-02-12, the Blob service will instead schedule the copy operation to be completed asynchronously: a success response only indicates that the copy operation has been successfully scheduled. As a consequence, a successful response from Copy Blob will now return HTTP status code 202 (Accepted) instead of 201 (Created).

A few important points:

There can be only one pending copy operation to a given destination blob name URL at time. But a source blob can be a source for many outstanding copies at once.
The asynchronous copy blob runs in the background using spare bandwidth capacity, so there is no SLA in terms of how fast a blob will be copied.
Currently there is no limit on the number of pending copy blobs that can be queued up for a storage account, but a pending copy blob operation can live in the system for at most 2 weeks. If longer than that, then the copy blob operation will be terminated.
If the source storage account is in a different location from the destination storage account, then the source storage account will be charged egress for the copy using the bandwidth rates as shown here.
When a copy is pending, any attempt to modify, snapshot, or lease the destination blob will fail.

Below we break down the key concepts of the new Copy Blob API.

Copy Blob Scheduling: when the Blob service receives a Copy Blob request, it will first ensure that the source exists and it can be accessed. If source does not exist or cannot be accessed, an HTTP status code 400 (Bad Request) is returned. If any source access conditions are provided, they will be validated too. If conditions do not match, then an HTTP status code 412 (Precondition Failed) error is returned. Once the source is validated, the service then validates any conditions provided for the destination blob (if it exists). If condition checks fail on destination blob, an HTTP status code 412 (Precondition Failed) is returned. If there is already a pending copy operation, then the service returns an HTTP status code 409 (Conflict). Once the validations are completed, the service then initializes the destination blob before scheduling the copy and then returns a success response to the user. If the source is a page blob, the service will create a page blob with the same length as the source blob but all the bytes are zeroed out. If the source blob is a block blob, the service will commit a zero length block blob for the pending copy blob operation. The service maintains a few copy specific properties during the copy operation to allow clients to poll the status and progress of their copy operations.

Copy Blob Response: when a copy blob operation returns success to the client, this indicates the Blob service has successfully scheduled the copy operation to be completed. Two new response headers are introduced:

x-ms-copy-status: The status of the copy operation at the time the response was sent. It can be one of the following:
- success : Copy operation has completed. This is analogous to the scenario in previous versions where the copy operation has completed synchronously.
- pending: Copy operation is still pending and the user is expected to poll the status of the copy. (See “Polling for Copy Blob properties” below.)
x-ms-copy-id: The string token that is associated with the copy operation. This can be used when polling the copy status, or if the user wishes to abort a “pending” copy operation.

Polling for Copy Blob properties: we now provide the following additional properties that allow users to track the progress of the copy, using Get Blob Properties, Get Blob, or List Blobs:

x-ms-copy-status (or CopyStatus): The current status of the copy operation. It can be one of the following:
- pending: Copy operation is pending.
- success: Copy operation completed successfully.
- aborted: Copy operation was aborted by a client.
- failed: Copy operation failed to complete due to an error.
x-ms-copy-id (CopyId): The id returned by the copy operation which can be used to monitor the progress or abort a copy.
x-ms-copy-status-description (CopyStatusDescription): Additional error information that can be used for diagnostics.
x-ms-copy-progress (CopyProgress): The amount of the blob copied so far. This has the format X/Y where X=number of bytes copied and Y is the total number of bytes.
x-ms-copy-completion-time (CopyCompletionTime): The completion time of the last copy.

These properties can be monitored to track the progress of a copy operation that returns “pending” status. However, it is important to note that except for Put Page, Put Block and Lease Blob operations, any other write operation (i.e., Put Blob, Put Block List, Set Blob Metadata, Set Blob Properties) on the destination blob will remove the properties pertaining to the copy operation.

Asynchronous Copy Blob: for the cases where the Copy Blob response returns with x-ms-copy-status set to “pending”, the copy operation will complete asynchronously.

Block blobs: The source block blob will be retrieved using 4 MB chunks and copied to the destination.
Page blobs: The source page blob’s valid ranges are retrieved and copied to destination

Copy Blob operations are retried on any intermittent failures such as network failures, server busy etc. but any failures are recorded in x-ms-copy-status-description which would let users know why the copy is still pending.

When the copy operation is pending, any writes to the destination blob is disallowed and the write operation will fail with HTTP status code 409 (Conflict). One would need to abort the copy before writing to the destination.

Data integrity during asynchronous copy: The Blob service will lock onto a version of the source blob by storing the source blob ETag at the time of copy. This is done to ensure that any source blob changes can be detected during the course of the copy operation. If the source blob changes during the copy, the ETag will no longer match its value at the start of the copy, causing the copy operation to fail.

Aborting the Copy Blob operation: To allow canceling a pending copy, we have introduced the Abort Copy Blob operation in the 2012-02-12 version of REST API. The Abort operation takes the copy-id returned by the Copy operation and will cancel the operation if it is in the “pending” state. An HTTP status code 409 (Conflict) is returned if the state is not pending or the copy-id does not match the pending copy. The blob’s metadata is retained but the content is zeroed out on a successful abort.

Best Practices

How to migrate blobs from a source account’s container to a destination container in another account?

With asynchronous copy, copying blobs from one account to another is simply as follow:

List blobs in the source container.
For each blob in the source container, copy the blob to a destination container.

Once all the blobs are queued for copy, the monitoring component can do the following:

List all blobs in the destination container.
Check the copy status; if it has failed or has been aborted, start a new copy operation.

Example: Here is a sample queuing of asynchronous copy. It will ignore snapshots and only copy base blobs. Error handling is excluded for brevity.

public static void CopyBlobs(
                CloudBlobContainer srcContainer,  
                string policyId, 
                CloudBlobContainer destContainer)
{
    // get the SAS token to use for all blobs
    string blobToken = srcContainer.GetSharedAccessSignature(
                       new SharedAccessBlobPolicy(), policyId);
 
     
    var srcBlobList = srcContainer.ListBlobs(true, BlobListingDetails.None);
    foreach (var src in srcBlobList)
    {
        var srcBlob = src as CloudBlob;
 
        // Create appropriate destination blob type to match the source blob
        CloudBlob destBlob;
        if (srcBlob.Properties.BlobType == BlobType.BlockBlob)
        {
            destBlob = destContainer.GetBlockBlobReference(srcBlob.Name);
        }
        else
        {
            destBlob = destContainer.GetPageBlobReference(srcBlob.Name);
        }
 
        // copy using src blob as SAS
        destBlob.StartCopyFromBlob(new Uri(srcBlob.Uri.AbsoluteUri + blobToken));
    }
}

Example: Monitoring code without error handling for brevity. NOTE: This sample assumes that no one else would start a different copy operation on the same destination blob. If such assumption is not valid for your scenario, please see “How do I prevent someone else from starting a new copy operation to overwrite my successful copy?” below.

public static void MonitorCopy(CloudBlobContainer destContainer)
{
    bool pendingCopy = true;
 
    while (pendingCopy)
    {
        pendingCopy = false;
        var destBlobList = destContainer.ListBlobs(
                true, BlobListingDetails.Copy);

        foreach (var dest in destBlobList)
        {
            var destBlob = dest as CloudBlob;
 
            if (destBlob.CopyState.Status == CopyStatus.Aborted ||
                destBlob.CopyState.Status == CopyStatus.Failed)
            {
                // Log the copy status description for diagnostics 
                // and restart copy
                Log(destBlob.CopyState);
                    pendingCopy = true;
                    destBlob.StartCopyFromBlob(destBlob.CopyState.Source);
            }
            else if (destBlob.CopyState.Status == CopyStatus.Pending)
            {
                // We need to continue waiting for this pending copy
                // However, let us log copy state for diagnostics
                Log(destBlob.CopyState);

                pendingCopy = true;
            }
            // else we completed this pending copy
        }
 
        Thread.Sleep(waitTime);
    };
}

How do I prevent the source from changing until the copy completes?

In an asynchronous copy, once authorization is verified on source, the service locks to that version of the source by using the ETag value. If the source blob is modified when the copy operation is pending, the service will fail the copy operation with HTTP status code 412 (Precondition Failed). To ensure that source blob is not modified, the client can acquire and maintain a lease on the source blob. (See the Lease Blob REST API.)

With 2012-02-12 version, we have introduced the concept of lock (i.e. infinite lease) which makes it easy for a client to hold on to the lease. A good option is for the copy job to acquire an infinite lease on the source blob before issuing the copy operation. The monitor job can then break the lease when the copy completes.

Example: Sample code that acquires a lock (i.e. infinite lease) on source.

// Acquire infinite lease on source blob                                
srcBlob.AcquireLease(null, leaseId);

// copy using source blob as SAS and with infinite lease id
string cid = destBlob.StartCopyFromBlob(
    new Uri(srcBlob.Uri.AbsoluteUri + blobToken),                    
    null /* source access condition */,
    null /* destination access condition */,
    null /* request options */);

How do I prevent someone else from starting a new copy operation to overwrite my successful copy?

During a pending copy, the blob service ensures that no client requests can write to the destination blob. The copy blob properties are maintained on the blob after a copy is completed (failed/aborted/successful). However, these copy properties are removed when any write command like Put Blob, Put Block List, Set Blob Metadata or Set Blob Properties are issued on the destination blob. The following operations will however retain the copy properties: Lease Blob, Put Page, and Put Block. Hence, a monitoring component which may require providing confirmation that a copy is completed will need these properties to be retained until it verifies the copy. To prevent any writes on destination blob once the copy is completed, the copy job should acquire an infinite lease on destination blob and provide that as destination access condition when starting the copy blob operation. The copy operation only allows infinite leases on the destination blob. This is because the service prevents any writes to the destination blob and any other granular lease would require client to issue Renew Lease on the destination blob. Acquiring a lease on destination blob requires the blob to exist and hence client would need to create an empty blob before the copy operation is issued. To terminate an infinite lease on a destination blob with pending copy operation, you would have to abort the copy operation before issuing the break request on the lease.

Weiping Zhang, Michael Roberson, Jai Haridas, Brad Calder

↧

TechEd 2012: New Windows Azure Storage Features, Improved Manageability, and Lower Prices

June 12, 2012, 9:13 am

≫ Next: USENIX Best Paper Award: Erasure Coding in Windows Azure Storage

≪ Previous: Introducing Asynchronous Cross-Account Copy Blob

We are very excited to release multiple improvements to Windows Azure Storage. These include price reductions, new manageability features, and new service features for Windows Azure Blobs, Tables, and Queues.

Jai Haridas will be presenting these features and more on Windows Azure Storage at Tech Ed 2012, so for more details please attend his talk today or view his talk online in a few days.

New Service Features

We’ve released a new version of the REST API, “2012-02-12”. We’ve updated the Java Storage Client library to reflect the new features. We’ve also released source code for a CTP of our .NET storage client library. This version contains the following new features:

Shared Access Signatures (Signed URLs) for Tables and Queues – similar to the Shared Access Signature feature previously available for Blobs, this allows account owners to issue URL access to specific resources such as tables, table ranges, queues, blobs and containers while specifying granular sets of permissions. In addition, there are some smaller improvements to Shared Access Signatures for Blobs. Learn more: Introducing Table SAS (Shared Access Signature), Queue SAS and update to Blob SAS
Expanded Blob Copy – For Blobs, we now support copying blobs between storage accounts and copy blob (even within accounts) is performed as an asynchronous operation. This is available in the new version, but will only work if the destination storage account was created on or after June 7, 2012. Of course, Copy Blob operations within the same account will continue to work for all accounts. Learn more: Introducing Asynchronous Cross-Account Copy Blob
Improved Blob Leasing – Leasing is now available for blob containers, and allows infinite lease duration. In addition, lease durations between 15-60 seconds are also supported. Changing the lease id (in order to rotate the lease-id across your components) is now supported. Learn more: New Blob Lease Features: Infinite Leases, Smaller Lease Times, and More

Improved Manageability

Users of the Windows Azure Management Portal will benefit from the following improvements in managing their storage accounts. These portal improvements are detailed further in New Storage Features on the Windows Azure Portal post.

Introducing Locally Redundant Storage - Storage users are now able turn off geo-replication by choosing Locally Redundant Storage (LRS). LRS provides highly durable and available storage within a single location (sub region).
Choosing Geo Redundant Storage or Locally Redundant Storage – By default storage accounts are configured for Geo Redundant Storage (GRS), meaning that Table and Blob data is replicated both within the primary location and also to a location hundreds of miles away (geo-replication). As detailed in this blog post, using LRS may be preferable in certain scenarios, and is available at a 23-34% discount compared to GRS. The price of GRS remains unchanged. Please note that a one-time bandwidth charge will apply if you choose to re-enable GRS after switching to LRS. You can also learn more about geo-replication in Introducing Geo-replication for Windows Azure Storage.
Configuration of Storage Analytics – While our analytics features (metrics and logging) have been available since last summer, configuring them required the user to call the REST API. In the new management portal, users can easily configure these features. To learn more about metrics and logging, see Windows Azure Storage Analytics.
Monitoring Storage Metrics – Storage users can now also monitor any desired set of metrics tracked in your account via the management portal.

Pricing

As mentioned above, users can reduce costs by choosing to use Locally Redundant Storage. Furthermore, we are excited to announce that we are reducing the pricing for storage transactions from $0.01 per 10,000 transactions to $0.01 per 100,000, reducing transaction costs by 90%! Learn more: 10x Price Reduction for Windows Azure Storage Transactions.

Summary

We’ve introduced a number of improvements to Windows Azure Storage and we invite you to read about each of them in the referenced blog posts. As always, we welcome your feedback and hope you’ll enjoy these new features!

Jeffrey Irwin and Brad Calder

↧

USENIX Best Paper Award: Erasure Coding in Windows Azure Storage

June 13, 2012, 11:36 am

≫ Next: Exploring Windows Azure Drives, Disks, and Images

≪ Previous: TechEd 2012: New Windows Azure Storage Features, Improved Manageability, and Lower Prices

We just published a paper describing how we Erasure Code data in Windows Azure Storage that won a Best Paper Award at the June 2012 USENIX Annual Technical Conference. This was joint work between Microsoft Research and the Windows Azure Storage team.

The paper can be found here.

Windows Azure Storage is a cloud storage system that provides customers the ability to store seemingly limitless amounts of data for any duration of time that is highly available and durable. When using Windows Azure Storage, you have access to your data from anywhere, at any time, and only pay for what you use and store.

The internal details for how Windows Azure Storage works is described in our SOSP paper here. One of the areas only briefly touched on in the SOSP paper was the fact that in the background we lazily erasure code data to reduce its storage overhead while keeping your data durable and highly available.

In our USENIX paper we describe how we do erasure coding in Windows Azure Storage. For erasure coding, we introduce a new set of codes we call Local Reconstruction Codes (LRC). LRC reduces the number of erasure coding fragments that need to be read when reconstructing data fragments that are offline, while still keeping the storage overhead low. The important benefits of LRC are that it reduces the bandwidth and I/Os required for reconstruction reads over prior codes, while still allowing a significant reduction in storage overhead. It is optimized to efficiently reconstruct fragments in the face of (a) single fragment failures (e.g., failed disk, node, or rack), (b) when fragments are offline due to an upgrade, or (c) when access to a fragment is slow. In the paper we describe how LRC is used in Windows Azure Storage to provide low overhead durable storage with consistently low read latencies. In addition, we describe our erasure coding implementation and important design decisions that went into it.

Brad Calder

↧

Exploring Windows Azure Drives, Disks, and Images

June 27, 2012, 11:14 pm

≫ Next: (Cross-Post) Windows Azure Storage – 4 Trillion Objects and Counting

≪ Previous: USENIX Best Paper Award: Erasure Coding in Windows Azure Storage

With the preview of Windows Azure Virtual Machines, we have two new special types of blobs stored in Windows Azure Storage: Windows Azure Virtual Machine Disks and Window Azure Virtual Machine Images. And of course we also have the existing preview of Windows Azure Drives. In the rest of this post, we will refer to these as storage, disks, images, and drives. This post explores what drives, disks, and images are and how they interact with storage.

Virtual Hard Drives (VHDs)

Drives, disks, and images are all VHDs stored as page blobs within your storage account. There are actually several slightly different VHD formats: fixed, dynamic, and differencing. Currently, Windows Azure only supports the format named ‘fixed’. This format lays the logical disk out linearly within the file format, such that disk offset X is stored at blob offset X. At the end of the blob, there is a small footer that describes the properties of the VHD. All of this stored in the page blob adheres to the standard VHD format, so you can take this VHD and mount it on your server on-premises if you choose to. Often, the fixed format wastes space because most disks have large unused ranges in them. However, we store our ‘fixed’ VHDs as a page blob, which is a sparse format, so we get the benefits of both the ‘fixed’ and ‘expandable’ disks at the same time.

Uploading VHDs to Windows Azure Storage

You can upload your VHD into your storage account to use it for either PaaS or IaaS. When you are uploading your VHD into storage, you will want to use a tool that understands that page blobs are sparse, and only uploads the portions of the VHD that have actual data in them. Also, if you have dynamic VHDs, you want to use a tool that will convert your dynamic VHD into a fixed VHD as it is doing the upload. CSUpload will do both of these things for you, and it is included as part of the Windows Azure SDK.

Persistence and Durability

Since drives, disks, and images are all stored in storage, your data will be persisted even when your virtual machine has to be moved to another physical machine. This means your data gets to take advantage of the durability offered by the Windows Azure Storage architecture, where all of your non-buffered and flushed writes to the disk/drive are replicated 3 times in storage to make it durable before returning success back to your application.

Drives (PaaS)

Drives are used by the PaaS roles (Worker Role, Web Role, and VM Role) to mount a VHD and assign a drive letter. There are many details about how you use these drives here. Drives are implemented with a kernel mode driver that runs within your VM, so your disk IO to and from the drive in the VM will cause network IO to and from the VM to your page blob in Windows Azure Storage. The follow diagram shows the driver running inside the VM, communicating with storage through the VM’s virtual network adapter.

PaaS roles are allowed to mount up to 16 drives per role.

Disks (IaaS)

When you create a Windows Azure Virtual Machine, the platform will attach at least one disk to the VM for your operating system disk. This disk will also be a VHD stored as a page blob in storage. As you write to the disk in the VM, the changes to the disk will be made to the page blob inside storage. You can also attach additional disks to your VM as data disks, and these will be stored in storage as page blobs as well.

Unlike for drives, the code that communicates with storage on behalf of your disk is not within your VM, so doing IO to the disk will not cause network activity in the VM, although it will cause network activity on the physical node. The following diagram shows how the driver runs in the host operating system, and the VM communicates through the disk interface to the driver, which then communicates through the host network adapter to storage.

There are limits to the number of disks a virtual machine can mount, varying from 16 data disks for an extra-large virtual machine, to one data disk for an extra small virtual machine. Details can be found here.

IMPORTANT: The Windows Azure platform holds an infinite lease on all the page blobs that it considers disks in your storage account so that you don’t accidently delete the underlying page blob, container, nor the storage account while the VM is using the VHD. If you want to delete the underlying page blob, the container it is within, or the storage account, you will need to detach the disk from the VM first as shown here:

And then select the disk you want to detach and then delete:

Then you need to remove the disk from the portal:

and then you can select ‘delete disk’ from the bottom of the window:

Note: when you delete the disk you are not deleting the disk (VHD page blob) in your storage account. You are only disassociating it from the images that can be used for Windows Azure Virtual Machines. After you have done all of the above, you will be able to delete the disk from your storage account, using Windows Azure Storage REST APIs or storage explorers.

Images (IaaS)

Windows Azure uses the concept of an “Image” to describe a template VHD that can be used to create one or more Virtual Machines. Windows Azure and some partners provide images that can be used to create Virtual Machines. You can also create images for yourself by capturing an image of an existing Windows Azure Virtual Machine, or you can upload a sysprep’d image to your storage account. An image is also in the VHD format, but the platform will not write to the image. Instead, when you create a Virtual Machine from an image, the system will create a copy of that image’s page blob in your storage account, and that copy will be used for the Virtual Machine’s operating system disk.

IMPORTANT: Windows Azure holds an infinite lease on all the page blobs, the blob container and the storage account that it considers images in your storage account. Therefore, to delete the underlying page blob, you need to delete the image from the portal by going to the “Virtual Machines” section, clicking on “Images”:

Then you select your image and press “Delete Image” at the bottom of the screen. This will disassociate the VHD from your set of registered images, but it does not delete the page blob from your storage account. At that point, you will be able to delete the image from your storage account.

Temporary Disk

There is another disk present in all web roles, worker roles, VM Roles, and Windows Azure Virtual Machines, called the temporary disk. This is a physical disk on the node that can be used for scratch space. Data on this disk will be lost when the VM is moved to another physical machine, which can happen during upgrades, patches, and when Windows Azure detects something is wrong with the node you are running on. The sizes offered for the temporary disk are defined here.

The temporary disk is the ideal place to store your operating system’s pagefile.

IMPORTANT: The temporary disk is not persistent. You should only write data onto this disk that you are willing to lose at any time.

Billing

Windows Azure Storage charges for Bandwidth, Transactions, and Storage Capacity. The per-unit costs of each can be found here.

Bandwidth

We recommend mounting drives from within the same location (e.g., US East) as the storage account they are stored in, as this offers the best performance, and also will not incur bandwidth charges. With disks, you are required to use them within the same location the disk is stored.

Transactions

When connected to a VM, disk IOs from both drives and disks will be satisfied from storage (unless one of the layers of cache described below can satisfy the request first). Small disk IOs will incur one Windows Azure Storage transaction per IO. Larger disk IOs will be split into smaller IOs, so they will incur more transaction charges. The breakdown for this is:

Drives

IO < 2 megabytes will be 1 transaction
IO >= 2 megabytes will be broken into transactions of 2MBs or smaller

Disks

IO < 128 kilobytes will be 1 transaction
IO >= 128 kilobytes will be broken into transactions of 128KBs or smaller

In addition, operating systems often perform a little read-ahead for small sequential IOs (typically less than 64 kilobytes), which may result in larger sized IOs to drives/disks than the IO size being issued by the application. If the prefetched data is used, then this can result in fewer transactions to your storage account than the number of IOs issued by your application.

Storage Capacity

Windows Azure Storage stores pages blobs and thus VHDs in sparse format, and therefore only charges for data within the VHD that has actually been written to during the life of the VHD. Therefore, we recommend using ‘quick format’ because this will avoid storing large ranges of zeros within the page blob. When creating a VHD you can choose the quick format option by specifying the below:

It is also important to note that when you delete files within the file system used by the VHD, most operating systems do not clear or zero these ranges, so you can still be paying capacity charges within a blob for the data that you deleted via a disk/drive.

Caches, Caches, and more Caches

Drives and disks both support on-disk caching and some limited in-memory caching. Many layers of the operating system as well as application libraries do in-memory caching as well. This section highlights some of the caching choices you have as an application developer.

Caching can be used to improve performance, as well as to reduce transaction costs. The following table outlines some of the caches that are available for use with disks and drives. Each is described in more detail below the table.

	Type	Purpose	Data Persistence
FileStream	Memory	Improves performance and reduces IOs for sequential reads and writes.	Writes are not automatically persisted. Call “Flush” to persist writes.
Operating System Caching	Memory	Improves performance and reduces IOs for random and sequential reads and writes.	Writes are not automatically persisted. Use “Write through” file handles, or “Flush” to persist writes.
Window Azure Drive Caches	Memory And Disk	Reduces read transactions to storage. Can improve performance for sequential IO, depending on workload.	Writes are automatically persisted. Use “Write through” file handles, or “Flush” to know writes are persisted.
Windows Azure Virtual Machine Disk Caches	Memory And Disk	Reduces transactions to storage. Can improve performance for sequential IO, depending on workload. Improves boot time.	Writes are automatically persisted. Use “Write through” file handles, or “Flush” to know writes are persisted.
No Disk or Drive Cache	N/A	Can improve performance for random and sequential IO, depending on workload.	Writes are automatically persisted. Use “Write through” file handles, or “Flush” to know writes are persisted.

FileStream (applies to both disks and drives)

.NET framework’s FileStream class will cache reads and writes in memory to reduce IOs to the disk. Some of the FileStream constructors take a cache size, and others will choose the default 8k cache size for you. You can not specify that the class use no memory cache, as the minimum cache size is 8 bytes. You can force the buffer to be written to disk by calling the FileStream.Flush(bool) API.

Operating System Caching (applies to both disks and drives)

The operating system itself will do in-memory buffering for both reads and writes, unless you explicitly turn it off when you open a file using FILE_FLAG_WRITE_THROUGH and/or FILE_FLAG_NO_BUFFERING. An in-depth discussion of the in memory caching behavior of windows is available here.

Windows Azure Drive Caches

Drives allow you to choose whether to use the node’s local temporary disk as a read cache, or to use no cache at all. The space for a drive’s cache is allocated from your web role or worker role’s temporary disk. This cache is write-through, so writes are always committed immediately to storage. Reads will be satisfied either from the local disk, or from storage.

Using the drive local cache can improve sequential IO read performance when the reads ‘hit’ the cache. Sequential reads will hit the cache if:

The data has been read before. The data is cached on the first time it is read, not on first write.
The cache is large enough to hold all of the data.

Access to the blob can often deliver a higher rate of random IOs than the local disk. However, these random IOs will incur storage transaction costs. To reduce the number of transactions to storage, you can use the local disk cache for random IOs as well. For best results, ensure that your random writes to the disk are 8KB aligned, and the IO sizes are in multiples of 8KB.

Windows Azure Virtual Machine Disk Caches

When deploying a Virtual Machine, the OS disk has two host caching choices:

Read/Write (Default) – write back cache
Read - write through cache

When you setup a data disk on a virtual machine, you get three host caching choices:

Read/Write – write back cache
Read – write through cache
None (Default)

The type of cache to use for data disks and the OS disk is not currently exposed through the portal. To set the type of host caching, you must either use the Service Management APIs (either Add Data Disk or Update Data Disk) or the Powershell commands (Add-AzureDataDisk or Set-AzureDataDisk).

The read cache is stored both on disk and in memory in the host OS. The write cache is stored in memory in the host OS.

WARNING: If your application does not use FILE_FLAG_WRITE_THROUGH, the write cache could result in data loss because the data could be sitting in the host OS memory waiting to be written when the physical machine crashes unexpectedly.

Using the read cache will improve sequential IO read performance when the reads ‘hit’ the cache. Sequential reads will hit the cache if:

The data has been read before.
The cache is large enough to hold all of the data.

The cache’s size for a disk varies based on instance size and the number of disks mounted. Caching can only be enabled for up to four data disks.

No Caching for Windows Azure Drives and VM Disks

Windows Azure Storage can provide a higher rate of random IOs than the local disk on your node that is used for caching. If your application needs to do lots of random IOs, and throughput is important to you, then you may want to consider not using the above caches. Keep in mind, however, that IOs to Windows Azure Storage do incur transaction costs, while IOs to the local cache do not.

To disable your Windows Azure Drive cache, pass ‘0’ for the cache size when you call the Mount() API.

For a Virtual Machine data disk the default behavior is to not use the cache. If you have enabled the cache on a data disk, you can disable it using the Update Data Disk service management API, or the Set-AzureDataDisk powershell command.

For a Virtual Machine operating system disk the default behavior is to use the cache. If your application will do lots of random IOs to data files, you may want to consider moving those files to a data disk which has the caching turned off.

Andrew Edwards and Brad Calder

↧

(Cross-Post) Windows Azure Storage – 4 Trillion Objects and Counting

July 19, 2012, 4:30 pm

≫ Next: Introducing Windows Azure Storage Client Library 2.0 for .NET and Windows Runtime

≪ Previous: Exploring Windows Azure Drives, Disks, and Images

Windows Azure Storage has had an amazing year of growth. We have over 4 trillion objects stored, process an average of 270,000 requests per second, and reach peaks of 880,000 requests per second.

About a year ago we hit the 1 trillion object mark. Then for the past 12 months, we saw an impressive 4x increase in number of objects stored, and a 2.7x increase in average requests per second.

The following graph shows the number of stored objects in Windows Azure Storage over the past year. The number of stored objects is counted on the last day of the month shown. The object count is the number of unique user objects stored in Windows Azure Storage, so the counts do not include replicas.

The following graph shows the average and peak requests per second. The average requests per second is the average over the whole month shown, and the peak requests per second is the peak for the month shown.

We expect this growth rate to continue, especially since we just lowered the cost of requests to storage by 10x. It now costs $0.01 per 100,000 requests regardless of request type (same cost for puts and gets). This makes object puts and gets to Windows Azure Storage 10x to 100x cheaper than other cloud providers.

In addition, we now offer two types of durability for your storage – Locally Redundant Storage (LRS) and Geo Redundant Storage (GRS). GRS is the default storage that we have always provided, and now we are offering a new type of storage called LRS. LRS is offered at a discount and provides locally redundant storage, where we maintain an equivalent 3 copies of your data within a given location. GRS provides geo-redundant storage, where we maintain an equivalent 6 copies of your data spread across 2 locations at least 400 miles apart from each other (3 copies are kept in each location). This allows you to choose the desired level of durability for your data. And of course, if your data does not require the additional durability of GRS you can use LRS at a 23% to 34% discounted price (depending on how much data is stored). In addition, we also employ a sophisticated erasure coding scheme for storing data that provides higher durability than just storing 3 (for LRS) or 6 (for GRS) copies of your data, while at the same time keeping the storage overhead low, as described in our USENIX paper.

We are also excited about our recent release of Windows Azure Virtual Machines, where the persistent disks are stored as objects (blobs) in Windows Azure Storage. This allows the OS and data disks used by your VMs to leverage the same LRS and GRS durability provided by Windows Azure Storage. With that release we also provided access to Windows Azure Storage via easy to use client libraries for many popular languages (.net, java, node.js, php, and python), as well as REST.

Windows Azure Storage uses a unique approach of storing different object types (Blobs, Disks/Drives, Tables, Queues) in the same store, as described in our SOSP paper. The total number of blobs (disk/drives are stored as blobs), table entities, and queue messages stored account for the 4+ trillion objects in our unified store. By blending different types of objects across the same storage stack, we have a single stack for replicating data to keep it durable, a single stack for automatic load balancing and dealing with failures to keep data available, and we store all of the different types of objects on the same hardware, blending their workloads, to keep prices low. This allows us to have one simple pricing model for all object types (same cost in terms of GB/month, bandwidth, as well as transactions), so customers can focus on choosing the type of object that best fits their needs, instead of being forced to use one type of object over another due to price differences.

We are excited about the growth ahead and continuing to work with customers to provide a quality service. Please let us know if you have any feedback, questions or comments! If you would like to learn more about Windows Azure, click here.

Brad Calder

↧

Introducing Windows Azure Storage Client Library 2.0 for .NET and Windows Runtime

October 29, 2012, 1:00 am

≫ Next: Windows Azure Storage Client Library 2.0 Breaking Changes & Migration Guide

≪ Previous: (Cross-Post) Windows Azure Storage – 4 Trillion Objects and Counting

Today we are releasing version 2.0 of the Windows Azure Storage Client Library. This is our largest update to our .NET library to date which includes new features, broader platform compatibility, and revisions to address the great feedback you’ve given us over time. The code is available on GitHub now. The libraries are also available through NuGet, and also included in the Windows Azure SDK for .NET - October 2012; for more information and links see below. In addition to the .NET 4.0 library, we are also releasing two libraries for Windows Store apps as Community Technology Preview (CTP) that fully supports the Windows Runtime platform and can be used to build modern Windows Store apps for both Windows RT (which supports ARM based systems), and Windows 8, which runs in any of the languages supported by Windows Store apps (JavaScript, C++, C#, and Visual Basic). This blog post serves as an overview of these libraries and covers some of the implementation details that will be helpful to understand when developing cloud applications in .NET regardless of platform.

What’s New

We have introduced a number of new features in this release of the Storage Client Library including:

Simplicity and Usability - A greatly simplified API surface which will allow developers new to storage to get up and running faster while still providing the extensibility for developers who wish to customize the behavior of their applications beyond the default implementation.
New Table Implementation - An entirely new Table Service implementation which provides a simple interface that is optimized for low latency/high performance workloads, as well as providing a more extensible serialization model to allow developers more control over their data.
Rich debugging and configuration capabilities– One common piece of feedback we receive is that it’s too difficult to know what happened “under the covers” when making a call to the storage service. How many retries were there? What were the error codes? The OperationContext object provides rich debugging information, real-time status events for parallel and complex actions, and extension points allowing users the ability to customize requests or enable end to end client tracing
Windows Runtime Support - A Windows Runtime component with support for developing Windows Store apps using JavaScript, C++,C#, and Visual Basic; as well as a Strong Type Tables Extension library for C++, C#, and Visual Basic
Complete Sync and Asynchronous Programming Model (APM) implementation - A complete Synchronous API for .Net 4.0. Previous releases of the client implemented synchronous methods by simply surrounding the corresponding APM methods with a ManualResetEvent, this was not ideal as extra threads remained blocked during execution. In this release all synchronous methods will complete work on the thread in which they are called with the notable exceptions of the stream implementations available via Cloud[Page|Block]Blob.Open[Read|Write]due to parallelism.
Simplified RetryPolicies - Easy and reusable RetryPolicies
.NET Client Profile– The library now supports the .NET Client Profile. For more on the .Net Client Profile see here.
Streamlined Authentication Model - There is now a single StorageCredentials type that supports Anonymous, Shared Access Signature, and Account and Key authentication schemes
Consistent Exception Handling - The library immediately will throw any exception encountered prior to making the request to the server. Any exception that occurs during the execution of the request will subsequently be wrapped inside a single StorageException type that wraps all other exceptions as well as providing rich information regarding the execution of the request.
API Clarity - All methods that make requests to the server are clearly marked with the [DoesServiceRequest] attribute
Expanded Blob API - Blob DownloadRange allows user to specify a given range of bytes to download rather than rely on a stream implementation
Blob download resume - A feature that will issue a subsequent range request(s) to download only the bytes not received in the event of a loss of connectivity
Improved MD5 - Simplified MD5 behavior that is consistent across all client APIs
Updated Page Blob Implementation - Full Page Blob implementation including read and write streams
Cancellation - Support for Asynchronous Cancellation via the ICancellableAsyncResult. Note, this can be used with .NET CancellationTokens via the CancellationToken.Register() method.
Timeouts - Separate client and server timeouts which support end to end timeout scenarios
Expanded Azure Storage Feature Support– It supports the 2012-02-12 REST API version with implementation for for Blob & Container Leases, Blob, Table, and Queue Shared Access Signatures, and Asynchronous Cross-Account Copy Blob

Design

When designing the new Storage Client for .NET and Windows Runtime, we set up a series of design guidelines to follow throughout the development process. In addition to these guidelines, there are some unique requirements when developing for Windows Runtime, and specifically when projecting into JavaScript, that has driven some key architectural decisions.

For example, our previous RetryPolicy was based on a delegate that the user could configure; however as this cannot be supported on all platforms we have redesigned the RetryPolicy to be a simple and consistent implementation everywhere. This change has also allowed us to simplify the interface in order to address user feedback regarding the complexity of the previous implementation. Now a user who constructs a custom RetryPolicy can re-use that same implementation across platforms.

Windows Runtime

A key driver in this release was expanding platform support, specifically targeting the upcoming releases of Windows 8, Windows RT, and Windows Server 2012. As such, we are releasing the following two Windows Runtime components to support Windows Runtime as Community Technology Preview (CTP):

Microsoft.WindowsAzure.Storage.winmd - A fully projectable storage client that supports JavaScript, C++, C#, and VB. This library contains all core objects as well as support for Blobs, Queues, and a base Tables Implementation consumable by JavaScript
Microsoft.WindowsAzure.Storage.Table.dll – A table extension library that provides generic query support and strong type entities. This is used by non-JavaScript applications to provide strong type entities as well as reflection based serialization of POCO objects

Breaking Changes

With the introduction of Windows 8, Windows RT, and Windows Server 2012 we needed to broaden the platform support of our current libraries. To meet this requirement we have invested significant effort in reworking the existing Storage Client codebase to broaden platform support, while also delivering new features and significant performance improvements (more details below). One of the primary goals in this version of the client libraries was to maintain a consistent API across platforms so that developer’s knowledge and code could transfer naturally from one platform to another. As such, we have introduced some breaking changes from the previous version of the library to support this common interface. We have also used this opportunity to act on user feedback we have received via the forums and elsewhere regarding both the .Net library as well as the recently released Windows Azure Storage Client Library for Java. For existing users we will be posting an upgrade guide for breaking changes to this blog that describes each change in more detail.

Please note the new client is published under the same NuGet package as previous 1.x releases. As such, please check any existing projects as an automatic upgrade will introduce breaking changes.

Additional Dependencies

The new table implementation depends on three libraries (collectively referred to as ODataLib), which are resolved through the ODataLib (version 5.0.2) packages available through NuGet and not the WCF Data Services installer which currently contains 5.0.0 versions. The ODataLib libraries can be downloaded directly or referenced by your code project through NuGet. The specific ODataLib packages are:

http://nuget.org/packages/Microsoft.Data.OData/5.0.2

http://nuget.org/packages/Microsoft.Data.Edm/5.0.2

http://nuget.org/packages/System.Spatial/5.0.2

Namespaces

One particular breaking change of note is that the name of the assembly and root namespace has moved to Microsoft.WindowsAzure.Storage instead of Microsoft.WindowsAzure.StorageClient.In addition to aligning better with other Windows Azure service libraries this change allows developers to use the legacy 1.X versions of the library and the 2.0 release side-by-side as they migrate their applications. Additionally, each Storage Abstraction (Blob, Table, and Queue) has now been moved to its own sub-namespace to provide a more targeted developer experience and cleaner IntelliSense experience. For example the Blob implementation is located in Microsoft.WindowsAzure.Storage.Blob, and all relevant protocol constructs are located in Microsoft.WindowsAzure.Storage.Blob.Protocol.

Testing, stability, and engaging the open source community

We are committed to providing a rock solid API that is consistent, stable, and reliable. In this release we have made significant progress in increasing test coverage as well as breaking apart large test scenarios into more targeted ones that are more consumable by the public.

Microsoft and Windows Azure are making great efforts to be as open and transparent as possible regarding the client libraries for our services. The source code for all the libraries can be downloaded via GitHub under the Apache 2.0 license. In addition we have provided over 450 new Unit Tests for the .Net 4.0 library alone. Now users who wish to modify the codebase have a simple and light weight way to validate their changes. It is also important to note that most of these tests run against the Storage Emulator that ships via the Windows Azure SDK for .NET allowing users to execute tests without incurring any usage on their storage accounts. We will also be providing a series of higher level scenarios and How-To’s to get users up and running both simple and advanced topics relating to using Windows Azure Storage.

Summary

We have put a lot of work into providing a truly first class development experience for the .NET community to work with Windows Azure Storage. In addition to the content provided in these blog posts we will continue to release a series of additional blog posts which will target various features and scenarios in more detail, so check back soon. Hopefully you can see your past feedback reflected in this new library. We really do appreciate the feedback we have gotten from the community, so please keep it coming by leaving a comment below or participating on our forums.

Joe Giardino
Serdar Ozler
Justin Yu
Veena Udayabhanu

Windows Azure Storage

Resources

Get the Windows Azure SDK for .Net

↧

Windows Azure Storage Client Library 2.0 Breaking Changes & Migration Guide

October 29, 2012, 3:00 am

≫ Next: Windows Azure Storage Emulator 1.8

≪ Previous: Introducing Windows Azure Storage Client Library 2.0 for .NET and Windows Runtime

The recently released Windows Azure Storage Client Library for .Net includes many new features, expanded platform support, extensibility points, and performance improvements. In developing this version of the library we made some distinct breaks with Storage Client 1.7 and prior in order to support common paradigms across .NET and Windows Runtime applications. Additionally, we have addressed distinct pieces of user feedback from the forums and users we’ve spoken with. We have made great effort to provide a stable platform for clients to develop their applications on and will continue to do so. This blog post serves as a reference point for these changes as well as a migration guide to assist clients in migrating existing applications to the 2.0 release. If you are new to developing applications using the Storage Client in .Net you may want to refer to the overview here to get acquainted with the basic concepts. This blog post will focus on changes and future posts will be introducing the concepts that the Storage Client supports.

Namespaces

The core namespaces of the library have been reworked to provide a more targeted Intellisense experience, as well as more closely align with the programming experience provided by other Windows Azure Services. The root namespace as well as the assembly name itself have been changed from Microsoft.WindowsAzure.StorageClient to Microsoft.WindowsAzure.Storage. Additionally, each service has been broken out into its own sub namespace. For example the blob implementation is located in Microsoft.WindowsAzure.Storage.Blob, and all protocol relevant constructs are in Microsoft.WindowsAzure.Storage.Blob.Protocol. Note: Windows Runtime component will not expose Microsoft.WindowsAzure.Storage.[Blob|Table|Queue].Protocol namespaces as they contain dependencies on .Net specific types and are therefore not projectable.

The following is a detailed listing of client accessible namespaces in the assembly.

Microsoft.WindowsAzure.Storage– Common types such as CloudStorageAccount and StorageException. Most applications should include this namespace in their using statements.
Microsoft.WindowsAzure.Storage.Auth – The StorageCredentials object that is used to encapsulate multiple forms of access (Account & Key, Shared Access Signature, and Anonymous).
Microsoft.WindowsAzure.Storage.Auth.Protocol – Authentication handlers that support SharedKey and SharedKeyLite for manual signing of requests
Microsoft.WindowsAzure.Storage.Blob – Blob convenience implementation, applications utilizing Windows Azure Blobs should include this namespace in their using statements

Microsoft.WindowsAzure.Storage.Blob.Protocol – Blob Protocol layer

Microsoft.WindowsAzure.Storage.Queue – Queue convenience implementation, applications utilizing Windows Azure Queues should include this namespace in their using statements

Microsoft.WindowsAzure.Storage.Queue.Protocol – Queue Protocol layer

Microsoft.WindowsAzure.Storage.Table – New lightweight Table Service implementation based on OdataLib. We will be posting an additional blog that dives into this new Table implementation in more greater detail.

Microsoft.WindowsAzure.Storage.Table.DataServices –The legacyTable Service implementation based on System.Data.Services.Client. This includes TableServiceContext, CloudTableQuery, etc.
Microsoft.WindowsAzure.Storage.Table.Protocol – Table Protocol layer implementation

Microsoft.WindowsAzure.Storage.RetryPolicies - Default RetryPolicy implementations (NoRetry, LinearRetry, and ExponentialRetry) as well as the IRetryPolicy interface
Microsoft.WindowsAzure.Storage.Shared.Protocol – Analytics objects and core HttpWebRequestFactory

What’s New

Added support for the .NET Client Profile, allowing for easier installation of your application on machines where the full .NET Framework has not been installed.
There is a new dependency on the three libraries released as OdataLib, which are available via nuget and codeplex.

A reworked and simplified codebase that shares a large amount of code between platforms
Over 450 new unit tests published to GitHub
All APIs that execute a request against the storage service are marked with the DoesServiceRequest attribute
Support for custom user headers
OperationContext– Provides an optional source of diagnostic information about how a given operation is executing. Provides mechanism for E2E tracing by allowing clients to specify a client trace id per logical operation to be logged by the Windows Azure Storage Analytics service.
True “synchronous” method support. SDK 1.7 implemented synchronous methods by simply wrapping a corresponding Asynchronous Programming Model (APM) method with a ManualResetEvent. In this release all work is done on the calling thread. This excludes stream implementations available via Cloud[Page|Block]Blob.OpenRead and OpenWrite and parallel uploads.
Support for Asynchronous cancellation via ICancellableAsyncResult. Note this can be hooked up to .NET cancellation tokens via the Register() method as illustrated below:

…
ICancellableAsyncResult result = container.BeginExists(callback, state);
token.Register((o) => result.Cancel(), null /* state */);
…

Timeouts – The Library now allows two separate timeouts to be specified. These timeouts can be specified directly on the service client (i.e. CloudBlobClient) or overridden via the RequestOptions. These timeouts are nullable and therefore can be disabled.

The ServerTimeout is the timeout given to the server for each request executed in a given logical operation. An operation may make more than one requests in the case of a Retry, parallel upload etc., the ServerTimeout is sent for each of these requests. This is set to 90 seconds by default.
The MaximumExecutionTime provides a true end to end timeout. This timeout is a client side timeout that spans all requests, including any potential retries, a given operation may execute. This is disabled by default.

Full PageBlob support including lease, cross account copy, and read/write streams
Cloud[Block|Page]Blob DownloadRange support
Blobs support download resume, in the event of an error the subsequent request will be truncated to specify a range at the correct byte offset.
The default MD5 behavior has been updated to utilize a FIPS compliant implementation. To use the default .NET MD5 please set CloudStorageAccount.UseV1MD5 = true;

Breaking Changes

General

Dropped support for .NET Framework 3.5, Clients must use .Net 4.0 or above
Cloud[Blob|Table|Queue]Client.ResponseReceived event has been removed, instead there are SendingRequest and ResponseReceived events on the OperationContext which can be passed into each logical operation
All Classes are sealed by default to maintain consistency with Windows RT library
ResultSegments are no longer generic. For example, in Storage Client 1.7 there is a ResultSegment<CloudTable>, while in 2.0 there is a TableResultSegment to maintain consistency with Windows RT library.
RetryPolicies

The Storage Client will no longer prefilter certain types of exceptions or HTTP status codes prior to evaluating the users RetryPolicy. The RetryPolicies contained in the library will by default not retry 400 class errors, but this can be overridden by implementing your own policy
A retry policy is now a class that implements the IRetryPolicy interface. This is to simplify the syntax as well as provide commonality with the Windows RT library

StorageCredentials

CloudStorageAccount.SetConfigurationSettingPublisher has been removed. Instead the members of StorageCredentials are now mutable allowing users to accomplish similar scenarios in a more streamlined manner by simply mutating the StorageCredentials instanceassociated with a given client(s) via the provided UpdateKey methods.
All credentials types have been simplified into a single StorageCredentials object that supports Anonymous requests, Shared Access Signature, and Account and Key authentication.

Exceptions

StorageClientException and StorageServerException are now simplified into a single Exception type: StorageException. All APIs will throw argument exceptions immediately; once a request is initiated all other exceptions will be wrapped.
StorageException no longer directly contains ExtendedErrorInformation. This has been moved inside the RequestResult object which tracks the current state of a given request

Pagination has been simplified. A segmented result will simply return up to the maximum number of results specified. If a continuation token is received it is left to the user to make any subsequent requests to complete a given page size.

Blobs

All blobs must be accessed via CloudPageBlob or CloudBlockBlob, the CloudBlob base class has been removed. To get a reference to the concrete blob class when the client does not know the type please see the GetBlobReferenceFromServer on CloudBlobClient and CloudBlobContainer
In an effort to be more transparent to the application layer the default parallelism is now set to 1 for blob clients. (This can be configured via CloudBlobClient.ParallelOperationThreadCount) In previous releases of the sdk, we observed many users scheduling multiple concurrent blob uploads to more fully exploit the parallelism of the system. However, when each of these operations was internally processing up to 8 simultaneous operations itself there were some adverse side effects on the system. By setting parallelism to 1 by default it is now up to the user to opt in to this concurrent behavior.
CloudBlobClient.SingleBlobUploadThresholdInBytes can now be set as low as 1 MB.
StreamWriteSizeInBytes has been moved to CloudBlockBlob and can now be set as low as 16KB. Please note that the maximum number of blocks a blob can contain is 50,000 meaning that with a block size of 16KB, the maximum blob size that can be stored is 800,000KB or ~ 781 MB.
All upload and download methods are now stream based, the FromFile, ByteArray, Text overloads have been removed.
The stream implementation available via CloudBlockBlob.OpenWrite will no longer encode MD5 into the block id. Instead the block id is now a sequential block counter appended to a fixed random integer in the format of [Random:8]-[Seq:6].
For uploads if a given stream is not seekable it will be uploaded via the stream implementation which will result in multiple operations regardless of length. As such, when available it is considered best practice to pass in seekable streams.
MD5 has been simplified, all methods will honor the three MD5 related flags exposed via BlobRequestOptions

StoreBlobContentMD5 – Stores the Content MD5 on the Blob on the server (default to true for Block Blobs and false for Page Blobs)
UseTransactionalMD5 – Will ensure each upload and download provides transactional security via the HTTP Content-MD5 header. Note: When enabled, all Download Range requests must be 4MB or less. (default is disabled, however any time a Content-MD5 is sent by the server the client will validate it unless DisableContentMD5Validation is set)
DisableContentMD5Validation – Disables any Content-MD5 validation on downloads. This is needed to download any blobs that may have had their Content-MD5 set incorrectly
Cloud[Page|Block]Blob no longer exposes BlobAttributes. Instead the BlobProperties, Metadata, Uri, etc. are exposed on the Cloud[Page|Block]Blob object itself

The stream available via Cloud[Page|Block]Blob.OpenRead() does not support multiple Asynchronous reads prior to the first call completing. You must first call EndRead prior to a subsequent call to BeginRead.
Protocol

All blob Protocol constructs have been moved to the Microsoft.WindowsAzure.Storage.Blob.Protocol namespace. BlobRequest and BlobResponse have been renamed to BlobHttpWebRequestFactory and BlobHttpResponseParsers respectively.
Signing Methods have been removed from BlobHttpWebRequestFactory, alternatively use the SharedKeyAuthenticationHandler in the Microsoft.WindowsAzure.Storage.Auth.Protocol namespace

Tables

New Table Service Implementation - A new lightweight table implementation is provided in the Microsoft.WindowsAzure.Storage.Table namespace. Note: For backwards compatibility the Microsoft.WindowsAzure.Storage.Table.DataServices.TableServiceEntity was not renamed, however this entity type is not compatible with the Microsoft.WindowsAzure.Storage.Table.TableEntity as it does not implement ITableEntity interface.
DataServices

The legacy System.Data.Services.Client based implementation has been migrated to the Microsoft.WindowsAzure.Storage.Table.DataServices namespace.
The CloudTableClient.Attach method has been removed. Alternatively, use a new TableServiceContext
TableServiceContext will now protect concurrent requests against the same context. To execute concurrent requests please use a separate TableServiceContext per logical operation.
TableServiceQueries will no longer rewrite the take count in the URI query string to take smaller amounts of entities based on the legacy pagination construct. Instead, the client side Lazy Enumerable will stop yielding results when the specified take count is reached. This could potentially result in retrieving a larger number of entities from the service for the last page of results. Developers who need a finer grained control over the pagination of their queries should leverage the segmented execution methods provided.

Protocol

All Table protocol constructs have been moved to the Microsoft.WindowsAzure.Storage.Table.Protocol namespace. TableRequest and TableResponse have been renamed to TableHttpWebRequestFactory and TableHttpResponseParsers respectively.
Signing Methods have been removed from TableHttpWebRequestFactory, alternatively use the SharedKeyLiteAuthenticationHandler in the Microsoft.WindowsAzure.Storage.Auth.Protocol namespace

Queues

Protocol

All Queue protocol constructs have been moved to the Microsoft.WindowsAzure.Storage.Queue.Protocol namespace. QueueRequest and QueueResponse have been renamed to QueueHttpWebRequestFactory and QueueHttpResponseParsers respectively.
Signing Methods have been removed from QueueHttpWebRequestFactory, alternatively use the SharedKeyAuthenticationHandler in the Microsoft.WindowsAzure.Storage.Auth.Protocol namespace

Migration Guide

In addition to the detailed steps above, below is a simple migration guide to help clients begin migrating existing applications.

Namespaces

A legacy application will need to update their “using” to include:

Microsoft.WindowsAzure.Storage
If using credentials types directly add a using statement to Microsoft.WindowsAzure.Storage.Auth
If you are using a non-default RetryPolicy add a using statement to Microsoft.WindowsAzure.Storage.RetryPolicies
For each Storage abstraction add the relevant using statement Microsoft.WindowsAzure.Storage.[Blob|Table|Queue]

Blobs

Any code that access a blob via CloudBlob will have to be updated to use the concrete types CloudPageBlob and CloudBlockBlob. The listing methods will return the correct object type, alternatively you may discern this from via FetchAttributes(). To get a reference to the concrete blob class when the client does not know the type please see the GetBlobReferenceFromServer on CloudBlobClient and CloudBlobContainer objects
Be sure to set the desired Parallelism via CloudBlobClient.ParallelOperationThreadCount
Any code that may rely on the internal MD5 semantics detailed here, should update to set the correct MD5 flags via BlobRequestOptions

Tables

If you are migrating an existing Table application you can choose to re-implement it via the new simplified Table Service implementation, otherwise add a using to the Microsoft.WindowsAzure.Storage.Table.DataServices namespace

DataServiceContext (the base implementation of the TableServiceContext) is not threadsafe, subsequently it has been considered best practice to avoid concurrent requests against a single context, though not explicitly prevented. The 2.0 release will now protect against simultaneous operations on a given context. Any code that may rely on concurrent requests on the same TableServiceContext should be updated to execute serially, or utilize multiple contexts.

Summary

This blog posts serves as a guide to the changes introduced by the 2.0 release of the Windows Azure Storage Client libraries.

We very much appreciate all the feedback we have gotten from customers and through the forums, please keep it coming. Feel free to leave comments below,

Joe Giardino
Serdar Ozler
Veena Udayabhanu
Justin Yu

Windows Azure Storage

Resources

Get the Windows Azure SDK for .Net

↧

Windows Azure Storage Emulator 1.8

October 30, 2012, 3:00 am

≫ Next: Known Issues for Windows Azure Storage Client Library 2.0 for .NET and Windows Runtime

≪ Previous: Windows Azure Storage Client Library 2.0 Breaking Changes & Migration Guide

In our continuous endeavor to enrich the development experience, we are extremely pleased to announce the new Storage Emulator, which has much improved parity with the Windows Azure Storage cloud service.

What is Storage Emulator?

Storage Emulator emulates the Windows Azure Storage blob, table and queue cloud services on local machine which helps developers in getting started and basic testing of their storage applications locally without incurring the cost associated with cloud service. This version of Windows Azure Storage emulator supports Blob, Tables and Queues up until REST version 2012-02-12.

How it works?

Storage Emulator exposes different http end points (port numbers: 10000 for blob, 10001 for queue and 10002 for table services) on local host to receive and serve storage requests. Upon receiving a request, the emulator validates the request for its correctness, authenticates it, authorizes (if necessary) it, works with the data in SQL tables and file system and finally sends a response to the client.

Delving deeper into the internals, Storage Emulator efficiently stores the data associated with queues and tables in SQL tables. However, for blobs, it stores the metadata in SQL tables and actual data on local disk one file for each blob, for better performance. When deleting blobs, the Storage Emulator does not synchronously clean up unreferenced blob data on disk while performing blob operations. Instead it compacts and garbage collects such data in the background for better scalability and concurrency.

Storage Emulator Dependencies:

SQL Express or LocalDB
.NET 4.0 or later with SQL Express or .NET 4.0.2 or later with LocalDB

Installing Storage Emulator

Storage Emulator can work with LocalDB, SQL express or even a full blown SQL server as its SQL store.

The following steps would help in getting started with emulator using LocalDB.

Install .NET framework 4.5 from here.
Install X64 or X86 LocalDB from here.
Install the Windows Azure Emulator from here.

Alternatively, if you have storage emulator 1.7 installed, you can do an in place update to the existing emulator. Please note that storage emulator 1.8 uses a new SQL schema and hence a DB reset is required for doing an in place update, which would result in loss of your existing data.

The following steps would help in performing an in place update.

Shutdown the storage emulator, if running
Replace the binaries ‘Microsoft.WindowsAzure.DevelopmentStorage.Services.dll’, ‘Microsoft.WindowsAzure.DevelopmentStorage.Store.dll’ and ‘Microsoft.WindowsAzure.DevelopmentStorage.Storev4.0.2.dll’, located at storage emulator installation path (Default path is "%systemdrive%\Program Files\Microsoft SDKs\Windows Azure\Emulator\devstore") with those available here.
Open up the command prompt in admin mode and run ‘dsinit /forceCreate’ to recreate the DB. You can find the ‘dsinit’ tool at the storage emulator installation path.
Start the storage emulator

What’s new in 1.8?

Storage emulator 1.8 supports the REST version 2012-02-12, along with earlier versions. Below are the service specific enhancements.

Blob Service Enhancements:

In 2012-02-12 REST version, Windows Azure Storage cloud service introduced support for container leases, improved blob leases and asynchronous copy blob across different storage accounts. Also, there were enhancements for blob shared access signatures and blob leases in the 2012-02-12 version. All those new features are supported in Storage Emulator 1.8.

Since the emulator has just one built in account, one can initiate cross account copy blob by providing a valid cloud based URL. Emulator serves such cross account copy blob requests, asynchronously, by downloading the blob data, in chunks of 4MB, and updating the copy status.

To know more about the new features in general, the following links would be helpful:

Storage Emulator 1.8 also garbage collects the unreferenced page blob files which may be produced as a result of delete blob requests, failed copy blob requests etc.

Queue Service Enhancements:

In 2012-02-12 REST version, Windows Azure Storage cloud service introduced support for Queue shared access signatures (SAS). Storage Emulator 1.8 supports Queue SAS.

Table Service Enhancements:

In 2012-02-12 REST version, Windows Azure Storage cloud service introduced support for table shared access signatures (SAS). Storage Emulator 1.8 supports Table SAS.

In order to achieve full parity with Windows Azure Storage table service APIs, the table service in emulator is completely rewritten from scratch to support truly schema less tables and expose data for querying and updating using ODATA protocol. As a result, Storage Emulator 1.8 fully supports the below table operations which were not supported in Emulator 1.7.

Query Projection: You can read more about it here.
Upsert operations: You can read more about it here.

Known Issues/Limitations

The storage emulator supports only a single fixed account and a well-known authentication key. They are: Account name: devstoreaccount1, Account key: Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==
The URI scheme supported by the storage emulator differs from the URI scheme supported by the cloud storage services. The development URI scheme specifies the account name as part of the hierarchical path of the URI, rather than as part of the domain name. This difference is due to the fact that domain name resolution is available in the cloud but not on the local computer. For more information about URI differences in the development and production environments, see “Using storage service URIs” section in Overview of Running a Windows Azure Application with the Storage Emulator.
The storage emulator does not support Set Blob Service Properties or SetServiceProperties for blob, queue and table services.
Date properties in the Table service in the storage emulator support only the range supported by SQL Server 2005 (For example, they are required to be later than January 1, 1753). All dates before January 1, 1753 are changed to this value. The precision of dates is limited to the precision of SQL Server 2005, meaning that dates are precise to 1/300th of a second.
The storage emulator supports partition key and row key property values of less than 900 bytes. The total size of the account name, table name, and key property names together cannot exceed 900 bytes.
The storage emulator does not validate that the size of a batch in an entity group transaction is less than 4 MB. Batches are limited to 4 MB in Windows Azure, so you must ensure that a batch does not exceed this size before transitioning to the Windows Azure storage services.
Avoid using ‘PartitionKey’ or ‘RowKey’ that contains ‘%’ character due to the double decoding bug
Get messages from queue might not return messages in the strict increasing order of message’s ‘Insertion TimeStamp’ + ‘visibilitytimeout’

Summary

Storage Emulator 1.8 has a great extent of parity with the Windows Azure Storage cloud service in terms of API support and usability and we will continue to improve it. We hope you all like it and please share your feedback with us to make it better.

Nagarjun Guraja

Windows Azure Storage

↧

Known Issues for Windows Azure Storage Client Library 2.0 for .NET and Windows Runtime

November 1, 2012, 8:38 am

≫ Next: (Cross-Post) Windows Azure’s Flat Network Storage and 2012 Scalability Targets

≪ Previous: Windows Azure Storage Emulator 1.8

We recently released the 2.0 version of the Windows Azure Storage Client Library. This is our largest update to our .NET library to date which includes new features, broader platform compatibility, and revisions to address the great feedback you’ve given us over time. For more about this release see here. For information regarding breaking changes see here.

This SDK 2.0 release contains a few known issues that will be addressed in the next release of the libraries and are detailed below.

Known Issues

Service Client Retry Policy does not support null

The Cloud[Blob|Queue|Table]Client.RetryPolicy does not support null, if you wish to disable retries simply use RetryPolicies.NoRetry (client.RetryPolicy = new NoRetry()).

CloudStorageAccount.Parse cannot parse DevelopmentStorageAccount strings if a proxy is not specified.

CloudStorageAccount.Parse() and TryParse() do not support DevelopmentStorageAccount strings if a proxy is not specified. CloudStorageAccount.DevelopmentStorageAccount.ToString() will serialize to the string: “UseDevelopmentStorage=true” which illustrate this particular issue. Passing this string into CloudStorageAccount.Parse() or TryParse () will throw a KeyNotFoundException.

The example below illustrates this issue:

// Will serialize to "UseDevelopmentStorage=true"CloudStorageAccount myAccount = CloudStorageAccount.DevelopmentStorageAccount;
CloudStorageAccount.Parse(myAccount.ToString());  // Will Throw KeyNotFoundException

To work around this issue you may specify a proxy Uri as below:

// Will serialize to "UseDevelopmentStorage=true;DevelopmentStorageProxyUri=http://ipv4.fiddler"CloudStorageAccount myAccount = CloudStorageAccount.GetDevelopmentStorageAccount(new Uri("http://ipv4.fiddler"));
CloudStorageAccount.Parse(myAccount.ToString());

Summary

We continue to work hard on delivering a first class development experience for the .Net community to work with Windows Azure Storage. We will address these issues in upcoming releases of the SDK and on GitHub.

Joe Giardino
Windows Azure Storage

Resources

Get the Windows Azure SDK for .Net

↧

(Cross-Post) Windows Azure’s Flat Network Storage and 2012 Scalability Targets

November 4, 2012, 1:17 pm

≫ Next: Windows Azure Storage Client Library for Windows Runtime

≪ Previous: Known Issues for Windows Azure Storage Client Library 2.0 for .NET and Windows Runtime

Earlier this year, we deployed a flat network for Windows Azure across all of our datacenters to create Flat Network Storage (FNS) for Windows Azure Storage. We used a flat network design in order to provide very high bandwidth network connectivity for storage clients. This new network design and resulting bandwidth improvements allows us to support Windows Azure Virtual Machines, where we store VM persistent disks as durable network attached blobs in Windows Azure Storage. Additionally, the new network design enables scenarios such as MapReduce and HPC that can require significant bandwidth between compute and storage.

From the start of Windows Azure, we decided to separate customer VM-based computation from storage, allowing each of them to scale independently, making it easier to provide multi-tenancy, and making it easier to provide isolation. To make this work for the scenarios we need to address, a quantum leap in network scale and throughput was required. This resulted in FNS, where the Windows Azure Networking team (under Albert Greenberg) along with the Windows Azure Storage, Fabric and OS teams made and deployed several hardware and software networking improvements.

The changes to new storage hardware and to a high bandwidth network comprise the significant improvements in our second generation storage (Gen 2), when compared to our first generation (Gen 1) hardware, as outlined below:

Storage SKU	Storage Node Network Speed	Networking Between Compute and Storage	Load Balancer	Storage Device Used for Journaling
Gen 1	1 Gbps	Hierarchical Network	Hardware Load Balancer	Hard Drives
Gen 2	10 Gbps	Flat Network	Software Load Balancer	SSDs

The deployment of our Gen 2 SKU, along with software improvements, provides significant bandwidth between compute and storage using a flat network topology. The specific implementation of our flat network for Windows Azure is referred to as the “Quantum 10” (Q10) network architecture. Q10 provides a fully non-blocking 10Gbps based fully meshed network, providing an aggregate backplane in excess of 50 Tbps of bandwidth for each Windows Azure datacenter. Another major improvement in reliability and throughput is moving from a hardware load balancer to a software load balancer. Then the storage architecture and design described here, has been tuned to fully leverage the new Q10 network to provide flat network storage for Windows Azure Storage.

With these improvements, we are pleased to announce an increase in the scalability targets for Windows Azure Storage, where all new storage accounts are created on the Gen 2 hardware SKU.These new scalability targets apply to all storage accounts created after June 7^th, 2012. Storage accounts created before this date have the prior scalability targets described here. Unfortunately, we do not offer the ability to migrate storage accounts, so only storage accounts created after June 7^th, 2012 have these new scalability targets.

To find out the creation date of your storage account, you can go to the new portal, click on the storage account, and see the creation date on the right in the quick glance section as shown below:

Storage Account Scalability Targets

By the end of 2012, we will have finished rolling out the software improvements for our flat network design. This will provide the following scalability targets for asingle storage account created after June 7^th 2012.

Capacity– Up to 200 TBs
Transactions– Up to 20,000 entities/messages/blobs per second
Bandwidth for a Geo Redundant storage account
- Ingress - up to 5 gigabits per second
- Egress - up to 10 gigabits per second
Bandwidth for a Locally Redundant storage account
- Ingress - up to 10 gigabits per second
- Egress - up to 15 gigabits per second

Storage accounts have geo-replication on by default to provide what we call Geo Redundant Storage. Customers can turn geo-replication off to use what we call Locally Redundant Storage, which results in a discounted price relative to Geo Redundant Storage and higher ingress and egress targets (by end of 2012) as described above. For more information on Geo Redundant Storage and Locally Redundant Storage, please see here.

Note, the actual transaction and bandwidth targets achieved by your storage account will very much depend upon the size of objects, access patterns, and the type of workload your application exhibits. To go above these targets, a service should be built to use multiple storage accounts, and partition the blob containers, tables and queues and objects across those storage accounts. By default, a single Windows Azure subscription gets 20 storage accounts. However, you can contact customer support to get more storage accounts if you need to store more than that (e.g., petabytes) of data.

Partition Scalability Targets

Within a storage account, all of the objects are grouped into partitions as described here. Therefore, it is important to understand the performance targets of a single partition for our storage abstractions, which are (the below Queue and Table throughputs were achieved using an object size of 1KB):

Single Queue– all of the messages in a queue are accessed via a single queue partition. A single queue is targeted to be able to process:
- Up to 2,000 messages per second
Single Table Partition– a table partition are all of the entities in a table with the same partition key value, and usually tables have many partitions. The throughput target for a single table partition is:
- Up to 2,000 entities per second
- Note, this is for a single partition, and not a single table. Therefore, a table with good partitioning, can process up to the 20,000 entities/second, which is the overall account target described above.
Single Blob– the partition key for blobs is the “container name + blob name”, therefore we can partition blobs down to a single blob per partition to spread out blob access across our servers. The target throughput of a single blob is:
- Up to 60 MBytes/sec

The above throughputs are the high end targets. What can be achieved by your application very much depends upon the size of the objects being accessed, the operation types (workload) and the access patterns. We encourage all services to test the performance at the partition level for their workload.

When your application reaches the limit to what a partition can handle for your workload, it will start to get back “503 Server Busy” or “500 Operation Timeout” responses. When this occurs, the application should use exponential backoff for retries. The exponential backoff allows the load on the partition to decrease, and to ease out spikes in traffic to that partition.

In summary, we are excited to announce our first step towards providing flat network storage. We plan to continue to invest in improving bandwidth between compute and storage as well as increase the scalability targets of storage accounts and partitions over time.

Brad Calder and Aaron Ogus
Windows Azure Storage

↧

Windows Azure Storage Client Library for Windows Runtime

November 5, 2012, 1:00 am

≫ Next: Windows Azure Storage Client Library 2.0 Tables Deep Dive

≪ Previous: (Cross-Post) Windows Azure’s Flat Network Storage and 2012 Scalability Targets

We are excited to be releasing two libraries for Windows 8 and Windows Server 2012 as Community Technology Preview (CTP) that fully support the Windows Runtime platform. These libraries can be used to build Windows Store applications on both ARM and X86 in any of the supported languages (JavaScript, C++, C#, and VB). We have invested a large amount of effort into ensuring Windows developers can easily leverage Windows Azure Storage in an easy and consistent manner via the language of their choice. This blog post serves as an overview of these libraries and highlights a few key differences when developing for Windows Runtime.

To download the Storage Client libraries for Windows 8 click here.

Tables

The Windows Runtime component contains a new lightweight table service implementation based on OdataLib. One key departure from the desktop client library is this release does not contain a table implementation based on DataServiceContext (System.Data.Services.Client.dll) as this is not projectable to the various languages Windows Runtime supports. To support JavaScript in the base library the table implementation that ships in the core component is limited to a single IDictionary based entity type (DynamicTableEntity) for persisting and querying entities. For languages other than JavaScript we have released a table extension library (Microsoft.WindowsAzure.Storage.Table.dll) that provides support for strong entity types (POCOs), generic queries, reflection based serialization, and the EntityResolver. This assembly is consumable by C++, C#, and VB applications. (Note: For c# developers, the current release does not currently expose an IQueryable implementation to allow users to create queries via LINQ similar to the legacy 1.x storage clients. As an alternative lightweight helpers are provided via static methods on the TableQuery class to assist in the construction of query strings.)

Key differences between Windows Runtime and the .NET 4.0 library

The Windows Runtime library shares some common code with the .NET equivalent; however there are some key differences necessary to fully support the Windows Runtime platform. These differences are discussed below.

Windows Runtime Types

The Windows Runtime public API only exposes Windows Runtime types which are consumable by all supported languages including JavaScript. If you are migrating an application from the .NET client you will need to use the Windows Runtime equivalent types. .NET 4.5 exposes some helpful extension methods to convert between .NET types and Windows Runtime types, for example System.IO.Stream.As[Input|Output]Stream() and Windows.Foundation.IAsync[Action|Operation].AsTask().

Additionally, due to the restrictions for Windows Runtime components, all protocol layer constructs can no longer be exposed. As such, the .Net protocol layer implementation is not available when developing Windows Runtime applications.

Async / Await Pattern

The Windows Runtime library utilizes the Async / Await pattern and does not expose any synchronous methods or APM equivalents. This is in keeping with best practice for developing modern Windows Runtime applications. The APIs exposed in the library are bound by the network and connectivity, as such leveraging this pattern is essential in keeping your applications fast and fluid. For .NET developers that may wish to use the Task Parallel Library (TPL), you may use the extension methods mentioned above to convert the exposed IAsyncActions and IAsyncOperations to Task objects.

Cancellation

All APIs support cancellation via the IAsyncInfo.Cancel() method. To leverage cancellation tokens in .NET you may either create a Task object via the extension methods provided as shown below:

CancellationToken myToken = new CancellationToken();
Task<bool> createTask = myTable.CreateAsync().AsTask(myToken);

Or you may hookup your CancellationToken via CancellationToken.Register():

CancellationToken myToken = new CancellationToken();
IAsyncOperation<bool> createOperation = myTable.CreateAsync();
myToken.Register(createOperation.Cancel);

Exceptions

Windows Runtime components cannot expose strong typed exception objects; instead a Windows Runtime application must rely on the HResult of the exception that it receives. (Note: There are some instances where the exception message is populated; however developers are encouraged not to depend on this as it is not available in all situations.) To simplify this paradigm for our users we have provided an optional feature available via OperationConext that allows users to still retrieve rich exception information in plain POJO object. By calling an overload and passing in an OperationContext object clients can analyze status and exception information in a language agnostic way.

Here is an example of how to retrieve additional exception information via the OperationContext:

// You will need the following usings
using Microsoft.WindowsAzure.Storage;
using Microsoft.WindowsAzure.Storage.Blob
…

// Create the BlobClient from the CloudStorageAccount
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();

// Get a reference to a container that does not exist
CloudBlobContainer container = blobClient.GetContainerReference("containerthatdoesntexist");

// Create an OperationContext to track the request
OperationContext deleteContainerContext = new OperationContext();
int? exceptionResult = null;

try
{
    // This will generate a resource not found exception
    await container.DeleteAsync(null /* AccessCondition */, null /* BlobRequestOptions */, deleteContainerContext);
}
catch (Exception e)
{
    exceptionResult = e.HResult;
}

if (exceptionResult.HasValue)
{
    // Display a detailed error to the user. Alternatively, you could inspect the extended error information returned by the server to provide some recover logic here. 
    MessageDialog errorDialog = new MessageDialog(string.Format("{0} : {1}", deleteContainerContext.LastResult.HttpStatusMessage, deleteContainerContext.LastResult.ServiceRequestID));
    await errorDialog.ShowAsync();
}

This sample will result in the following error being displayed to the user:

Authentication

A key feature that Windows Azure Storage provides is fine grained Shared Access Signature permissions that allow users to write mobile and distributed applications that directly interact with their storage in a controlled manner without exposing their private key information. For example, as a mobile game developer I can design my client application to directly query a user’s achievements from a given table, but limit the user to read access over a strict PartitionKey / RowKey range. Additionally, I can enable Logging on my storage account to monitor usage and behavior of my applications. By alleviating the need to provide a middle tier service I can not only provide lower latencies for the client application, but also dramatically reduce cost. For more information regarding Shared Access Signatures, please see this blog post.

The StorageCredentials object supports Anonymous, Shared Access Signature, and Account and Key via the SharedKey and SharedKeyLite authentication schemes. When developing mobile applications that leverage Windows Azure Storage there are a few best practices that should be followed which are detailed below

Support for Account and Key is included in the library for completeness and to support scenarios where an application user may provide their own credentials (for example a storage explorer application). However, users should never distribute account and key information as part of a mobile application as it is no longer under their control. If a key is inadvertently leaked you can rotate your storage keys via the Windows Azure Portal.
It is considered best practice to utilize Shared Access Signature authentication in a mobile application. Shared Access Signatures should be tied to an Access Policy on the Container, Table, or Queue which will allow them to be revoked in the case they are abused or leaked.
The Windows Push Notification service can be used to distribute Shared Access Signature tokens to all application users or a given subset. This provides a convenient mechanism to update tokens in the event of a compromised token.
When using Shared Access Signatures outside a secured network HTTPS should be used to avoid leaking the Shared Access Signature token.
When designing your application keep in mind that a given Container, Table, or Queue can support 5 Shared Access Policies (Signed Identifiers) which may dictate the granularity of permissions that you can issue via SAS.

Known Issues

Because of the differences in escaped character handling in Uris, using some special characters in blob names might cause authentication failures. These characters are listed below:

[ ] \

In the previous versions of Storage Client Library, backslash (‘\’) was converted to forward-slash (‘/’) automatically. Hence, if you were using backslash, please replace it with a forward-slash to work around the issue, since the current client library no longer automatically does that conversion.

Similarly, [ and ] characters cause authentication failures when used in partition and/or row keys in tables. However, batch operations can be used if your partition and/or row key needs to use those values, so you would have to use batch operations to work around this issue.

Summary

This blog post has served as an overview of the Windows Azure Storage Client libraries for Windows Runtime and has highlighted some significant points that differ from the desktop implementation. We are committed to providing a great development experience for Windows Runtime developers that fully exploits the Windows Runtime platform. We welcome any feedback that you may have, feel free to leave comments below,

Joe Giardino
Serdar Ozler
Veena Udayabhanu
Justin Yu

Windows Azure Storage

Resources

Get the Windows Azure SDK for .Net

↧

Windows Azure Storage Client Library 2.0 Tables Deep Dive

November 6, 2012, 1:00 am

≫ Next: AzCopy – Uploading/Downloading files for Windows Azure Blobs

≪ Previous: Windows Azure Storage Client Library for Windows Runtime

This blog post serves as an overview to the recently released Windows Azure Storage Client for .Net and the Windows Runtime. In addition to the legacy implementation shipped in versions 1.x that is based on DataServiceContext, we have also provided a more streamlined implementation that is optimized for common NoSQL scenarios.

Note, if you are migrating an existing application from a previous release of the SDK please see the overview and migration guide posts.

New Table implementation

The new table implementation is provided in the Microsoft.WindowsAzure.Storage.Tablenamespace. There are three key areas we emphasized in the design of the new table implementation: usability, extensibility, and performance. The basic scenarios are simple and “just work”; in addition, we have also provided two distinct extension points to allow developers to customize the client behaviors to their specific scenario. We have also maintained a degree of consistency with the other storage clients (Blob and Queue) so that moving between them feels seamless.

With the addition of this new implementation users have effectively three different patterns to choose from when programing against table storage. A high level summary of each and a brief description of the benefits they offer are provided below.

Table Service Layer via TableEntity – This approach offers significant performance and latency improvements over the WCF Data Services, but still offers the ability to define POCO objects in a similar fashion without having to write serialization / deserialization logic. Additionally, the optional EntityResolver provides the ability to easily work with heterogeneous entity types returned via queries without any additional client objects or overhead. Additionally, users can optionally customize the serialization behavior of their entities by overriding the ReadEntity or WriteEntity methods. The Table Service layer does not currently expose an IQueryable, meaning that queries need to be manually constructed (helper functions are exposed via static methods on the TableQuery class, see below for more). For an example see the NoSQL scenario below.
Table Service Layer via DynamicTableEntity – This approach is provided to allow user’s direct access to a Dictionary key value pairs. This is particularly useful for more advanced scenarios such as defining entities whose property names are dictated at runtime, entities with large amount of properties, server side projections, and bulk updates of heterogeneous data. Since DynamicTableEntity implements the ITableEntity interface all results, including projections, can be persisted back to the server. For an example see the Heterogeneous Update scenario below.
WCF Data Services – Similar to the legacy 1.7x implementation, this approach exposes an IQueryable allowing users to construct complex queries via LINQ. This approach is recommended for users with existing code assets as well as non-latency sensitive queries as it utilizes greater system resources. The WCF Data Services based implementation has been migrated to the Microsoft.WindowsAzure.Storage.Table.DataServicesnamespace. For additional details see the DataServices section below.
Note, a similar table implementation using WCF Data Services is not provided in the recently released Windows 8 library due to limitations when projecting to various supported languages.

Dependencies

The new table implementation utilizes the OdataLib components to provide the over the wire protocol implementation. These libraries are available via NuGet (See the resources section below). Additionally, to maintain compatibility with previous versions of the SDK, the client library has a dependency on System.Data.Services.Client.dll which is part of the .Net platform. Please Note, the current WCF Data Services standalone installer contains version 5.0.0 assemblies, referencing these assemblies will result in a runtime failure.

You can resolve these dependencies as shown below

NuGet

To install Windows Azure Storage, run the following command in the Package Manager Console.

PM>Install-Package WindowsAzure.Storage

This will automatically resolve any needed dependencies and add them to your project.

Windows Azure SDK for .NET - October 2012 release

Install the SDK (http://www.windowsazure.com/en-us/develop/net/ click on the “install the SDK” button)
Create a project and add a reference to %Program Files%\Microsoft SDKs\Windows Azure\.NET SDK\2012-10\ref\Microsoft.WindowsAzure.Storage.dll
In Visual Studio go to Tools > Library Package Manager-> Package Manager Console and execute the following command.

PM> Install-Package Microsoft.Data.OData -Version 5.0.2

Performance

The new table implementation has shown significant performance improvements over the updated DataServices implementation and the previous versions of the SDK. Depending on the operation latencies have improved by between 25% and 75% while system resource utilization has also decreased significantly. Queries alone are over twice as fast and consume far less memory. We will have more details in a future Performance blog.

Object Model

A diagram of the table object model is provided below. The core flow of the client is that a user defines an action (TableOperation, TableBatchOperation, or TableQuery) over entities in the Table service and executes these actions via the CloudTableClient. For usability, these classes provide static factory methods to assist in the definition of actions.

For example, the code below inserts a single entity:

CloudTable table = tableClient.GetTableReference([TableName]);
table.Execute(TableOperation.Insert(entity));

Execution

CloudTableClient

Similar to the other Azure Storage clients, the table client provides a logical service client, CloudTableClient, which is responsible for service wide operations and enables execution of other operations. The CloudTableClient class can update the Storage Analytics settings for the Table service, list all tables in the account, and can create references to client side CloudTable objects, among other operations.

CloudTable

A CloudTable object is used to perform operations directly on a given table (Create, Delete, SetPermissions, etc.), and is also used to execute entity operations against the given table.

TableRequestOptions

The TableRequestOptions class defines additional parameters which govern how a given operation is executed, specifically the timeouts and RetryPolicy that are applied to each request. The CloudTableClient provides default timeouts and RetryPolicy settings; TableRequestOptions can override them for a particular operation.

TableResult

The TableResult class encapsulates the result of a single TableOperation. This object includes the HTTP status code, the ETag and a weak typed reference to the associated entity. For TableBatchOperations, the CloudTable.ExecuteBatch method will return a collection of TableResults whose order corresponds with the order of the TableBatchOperation. For example, the first element returned in the resulting collection will correspond to the first operation defined in the TableBatchOperation.

Actions

TableOperation

The TableOperation class encapsulates a single operation to be performed against a table. Static factory methods are provided to create a TableOperation that will perform an Insert, Delete, Merge, Replace, Retrieve, InsertOrReplace, and InsertOrMerge operation on the given entity. TableOperations can be reused so long as the associated entity is updated. As an example, a client wishing to use table storage as a heartbeat mechanism could define a merge operation on an entity and execute it to update the entity state to the server periodically.

Sample – Inserting an Entity into a Table

// You will need the following using statements
using Microsoft.WindowsAzure.Storage;
using Microsoft.WindowsAzure.Storage.Table;

// Create the table client.
CloudTableClient tableClient = storageAccount.CreateCloudTableClient();
CloudTable peopleTable = tableClient.GetTableReference("people");
peopleTable.CreateIfNotExists();

// Create a new customer entity.
CustomerEntity customer1 = new CustomerEntity("Harp", "Walter");
customer1.Email = "Walter@contoso.com";
customer1.PhoneNumber = "425-555-0101";

// Create an operation to add the new customer to the people table.
TableOperation insertCustomer1 = TableOperation.Insert(customer1);

// Submit the operation to the table service.
peopleTable.Execute(insertCustomer1);

TableBatchOperation

The TableBatchOperation class represents multiple TableOperation objects which are executed as a single atomic action within the table service. There are a few restrictions on batch operations that should be noted:

You can perform batch updates, deletes, inserts, merge and replace operations.
A batch operation can have a retrieve operation, if it is the only operation in the batch.
A single batch operation can include up to 100 table operations.
All entities in a single batch operation must have the same partition key.
A batch operation is limited to a 4MB data payload.

The CloudTable.ExecuteBatch which takes as input a TableBatchOperation will return an IList of TableResults which will correspond in order to the entries in the batch itself. For example, the result of a merge operation that is the first in the batch will be the first entry in the returned IList of TableResults. In the case of an error the server may return a numerical id as part of the error message that corresponds to the sequence number of the failed operation in the batch unless the failure is associated with no specific command such as ServerBusy, in which case -1 is returned. TableBatchOperations, or Entity Group Transactions, are executed atomically meaning that either all operations will succeed or if there is an error caused by one of the individual operations the entire batch will fail.

Sample – Insert two entities in a single atomic Batch Operation

// You will need the following using statements
using Microsoft.WindowsAzure.Storage;
using Microsoft.WindowsAzure.Storage.Table;

// Create the table client.
CloudTableClient tableClient = storageAccount.CreateCloudTableClient();
CloudTable peopleTable = tableClient.GetTableReference("people");
peopleTable.CreateIfNotExists();

// Define a batch operation.
TableBatchOperation batchOperation = new TableBatchOperation();

// Create a customer entity and add to the table
CustomerEntity customer = new CustomerEntity("Smith", "Jeff");
customer.Email = "Jeff@contoso.com";
customer.PhoneNumber = "425-555-0104";
batchOperation.Insert(customer);

// Create another customer entity and add to the table
CustomerEntity customer2 = new CustomerEntity("Smith", "Ben");
customer2.Email = "Ben@contoso.com";
customer2.PhoneNumber = "425-555-0102";
batchOperation.Insert(customer2);

// Submit the operation to the table service.
peopleTable.ExecuteBatch(batchOperation);

TableQuery

The TableQuery class is a lightweight query mechanism used to define queries to be executed against the table service. See “Querying” below.

Entities

ITableEntity interface

The ITableEntity interface is used to define an object that can be serialized and deserialized with the table client. It contains the PartitionKey, RowKey, Timestamp, and Etag properties, as well as methods to read and write the entity. This interface is implemented by the TableEntity and DynamicTableEntity entity types that are included in the library; a client may implement this interface directly to persist different types of objects or objects from 3rd-party libraries. By overriding the ITableEntity.ReadEntity or ITableEntity.WriteEntity methods a client may customize the serialization logic for a given entity type.

TableEntity

The TableEntity class is an implementation of the ITableEntity interface and contains the RowKey, PartitionKey, and Timestamp properties. The default serialization logic TableEntity uses is based off of reflection where all public properties of a supported type that define both get and set are serialized. This will be discussed in greater detail in the extension points section below. This class is sealed and may be extended to add additional properties to an entity type.

Sample – Define a POCO that extends TableEntity

// This class defines one additional property of integer type, since it derives from
// TableEntity it will be automatically serialized and deserialized.    
public class SampleEntity : TableEntity
{
    public int SampleProperty { get; set; }  
}

DynamicTableEntity

The DynamicTableEntity class allows clients to update heterogeneous entity types without the need to define base classes or special types. The DynamicTableEntity class defines the required properties for RowKey, PartitionKey, Timestamp, and ETag; all other properties are stored in an IDictionary. Aside from the convenience of not having to define concrete POCO types, this can also provide increased performance by not having to perform serialization or deserialization tasks.

Sample – Retrieve a single property on a collection of heterogeneous entities

// You will need the following using statements
using Microsoft.WindowsAzure.Storage;
using Microsoft.WindowsAzure.Storage.Table;

// Define the query to retrieve the entities, notice in this case we
// only need to retrieve the Count property.
TableQuery query = new TableQuery().Select(new string[] { "Count" });

// Note the TableQuery is actually executed when we iterate over the
// results. Also, this sample uses the DynamicTableEntity to avoid
// having to worry about various types, as well as avoiding any
// serialization processing.
foreach (DynamicTableEntity entity in myTable.ExecuteQuery(query))
{ 
      // Users should always assume property is not there in case another client removed it.
      EntityProperty countProp;

    if (!entity.Properties.TryGetValue("Count", out countProp))    
    {
        throw new ArgumentNullException("Invalid entity, Count property not found!");
    }
    
    // Display Count property, however you could modify it here and persist it back to the service.
    Console.WriteLine(countProp.Int32Value);
}

Note: an ExecuteQuery equivalent is not provided in the Windows Runtime library in keeping with best practice for the platform. Instead use the ExecuteQuerySegmentedAsync method to execute the query in a segmented fashion.

EntityProperty

The EntityProperty class encapsulates a single property of an entity for the purposes of serialization and deserialization. The only time the client has to work directly with EntityProperties is when using DynamicTableEntity or implementing the TableEntity.ReadEntity and TableEntity.WriteEntity methods.

The samples below show two approaches that can be a player’s score property. The first approach uses DynamicTableEntity to avoid having to declare a client side object and updates the property directly, whereas the second will deserialize the entity into a POJO and update that object directly.

Sample –Update of entity property using EntityProperty

// You will need the following using statements
using Microsoft.WindowsAzure.Storage;
using Microsoft.WindowsAzure.Storage.Table;

// Retrieve entity
TableResult res = gamerTable.Execute(TableOperation.Retrieve("Smith", "Jeff"));
DynamicTableEntity player = (DynamicTableEntity)res.Result;

// Retrieve Score property
EntityProperty scoreProp;

// Users should always assume property is not there in case another client removed it.
if (!entity.Properties.TryGetValue("Score ", out scoreProp))    
{ 
    throw new ArgumentNullException("Invalid entity, Score property not found!");
}

scoreProp.Int32Value += 1;

// Store the updated score
gamerTable.Execute(TableOperation.Merge(player));

Sample – Update of entity property using POJO

// You will need the following using statements
using Microsoft.WindowsAzure.Storage;
using Microsoft.WindowsAzure.Storage.Table;

public class GamerEntity : TableEntity
{
    public int Score { get; set; }
}
// Retrieve entity
TableResult res = gamerTable.Execute(TableOperation.Retrieve<GamerEntity>("Smith", "Jeff"));
GamerEntity player = (GamerEntity)res.Result;

// Update Score
player.Score += 1;

// Store the updated score
gamerTable.Execute(TableOperation.Merge(player));

EntityResolver

The EntityResolver delegate allows client-side projection and processing for each entity during serialization and deserialization. This is designed to provide custom client side projections, query-specific filtering, and so forth. This enables key scenarios such as deserializing a collection of heterogeneous entities from a single query.

Sample – Use EntityResolver to perform client side projection

// You will need the following using statements
using Microsoft.WindowsAzure.Storage;
using Microsoft.WindowsAzure.Storage.Table;

// Define the query to retrieve the entities, notice in this case we only need
// to retrieve the Email property.
TableQuery<TableEntity> query = new TableQuery<TableEntity>().Select(new string[] { "Email" });

// Define a Entity resolver to mutate the entity payload upon retrieval.
// In this case we will simply return a String representing the customers Email address.
EntityResolver<string> resolver = (pk, rk, ts, props, etag) => props.ContainsKey("Email") ? props["Email"].StringValue : null;


// Display the results of the query, note that the query now returns
// strings instead of entity types since this is the type of EntityResolver we created.
foreach (string projectedString in gamerTable.ExecuteQuery(query, resolver, null /* RequestOptions */, null /* OperationContext */))
{
    Console.WriteLine(projectedString);
}

Querying

There are two query constructs in the table client: a retrieve TableOperation which addresses a single unique entity and a TableQuery which is a standard query mechanism used against multiple entities in a table. Both querying constructs need to be used in conjunction with either a class type that implements the TableEntity interface or with an EntityResolver which will provide custom deserialization logic.

Retrieve

A retrieve operation is a query which addresses a single entity in the table by specifying both its PartitionKey and RowKey. This is exposed via TableOperation.Retrieve and TableBatchOperation.Retrieve and executed like a typical operation via the CloudTable.

Sample – Retrieve a single entity

// You will need the following using statements
using Microsoft.WindowsAzure.Storage;
using Microsoft.WindowsAzure.Storage.Table;

// Create the table client.
CloudTableClient tableClient = storageAccount.CreateCloudTableClient();
CloudTable peopleTable = tableClient.GetTableReference("people");

// Retrieve the entity with partition key of "Smith" and row key of "Jeff"
TableOperation retrieveJeffSmith = TableOperation.Retrieve<CustomerEntity>("Smith", "Jeff");

// Retrieve entity
CustomerEntity specificEntity = CustomerEntity)peopleTable.Execute(retrieveJeffSmith).Result;

TableQuery

TableQuery is a lightweight object that represents a query for a given set of entities and encapsulates all query operators currently supported by the Windows Azure Table service. Note, for this release we have not provided an IQueryable implementation, so developers who are migrating applications to the 2.0 release and wish to leverage the new table implementation will need to reconstruct their queries using the provided syntax. The code below produces a query to take the top 5 results from the customers table which have a RowKey greater than 5.

Sample – Query top 5 entities with RowKey greater than or equal to 5

// You will need the following using statements
using Microsoft.WindowsAzure.Storage;
using Microsoft.WindowsAzure.Storage.Table;

TableQuery<TableEntity> query = new TableQuery<TableEntity>().Where(TableQuery.GenerateFilterCondition("RowKey", QueryComparisons.GreaterThanOrEqual, "5")).Take(5);

In order to provide support for JavaScript in the Windows Runtime, the TableQuery can be used via the concrete type TableQuery, or in its generic form TableQuery<EntityType> where the results will be deserialized to the given type. When specifying an entity type to deserialize entities to, the EntityType must implement the ITableEntity interface and provide a parameterless constructor.

The TableQuery object provides methods for take, select, and where. There are static methods provided such as GenerateFilterCondition, GenerateFilterConditionFor*, and CombineFilters which construct other filter strings. Some examples of constructing queries over various types are shown below

// 1. Filter on String
TableQuery.GenerateFilterCondition("Prop", QueryComparisons.GreaterThan, "foo");

// 2. Filter on GUID
TableQuery.GenerateFilterConditionForGuid("Prop", QueryComparisons.Equal, new Guid());

// 3. Filter on Long
TableQuery.GenerateFilterConditionForLong("Prop", QueryComparisons.GreaterThan, 50L);

// 4. Filter on Double
TableQuery.GenerateFilterConditionForDouble("Prop", QueryComparisons.GreaterThan, 50.50);

// 5. Filter on Integer
TableQuery.GenerateFilterConditionForInt("Prop", QueryComparisons.GreaterThan, 50);

// 6. Filter on Date
TableQuery.GenerateFilterConditionForDate("Prop", QueryComparisons.LessThan, DateTime.Now);

// 7. Filter on Boolean
TableQuery.GenerateFilterConditionForBool("Prop", QueryComparisons.Equal, true);

// 8. Filter on Binary
TableQuery.GenerateFilterConditionForBinary("Prop", QueryComparisons.Equal, new byte[] { 0x01, 0x02, 0x03 });

Sample – Query all entities with a PartitionKey=”SamplePK” and RowKey greater than or equal to “5”

string pkFilter = TableQuery.GenerateFilterCondition("PartitionKey", QueryComparisons.Equal, "samplePK");

string rkLowerFilter = TableQuery.GenerateFilterCondition("RowKey", QueryComparisons.GreaterThanOrEqual, "5");

string rkUpperFilter = TableQuery.GenerateFilterCondition("RowKey", QueryComparisons.LessThan, "10");

// Note CombineFilters has the effect of “([Expression1]) Operator (Expression2]), as such passing in a complex expression will result in a logical grouping.
string combinedRowKeyFilter = TableQuery.CombineFilters(rkLowerFilter, TableOperators.And, rkUpperFilter);

string combinedFilter = TableQuery.CombineFilters(pkFilter, TableOperators.And, combinedRowKeyFilter);

// OR 
string combinedFilter = string.Format(“({0}) {1} ({2}) {3} ({4})”, pkFilter, TableOperators.And, rkLowerFilter, TableOperators.And, rkUpperFilter);
TableQuery<SampleEntity> query = new TableQuery<SampleEntity>().Where(combinedFilter);

Note: There is no logical expression tree provided in the current release, and as a result repeated calls to the fluent methods on TableQuery overwrite the relevant aspect of the query. Additionally,

Note the TableOperators and QueryComparisons classes define string constants for all supported operators and comparisons:

TableOperators

QueryComparisons

Equal
NotEqual
GreaterThan
GreaterThanOrEqual
LessThan
LessThanOrEqual

Scenarios

NoSQL

A common pattern in a NoSQL datastore is to work with storing related entities with different schema in the same table. In this sample below, we will persist a group of heterogeneous shapes that make up a given drawing. In our case, the PartitionKey for our entities will be a drawing name that will allow us to retrieve and alter a set of shapes together in an atomic manner. The challenge becomes how to work with these heterogeneous entities on the client side in an efficient and usable manner.

The table client provides an EntityResolver delegate which allows client side logic to execute during deserialization. In the scenario detailed above, let’s use a base entity class named ShapeEntity which extends TableEntity. This base shape type will define all common properties to a given shape, such as it Color Fields and X and Y coordinates in the drawing.

public class ShapeEntity : TableEntity
{
    public virtual string ShapeType { get; set; }
    public double PosX { get; set; }
    public double PosY { get; set; }
    public int ColorA { get; set; }
    public int ColorR { get; set; }
    public int ColorG { get; set; }
    public int ColorB { get; set; }
}

Now we can define some shape types that derive from the base ShapeEntity class. In the sample below we define a rectangle which will have a Width and Height property. Note, this child class also overrides ShapeType for serialization purposes. For brevities sake the Line, and Ellipse entities are omitted here, however you can imagine representing other types of shapes in different child entity type such as triangles, trapezoids etc.

public class RectangleEntity : ShapeEntity
{
    public double Width { get; set; }
    public double Height { get; set; }

    public override string ShapeType
    {
        get { return "Rectangle"; }
        set {/* no op */}
    }
}

Now we can define a query to load all of the shapes associated with our drawing and an EntityResolver that will resolve each entity to the correct child class. Note, that in this example aside from setting the core properties PartitionKey, RowKey, Timestamp, and ETag, we did not have to write any custom deserialization logic and instead rely on the built in deserialization logic provided by TableEntity.ReadEntity.

// You will need the following using statements
using Microsoft.WindowsAzure.Storage;
using Microsoft.WindowsAzure.Storage.Table;

TableQuery<ShapeEntity> drawingQuery = new TableQuery<ShapeEntity>().Where(TableQuery.GenerateFilterCondition("PartitionKey", QueryComparisons.Equal, "DrawingName"));

EntityResolver<ShapeEntity> shapeResolver = (pk, rk, ts, props, etag) =>
{
    ShapeEntity resolvedEntity = null;
    string shapeType = props["ShapeType"].StringValue;

    if (shapeType == "Rectangle") { resolvedEntity = new RectangleEntity(); }
    else if (shapeType == "Ellipse") { resolvedEntity = new EllipseEntity(); }
    else if (shapeType == "Line") { resolvedEntity = new LineEntity(); }    
    // Potentially throw here if an unknown shape is detected

    resolvedEntity.PartitionKey = pk;
    resolvedEntity.RowKey = rk;
    resolvedEntity.Timestamp = ts;
    resolvedEntity.ETag = etag;
    resolvedEntity.ReadEntity(props, null);

    return resolvedEntity;
};

Now we can execute this query in a segmented Asynchronous manner in order to keep our UI fast and fluid. The code below is written using the Async Methods exposed by the client library for Windows Runtime.

List<ShapeEntity> shapeList = new List<ShapeEntity>();
TableQuerySegment<ShapeEntity> currentSegment = null;
while (currentSegment == null || currentSegment.ContinuationToken != null)
{
    currentSegment = await drawingTable.ExecuteQuerySegmentedAsync(
 drawingQuery,
 shapeResolver,
 currentSegment != null ? currentSegment.ContinuationToken : null);

    shapeList.AddRange(currentSegment.Results);
}

Once we execute this we can see the resulting collection of ShapeEntities contains shapes of various entity types.

Heterogeneous update

In some cases it may be required to update entities regardless of their type or other properties. Let’s say we have a table named “employees”. This table contains entity types for developers, secretaries, contractors, and so forth. The example below shows how to query all entities in a given partition (in our example the state the employee works in is used as the PartitionKey) and update their salaries regardless of job position. Since we are using merge, the only property that is going to be updated is the Salary property, and all other information regarding the employee will remain unchanged.

// You will need the following using statements
using Microsoft.WindowsAzure.Storage;
using Microsoft.WindowsAzure.Storage.Table;

TableQuery query = new TableQuery().Where("PartitionKey eq 'Washington'").Select(new string[] { "Salary" });

// Note for brevity sake this sample assumes there are 100 or less employees, however the client should ensure batches are kept to 100 operations or less.      
TableBatchOperation mergeBatch = new TableBatchOperation();
foreach (DynamicTableEntity ent in employeeTable.ExecuteQuery(query))
{
    EntityProperty salaryProp;

    // Check to see if salary property is present
    if (!ent.Properties.TryGetValue("Salary", out salaryProp))
    {
        if (salaryProp.DoubleValue < 50000)
        {
            // Give a 10% raise
            salaryProp.DoubleValue = 1.1;
        }
        else if (salaryProp.DoubleValue < 100000)
        {
            // Give a 5% raise
            salaryProp.DoubleValue = *1.05;
        }

        mergeBatch.Merge(ent);
    }
    else
    {
        throw new ArgumentNullException("Entity does not contain Salary!");
    }
}
// Execute batch to save changes back to the table service
employeeTable.ExecuteBatch(mergeBatch);

Persisting 3^rd party objects

In some cases we may need to persist objects exposed by 3rd party libraries, or those which do not fit the requirements of a TableEntity and cannot be modified to do so. In such cases, the recommended best practice is to encapsulate the 3rd party object in a new client object that implements the ITableEntity interface, and provide the custom serialization logic needed to persist the object to the table service via ITableEntity.ReadEntity and ITableEntity.WriteEntity.

Continuation Tokens

Continuation Tokens can be returned in the segmented execution of a query. One key improvement to the [Blob|Table|Queue]ContinuationTokens in this release is that all properties are now publicly settable and a public default constructor is provided. This, in addition to the IXMLSerializable implementation, allows clients to easily persist a continuation token.

DataServices

The legacy table service implementation has been migrated to the Microsoft.WindowsAzure.Storage.Table.DataServicesnamespace and updated to support new features in the 2.0 release such as OperationContext, end to end timeouts, and asynchronous cancellation. In addition to the new features, there have been some breaking changes introduced in this release, for a full list please reference the tables section of the Breaking Changes blog post.

Developing in Windows Runtime

Microsoft.WindowsAzure.Storage.winmd - A fully projectable storage client that supports JavaScript, C++, C#, and VB. This library contains all core objects as well as support for Blobs, Queues, and a base Tables Implementation consumable by JavaScript
Microsoft.WindowsAzure.Storage.Table.dll – A table extension library that provides generic query support and strong type entities. This is used by non-JavaScript applications to provide strong type entities as well as reflection based serialization of POCO objects

The images below illustrate the Intellisense experience when defining a TableQuery in an application that just reference the core storage component and when the table extension library is used. EntityResolver, and the TableEntity object are also absent in the core storage component, instead all operations are based off of the DynamicTableEntity type.

Intellisense when defining a TableQuery referencing only the core storage component

Intellisense when defining a TableQuery with the table extension library

While most table constructs are the same, you will notice that when developing for the Windows Runtime all synchronous methods are absent which is in keeping with the specified best practice. As such, the equivalent of the desktop method CloudTable.ExecuteQuery which would handle continuation for the user has been removed. Instead developers are supposed to handle this segmented execution in their application and utilize the provided ExecuteQuerySegmentedAsync methods in order to keep their apps fast and fluid.

Summary

This blog post has provided an in depth overview of developing applications that leverage the Windows Azure Table service via the new Storage Client libraries for .Net and the Windows Runtime. Additionally, we have discussed some specific differences when migrating existing applications from a previous 1.x release of the SDK.

Joe Giardino
Serdar Ozler
Veena Udayabhanu
Justin Yu

Windows Azure Storage

Resources

Get the Windows Azure SDK for .Net

↧

AzCopy – Uploading/Downloading files for Windows Azure Blobs

December 3, 2012, 2:19 pm

≫ Next: Updated: Known Issues for Windows Azure Storage Client Library 2.0 for .NET and Windows Runtime

≪ Previous: Windows Azure Storage Client Library 2.0 Tables Deep Dive

One of the frequent requests we receive is for a simple way to upload or download files between Windows Azure Blob Storage and their local file system. We’re pleased to release AzCopy (Beta Version), which is a command line utility which allows Windows Azure Storage customers to do just that. The utility is designed to simplify the task of transferring data in to and out of a Windows Azure Storage account. Customers can use this as a standalone tool or incorporate this utility in an existing application. The utility can be downloaded from Github.

The command is analogous to other Microsoft file copy utilities like robocopy that you may already be familiar with. Below is the syntax:

AzCopy <Source> <Destination> [filepattern [filepattern…]] [Options]

In this post we highlight AzCopy’s features without going into implementation details. The help command (AzCopy /?) lists and briefly describes all available system commands and parameters.

Key Features:

Efficient and Flexible: AzCopy allows users to selectively copy the data. By using recursive mode, customers can copy nested directories of files. Users can specify file pattern – wild cards in Windows file system or prefix in the blob storage to identify source files that are candidates for copying. In addition, users can set an option to copy files which have “Archive” attribute set. When copying a large number of files, and if there is a copy failure due to network or other issues, the re-startable option can resume the copy process from where it left off (the files which have not been copied). Using re-startable mode, you will not need to re-copy files which were successful in the previous attempt.
Support for Windows Azure Storage Data Types: AzCopy provides options for customers to specify the destination data in a storage account as a block blob or a page blob. Default is set to block blobs, as this is best suited for nearly all files. When using the page blob option, the blob will be zero-padded to a 512-byte boundary.
Naming Rules: URI format (http or https) is used to specify the blob storage path and NTFS Windows file folder path is used for Windows file system. Since some of the blob names cannot be directly translated into Windows file system, AzCopy translates into Windows file system names using its own rules. Here are some rules that we follow in naming translations:

Blobs that conflict with Windows special file names will be renamed using the following rules: “.” => “dot”; “..” => “dotdot”; “/” => “slash”; etc. As with other conflict resolution rules, if a conflict occurs on any of these names the string “(n)” will be added to the conflicting file or blob name such that the conflict is resolved.
The Windows file system namespace is case insensitive (but case retentive), and the Windows Azure blob namespace case sensitive, the following rules apply:

Blobs in the blob namespace are created with the default case for the file name.
Files are created with the default case of the blob namespace.
If a case-conflict occurs while copying from the blob-namespace to the file-namespace, we will append the string “(n)” to the conflicting file or blob names.

Logging: Users can run AzCopy in a verbose mode, which displays lists of files and directories processed, and render the list of files that the utility failed to copy. AzCopy also displays the progress indication for each file while running in verbose mode.

Examples:

Example 1: Copy a directory of locally accessible files to blob storage container in a recursive mode.

AzCopy C:\blob-data https://myaccount.blob.core.windows.net/mycontainer/ /destkey:key /S

The above command will copy all files from the “c:\blob-data” directory and all subdirectories as block blobs to the container named “mycontainer” in storage account “myaccount”. “Blob-data” folder contains the following files and one subdirectory named “subfolder1”;

C:\blob-data\car1.docx

C:\blob-data\car2.docx

C:\blob-data\car3.docx

C:\blob-data\train1.docx

C:\blob-data\subfolder1\car_sub1.docx

C:\blob-data\subfolder1\car_sub2.docx

After the copy operation, “mycontainer” blob container will contain the following blobs:

car1.docx

car2.docx

car3.docx

train1.docx

subfolder1/car_sub1.docx

subfolder1/car_sub2.docx

If we do not use recursive mode (Copying without the “/S” option), the “mycontainer” blob container would only contain the following files under “blob-data” folder and would ignore files under the “subfolder1” folder.

car1.docx

car2.docx

car3.docx

train1.docx

Example 2: Recursively copy a set of blobs from a blob storage to a locally accessible directory in both verbose and recursive modes.

AzCopy https://myaccount.blob.core.windows.net/mycontainer c:\blob-data /sourceKey:key /S /V

The command will copy all blobs under the “mycontainer” blob container in account “myaccount” to the “c:\blob-data” directory in both verbose and recursive modes.

“mycontainer” blob container contains the following files:

car1.docx

car2.docx

car3.docx

train1.docx

subfolder1/car_sub1.docx

subfolder1/car_sub2.docx

Since we are using the verbose mode, the tool will display the following output which contains the file transfer status of each of the file in addition to the transfer summary. By default, the tool will only display the transfer summary:

Finished Transfer: car1.docx

Finished Transfer: car2.docx

Finished Transfer: car3.docx

Finished Transfer: train1.docx

Finished Transfer: subfolder1/car_sub1.docx

Finished Transfer: subfolder1/car_sub2.docx

Transfer summary:

-----------------

Total files transferred: 6

Transfer successfully: 6

Transfer failed: 0

After the copy operation, c:\blob-data folder will contain the files listed below:

C:\blob-data\car1.docx

C:\blob-data\car2.docx

C:\blob-data\car3.docx

C:\blob-data\train1.docx

C:\blob-data\subfolder1\car_sub1.docx

C:\blob-data\subfolder1\car_sub2.docx

Let’s try a slightly different scenario by copying the blobs which start with “subfolder1\” by using the following command:

AzCopy https://myaccount.blob.core.windows.net/mycontainer/subfolder1 c:\blob-data /sourceKey:key /S /V

The above command will only copy blobs which begin with “subfolder1/”, and thus the tool will only copy “subfolder1/car_sub1.docx” and “subfolder1/car_sub2.docx” blobs to “c:\blob-data\” folder. After the copy operation, “C:\blob-data” will contain the following files:

C:\blob-data\car_sub1.docx

C:\blob-data\car_sub2.docx

Example 3: Copy a directory of locally accessible files to a blob account in re-startable mode

AzCopy c:\blob-data https://myaccount.blob.core.windows.net/mycontainer /destkey:key /Z:restart.log /S

Restart.log, a journal file, will be used to maintain a record of the status of the copy operation to allow the operation to restart if interrupted. If there is no text file specifies along with the re-startable mode parameter, the journal file will default to “azcopy.log” in the current working directory.

For instance, “C:\blob-data” folder contains the five large files with each of the file size greater than 100 MB.

C:\blob-data\car.docx

C:\blob-data\car1.docx

C:\blob-data\car2.docx

C:\blob-data\car3.docx

C:\blob-data\car4.docx

When running with restart option, AzCopy allows you to restart the process in the case of failure. If the failure occurred while copying “car.docx”, AzCopy will resume the copy from the part of “car.docx” which has not been copied. If the copy occurred after “car.docx” was successfully copied, AzCopy will resume the copy operation from one of the remaining four files which have yet to be copied.

Example 4: Select number of files in a storage blob container using a file pattern and copy them to a locally accessible directory.

AzCopy https://myaccount.blob.core.windows.net/mycontainer c:\blob-data car /sourceKey:key /Z /S

“mycontainer” contains the following files:

car1.docx

car2.docx

car3.docx

train.docx

carfolder/car_sub1.docx

carfolder/train_sub2.docx

subfolder1/car_sub1.docx

subfolder1/car_sub2.docx

After copy operation, “c:\blob-data” will contain the files listed below. Since the file pattern with the prefix of “car” was specified, the copy operation copies only the file with the prefix of “car”. Note that this prefix is applied to the blob, if it’s directly in the “mycontainer” container, or to the subdirectory name.

C:\blob-data\car1.docx

C:\blob-data\car2.docx

C:\blob-data\car3.docx

C:\blob-data\carfolder\car_sub1.docx

C:\blob-data\carfolder\train_sub2.docx

Performance

Within a Windows Azure datacenter (i.e., between a compute instance and a storage account within the same DC), users should be able to achieve 50MB/s upload and download speed for uploading and downloading large amounts of data using an extra-large compute instance. Transfers to and from a Windows Azure datacenter will be constrained by the bandwidth available to AzCopy.

Known Issues

When copying without /XN (Exclude Newer) and /XO (Exclude Older), the tool only compares the names of source and target files before copying. Therefore, users will be prompted whether to overwrite the target files although source and target files are identical.
When using /XN and /XO, note that your local system time and the time stored in the storage service will vary slightly. So if the blob and local file were modified at nearly the same time, this comparison may not filter correctly.
When copying a file to a page blob with the re-startable option and if there is a copy operation failure in the middle of copying, the tool will restart the copy process from the beginning of the file. This issue does not apply to copying a file to a block blob.
When copying blobs to local %systemdrive%, the tool will not prompt the confirm message to overwrite or not to overwrite the files if there are existing files with the same name.
If there are two blobs named “a” and “a/b” under a storage container, copying the blobs under that container with /S will fail. Windows will not allow the creation of folder name “a” and file name “a” under the same folder.

Aung Oo
Matthew Hendel

Windows Azure Storage Team

↧

Updated: Known Issues for Windows Azure Storage Client Library 2.0 for .NET and Windows Runtime

December 9, 2012, 9:53 pm

≫ Next: AzCopy – Using Cross Account Copy Blob

≪ Previous: AzCopy – Uploading/Downloading files for Windows Azure Blobs

This Storage Client 2.0 release contains some known issues that are being addressed in the current and upcoming releases of the libraries and are detailed below. Some of these were reported by you, and we appreciate you bringing them to our attention!

Pending Issues

The following are the set of known issues that are pending fixes.

Some Azure SDK assemblies still reference Storage Client Library 1.7

Description: SDK assemblies such as Microsoft.WindowsAzure.Diagnostics.dll and Microsoft.WindowsAzure.CloudDrive.dll still reference Storage Client Library version 1.7.
Status: Will be resolved in a future release of the Windows Azure SDK
Impacted Platforms: .NET
Workaround: Please add references to both the old version (Microsoft.WindowsAzure.StorageClient.dll; version 1.7) and the new version (Microsoft.WindowsAzure.Storage.dll; version 2.0 or greater) of Storage Client Library in your project.

TableEntity does not support deserializing nullable values

Description: TableEntity does not support nullable values such as int? or long? during deserialization. The example below illustrates this issue:

class AgeEntity : TableEntity
{public int? Age { get; set; }
}

…

AgeEntity entity = new AgeEntity()
{
    PartitionKey = "FirstName",
    RowKey = "LastName",
    Age = 25,
};

table.Execute(TableOperation.Insert(entity));
…
TableResult result = table.Execute(TableOperation.Retrieve<AgeEntity>("FirstName", "LastName"));

entity = (AgeEntity)result.Result;  // entity.Age will be null

Status: Will be resolved in version 2.0.3
Resolution: TableEntity will support deserializing nullable int, long, double, and bool values.
Impacted Platforms: All
Workaround: If this impacts you, then upgrade to 2.0.3 when it is available. If you are unable to upgrade, please use the legacy Table Service implementation as described in the Migration Guide.

Extra TableResults returned via ExecuteBatch when a retry occurs

Description: When a batch execution fails and is subsequently retried, it is possible to receive extraneous TableResult objects from any failed attempt(s). The resulting behavior is that upon completion the correct N TableResults are located at the end of the IList<TableResult> that is returned, with the extraneous results listed at the start.
Status: Will be resolved in version 2.0.3
Resolution: The list returned back will only contain the results from the last retry.
Impacted Platforms: All
Workaround: If this impacts you, then upgrade to 2.0.3 when it is available. If you are unable to upgrade, please only check the last N TableResults in the list returned. As an alternative, you can also disable retries:

client.RetryPolicy = new NoRetry();

GenerateFilterConditionForLong does not work for values larger than maximum 32-bit integer value

Description: When doing a table query and filtering on a 64-bit integer property (long), values larger than maximum 32-bit integer value do not work correctly. The example below illustrates this issue:

TableQuery query = new TableQuery().Where(
    TableQuery.GenerateFilterConditionForLong("LongValue", QueryComparisons.Equal, 1234567890123));
List<DynamicTableEntity> results = table.ExecuteQuery(query).ToList();  // Will throw StorageException

Status: Will be resolved in version 2.0.3
Resolution: The required ‘L’ suffix will be correctly added to 64-bit integer values.
Impacted Platforms: All
Workaround: If this impacts you, then upgrade to 2.0.3 when it is available. If you are unable to upgrade, please convert the value to a string and then append the required ‘L’ suffix:

TableQuery query = new TableQuery().Where(
    TableQuery.GenerateFilterCondition("LongValue", QueryComparisons.Equal, "1234567890123L"));
List<DynamicTableEntity> results = table.ExecuteQuery(query).ToList();

CloudTable.EndListTablesSegmented method does not work correctly

Description: Listing tables segment by segment using APM does not work, because CloudTable.EndListTablesSegmented method always throws an exception. The example below illustrates this issue:

IAsyncResult ar = tableClient.BeginListTablesSegmented(null, null, null);
TableResultSegment results = tableClient.EndListTablesSegmented(ar);  // Will throw InvalidCastException

Status: Will be resolved in version 2.0.3
Resolution: EndListTablesSegmented will correctly return the result segment.
Impacted Platforms: .NET
Workaround: If this impacts you, then upgrade to 2.0.3 when it is available. If you are unable to upgrade, please use the synchronous method instead:

TableResultSegment results = tableClient.ListTablesSegmented(null);

CloudQueue.BeginCreateIfNotExists and CloudQueue.BeginDeleteIfExists methods expect valid options argument

Description: BeginCreateIfNotExists and BeginDeleteIfExists methods on a CloudQueue object do not work if the options argument is null. The example below illustrates this issue:

queue.BeginCreateIfNotExists(null, null);  // Will throw NullReferenceExceptionqueue.BeginCreateIfNotExists(null, null, null, null);  // Will throw NullReferenceException

Status: Will be resolved in version 2.0.3
Resolution: Both methods will be updated to accept null as a valid argument.
Impacted Platforms: .NET
Workaround: If this impacts you, then upgrade to 2.0.3 when it is available. If you are unable to upgrade, please create a new QueueRequestOptions object and use it instead:

queue.BeginCreateIfNotExists(new QueueRequestOptions(), null, null, null);

Metadata Correctness tests fail when submitting apps to the Windows Store

Description: An application that references Storage Client Library will fail the Metadata Correctness test during Windows Store certification process.
Status: Will be resolved in version 2.0.3
Resolution: All non-sealed public classes will be either removed or sealed.
Impacted Platforms: Windows Store Apps
Workaround: Not available

Missing Queue constants

Description: A few general purpose queue constants have not been exposed as public on CloudQueueMessage. Missing constants are shown below:

public static long MaxMessageSize { get; }public static TimeSpan MaxTimeToLive { get; }public static int MaxNumberOfMessagesToPeek { get; }

Status: Will be resolved in version 2.0.3
Resolution: CloudQueueMessage constants will be made public.
Impacted Platforms: Windows Store Apps
Workaround: If this impacts you, then upgrade to 2.0.3 when it is available. If you are unable to upgrade, please use the values directly in your code. MaxMessageSize is 64KB, MaxTimeToLive is 7 days, and MaxNumberOfMessagesToPeek is 32.

Service Client RetryPolicy does not support null

Description: The Cloud[Blob|Queue|Table]Client.RetryPolicy does not support null.
Status: Not resolved
Impacted Platforms: All
Workaround: If you wish to disable retries, please use:

client.RetryPolicy = new NoRetry();

Windows Store Apps developed in JavaScript cannot use Table Service layer due to missing OData dependencies

Description: Windows Store Apps developed in JavaScript are unable to load the dependent dlls to a referenced component (WinMD). Because the Table Storage API is dependent on Microsoft.Data.OData.WindowsStore.dll, invoking Table APIs will result in a FileNotFoundException at runtime.
Status: Not Resolved. We are actively exploring options to bring Table Storage support to Windows Store Apps developed in JavaScript.
Impacted Platforms: Windows Store Apps
Workaround: Not available

Resolved Issues

The following are the set of known issues that have been fixed and released.

IAsyncResult object returned by asynchronous methods is not compatible with TaskFactory.FromAsync

Description: Both the System.Threading.Tasks.TaskFactory and System.Threading.Tasks.TaskFactory<TResult> classes provide several overloads of the FromAsync and FromAsync methods that let you encapsulate an APM Begin/End method pair in one Task instance or Task<TResult> instance. Unfortunately, the IAsyncResult implementation of Storage Client Library is not compatible with these methods, which leads to the End method being called twice. The effect of calling the End method multiple times with the same IAsyncResult is not defined. The example below illustrates this issue. The call will throw a SemaphoreFullException even if the actual operation succeeds:

TableServiceContext context = client.GetTableServiceContext();// Your Table Service operations hereawait Task.Factory.FromAsync<DataServiceResponse>(
    context.BeginSaveChangesWithRetries,
    context.EndSaveChangesWithRetries,null);  // Will Throw SemaphoreFullException

Status: Resolved in version 2.0.2
Resolution: IAsyncResult.CompletedSynchronously flag now reports the status correctly and thus FromAsync methods can work with the Storage Client Library APM methods.
Impacted Platforms: .NET
Workaround: If you are unable to upgrade to 2.0.2, please use APM methods directly without passing them to FromAsync methods.

Public DynamicTableEntity constructors use DateTime even though the Timestamp property is of type DateTimeOffset

· Description: DynamicTableEntity class defines the Timestamp property as a DateTimeOffset value. However, its public constructors use DateTime.MinValue to initialize the property. Therefore, if the machine’s time zone is ahead of UTC (such as UTC+01), DynamicTableEntity references cannot be instantiated. The example below illustrates this issue if the time zone is ahead of UTC:

CloudTable table = client.GetTableReference(tableName);
TableQuery query = new TableQuery();
IEnumerable<DynamicTableEntity> result = table.ExecuteQuery(query);  // Will Throw StorageException

Status: Resolved in version 2.0.2
Resolution: IAsyncResult.CompletedSynchronously flag now reports the status correctly and thus FromAsync methods can work with the Storage Client Library APM methods.
Impacted Platforms: All
Workaround: Not available

BeginSaveChangesWithRetries ignores SaveChangesOptions argument

Description: TableServiceContext class provides several overloads of the BeginSaveChangesWithRetries method that let you begin an asynchronous operation to save changes. One of the overloads ignores the “options” argument. The example below illustrates this issue:

IAsyncResult ar = tableContext.BeginSaveChangesWithRetries(
    SaveChangesOptions.Batch, null, null);
…
DataServiceResponse response = tableContext.EndSaveChangesWithRetries(ar);
int statusCode = response.BatchStatusCode;  // Will return -1

Status: Resolved in version 2.0.2
Resolution: All overloads of BeginSaveChangesWithRetries now make use of the “options” argument correctly.
Impacted Platforms: .NET
Workaround: If you are unable to upgrade to 2.0.2, please use a different overload of BeginSaveChangesWithRetries method:

IAsyncResult ar = tableContext.BeginSaveChangesWithRetries(
    SaveChangesOptions.Batch, null, null, null, null);
…
DataServiceResponse response = tableContext.EndSaveChangesWithRetries(ar);
int statusCode = response.BatchStatusCode;

CloudStorageAccount.Parse cannot parse DevelopmentStorageAccount strings if a proxy is not specified

Description: CloudStorageAccount.Parse() and TryParse() do not support DevelopmentStorageAccount strings if a proxy is not specified. CloudStorageAccount.DevelopmentStorageAccount.ToString() will serialize to the string: “UseDevelopmentStorage=true” which illustrates this particular issue. Passing this string into CloudStorageAccount.Parse() or TryParse() will throw a KeyNotFoundException.

CloudStorageAccount myAccount = CloudStorageAccount.Parse("UseDevelopmentStorage=true");  // Will Throw KeyNotFoundException

Status: Resolved in version 2.0.1
Resolution: CloudStorageAccount can now parse this string correctly.
Impacted Platforms: All
Workaround: If you are unable to upgrade to 2.0.1, you can use:

CloudStorageAccount myAccount = CloudStorageAccount.DevelopmentStorageAccount;

StorageErrorCodeStrings class is missing

Description: Static StorageErrorCodeStrings class that contains common error codes across Blob, Table, and Queue services is missing. The example below shows some of the missing error codes:

public const string OperationTimedOut = "OperationTimedOut";public const string ResourceNotFound = "ResourceNotFound";

Status: Resolved in version 2.0.1
Resolution: StorageErrorCodeStrings class is added.
Impacted Platforms: All
Workaround: If you are unable to upgrade to 2.0.1, you can directly use the error strings listed in the Status and Error Codes article.

ICloudBlob interface does not contain Shared Access Signature creation methods

Description: Even though both CloudBlockBlob and CloudPageBlob have methods to create a shared access signature, the common ICloudBlob interface does not contain them which prevents a generic method to create a SAS token for any blob irrespective of its type. Missing methods are shown below:

string GetSharedAccessSignature(SharedAccessBlobPolicy policy);string GetSharedAccessSignature(SharedAccessBlobPolicy policy, string groupPolicyIdentifier);

Status: Resolved in version 2.0.1
Resolution: GetSharedAccessSignature methods are added to ICloudBlob interface.
Impacted Platforms: All
Workaround: If you are unable to upgrade to 2.0.1, please cast your object to CloudBlockBlob or CloudPageBlob depending on the blob’s type and then call GetSharedAccessSignature:

CloudBlockBlob blockBlob = (CloudBlockBlob)blob;string signature = blockBlob.GetSharedAccessSignature(policy);

Summary

We continue to work hard on delivering a first class development experience for the .NET and Windows 8 developer communities to work with Windows Azure Storage. We have actively embraced both NuGet and GitHub as release mechanisms for our Windows Azure Storage Client libraries. As such, we will continue to release updates to address any issues as they arise in a timely fashion. As always, your feedback is welcome and appreciated!

Joe Giardino
Serdar Ozler

Windows Azure Storage

↧

AzCopy – Using Cross Account Copy Blob

March 31, 2013, 7:37 pm

≫ Next: .NET Clients encountering Port Exhaustion after installing KB2750149

≪ Previous: Updated: Known Issues for Windows Azure Storage Client Library 2.0 for .NET and Windows Runtime

Please download AzCopy CTP2 here, and there is a previous blog of AzCopy CTP1 here as a reference

New features have been added in this release

Support of Cross-Account Copy Blob: AzCopy allows you to copy blobs within same storage accounts or between different storage accounts (visit this blog post for more details on cross account blob copy). This enables you move blobs from one account to another efficiently with respect to cost and time. The data transfer is done by the storage service, thus eliminating the need for you to download each blob from the source and then upload to the destination. You can also use /Z to execute the blob copy in re-startable mode.
Added /MOV: This option allows you move files and delete them from source after copying. Assuming you have delete permissions on the source, this option applies regardless of whether the source is Windows Azure Storage or the local file system.
Added /NC: This option allows you to specify the concurrent network calls. By default, when you upload files from local computer to Windows Azure Storage, AzCopy will initiate network calls up to eight times the number of cores this local computer had to execute concurrent tasks. For example, if your local computer has four cores, then AzCopy will initiate up to 32 (eight times of 4) network calls at one time. However, if you want to limit the concurrency to throttle local CPU and bandwidth usage, you can specify the maximum concurrent network calls by using /NC. The value specified here is the absolute count and will not be multiplied by the core count. So in the above example, to reduce the concurrent network calls by half, you would specify /NC:16
Added /SNAPSHOT: This option allows you transfer blob with snapshots. This is a semantic change, as in AzCopy CTP 1 (Released in October, 2012), it would transfer a blob’s snapshots by default. However starting from this version, by default AzCopy won’t transfer any snapshots while copying a blob. Only with /SNAPSHOT specified, AzCopy will actually transfer all the snapshots of a blob to destination, however these snapshots will became separate blobs instead of snapshots of original base blob in destination, so each of the blobs will be full charged for (not block reuse between them). The transferred blob snapshots will be renamed in this format: [blob-name] (snapshot-time)[extension].

For example if readme.txt is a source blob and there are 3 snapshots created for it in source container, then after using /SNAPSHOT there will be 3 more separate blobs created in destination container where their names will looks like

readme (2013-02-25 080757).txt
readme (2012-12-23 120657).txt
readme (2012-09-12 090521).txt

For the billing impact compare blob snapshots to separate blob, please refer to this blog post Understanding Windows Azure Storage Billing
Added /@:response-file: Thisallows you store parameters in a file and they will be processed by AzCopy just as if they had been specified on the command line. Parameters in the response file can be divided as several lines but each single parameter must be in one line (breaking 1 parameter into 2 lines is not supported). AzCopy parses each line as if it is a single command line into a list of parameters and concatenates all the parameter lists into one list, which is treated by AzCopy as from a single command line. Multiple response files can be specified, but nested response file is not supported, instead it will be parsed as a location parameter or file pattern. Escape characters are not supported, except that “” in a quoted string will be parsed as a single quotation mark. Note that /@: after /- does not means a response file, it is treated as a location parameter or file pattern like other parameters

Examples

Here are some examples that illustrate the new features in this release.

Copy all blobs from one container to another container under different storage account

AzCopy https://<sourceaccount>.blob.core.windows.net/<sourcecontainer>/ https://<destaccount>.blob.core.windows.net/<destcontainer>/ /sourcekey:<key> /destkey:<key> /S

The above command will copy all blobs from the container named “sourcecontainer” in storage account “sourceaccount” to another container named “destcontainer” in storage account “destaccount”

If you have base blob with snapshots, please add/Snapshot to move all snapshots with base blob to destination, please be noted the blob snapshot will be renamed to this format in destination: [blob-name] (snapshot-time)[extension]

AzCopy https://<sourceaccount>.blob.core.windows.net/<sourcecontainer>/ https://<destaccount>.blob.core.windows.net/<destcontainer>/ /sourcekey:<key> /destkey:<key> /S /SNAPSHOT

For example if you have readme.txt with 3 snapshots in source container, then it will be as below in destination container
readme.txt

readme (2013-02-25 080757).txt
readme (2012-12-23 120657).txt
readme (2012-09-12 090521).txt

If you’d like to delete those blobs from the source container when the copy is complete, then just added /MOV as below

AzCopy https://<sourceaccount>.blob.core.windows.net/<sourcecontainer>/ https://<destaccount>.blob.core.windows.net/<destcontainer>/ /sourcekey:<key> /destkey:<key> /MOV /S

You can also create a response file to make it easier of running same command again and again. Create a txt file called “myAzCopy.txt” with content below

#URI of Source Container
https://<sourceaccount>.blob.core.windows.net/<sourcecontainer>/
#URI of Destination Container
https://<destaccount>.blob.core.windows.net/<destcontainer>/

Then you can run the command below to transfer files from source container to destination container

AzCopy/@:C:\myAzCopy.txt /sourcekey:<key> /destkey:<key> /MOV /S

↧

.NET Clients encountering Port Exhaustion after installing KB2750149

May 25, 2013, 1:57 pm

≫ Next: PartitionKey or RowKey containing the percent ‘%’ character causes some Windows Azure Tables APIs to fail

≪ Previous: AzCopy – Using Cross Account Copy Blob

A recent update for .Net 4.5 introduced a regression to HttpWebRequest that may affect high scale applications. This blog post will cover the details of this change, how it impacts clients, and mitigations clients may take to avoid this issue altogether.

What is the affect?

Client would observe long latencies for their storage requests and may find either that that their requests to storage are dispatched after a delay or it is not dispatching requests to storage and instead see System.Net.WebException being thrown from their application when trying to access storage. The details about the exception is explained below. Running a netstat as described in the next section would show that the process has consumed many ports causing port exhaustion.

Who is affected?

Any client that is accessing Windows Azure Storage from a .NET platform with KB2750149 installed that does not consume the entire response stream will be affected. This includes clients that are accessing the REST API directly via HttpWebRequest and HttpClient, the Storage Client for Windows RT, as well as the .NET Storage Client Library (2.0.5.0 and below provided via nuget, github, and the sdk). You can read more about the specifics of this update here.

In many cases the Storage Client Libraries do not expect a body to be returned from the server based on the REST API and subsequently do not attempt to read the response stream. Under previous behavior this “empty” response consisting of a single 0 length chunk would have been automatically consumed by the .NET networking layer allowing the socket to be reused. To address this change proactively we have added a fix to the .Net Client library in version 2.0.5.1 to explicitly drain the response stream.

A client can use the netstat utility to check for processes that are holding many ports open in the TIME_WAIT or ESTABLISHED states by issuing a nestat –a –o ( The –a will show all connections, and the -o option will display the owner process ID).

Running this command on an affected machine shows the following:

You can see above that a single process with ID 3024 is holding numerous connections open to the server.

Description

Users installing the recent update (KB2750149) will observe slightly different behavior when leveraging the HttpWebRequest to communicate with a server that returns a chunked encoded response. (For more on Chunked encoded data see here).

When a server responds to an HTTP request with a chunked encoded response the client may be unaware of the entire length of the body, and therefore will read the body in a series of chunks from the response stream. The response stream is terminated when the server sends a zero length “chunk” followed by a CRLF sequence (see the article above for more details). When the server responds with an empty body this entire payload will consists of a single zero-length chunk to terminate the stream.

Prior to this update the default behavior of the HttpWebRequest was to attempt to “drain” the response stream whenever the users closes the HttpWebResponse. If the request can successfully read the rest of the response then the socket may be reused by another request in the application and is subsequently returned back to the shared pool. However, if a request still contains unread data then the underlying socket will remain open for some period of time before being explicitly disposed. This behavior will not allow the socket to be reused by the shared pool causing additional performance degradation as each request will be required to establish a new socket connection with the service.

Client Observed Behavior

In some cases the Storage Client libraries provided will not retrieve the response stream from the HttpWebRequest (i.e. PUT operations), and therefore will not drain it, even though data is not sent by the server. Clients with KB2750149 installed that leverage these libraries may begin to encounter TCP/IP port exhaustion. When TCP/IP port exhaustion does occur a client will encounter the following Web and Socket Exceptions:

System.Net.WebException: The underlying connection was closed: An unexpected error occurred on a send.

- or -

System.Net.WebException: Unable to connect to the remote server
System.Net.Sockets.SocketException: Only one usage of each socket address (protocol/network address/port) is normally permitted.

Note, if you are accessing storage via the Storage Client library these exceptions will be wrapped in a StorageException:

Microsoft.WindowsAzure.Storage.StorageException: Unable to connect to the remote server

System.Net.WebException: Unable to connect to the remote server 
System.Net.Sockets.SocketException: Only one usage of each socket address (protocol/network address/port) is normally permitted

Mitigations

We have been working with the .NET team to address this issue. A permanent fix will be made available in the coming months which will reinstate this read ahead semantic in a time bounded manner. Until then clients can take the following actions to mitigate this issue in their applications:

Upgrade to latest version of the Storage Client (2.0.5.1)

An update was made for the 2.0.5.1 (nuget, github) version of the Storage Client library to address this issue, if possible please upgrade your application to use the latest assembly.

Uninstall KB2750149

We also recognize that some clients may be running applications that still utilize the 1.7 version of the storage client and may not be able to easily upgrade to the latest version without additional effort. For such users, consider uninstalling the update until the .NET team releases a permanent fix for this issue. We will update this blog, once such fix is available.

Update applications that leverage the REST API directly to explicitly drain the response stream

Any client application that directly references the Windows Azure REST API can be updated to explicitly retrieve the response stream from the HttpWebRequest via [Begin/End]GetResponseStream() and drain it manually i.e. by calling the Read or BeginRead methods until end of stream

Summary

We apologize for any inconvenience this may have caused. We are actively working with the .NET team to provide a permanent resolution to this issue that will not require any modification of client source code. Please feel free to leave questions or comments below,

Joe Giardino, Serdar Ozler, and Jean Ghanem

Resources

Windows Azure Storage Client library 2.0.5.1 (nuget, github)

Original KB article: http://support.microsoft.com/kb/2750149

↧

PartitionKey or RowKey containing the percent ‘%’ character causes some Windows Azure Tables APIs to fail

May 28, 2012, 3:12 pm

≫ Next: Character Encoding Issues Related to Copy Blob API

≪ Previous: .NET Clients encountering Port Exhaustion after installing KB2750149

Description and Symptoms

We have identified an issue that would affect services using Windows Azure Tables whenever the percent character ‘%’ appears as part of the PartitionKey or RowKey.

The affected APIs are GET entity, Merge Entity, Update Entity, Delete Entity, Insert Or Merge Entity and Insert Or Replace Entity APIs. If any of these APIs are invoked with a PartitionKey or RowKey that contains the ‘%’ character, the user could erroneously receive a 404 Not Found or 400 Bad Request error code. In addition, in the case of upsert (Insert Or Merge Entity and Insert Or Replace APIs), the request might succeed but the stored string might not be what the user intended it to be.

Note that Insert Entity, Entity Group Transactions and Query Entities APIs are not affected since the PartitionKey and RowKey is not part of the URL path segment.

Root Cause

The Windows Azure Table Service is double decoding the URL path segment when processing a request which is resulting in an erroneous interpretation of the string whenever the ‘%’ character appears. Note that the query string portion of the URL is not affected by this issue nor is any URL that appears as part of the HTTP body. Therefore, any other property filters used in a query will be unaffected by this issue – only PartitionKey and RowKey are affected.

Here is an example of how this issue occurs: Inserting an entity with PartitionKey = “Metric%25” and RowKey = “Count” would succeed, since PartitionKey, RowKey and custom values are part of the request payload and not the URL path segment. Now, when you intend to retrieve this existing entity, the Get Entity HTTP URL will look like:

http://foo.table.core.windows.net/Metrics(PartitionKey='Metric%2525',RowKey='Count')

However due to the double decoding bug, the PartitionKey is getting interpreted as “Metric%” on the server side which is not what the user intended. In this case, a 404 Not Found is returned.

Workarounds

If you did not currently commit any entities where ‘%’ is used as part of the PartitionKey or RowKey we suggest that you consider the following:

Avoid using ‘%’ as part of your PartitionKey and RowKey and consider replacing it with another character, for example ‘-‘.
Consider using URL safe Base64 encoding for your PartitionKey and RowKey values.

Note: Do not double encode your PartitionKey and RowKey values as a workaround, since this would not be compatible with future Windows Azure Tables releases when a fix is applied on the server side.

In case you already have inserted entities where ‘%’ appears as part of the PartitionKey or RowKey, we suggest the following workarounds:

For Get Entity:
- Use the Entity Group Transaction with an inner GET Entity command (refer to the example in the subsequent section)
- Use the Query Entities API by relying on the $Filter when retrieving a single entity. While this is not possible for users of the Windows Azure Storage Client library or the WCF Data Services Client library, this workaround is available to users who have control over the wire protocol. As an example, consider the following URL syntax when querying for the same entity mentioned in the “Root Cause” section above:
  http://foo.table.core.windows.net/Metrics()?$filter=(PartitionKey%20eq%20'Metric%2525')%20and%20(RowKey%20eq%20'Count')
For Update Entity, Merge Entity, Delete Entity, Insert Or Merge Entity and Insert Or Replace Entity APIs, use the Entity Group Transaction with the inner operation that you wish to perform. (refer the example in the subsequent section)

Windows Storage Client Library Workaround Code Example

Consider the case where the user has already inserted an entity with PartitionKey = “Metric%25” and RowKey = “Count”. The following code shows how to use the Windows Azure Storage Client Library in order to retrieve and update that entity. The code uses the Entity Group Transaction workaround mentioned in the previous section. Note that both the Get Entity and Update Entity operations are performed as a batch operation.

// Creating a Table Service ContextTableServiceContext tableServiceContext = new TableServiceContext(tableClient.BaseUri.ToString(), tableClient.Credentials);// Create a single point queryDataServiceQuery<MetricEntity> getEntityQuery = (DataServiceQuery<MetricEntity>) 
     from entity in tableServiceContext.CreateQuery<MetricEntity>(customersTableName)
     where entity.PartitionKey == "Metric%25" && entity.RowKey == "Count"select entity;// Create an entity group transaction with an inner Get Entity requestDataServiceResponse batchResponse = tableServiceContext.ExecuteBatch(getEntityQuery);// There is only one response as part of this batchQueryOperationResponse response = (QueryOperationResponse) batchResponse.First();if (response.StatusCode == (int) HttpStatusCode.OK)
{
    IEnumerator queryResponse = response.GetEnumerator();
    queryResponse.MoveNext();// Read this single entityMetricEntity  singleEntity = (MetricEntity)queryResponse.Current;// Updating the entitysingleEntity.MetricValue = 100;
    tableServiceContext.UpdateObject(singleEntity);// Make sure to save with the Batch optiontableServiceContext.SaveChanges(SaveChangesOptions.Batch);
}

Java Storage Client Workaround Code Example

As the issue discussed above is related to the service, the same behavior will exhibit when performing single entity operations using the Storage Client Library for Java. However, it is also possible to use Entity Group Transaction to work around this issue. The latest version that can be used to implement the proposed workaround can be found in here.

// Define a batch operation.TableBatchOperation batchOperation = new TableBatchOperation();// Retrieve the entitybatchOperation.retrieve("Metric%25", "Count", MetricEntity.class);// Submit the operation to the table service.tableClient.execute("foo", batchOperation);

For more on working with Tables via the Java Storage Client see: http://blogs.msdn.com/b/windowsazurestorage/archive/2012/03/05/windows-azure-storage-client-for-java-tables-deep-dive.aspx

Long Term Fix

We will be fixing this issue as part of a version change in a future release. We will update this post with the storage version that contains the fix.

We apologize for any inconvenience this may have caused.

Jean Ghanem

↧

Character Encoding Issues Related to Copy Blob API

May 28, 2012, 3:23 pm

≫ Next: 10x Price Reduction for Windows Azure Storage Transactions

≪ Previous: PartitionKey or RowKey containing the percent ‘%’ character causes some Windows Azure Tables APIs to fail

This blog applies to the 2011-08-18 storage version or earlier of the Copy Blob API and the Windows Azure Storage Client Library version 1.6.

Two separate problems are discussed in this blog:

Over REST, the service expects the ‘+’ character appearing as part of the x-ms-copy-source header to be percent encoded. When the ‘+’ is not URL encoded, the service would interpret it as space ‘ ’ character.
The Windows Azure Storage Client Library is not URL percent encoding the x-ms-copy-source header value. This leads to a misinterpretation of x-ms-copy-source blob names that include the percent ‘%’ character.

When using Copy Blob, character ‘+’ appearing as part of the x-ms-copy-source header must be URL percent encoded

When using the Copy Blob API, the x-ms-copy-source header value must be URL percent encoded. However, when the server is decoding the string, it is converting character ‘+’ to a space which might not be compatible with the encoding rule applied by the client and in particular, the Windows Azure Storage Client Library.

Example: Assume that an application wants to copy from a source blob with the following key information: AccountName = “foo” ContainerName = “container” BlobName = “audio+video.mp4”

Using the Windows Azure Storage Client Library, the following value for the x-ms-copy-source header is generated and transmitted over the wire:

x-ms-copy-source: /foo/container/audio+video.mp4

When the data is received by the server, the blob name would then be interpreted as “audio video.mp4” which is not what the user intended. A compatible header would be:

x-ms-copy-source: /foo/container/audio%2bvideo.mp4

In that case, the server when decoding this header would interpret the blob name correctly as “audio+video.mp4”

NOTE: The described server behavior in this blog does not apply to the request URL but only applies to the x-ms-copy-source header that is used as part of the Copy Blob API with version 2011-08-18 or earlier.

To get correct Copy Blob behavior, please consider applying the following encoding rules for the x-ms-copy-source header:

URL percent encode character ‘+’ to “%2b”.
URL percent encode space i.e. character ‘ ‘ to “%20”. Note that if you currently happen to encode character space to character ‘+’, the current server behavior will interpret it as a space when decoding. However, this behavior is not compatible with the rule to decode request URLs where character ‘+’ is kept as a ‘+’ after decoding.
In case you are using the Windows Azure Storage Client Library, please apply the workaround at the end of this post.

Windows Azure Storage Client Library is not URL encoding the x-ms-copy-source header

As described in the previous section, x-ms-copy-source header must be URL percent encoded. However the Windows Azure Storage Client Library is transmitting the blob name in an un-encoded manner. Therefore any blob name that has percent ‘%’ in its name followed by a hex number will be misinterpreted on the server side.

Example: Assume that an application wants to copy from a source blob with the following key information: AccountName = “foo” ContainerName = “container” BlobName = “data%25.txt”

Using the Windows Azure Storage Client Library, the following un-encoded value for the x-ms-copy-source header is generated and transmitted over the wire:

x-ms-copy-source: /foo/container/data%25.txt

Data received by the server will be URL decoded and therefore the blob name would be interpreted as “data%.txt” which is not what the user intended. A compatible header would be:

x-ms-copy-source: /foo/container/data%2525.txt

In that case, the server when decoding this header would interpret the blob name correctly as “data%25.txt”

Note that this bug exists in Version 1.6 of the client library and will be fixed in future releases.

As described in the previous sections, the current behavior of Copy Blob APIs exposed by the client library will not work properly in case the characters ‘+’ or ‘%’ appear as part of the source blob name. The affected APIs are CloudBlob.CopyFromBlob and CloudBlob.BeginCopyFromBlob.

To get around this issue, we have provided the following extension method which creates a safe CloudBlob object that can be used as the sourceBlob with any of the copy blob APIs. Please note that the returned object should not be used to access the blob or to perform any action on it.

Note: This workaround is needed for Windows Azure Storage Library version 1.6.

Windows Azure Storage Client Library Code Workaround

Note: This workaround is needed for Windows Azure Storage Library version 1.6.

public static class CloudBlobCopyExtensions
{/// <summary>
    /// This method converts a CloudBlob to a version that can be safely used as a source for the CopyFromBlob or BeginCopyFromBlob APIs only./// The returned object must not be used to access the blob, neither should any of its API be invoked./// This method should only be used against storage version 2011-08-18 or earlier/// and with Windows Azure Storage Client Versions 1.6     /// </summary>
    /// <param name="originBlob">The source blob this being copied</param>
    /// <returns>CloudBlob that can be safely used as a source for the CopyFromBlob or BeginCopyFromBlob APIs only.</returns>public static CloudBlob GetCloudBlobReferenceAsSourceBlobForCopy(this CloudBlob originBlob)
        {
            UriBuilder uriBuilder = new UriBuilder();
            Uri srcUri = originBlob.Uri;// Encode the segment using UrlEncodestring encodedBlobName = HttpUtility.UrlEncode(
                                        HttpUtility.UrlEncode(
                                            originBlob.Name));string firstPart = srcUri.OriginalString.Substring(
                0, srcUri.OriginalString.Length - Uri.EscapeUriString(originBlob.Name).Length);string encodedUrl = firstPart + encodedBlobName;return new CloudBlob(
                encodedUrl,
                originBlob.SnapshotTime,
                originBlob.ServiceClient);
        }

}

Here is how the above method can be used:

// Create a blob by uploading data to itCloudBlob someBlob = container.GetBlobReference("a+b.txt");
someBlob.UploadText("test");
CloudBlob destinationBlob = container.GetBlobReference("a+b(copy).txt");// The below object should only be used when issuing a copy. Do not use sourceBlobForCopy to access the blobCloudBlob sourceBlobForCopy = someBlob.GetCloudBlobReferenceAsSourceBlobForCopy();
destinationBlob.CopyFromBlob(sourceBlobForCopy);

We will update this blog once we have fixed the service. We apologize for any inconvenience that this may have caused.

Jean Ghanem

↧

10x Price Reduction for Windows Azure Storage Transactions

June 7, 2012, 10:22 pm

≫ Next: New Storage Features on the Windows Azure Portal

≪ Previous: Character Encoding Issues Related to Copy Blob API

We heard you loud and clear that you want cheaper transaction costs for Windows Azure Blobs, Tables, Queues, and Drives. We are therefore very pleased today to slash transaction prices 10 fold for Windows Azure Storage and CDN. This means that it now costs $0.01 for 100,000 transactions ($1 per 10 million). This applies to all transactions for Windows Azure Storage Blobs (PutBlob, GetBlob, DeleteBlob, etc), Tables (PutEntity, GetEntity, Table Queries, Batch Transactions, etc), Queues (PutMessage, GetMessage, DeleteMessage, UpdateMessage, etc), as well as transactions to VHD’s stored in Windows Azure Storage from Drives and the new IaaS Data Disks that was just released. Pricing details can be found here.

Windows Azure Storage service was built from ground up to provide storage at massive scale that is highly available and durable. We have provided a storage solution that scales out and load balances automatically, so it does not require manual sharding techniques to be applied. Our storage stack is layered to provide different types of storage abstractions, as described in our SOSP paper. It provides the following four data abstractions:

Windows Azure Blob Service: supports storing large scale unstructured data. Think of it as your file store in the cloud. It empowers developers to build internet scale applications like a document store, media sharing for social networking sites, device backups, etc. In addition, our Windows Azure CDN can be utilized to ensure that the blobs stored are delivered to end users efficiently by making use of the 24+ worldwide caching locations.

Windows Azure Table Service: is a NoSQL structured store system that auto scales hence enabling users to build applications requiring massive scale structured store. It provides an OData interface to access the structured store system. Distributed systems that require massive scale can benefit from storing its structured data in this NoSQL store – example scenarios include: keeping track of users for social sites that can grow to support millions of users, CRM data, queryable metadata for massive number of items/objects, etc.

Windows Azure Queue Service: is an asynchronous messaging system that enables reliable inter-role or component communication for large scale distributed systems. It provides a lease-based message processing system to effectively deal with failures during message processing. It also allows updating of messages that enables more efficient continuation on failure. Example scenarios – web role enqueues work for worker roles to process asynchronously (image processing, virus scan, report building etc.), queues are used for workflow like order processing, etc.

Windows Azure Disks, Images and Drives: AWindows Azure Virtual Machine allows you to easily deploy an IaaS Windows Server or Linux virtual machine and hence migrate your legacy applications in the cloud without having to change them. With a Windows Azure Virtual Machine, you need to associate at least one disk to the VM for your operating system. This disk is a VHD stored as a page blob in Windows Azure Storage. In addition, you can attach multiple data disks with the virtual machine and these data disks are VHDs stored as page blobs. All VHD’s are fixed formatted and all writes to the disk are converted to PutPage transactions that are set to your storage account in the Windows Azure Blob Service, which provides durability for all writes to the IaaS disks. In addition, if you take an image of your virtual machine, it is also stored as a VHD formatted page blob in the Windows Azure Blob Service. These images can then be used to load virtual machines. Then for PaaS, we also have Windows Azure Drives, which allow Windows Azure PaaS roles to dynamically network mount a page blob formatted as a single volume VHD. Both Disks (used for IaaS) and Drives (used for PaaS) are network mounted durable VHDs stored in page blobs, and all transactions to those blobs count towards the billable transactions for the storage account in which they are contained.

To get started, please visit the Windows Azure website and register your Windows Azure Storage account. We provide an easy to use and open REST APIs in addition to client libraries in various languages such as .NET, Java, Node.js, etc. hence making the storage service available to large number of developers. You can download easy to use storage client libraries for your favorite language here and start building applications that require large scale storage.

The following resources provide additional information:

Windows Azure Virtual Machines: Windows Server 2008 R2 and Linux
What is considered a transaction for Blobs, Tables and Queues for billing?
Windows Azure Blob Service: .NET, Java and Node.js libraries
Windows Azure Table Service: .NET, Java and Node.js libraries
Windows Azure Queue Service: .NET, Java and Node.js libraries

Brad Calder

↧

New Storage Features on the Windows Azure Portal

June 8, 2012, 9:26 am

≫ Next: Introducing Locally Redundant Storage for Windows Azure Storage

≪ Previous: 10x Price Reduction for Windows Azure Storage Transactions

We are excited to announce several new storage features available on the Windows Azure Portal. With the updated Portal you have the ability to choose the level of redundancy for your storage, enable/disable and configure both Metrics and Logging, and view metrics in tabular and graphic format.

When you create your storage account, you can now configure the type of redundant storage that meets your business needs – Geo Redundant Storage (geo-replication is enabled) or Locally Redundant Storage (geo-replication is disabled). To learn more about the different types of redundant storage, please read this blog post. You can also update your storage selection (enable/disable geo replication) after your account has been created in the ‘Storage/Configure’ section of the portal.

In the portal you can now also configure Windows Storage Analytics. You can use the portal to enable/disable Metrics and Logging as well as to configure all settings; for full details on analytics, please read this blog post. After you configure metrics, you can choose which of the available metrics you want to monitor in the portal. You can also select which metrics to plot on metrics charts Note that all available metrics (based on what you configured) will be captured as described in this blog post.

As always, the portal also provides the ability to update your storage keys and delete your storage account. Please note that if you delete your storage account, there is no way to restore your data and the account name will no longer be reserved.

The following resources provide additional information:

For details on account creation in the portal, please review this How To documentation.
For details on selecting your redundant storage choice for existing accounts, and related pricing information, please review this How To documentation.
For details on how to configure analytics in the Portal, please review this How To documentation.
For more details on configuring analytics via APIs, please review this blog post.

We hope you enjoy the new storage features in the portal, and welcome your feedback and suggestions for further improvements!

Monilee Atkinson

↧