Comparing AWS SimpleDB and Windows Azure Table Storage – Part II

In my last post, I took an initial look at the Amazon Web Services (AWS) SimpleDB product and compared it to the Microsoft Windows Azure Table storage.  I showed that both solutions are relatively similar in that they embrace a loosely typed, flexible storage strategy and both provide a bit of developer tooling.  In that post, I walked through a demonstration of SimpleDB using the AWS SDK for .NET.

In this post, I’ll perform a quick demonstration of the Windows Azure Table storage product and then conclude with a few thoughts on the two solution offerings.  Let’s get started.

Windows Azure Table Storage

First, I’m going to define a .NET object that represents the entity being stored in the Azure Table storage.  Remember that, as pointed out in the previous post, the Azure Table storage is schema-less so this new .NET object is just a representation used for creating and querying the Azure Table.   It has no bearing on the underlying Azure Table structure. However, accessing the Table through a typed object differs from the AWS SimpleDB which has a fully type-less .NET API model.

I’ve built a new WinForm .NET project that will interact with the Azure Table.  My Azure Table will hold details about different conferences that are available for attendance.  My “conference record” object inherits from TableServiceEntity.

public class ConferenceRecord: TableServiceEntity
    {
        public ConferenceRecord()
        {
            PartitionKey = "SeroterPartition1";
            RowKey = System.Guid.NewGuid().ToString();

        }

        public string ConferenceName { get; set; }
        public DateTime ConferenceStartDate { get; set; }
        public string ConferenceCategory { get; set; }
    }

Notice that I have both a partition key and row key value.  The PartitionKey attribute is used to identify and organize data entities.  Entities with the same PartitionKey are physically co-located which in turn, helps performance.  The RowKey attribute uniquely defines a row within a given partition.  The PartitionKey + RowKey must be a unique combination.

Next up, I built a table context class which is used to perform operations on the Azure Table.  This class inherits from TableServiceContext and has operations to get, add and update ConferenceRecord objects from the Azure Table.

public class ConferenceRecordDataContext : TableServiceContext
    {
        public ConferenceRecordDataContext(string baseAddress, StorageCredentials credentials)
            : base(baseAddress, credentials)
        {}

        public IQueryable<ConferenceRecord> ConferenceRecords
        {
            get
            {
                return this.CreateQuery<ConferenceRecord>("ConferenceRecords");
            }
        }

        public void AddConferenceRecord(ConferenceRecord confRecord)
        {
            this.AddObject("ConferenceRecords", confRecord);
            this.SaveChanges();
        }

        public void UpdateConferenceRecord(ConferenceRecord confRecord)
        {
            this.UpdateObject(confRecord);
            this.SaveChanges();
        }
    }

In my WinForm code, I have a class variable of type CloudStorageAccount which is used to interact with the Azure account.  When the “connect” button is clicked on my WinForm, I establish a connection to the Azure cloud.  This is where Microsoft’s tooling is pretty cool.  I have a local “fabric” that represents the various Azure storage options (table, blob, queue) and can leverage this fabric without ever provisioning a live cloud account.

2010.10.04storage01

Connecting to my development storage through the CloudStorageAccount looks like this:

string connString = "UseDevelopmentStorage=true";

storageAcct = CloudStorageAccount.Parse(connString);

After connecting to the local (or cloud) storage, I can create a new table using the ConferenceRecord type definition, URI of the table, and my cloud credentials.

 CloudTableClient.CreateTablesFromModel(
                typeof(ConferenceRecordDataContext),
                storageAcct.TableEndpoint.AbsoluteUri,
                storageAcct.Credentials);

Now I instantiate my table context object which will add new entities to my table.

string confName = txtConfName.Text;
string confType = cbConfType.Text;
DateTime confDate = dtStartDate.Value;

var context = new ConferenceRecordDataContext(
      storageAcct.TableEndpoint.ToString(),
      storageAcct.Credentials);

ConferenceRecord rec = new ConferenceRecord
 {
       ConferenceName = confName,
       ConferenceCategory = confType,
       ConferenceStartDate = confDate,
  };

context.AddConferenceRecord(rec);

Another nice tool built into Visual Studio 2010 (with the Azure extensions) is the Azure viewer in the Server Explorer window.  Here I can connect to either the local fabric or the cloud account.  Before I run my application for the first time, we can see that my Table list is empty.

2010.10.04storage02

If I start up my application and add a few rows, I can see my new Table.

2010.10.04storage03

I can do more than just see that my table exists.  I can right-click that table and choose to View Table, which pulls up all the entities within the table.

2010.10.04storage04

Performing a lookup from my Azure Table via code is fairly simple and I can either loop through all the entities via a “foreach” and conditional, or, I can use LINQ.  Here I grab all conference records whose ConferenceCategory is equal to “Technology”.

var val = from c in context.ConferenceRecords
            where c.ConferenceCategory == "Technology"
            select c;

Now, let’s prove that the underlying storage is indeed schema-less.  I’ll go ahead and add a new attribute to the ConferenceRecord object type and populate it’s value in the WinForm UI.  A ConferenceAttendeeLimit of type int was added to the class and then assigned a random value in the UI.  Sure enough, my underlying table was updated with the new “column’” and data value.

2010.10.04storage05

I can also update my LINQ query to look for all conferences where the attendee limit is greater than 100, and only my latest column is returned.

Summary of Part II

In this second post of the series, we’ve seen that the Windows Azure Table storage product is relatively straightforward to work with.  I find the AWS SimpleDB documentation to be better (and more current) than the Windows Azure storage documentation, but the Visual Studio-integrated tooling for Azure storage is really handy.  AWS has a lower cost of entry as many AWS products don’t charge you a dime until you reach certain usage thresholds.  This differs from Windows Azure where you pretty much pay from day one for any type of usage.

All in all, both of these products are useful for high-performing, flexible data repositories.  I’d definitely recommend getting more familiar with both solutions.



Categories: Cloud, Windows Azure

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: