niq January 29, 2009 at 08:15

David Cheppel Azure Services Platform Translation Part 2

Transfer

A detailed look at technology.

Getting an overview of the Azure Service Platform is an important first step, however, a deeper understanding of each technology is also needed. In this section, each member of the Azure family is examined in more detail.

Windows azure

Windows Azure does two main things - it runs applications and stores their data. Accordingly, this section is divided into two parts, devoted to each of them. The interaction of the parts is also very important, so its description will also be part of the section.

Running applications

On Windows Azure, an application typically has multiple instances, each of which runs part or all of the application code. Each of the instances runs on its own virtual machine. All virtual machines are running Windows Server 2008 and are provided by a hypervisor specifically designed for use in the cloud.

At the same time, the application on Windows Azure does not see the virtual machine on which it runs. A developer cannot provide his virtual machine image for Windows Azure to work on it, but he does not need to worry about supporting this copy of Windows. Instead, the CTP version allows developers to create .NET 3.5 applications with two types of instances - one runs Web Role and the other Worker Role. Figure 5 shows how it all works.

Fig. 6 In the CTP version, the Windows Azure application consists of Web Role and Worker Role instances, each of which runs on its own virtual machine.

As the name implies, a Web Role instance accepts incoming HTTP (or HTTPS) requests through Internet Information Services (IIS) 7. The web role can be implemented using ASP.NET, WCF, or other .NET technology that works with IIS. As shown in fig. 6, Windows Azure provides a built-in load balancer that distributes requests between different Web Role instances that are parts of the same application.

Worker Role, by contrast, does not accept requests directly from the outside world - it cannot have external incoming network connections and there is no IIS on its virtual machines. Instead, it receives raw data from Web Role, usually through a queue in the Windows Azure repository. The result of Worker Role instances can be written to Windows Azure storage or sent to the outside world - outgoing network connections are allowed. Unlike Web Role instances that are created to process HTTP requests and turn off after processing the request, Worker Role can work endlessly - this is a background job. Due to this lack of specialization, Worker Role can be implemented using any .NET technology that supports the main () method (there are some limitations related to Windows Azure trust, which will be described below).

Each virtual machine hosting an instance of Web Role or Worker Role contains a Windows Azure agent through which applications interact with the Windows Azure factory, as in Figure 6. The agent is accessible through the Windows Azure API defined through which applications can write to the log managed by Windows Azure, send alerts to their owner through the Windows Azure factory, etc.

Although the current state of affairs may change in the future, each virtual machine in CTP has its own physical processor core. Without this, the performance of each application can be guaranteed - each instance of Web Role and Worker Role has its own dedicated kernel. To increase the productivity of the entire application, its owner can increase the number of working instances specified in the configuration file. The Windows Azure Factory will then “overclock” the new virtual machines, assign them kernels, and launch more application instances. The factory also detects when a Web or Worker Role instance crashes and launches a new one.

Note the corollary of this: to ensure scalability, Web Role instances do not need to store their state. Any information about the client-specific state should be written to the Windows Azure storage or sent to the client with cookies. The lack of state in the Web role is necessary for the load balancer built into Windows Azure. Since you cannot bind to a specific instance of Web Role, you cannot guarantee that different requests from the same user will be processed by the same instance.

Both Web Role and Worker Role are implemented using standard .NET technologies. But migrating existing applications to Windows Azure without changes will most likely fail. On the one hand, the way to access data is different. Access to Windows Azure storage is provided through ADO.NET Web Services, a relatively new technology that is not yet applied universally in local applications. Similarly, Worker Role typically uses the Windows Azure storage queue to receive input, that is, an abstraction that does not exist for local applications. Another limitation is that Windows Azure applications do not work in a fully trusted environment. On the contrary, they are limited by what Microsoft calls the Windows Azure trust, which corresponds to the average degree of trust that most ASP.NET hosts use.

For developers, creating Windows Azure applications on the CTP version is very similar to developing traditional .NET applications. Microsoft provides VS2008 project templates for creating Windows Azure Web Role, Worker Role, and a combination of both roles. Developers can use any .NET language to choose from (although it is fair to say that Microsoft focused on C # when developing Azure). Also, the Windows Azure SDK includes a version of the Windows Azure environment that runs on the developer's computer. It contains storage, a Windows Azure agent, and everything else available to an application running in the cloud. A developer can create and debug his application in his local likeness of the cloud, and then distribute it to the real cloud when it is ready. At the same time, some things in the cloud are completely different.

Windows Azure provides developers and other services. For example, a Windows Azure application may send a message with some kind of warning through a Windows Azure agent, which the agent will then deliver via mail, instant message, or in another way to the specified recipient. If necessary, Windows Azure itself can determine the crash of the application and send a warning. The platform also provides detailed information on the consumption of resources by the application, including processor time, incoming and outgoing traffic, and storage.

Data access

Applications work with data in a variety of ways. Sometimes, all you need is simple blobs; other situations require a more structured way of storing information. In some cases, all that is needed is a mechanism for exchanging data between different parts of the application. Windows Azure storage covers all of these requirements as shown in Figure 2. 7.

Fig. 7 Windows Azure allows you to store data in blobs, tables and queues, which are accessed through REST over HTTP.

The easiest way to store data in Windows Azure - using blobs. As in Figure 7, there is a simple hierarchy: the storage account can have one or more containers, each of which can store one or more blobs. Blobs can be large - up to 50 GB each - and to make blob transfer easier, each of them can be divided into sub-blobs. If a transmission error occurs, retransmission may begin from the most recent transmitted subblock. Blobs can have metadata, such as information about where JPEG photos were taken, or song composer information for an MP3 file.

For some types of data blobs are the most, but in many situations they turn out to be insufficiently structured. So that applications can work with data at the level of small structural units, Windows Azure stores provide tables. Despite the similarity of names, they should not be confused with relational tables. In fact, although they are called tables, the data in them is stored as a simple hierarchy of entities with properties. A table does not have a specific data schema; instead, properties can have various types, such as Int, String, Bool, or DateTime. And instead of using SQL to work with data, the application accesses the data through a query language with LINQ syntax. A single table can be quite large, with millions of entities storing terabytes of data,

Both blobs and tables are designed to store data. The third method of storing data in Windows Azure storage - queues - was created for a slightly different purpose. The primary role of queues is to enable Web Role and Worker Role instances to communicate. For example, a user sends a request to perform some resource-intensive task through a web page implemented in Web Role. The Web Role instance that receives this request writes a message describing the required operation to the queue. A Worker Role instance that is waiting for a message in the queue reads a new message and does the required work. He can return the results of work through another queue or in some other way.

Regardless of the storage method — blobs, queues, or tables — three copies are stored for all data in Windows Azure. This gives independence from falls, since the loss of one copy is not fatal. The system also guarantees integrity, so that an application reading the data that it just wrote will receive exactly what it expects.

Data in Windows Azure Storage is available to both Windows Azure applications and applications running elsewhere. In both cases, they are equally accessible for all three data storage methods, using REST conventions to identify and provide access to data. Everything is called a URI and is accessible through regular HTTP operations. A .NET client can use ADO.NET Services or LINQ, and from, for example, a Java application, Windows Azure data is accessed through standard REST. For example, a blob can be read via an HTTP GET to a URI formatted like this:

http: //.blob.core.windows.net //

it is an identifier that is assigned to each created vault account, and it uniquely identifies blobs, tables, and queues created in this account. and these are just the names of the container and blob referenced in the request.

Similarly, a request to a specific table is expressed through an HTTP GET request to a URI of the form:

http: //.table.core.windows.net /? $ filter =

Here indicates the requested table, and the query to execute to this table.

Even queues are available to Windows Azure applications and external applications via HTTP GET to URIs:

http: //.queue.core.windows.net /

The cost of computing resources and storage on the Windows Azure platform is charged independently. This means that the local application can only use Windows Azure storage, accessing its data through REST as described. But it’s fair to say that the main goal of Windows Azure storage is to work with data from Azure applications. And since the data is not directly available to Azure applications, they remain available even when the Azure application using them is not running.

The main goal of application platforms, whether cloud or on-premises, is to support applications and data. Windows Azure provides an environment for both. Looking to the future, we can expect that applications that should be local Windows applications will be built on this new cloud platform.

.Net Services

Running applications in the cloud is already useful in itself, but the infrastructure services provided in the cloud are no less useful. These services can use both on-premises and cloud applications, and they solve problems that would otherwise be much more difficult to solve. This section discusses Microsoft's offerings in this area: .Net Access Control Service, .Net Service Bus and .Net Workflow Service, collectively known as .Net Services.

Access control service

Identity management is a fundamental part of most distributed applications. Based on the user's identity, the application decides what is allowed to this user. An application can use tokens defined using the Security Assertion Markup Language (SAML) to transfer this data. The SAML token contains claims, each of which represents specific information about the user. One statement may contain a username, another may indicate its role, such as “manager” or “general director”, and the third may contain its email. Tokens are created by applications known as security token service (STS), which digitally sign each token to confirm its source.

As soon as a client (for example, a Web browser) receives a user token, it can provide it to the application. An application can now use assertions from a token to determine what is allowed to this user. At the same time, a couple of problems immediately appear:

What if the token does not contain the statements that the application needs? With claims-based authentication, each application can define a set of claims that users must provide. At the same time, it is not a fact that the STS that created the token added exactly what the application needs to it.
What if the application does not trust the STS that issued the token? The application cannot accept all tokens in a row that are issued by any STS. Instead, an application typically has a list of STS trusted certificates, which gives the application the ability to verify token subscriptions. Only tokens from trusted STS will be accepted by the application.

Adding a new STS to this process can solve both problems. To be sure that the token contains the necessary statements, this additional STS performs a statement conversion. STS can have rules defining the relationship between the initial and final statements, and use these rules to create a token containing exactly those statements that the application needs. The solution to the second problem, commonly called identity federation, requires the application to trust this new STS. It is also required to establish a trusted relationship between this STS and the STS that issued the original token.

Adding an additional STS gives you the ability to transform claims and identity federation, both of which are very useful. But where should this STS run? You can use STS, which works within the organization - this is a service that some vendors provide. Why not do STS in the cloud? This will make it accessible to users and applications of any organization. It also shifts the management and support of STS to the service provider.

This is exactly what the Access Control Service offers: this is STS in the cloud. To imagine how such an STS can be used, suppose ISV, which provides an application accessible via the Internet, that users from different companies can work with. Even if all companies can provide SAML tokens for their users, it is unlikely that they will contain exactly the set of claims that the application needs. Figure 8 shows how the Access Control Service solves this problem.

Fig. 8 Access Control Service provides rule-based claims transformation and identity federation.

First, the user application (in this example, it is a browser, but it can also be a WCF client or something else) sends the Access Control Service user token (step 1). The service verifies the signature of the token, verifies that the token is issued by a trusted STS. After that, the service creates and signs a new SAML token, which contains the statements required by the application.

To do this conversion, the STS in the Access Control Service uses the rules defined by the application owner. For example, imagine an application that gives certain rights to any user who works as a manager in his company. Although each company can include a statement in its token, which shows that the user is a manager, the statements of different companies may well be different. One company may use the string “Manager”, another “Supervisor”, and the third generally a special code, expressed as an integer. To help applications work with all this diversity, the application owner can define a rule that converts all three of these statements into one that contains the “Decision Maker” line. The life of the application after that becomes much easier

After creating a new token, the Access Control Service returns the new token to the client (step 3), which then passes it to the application (step 4). The application checks the token signature to make sure that it is issued exactly by STS in the Access Control Service. It should be noted that although the STS in the Access Control Service must establish trust with the STS of all user companies, the application itself must trust only the STS in the Access Control Service. When trust in the token is confirmed, the application uses the statements contained in it to determine user rights.

Another way to use the Access Control Service stems from its name - the application can shift concerns about determining user access rights to the service. For example, imagine that accessing a specific function of an application requires the user to submit a specific statement. Application rules in the Access Control Service can determine that approval is only given to users who provide another approval, for example, the manager that was discussed earlier. When an application receives a user token, it can issue or deny access, depending on the availability of this statement - the decision was effectively made for it using the Access Control Service. This scheme allows administrators to define access control rules in one common place,

All interaction with the Access Control Service is done through the standard protocols WS-Trust and WS -Federation. This makes the service available to any application on any platform. To define the rules, the service provides both a GUI through a browser and an API for access from programs.

Claims-based authentication paves the way for becoming a standard mechanism in distributed applications. By providing STS in the cloud, complemented by the ability to transform claims, the Access Control Service makes this new authentication mechanism all the more attractive.

Service bus

Suppose you have an application that works internally and that you want to give others access to the company through the Internet. At first glance, this seems like a very simple problem. Assume that the functionality of your application is available through web services (REST or SOAP), then you can just make these Web services available to the outside world. When you actually try to do this, some problems will immediately arise.

First, how can applications in other companies (or even parts of your application) find access points through which they can connect to your services? It would be great to have something like a registry through which your application can be found. After your application has been found, how will requests from applications from other companies get to your application? Network Address Translation (NAT) is common, so an application often does not have a fixed external IP address. And even if NAT is not used, how can I get through the firewall with this request? Of course, you can open certain ports for the application, but most network administrators look at this very disapprovingly.

Service Bus solves these problems - Fig. 9.

Fig. 9 Service Bus allows applications to register their access points, then other applications can discover and use them to access application services.

To begin with, your application registers one or more access points in the Service Bus (step 1), which itself already opens access to them in your interests. Service Bus assigns a root URI to your organization, inside of which you can create any name hierarchy you want. This allows your access points to have specific, detectable URIs assigned to them. Your application should open a connection to the service bus for each access point it creates. Service Bus keeps this connection established, which solves two problems. First of all, NAT is no longer a nuisance, because traffic from Service Bus will always reach the application through the connection it opens. Secondly, since the connection was initiated from within the company’s network,

When an application from another company (or even another part of your application) needs to access your application, it accesses the Service Bus registry (step 2). For requests, the Atom Publishing Protocol is used, the response is returned as an AtomPub document with links to the access points of your application. When this data is received, the application can access the services available through these access points (step 3). Each request is received by the Service Bus, then it is passed to your application, the application is responding the way back. And although this is not shown in the figure, it is possible that the Service Bus establishes a direct connection between the application and its client, which makes communication more efficient.

Together with simplified communication, Service Bus can also increase security. Since customers only see the IP belonging to Service Bus, there is no need to open the IP addresses of your organization. This effectively makes your application anonymous, because the outside world does not see its IP. Service Bus works as an external DMZ, providing an “address translation layer” to help protect against attackers. Finally, the Service Bus was developed for use with Access Control Service. In particular, the Service Bus accepts tokens only from STS Access Control Service.

An application that opens access to its services through the Service Bus is usually implemented using WCF. Clients can be written using WCF or other technologies, such as Java, and they can send requests through SOAP or HTTP. Applications and their clients can use their own security mechanisms to protect their communication from attackers and the Service Bus itself.

Opening access to the application from the outside world is not as easy as it might seem. The goal of Service Bus is to make the implementation of this useful behavior as simple and obvious as possible.

Workflow service

Windows Workflow Foundation is the core technology for creating workflow applications. One of the classic scenarios for using Workflow is to control a time-consuming process, as is often the case when integrating enterprise applications. More generally, WF applications can be a good choice for coordinating many types of work. Especially when coordinated work occurs between several companies, monitoring logic in the cloud can be a very good solution.

Workflow Service actually provides such an opportunity. Providing a host process for applications based on WF 3.5, it gives the developer the ability to create workflows that run in the cloud. How it looks - in fig. 10

Fig. 10 Workflow Service allows you to create WF applications that can communicate via HTTP or Service Bus

Each WF workflow is implemented using a number of activities, in the figure they are highlighted in red. Each activity performs a specific action, such as sending or receiving a message, implementing a conditional expression, or controlling a loop. WF provides a standard set of activities known as the Base Activity Library (BAL), and the Workflow Service allows applications to use a subset of the BAL. The service also provides some set of its own activities. For example, an application that runs on a service can communicate with other applications through HTTP or Service Bus, as in Figure 10, that is, the Workflow Service contains built-in activities for implementing both.

However, running in the cloud introduces some limitations. WF-based applications running on the Workflow Service can only use a consistent workflow model, for example. Also, arbitrary code execution is not allowed, so neither Code Activity from BAL nor user-defined activities can be used.

To create applications for the Workflow Service, developers can use the standard workflow designer from Visual Studio. Once the application is written, it can be published to the cloud through a portal accessible through a browser, or programmatically through the provided API. Application execution can also be controlled through the portal or through the provided API. And as with Service Bus, an application that interacts with Workflow Service must first obtain a token from the Access Control Service - this is the only trusted STS.

WF applications are not a method for solving any problems. However, when solving a problem of a suitable type is required, using workflow can make a developer’s life much easier. By providing a manageable, scalable way to host WF applications in the cloud, Workflow Service makes this useful technology more affordable.

SQL Services

SQL Service is the common name for what will be a group of cloud technologies. All of them are focused on working with data - their storage, analysis, reporting on data, etc. These basic database functions presented, perhaps the most fundamental of all, will be the first members to appear in the family called SQL Data Services.

Databases in the cloud are attractive for many reasons. For some organizations, shifting concerns of reliability, backup, and other support functions to a service provider can be very convenient. Data in the cloud is also easy to make available for applications that run anywhere, including on mobile devices. And since scaling in the cloud is implemented to the satisfaction of the provider much more economically, in the end, for organizations, storing data in the cloud is cheaper. The goal of SQL Data Services is to provide all of these benefits.

At the same time, the implementation of a reliable, high-performance database with Internet-scalability is not at all a simple matter; some compromises needed. As described above, for example, SQL Data Services does not provide a standard relational database, it does not work through regular SQL queries. Instead, the data is organized using the structure as in Figure 11.

Fig. 11 SQL Data Services data centers are divided into authorities, each of which contains containers, which in turn contain entities containing properties.

Information in SQL Data Services is stored in many different data centers. Each of the centers contains a certain number of authorities, as in Fig. 11. The separation of authority is by geographical location, each of them is stored in a specific Microsoft data center and has a unique DNS name. Authority contains containers, each of which is replicated inside its data center. Containers are used to balance the load and availability of data: if an error occurs, SQL Data Services will automatically start using a different replica of the container. Each request is executed to one specific container - data requests from the entire authority are not allowed. Each container stores a number of entities, which in turn contain properties. Each property has a name, type, and value for that type. The types supported by SQL Data Services include String, DateTime, Base64Binary, Boolean, and Decimal. Applications can also store blobs with MIME types.

To access this data, applications have several options. One of them is the use of a query language similar in syntax to LINQ in C #, which are sent via SOAP or REST. Another option is to use ADO.NET Data Services, an alternative way to access data like REST. In both cases, the application makes requests to containers using operators like ==,! =,>, <, AND, OR, or NOT. Queries can also be performed by some SQL-like statements, such as ORDER BY and JOIN.

However, for requests, the objects to receive or update are entities, not their properties. The query returns a number of entities, for example, including all the properties that they contain. Similarly, you cannot change only one property of an entity — you must replace the entire entity. And since entities do not have a predefined data schema, properties in one entity can have different types. Entities in one container can also differ from each other, each of them contain a different set of properties.

Data in SQL Data Servies is named through a URI, very similar to the Windows Azure storage service. The general format for a URI that identifies a particular entity is something like this:

http: //.data.database.windows.net / v1 //

It will not be superfluous to emphasize once again that SQL Data Services does not require .Net on the client. The data it contains is accessible via REST — that is, plain HTTP — or through SOAP to any application on any platform. Regardless of the platform on which they are running, data access applications must identify their users with a specific SQL Data Services username and password or token created by Access Control Service STS.

Microsoft has announced plans to develop SQL Data Services into a more relational technology. If we recall that, unlike Windows Azure storage, SQL Data Service Storage is implemented on top of SQL Server, then this evolution becomes natural. Regardless of the model provided, the goal of the technology remains the same - providing a scalable, reliable and inexpensive cloud database for any type of application. With the expansion of SQL Data Services, which will include new cloud-based data services, you can expect that they will all rely on this first member of the family.

To be continued

Tags: