Friday 27 December 2013

Cloud Architectural Challenges

Building applications for cloud.


How is creating software for cloud different from previous architecture? Not that much really. There is a great deal of hype which makes it seem that cloud software architecture is a huge improvement compared to "traditional" software architecture - whatever that is: mainframe architecture, thin/fat-client architecture, server farm architecture, any other architecture...

The differences between other architectural styles and cloud architecture rise not from the offerings of the cloud but rather from the unique challenges posed by operating in the cloud. The challenges are two, and they are mainly a question of uncertainty...

Challenge: Out of Sync


The cloud is - by its very nature - a distributed environment. A distributed application consists of multiple parts which communicate with each other but don't necessarily run in the same system. When an application has functionality which is used at different times or different frequencies, or can be run parallel to other parts, then it often would make sense to remove it outside the normal execution process and perhaps even set it into a different system to be connected only when required.

This kind of functionality might for example be the billing or archiving functions of a website. After making an order in a web-shop, the website will not wait until the customer's credit card is actually billed; the web-shop returns control to the user immediately and a different subsystem of the application handles the credit card billing. When it is finished, the user will get a confirmation email and user's profile in the web-shop will be updated.


Workers


The example above is very trivial. The subsystem handling the billing is a "worker", a unit which is activated only when the application requires its service. The unit might exist outside the application's system, maybe even on a different server. The important thing is that the application does not wait for it to return anything. It runs alone according to the parameters the application provides to it, and after running it closes itself down automatically without need for any interaction. Therefore, it runs out of sync with the main application.


Being Out of Sync


When programs run inside one server and one operating system, the communication is instantaneous or - in practice - real time. But communication in the cloud is not in real time, sometimes not even stable. The connection might take a long time to establish, or it might break, or simply be slow. The vendor might have an unscheduled maintenance break or the service might have been relocated physically to a different server. In general, subsystems often use IP addressing to connect to each other. The two main architectural choices are RPC (remote procedure call) interface or REST (representational state transfer) interface. The main difference between these is less in the implementation and more in their philosophy.

RPC is an interface for a tightly coupled application where the subsystems are in fact subroutines and the main program waits for the completion of the subroutine before continuing. REST on the other hand is an API which can function both synchronously and asynchronously. REST is best used when subsystems of the application are autonomous services which can - in principle - be offered to any application. REST encourages the design of the API into the form of a (public) service. REST is stateless API so no client context is stored on the server between requests.

Because of the possible or (pessimistically) likely problems in connection between the application and its subsystems in the cloud, it is generally better to go for REST style architectural design. Properly implemented it provides a robust loosely coupled system fit for cloud.


Messaging


Of course, the distributed parts (workers and others) need to communicate with each other. There are many ways to do it but generally the best is an out-of-sync way: a message queue. A message queue is an external application, "messaging middle-ware", to whose care the application gives a message and then "forgets it". Another part of the application polls the message queue at preset intervals and reads the message when it is available. The message queue guarantees that a message will never get lost but it doesn't know how quickly the other subsystem will read it or act upon it. It does not wait for a return message. It will wait, however, for a receipt from so it knows the message was handled. When using a message queue to link subsystems together, an API (REST or other) is not necessary.


Challenge: Unreliability


As mentioned above, cloud is a volatile environment, also in the sense that vendor companies may come and go, new services promoted and old ones canceled. Cloud architecture is also about preparing for the eventuality of migration to new services or platforms. It is the natural additional price to pay when seeking "affordable" cloud services as most companies always do.


Design and prepare for eventual platform or vendor change.


The platforms that cloud vendors and service providers offer include not only real or virtual servers but also "platforms" that are more like services, such as databases, messaging middle-ware, worker platforms, and of course varied special services like log collectors (IT operation oriented) or daily currency rate providers (business oriented). All purchased services, not to mention free services, have a tendency to change APIs or even disappear as time goes by.

Cloud architecture has ways to prepare for this eventuality. Most of these are coding practices that can be forced for instance if the implementation is done using a framework. Connection to external APIs can be isolated, database connection abstracted into an ORM (object relational mapper), message queue connection as well. Unfortunately this also means that the risk for complicating the implementation rises.

Another isolation layer could be a proxy server for REST or RPC calls. A proxy server can provide additional security as it could also keep the remote services' passwords and other connection details hidden from the service users.


Change is always pending


With cloud architecture, preparing for trouble and change is always paramount because in the cloud an application can have very little control over its environment. The cloud creates a new kind of approach into dealing with vendors: pay-as-you-go. If the costs of vendor service are billed accurately according to the actual usage of resources (memory, CPU cycles, bandwidth, tech support requests, ...), this will prompt the design of applications better optimized for cloud environment.

Monday 4 November 2013

IronMQ - Message Queue in the cloud

Why didn't anybody think of it before!

The advent of cloud services are breaking apart the server-centered thinking: with the cloud - or in the cloud - all Internet services are close to each other. The trunk line connections even between separate clouds provide fast enough access speed to actually start "picking" the services. Paas (Platform as a service) will give way to Saas (Software as a service), or maybe even "Service as a service". Selecting any Internet service will be possible if services are compatible enough and "close" enough.

Cloud makes the services close enough but it's not enough by itself. Distributed computing and application integration requires a reliable way for the applications to talk to each other, preferably without syncronization because in a real word (cloud/Internet) services and applications don't necessarily go at the same speed. One of the best ways to balance the sending and receiving is to use a message queue.

Until now message queues have been limited inside one server, with a few exceptions, such as IBM's Websphere MQ. And even then, the interface to the message queues has been via linkable system libraries, which binds them to platforms or even specific programming languages. And of course the message queue must have an available node, port or other connection point accessible from within the server.

Iron.io has changed that! If cloud makes services available to all applications, then there should be a message queue inside cloud - but outside servers. IronMQ is that messages queue; and its API is in line with most cloud services because it is a REST compatible API.

IronMQ is "Message queue as a service", the first of its kind. Customer may pay on a per-message basis which goes perfectly with the idea of Saas. For a hobbyist it's a heaven since the payments only start running after the first 10 million requests (REST calls).

Iron.io uses OAuth for user authentication, and access protocol is of course HTTPS. REST interface for a message queue is not the big innovation here; there is other message queues which also provide a REST interface to supplement their normal socket interface, or linkable library interface. What is an innovation is how well IronMQ is interacting in cloud/Internet environment: from a passive party (what a message queue by nature is) it turns into an active party via its "push queues". Push queue is a queue which "knows" who is going to read the messages. It simply means that the message is relayed to another HTTP (or HTTPS) endpoint. The subscriber does not need to keep polling the queue for new messages; it simply sets up an HTTP(S) server/reader and waits for the messages. Besides remote HTTP endpoints, messages can also be pushed to different queues or IronWorker, Iron.io's worker system.

IronMQ pushes the concept of push queues even further: it accepts messages pushed to it by the REST compatible method of Webhooks, user-defined HTTP callbacks. They are usually triggered by some event, such as pushing code to a repository or a comment being posted to a blog. When that event occurs the source site makes an HTTP request to the URI configured for the webhook.

Iron.io has three cloud services: IronMQ, IronWorker and IronCache (a key-value storage). All of them have a REST interface and excellent cloud-usability. Cloud applications are often parts of integration systems. But the integration itself has been difficult because most integration tools, such as message queues, are running "inside" servers and are good at providing "internal" services. Iron.io's services are "between" servers and they are accessible by the most widely used REST protocol, HTTP.

Saturday 26 October 2013

Exercises in Restful Integration and Continuous Delivery

Having participated in several application integration projects, and seen both
great success and horrible blunders, this blog is a web diary and collection
of notes on things that I've seen work or fail; either tested by myself or simply witnessed working or failing.

RestChess is an attempt to integrate cloud/Internet services some of which are of old technology and some very new. My modest attempt is to "do things right", both in actual application integration and in delivery. Continuous Delivery may be a buzzword but also a good goal.