Jul 13
Holidays in beautiful Umbria (Italy) give the opportunity to do some reading. With a strong interest in clould computing, I read Cloud Application Architectures by Georges Reese this summer. Around the same time last year (2008), I read Programming Amazon Web Services by James Murty.
The book “Programming Amazon Web Services” was really good in 2008. It describes the different Amazon offerings and how to invoke the API’s using Ruby. But Amazon is extending its offering a a rapid pace, e.g. with fixed IP addresses and block storages (like NAS). So James Murty’s book is in need for a 2nd edition.
“Cloud Application Architecture” goes up the stack to a higher abstraction level and explains how to deploy (“architect”) application on the Amazon cloud. Georges Reese has gained practical experience while deploying the Valtira (Web Marketing) application on Amazon.
Reese covers some very interesting topics:
- Load balancing with software load balancer in the cloud vs. HW load balancer on premise
- Cost comparison with sample calculation; : making the comparison with operating application on own hardware or in the cloud
- (High) Availability with some sample calculations
- Use of stateless application servers
- (Virtual) Machine images: outweihing generic vs. specific machine images; the use of startup-scripts with user-data
- Privacy: example on how to separate private information and encrypt it with key generated for each customer/partner/…
- Database management: outweighing clustering vs replication, whereby replication is usually considered the better option; the slave(s) can be used for read operations and backups; solutions for primary key generation and optimistic locking
- Data Security: e.g. through file system encryption
- Network security: security groups as alternative to firewalls, the fact that network intrusion detection cannot be used in Amazon context, why network level encryption still makes sense even if machine cannot see eachother’s traffic at Amazon, system hardening (Bastille), Host intruction detection (OSSEC), anti-virus
- Disaster Recovery, backups, recovery, redundancy,
- Scaling & capacity planning, the non-sense of auto-scaling
A real joy to read, but sometimes I would have loved that the author went into some more depth. One thing definitely became clear to me: deploying application on the (Amazon) cloud requires specific approaches and skills with obviously a sound and well-thought architecture. Also specific tools will be helpful and needed: Rightscale and enStratus are mentioned in the book. That’s probably the reason why Reese is also the CTO of enStratus.
We may expect many more cloud books in the coming months but “Cloud Application Architectures” brings quality content well ahead of the pack.
PS: podcast with interview of George Reese available here, same quality and content
Apr 11
Big news this week, the rumor finally became true: Google App Engine supports Java, next to Python. So Google AppEngine is now a big Servlet Engine in the cloud.
Along with the Java on Google App Engine announcement, I noticed another component: the Secure Data Connector. This SDC allows applications running in the Google cloud to inter operate with Intranet applications. Through the Secure Data Connector, Intranet applications can be accessed.


Scenario:
- The Secure Data Connector is installed on a Linux server within the Enterprise.
- An administrator configures the SDC to access certain resources within the Intranet.
- The SDC is started and runs continuously as a background process.
- The SDC connects to Google (https://apps-secure-data-connector.google.com) on port 443 (HTTPS). The connection is made from the enterprise to Google, so no need to configure the firewall at Entrprise side to allow inbound connections (from Google into the Enterprise).
- The SDC authenticates itself using username and password.
- Once the SSL connection is established, the connection remains open.
- An application running in the Google cloud (AppEninge, Google Spreadsheet, …) needs to access data from the Intranet or send data to the Intranet.
- In AppEngine, this is done using the URLFetchService. To specify that an Intranet resource, HTTP header
use_intranet=true is set.
- From the Google AppEngine, a call is made to the SDC deployed in the Enterprise. Remember, TCP connections are bidirectional.
- The SDC verifies if the access the local resources, e.g. using the local DNS from within the Enterprise.
- The SDC accesses the local resource or web service and returns the data back to the applicaton running in te Google cloud. The size of request and response messages is limited to 1 MB.
The access to protected data within the enterprise is somewhat of a challenge. The only mechanism the SDC can provide credentials to Intranet application/service/resource is OpenSocial and OAuth signatures. And
One of the evolutions that I envision is that ESB’s or B2B services will embed the SDC logic as an adapter. The ESB is able to transform the requests coming from Google into other protocols or formats and add the necessary credentials.
Some more thoughts and remarks I made while going through the Secure Data Connector docs:
- How is the configuration file of the SDC protected? In particular the username and password contained therein.
- Support is limited to Linux. What prevents this open source code to be ported to other platforms?
- How about load balancing or failover?
- How about interoperability between clouds: anyone already tried to deploy the SDC on EC2?
- Where is the SOAP support? How to invoke SOAP web services using the URLFetchService?
- How about identity services and mapping the identity of a Google user to an internal Enterprise user account?
- The SDC is not comparable to the .Net Services Bus of Windows Azure.
- Can the SDC access the Internet through a proxy?
- To deploy the Secure Data Connector in a large enterprise, you might have a hard time convincing the security department.

Mar 03
EDIINT AS2 is a very popular B2B protocol. Apart from file transfer, I think this is currently the most popular B2B protocol. Although EDIINT was initially meant to replace EDI over VAN connections by EDI over the INTernet (EDIINT), the AS2 protocol can transport any data in an asynchronous manner.
AS2 provides:
- Firewall friendly as it uses HTTP(S) underneath
- Straightforward: adds a number of HTTP header field and MIME based message structure
- Reliable: send message until you get a 200, duplicates are filtered based on message-id
- Signing: based on S/MIME message structure and PKCS7 signing (self signed certificates are often used in practice)
- Encyption: also leveraging the S/MIME message structure
- Non-repudiation: signed acknowledgement or message receipt, called the Message Disposition Notification
- Any payload: EDI, XML, binary, text
- Many implementations: free commercial versions available at low price and one or two open source implementations

As is typical in B2B scenario’s, AS2 servers are mostly located behind firewalls that only allows inbound connections from well know IP addresses.
AS2 and its family
AS2 has some family: AS1, which goes over SMTP. And a younger brother AS3 which uses FTP as its transport. And now a 4th child joins the family: AS4! AS4 uses uses ebXML messaging v3 as its transport. But what the heck is ebXML?
ebXML
ebXML was an initiative to define *the* new B2B standard for the 21st century: new transport layer, new message definitions, XML schema building blocks, process layer, protocol profiles and so on. ebXML didn’t really take off, mostly because it was considered some sort of threat to the Web Services story. A pitty that the WS-* world and ebXML world weren’t able to come together in 2001. ebXML gained a bit of popularity in some European countries like the Netherlands and Denmark, where it is used on a limited scale.
Work continued very quiet and a new version of the transport layer was released in 2007: ebXML Messaging v3. ebXML Messaging v2 and v3 are actually SOAP over HTTP(s) with some extra whistles.
Conclusion
The feature of AS4 is pulling or polling. Polling is one of the reasons why file transfer is so popular: it is “asymmetric” and allows one side to stay behind a closed firewall. ebMS 3.0 supports polling and AS4 das well. Well done! WS-Polling was a similar initiative in the WS-* world to introduce polling.

But AS4 will not become that popular in my opinion. The spec is rather “heavy” and ebMS 3.0 has very little traction. I’m not aware of any implementation. I would have preferred an extension of AS2 with support for polling, completely independent of the ebMS and WS-* specs. And so we’ll simply continue using file transfer (SFTP, FTPS) as our most popular polling mechanism.
Notes:
- Brik, thanks for pointing me to AS4
- Pictures are from a talk I gave on AS2
- Interesting link for those speaking Dutch: ebXML en ebMS: veilig en betrouwbaar berichten uitwisselen