23 February 2011

Why you *can't* run your website from Amazon S3 (at least, not entirely)

A few days ago, I saw a blog post by Amazon CTO Werner Vogels: "New AWS feature: Run your website from Amazon S3." I wanted to set up a static website at barillari.org and thought this would be a great way to do it.

After uploading my files to S3, I edited my DNS settings. I tried adding a CNAME record to point barillari.org to s3-website-us-east-1.amazonaws.com. For some reason, GoDaddy refused to let me do this. Googling eventually led me to this page, which explained that RFC 1034 prohibits creating a CNAME for the root of a domain. In other words, while I could point www.barillari.org to S3, I couldn't point barillari.org to S3. I would have to set up a server somewhere to server redirects from barillari.org to www.barillari.org.

I'm not the first to post about the root CNAME issue: there are some comments about it on Dr. Vogels's initial post, and even a reply from Dr. Vogels where he tells a reader to "redirect {mysite}.com to www.{mysite}.com." (How is the user supposed create a redirect using only S3? Read on.)

Less than 12 hours ago, Dr. Vogels followed up with a new post, "Free at Last - A Fully Self-Sustained Blog Running in Amazon S3." He explains that by switching to Disquis for comments and Bing for search, he was able to move his blog entirely to S3.

Entirely? How did he solve the redirection problem? Let's see:

$ host allthingsdistributed.com
allthingsdistributed.com has address 74.208.227.55
allthingsdistributed.com mail is handled by 10 mx01.1and1.com.
allthingsdistributed.com mail is handled by 10 mx00.1and1.com.

Whom does that address belong to?

$ whois 'n 74.208.227.55'
#
# The following results may also be obtained via:
# http://whois.arin.net/rest/nets;q=74.208.227.55?showDetails=true&showARIN=false
#

NetRange: 74.208.0.0 - 74.208.255.255
CIDR: 74.208.0.0/16
OriginAS:
NetName: 1AN1-NETWORK
NetHandle: NET-74-208-0-0-1
Parent: NET-74-0-0-0-0
NetType: Direct Allocation
NameServer: NSA2.1AND1.COM
NameServer: NSA.1AND1.COM
Comment: For abuse issues, please use only abuse@1and1.com
RegDate: 2006-11-22
Updated: 2009-08-12
Ref: http://whois.arin.net/rest/net/NET-74-208-0-0-1

It doesn't look like 74.208.227.55 belongs to S3. Let's see what happens when we connect:

$ telnet allthingsdistributed.com 80
Trying 74.208.227.55...
Connected to allthingsdistributed.com.
Escape character is '^]'.
HEAD / HTTP/1.1
Host: allthingsdistributed.com

HTTP/1.1 200 OK
Content-Length: 79640
Content-Type: text/html
Content-Location: http://allthingsdistributed.com/index.html
Last-Modified: Thu, 24 Feb 2011 02:21:34 GMT
Accept-Ranges: bytes
ETag: "af7bb38ec9d3cb1:447"
Server: Microsoft-IIS/6.0 <--- Definitely not S3
X-Powered-By: ASP.NET
Date: Thu, 24 Feb 2011 06:43:43 GMT

It looks like allthingsdistributed.com still points to Dr. Vogels's old server. If you go to www.allthingsdistributed.com, however, the site is indeed served from S3:

$ telnet www.allthingsdistributed.com 80
Trying 72.21.203.159...
Connected to s3-website-us-east-1.amazonaws.com.
Escape character is '^]'.
HEAD / HTTP/1.1
Host: www.allthingsdistributed.com

HTTP/1.1 200 OK
x-amz-id-2: JycRZ0LH9NSGYyM6A+B24cSpSs5AsMUTH8wn95OoVwnOcrDQ/Q2/xbcldydB+IGQ
x-amz-request-id: 18BBB3B91F04D1B6
Date: Thu, 24 Feb 2011 06:50:53 GMT
Cache-Control: no-cache
Last-Modified: Thu, 24 Feb 2011 02:25:20 GMT
ETag: "000d8cb6f9e84d4012aaa0739c48038d"
Content-Type: text/html
Content-Length: 79640
Server: AmazonS3

What does this mean? It means that there is still one final dependency if you want to serve your site from S3. Unless you want to ignore users who go to {mysite}.com instead of www.{mysite}.com, you need a web server that redirects users from {mysite}.com to www.{mysite}.com. Apache's mod_rewrite can do this. In fact, the manual for mod_rewrite even provides an example of how to do exactly that.

A generously-minded individual (or AWS) could even set up one redirection server for everyone who wanted to host their second-level domain in S3. Those who wanted to host their sites in S3 could just point the root A record for their domain to the redirector, which could be an Apache server with a mod_rewrite configuration that looks something like this:
RewriteCond %{HTTP_HOST}   ^[^.]+\.[a-z]+$ [NC]
RewriteRule ^/?(.*) http://www.%{HTTP_HOST}/$1 [L,R=301,NE]
[I haven't tested this and am not an Apache expert, so please let me know if you can't perform %{variable} substitution in a RewriteRule.]

Until such a service exists, you will still need an HTTP server elsewhere in order to host your entire site in S3.


[How did I solve this issue? I avoided it. I put the content in EBS and served it from an EC2 instance. I use EC2 for other projects and have always been pleased with it.]

Update: See Dr. Vogels's response in the comments.

6 comments:

  1. Joe, you are right in that I haven't fixed that step yet. Next on the list there is to have 1and1 do the redirect of the apex name to www, which is a service they provide. I could take the server down then.

    My goal with all of this is to push whether we can be as simple as possible as see what where we still need solutions for. The apex/cname issue clearly is one, and there are a few other minor ones as well.

    You solution is excellent as well, and many folks are running like that. Completely running out of S3 has limitations as well. But it my task to push the envelope a bit...

    ReplyDelete
  2. P.S. My website is currently at
    http://s3.amazonaws.com/cloud.podometic.com/index.php

    ReplyDelete
  3. not really true anymore. On GoDaddy you could redirect your www domain to your non-www site. I do it on www.ramakantyadav.com. Unless, I am missing something and godaddy is generously hosting the site for me for free.

    ReplyDelete
  4. Want to discover more information about Work Abroad?

    ReplyDelete
  5. Worth noting that S3 now supports Root Domain access, provided you use route53 as well.

    ReplyDelete

About Me

blog at barillari dot org Older posts at http://barillari.org/blog