Archive for April, 2009
Capistrano, Rails and Amazon CloudFront
Posted by Anthony in Rails, Ruby, amazon web services on April 14, 2009
Amazon CloudFront (ACF), the content delivery network that is now part of Amazon’s Web Services provides a simple means of deploying content to a fast, low-cost global content delivery network. As part of our deployment process for chi.mp we now deploy our static assets to ACF (well, technically this will be deployed as part of the next release). Here’s how we do it: http://gist.github.com/95255.
Let’s go through this piece by piece:
# get the previous timestamp
old_timestamp = File.read("config/deploy_timestamp").to_i rescue 0
The first step is to get the previous timestamp. This will be saved in memory and compared against file modification times to determine if a put or a copy should be used.
# generate timestamp into config/deploy_timestamp
timestamp = Time.now.to_i
File.open("config/deploy_timestamp", 'w') do |f|
f.write(timestamp)
end
The next step is to update the time stamp file and store the new time stamp in memory.
# generate minified JS and CSS
system('rake asset:packager:build_all')
This portion uses the asset packager plugin for Rails to package up JS and CSS files into bundled assets.
# sync local public/ directory to S3 bucket
# the S3 bucket directory should be the
# timestamp generated above
require 'right_aws'
s3 = RightAws::S3.new(access_key_id, secret_access_key)
bucket = s3.bucket('a-bucket')
Here we connect to S3 using RightScale’s RightAws library. You will need to put your access key ID and secret access key into the variables provided. The bucket name should be a bucket that has already been set up as an Amazon CloudFront S3 source. Documentation on how to set up a CloudFront configuration can be found in the ACF Getting Started Guide.
put_count = 0
copy_count = 0
Dir.glob('public/**/*').each do |f|
next if File.directory?(f)
key = "#{timestamp}/#{f.gsub(/public\//, '')}"
if File.new(f).mtime.to_i > old_timestamp
puts "putting #{f} into S3 as #{key}"
bucket.put(key, File.read(f), {}, 'public-read')
put_count += 1
else
old_key = bucket.key(
"#{old_timestamp}/#{f.gsub(/public\//, '')}"
)
if old_key.exists?
puts "copying #{old_key} to #{key}"
old_key.copy(key)
acl = s3.interface.get_acl(bucket.name, old_key.name)
s3.interface.put_acl(bucket.name, new_key.name, acl[:object])
copy_count += 1
else
puts "putting #{f} into S3 as #{key}"
bucket.put(key, File.read(f), {}, 'public-read')
put_count += 1
end
end
end
puts "done. #{put_count} files uploaded,
#{copy_count} keys copied"
This code loops through all of the files and directories in the public directory and for any file it first checks to see if the file modification time is after the previous timestamp. If it is then the data from the file will be pushed to S3 using a key that is the file path prefixed with the time stamp. If the modification time is before the previous time stamp then the file hasn’t changed. In this case the script will look up the old key in S3. If the old key exists then key.copy() is used to make a copy of the resource (reducing the time required to process large files) and set the ACL to the old key’s ACL. If the key does not exist then the file data will be put into S3.
The S3 key is always prefixed with the time stamp. This is done because Amazon CloudFront will cache files for a minimum of 24 hours. If you were to release a new version of an asset and overwrite the old asset in S3 it could be quite some time before ACF would pick up the change. By prefixing the key with the time stamp you will always get the latest version as long as the asset host is configured properly in your Rails application (more on that in a bit).
# add and commit the config/deploy_timestamp file
system('git add config/deploy_timestamp')
system('git commit -m "deploy_assets complete,
updating timestamp"')
system('git push')
This last little bit of code commits the updated deploy_timestamp file into the git repository and pushes it to the remote repo. This works for us because we use Github. If you do not then you’d want to adjust these lines to push the file into your source code repository.
The last piece of the puzzle is to set the asset host in Rails. Here’s what that might look like:
ts_file = File.join(RAILS_ROOT, "config/deploy_timestamp")
ts = File.read(ts_file).to_i
config.action_controller.asset_host = Proc.new { |source, request|
if request.ssl?
"https://yoursite.com"
else
"http://cdn.yoursite.com/#{ts}"
end
}
This code can go either in your config block in config/environment.rb or in specific environment configs. You’ll also need to set up a CNAME record pointing cdn.yoursite.com to your Amazon CDN host. If you aren’t using Rails you’d still need some way to indicate that all images should be originating from Amazon CloudFront. Note that resources requested from SSL encrypted pages must still go to your SSL-enabled servers since CloudFront does not support SSL at this time.
One final caveat: if you are using images that are specified in stylesheets you’ll need to ensure that you use relative paths to those images.
Update 1 It turns out that when you copy a key in S3 the original key ACL is not retained. This is unfortunate since it means copied assets will now be marked private. Furthermore it appears that RightAWS does not support URI-based group identifiers for S3 ACLs, which means there is no way to change the permissions on a copied key to public-read. It seems likely that I can switch to another S3 lib to get this functionality, but that’ll have to wait. More updates will be forthcoming when I fix this.
Update 2 Thanks to Alex’s comment I’ve re-enabled the key copy in the example code. It’s a bit slow to copy the ACL, but for large files it will still perform significantly better.
Upgrading to Rails 2.3
I spent most of today working on upgrading chi.mp to Rails 2.3. Upgrading required more than just simply freezing the new gems. Here are my notes so far:
application.rb becomes application_controller.rb
The source file application.rb becomes application_controller.rb.
uninitialized constant Rails::Plugin::OpenIdAuthentication
The OpenIdAuthentication plugin needed to be upgraded to the latest from github:
script/plugin install git://github.com/rails/open_id_authentication.git --force
undefined method `use_transactional_fixtures=’ for Test::Unit::TestCase:Class
Two problems occurred that caused this. First, the test/unit/test_helper.rb was opening up Test::Unit::TestCase to add additional items when it should be opening up ActiveSupport::TestCase. Additionally some of our tests were old and still extended from Test::Unit::TestCase instead of ActiveSupport::TestCase.
formatted_xxx_url
Formatted URLs should now use the normal xxx_url methods and just include :format => format in the options Hash.
has_many collections do not support .destroy(id)
They did, I’m certain (and I have code from 2.2.2 that works to prove it), but it no longer works. The easiest way to fix this is to replace collection.destroy(id) with collection.find(id).destroy. There is a lighthouse ticket for this as well if you’re interested in following along at home: https://rails.lighthouseapp.com/projects/8994-ruby-on-rails/tickets/2306-associationcollections-destroy-method-is-not-compatible-with-old-version.
Enumerable.group_by now returns an OrderedHash
This one was maddening. First of all, as mentioned, Enumerable.group_by now returns an OrderedHash instead of an array of arrays. This taken by itself would have been ok, but our test expectation for this was showing the result as being a Hash where the keys were the arrays and the values were nil.
Local cache strategy freezes memcached objects
The local cache strategy now uses MemoryCache as a local storage mechanism in front of remote caches like memcached. Unfortunately the MemoryCache#write method freezes the objects and therefore if you try to modify them afterward an error is raised. The only way I’ve found to stop this for the moment is to change value.freeze to value in the MemoryStore.write method. This probably isn’t the best solution but it does the job.
That’s it so far, running locally. Next step is to test in an integrated environment.



