About this blog

'Going Spatial' is my personal blog, the views on this site are entirely my own and should in no way be attributed to anyone else or as the opinion of any organisation.

My tweets on GIS, Humanitarian, Tech, Games and Randomness

Tuesday 11 May 2010

Security Groups in Amazon Web Services

Granting access to that lovely new AMI you have created
The concept of a security group in AWS is a nice idea as it is, in effect a firewall. Each AMI that is created and running is allocated a security group. The security group bares little resemblance to what one would normally call a security group, one with users and group permissions in a windows active directory for example. I think the name does confuse.

Anyway, when a new AMI is spun up it needs to have a number of ports open on it to allow web and remote desktop protocol (RDP) to be passed through from the internet to the AMI and back. By default, all AMIs are put into a default security group that has all connections denied. Not a good place to be.

Pro Tip:
So, before you even create your first AMI, as tempting as it may be, create the necessary connection rules in the firewall first. Most will need the minimum of RDP, HTTP and HTTPS to name three and we shall see about creating this group for all internet access. We shall call it 'Internet'.

Go and create your new security group by navigating down the left hand table of contents and selecting 'Security Groups'.

Click create a new security group, call it 'Internet'. Under the connection methods, click on the pull down menu and select one of a dozen well known connection methods. Each one will automatically default to well-known port numbers. You can change these ports if required. Make sure you hit the 'save' button on the right hand column, called 'Actions' to ensure that your new firewall rule (because that is what it is) has been saved. Annoyingly, you have to do this for each connection method. Ensure that RDP is one of the choices as you want to remote desktop to your AMI don't you? Of course, if you have a number of secured services, it might be a good idea to remove this particular connection method just to improve security. I would use NetSupport as an alternative and it uses port 5405. Just make sure that your own corporate firewall or personal firewall allows these ports out!

Once these rules are saved, it is applied almost instantly.

I can't access my AMI!

OK could be due to the following so check again:

1. Your security group - do you have the correct connection method selected?
2. Correct ports?
3. Did you save?
4. Check your own corporate firewall.
5. Check your own personal firewall (i.e. zonealarm) - could be blocking it.
6. Check the external DNS - you might be going to the wrong AMI.

Thursday 6 May 2010

Imagery in web applications - ArcGIS Server Blog

Imagery in web applications - ArcGIS Server Blog

Interesting article here - though our initial experiences with map caches wasn't too successful.

CNAME and Amazon

Amazon Web Services provides some unwieldy names for their AMIs and I believe their IPs change as well. One obvious task after an AMI has been created and spun up is to have something more friendly than en-1002957-gb1-eu1a.aws.amazon.com as a DNS name!

To get round this, one needs to use CNAME to map a different name to it. We're going to alias it. There appears to be some confusion over the exact term of the CNAME v Canonical Name as they are different but over time, both have been used interchangeably.

So, anyway - the goal is to have 'datahub.esriuk.com' as the URL that a user types into the address field. This will then seamlessly resolve to the 'proper' DNS name that is attached to each Amazon AMI.

The process was surprisingly simply: you just had to contact your ISP who holds your domain name, in this case www.esriuk.com and make a request. That's it.

Using CNAMES within the Amazon cloud makes access to data and resources a lot easier. Nearly everyone will have S3 buckets in the cloud as well and using the CNAME as an alias is dead-easy,especially if you're using S3FOX - a wonderful add-on for the firefox browser.

Here's a quick Amazon video on using CNAMEs and S3 buckets.

or shameless borrowed from the following page: http://developer.amazonwebservices.com/connect/entry.jspa?externalID=2456

When status becomes deployed, our distribution is ready and we are using Amazon CloudFront. As you can see above, our distribution gives us a new host name; we can now access our content at: http://d2oxqriwljg696.cloudfront.net/media_rdc_share_web-poster.jpg.

Obviously this is a cumbersome URL to work with; you might want to replace this with learn how to create a friendlier alias. The standard way to do this is by creating an alias that maps a friendly name to our actual name - this alias is called a CNAME or canonical name.

A CNAME is simply a way to create an alias or a nickname for a DNS record. In our case, we are going to create an alias for our cumbersome d2oxqriwljg696.cloudfront.net host name.

For this example, we will create demo.learnaws.com as a CNAME that points to d2oxqriwljg696.cloudfront.net.

This is an optional step, if you are comfortable using d2oxqriwljg696.cloudfront.net in your web page or application there is no need to create a CNAME.

The first thing to do is to let Amazon CloudFront know that you plan to create the CNAME. To do this in S3 Organizer, you’ll add the CNAME to the Manage Distribution dialog. Click the Update Distribution  

Next, you need to create a DNS entry for your CNAME. CNAMEs are managed by whoever manages your DNS entries. This is usually your web hosting provider. There is no standard interface for managing DNS entries, so an example from Dreamhost.com is shown below.

Usually, a web hosting provider will discuss how to alter your DNS entries in their support documentation. For our example, we will continue to use Dreamhost.com and create a CNAME for our new Amazon S3 bucket.

The alias, or CNAME that we will use is demo and we simply specify d2oxqriwljg696.cloudfront.net as the value.

It is common to also create a www.demo CNAME entry that maps to the d2oxqriwljg696.cloudfront.net as well. Incidentally, if you have a CNAME for an Amazon S3 bucket, you can simply change its value to your new Amazon CloudFront host.

New DNS entries usually take a few minutes to propagate. When it does, we can access our content at http://demo.learnaws.com. This is the base URL that we can use to access our content in Amazon CloudFront.

Now we have a friendly URL that will serve its content from a data center that is as close as possible to the user requesting it.

2. Use the Amazon CloudFront domain name to reference content in your web pages or applications

Once your content has been uploaded and your distribution has been setup, you can reference your content with your new Amazon CloudFront-based URL.

Your content can be served from any of the following edge locations– depending on where the request is being made:

United States
  • Ashburn, VA
  • Dallas/Fort Worth, TX
  • Los Angeles, CA
  • Miami, FL
  • Newark, NJ
  • Palo Alto, CA
  • Seattle, WA
  • St. Louis, MO
  • Amsterdam
  • Dublin
  • Frankfurt
  • London
  • Hong Kong
  • Tokyo

While one or several of these edge locations may serve your requests, your ‘origin’ server will always be the Amazon S3 bucket where you originally uploaded your data.

Your content will be copied to each edge server as it is requested. The first request will be processed by the origin server; then that content will be propagated to the appropriate edge server. The next time this content is requested, it will be handled by the edge server.

When you update your content, those updates are made at the Amazon S3 bucket (i.e. the origin server). Amazon CloudFront will then propagate those changes to the edge servers that have your content – this process can take up to 24 hours, but is usually completed within a few minutes.

Wednesday 5 May 2010

ArcGIS and Amazon

First technical blog for me though I have been blogging under a number of different guises in different subjects for a while, mainly online gaming and photography but I needed a log to keep track of what I was doing, merely as a reminder to myself.

Amazon Web Services (http://aws.amazon.com/) has been a new offering to the market providing fast, reliable and economical cloud computing to anyone who wants to pay. As someone who is managing a hosting service, AWS provides very quick access to resources that would otherwise cost me and the business thousands of pounds just to get started in terms of new machines and licenses. Amazon Web Services gives almost instant access to pay-as-you-go infrastructure and this is a great thing.

Cloud computing has been around for a while but AWS has made it easy for individuals and companies to access it - with a relatively clear pricing structure so that one can keep a track on the cost on a daily basis. Once a service is not required, you can throw the AMI away and not have to worry about disposal/recycling of hardware. This evolution is a natural process for our hosting team: we started off with a few servers, that grew to a few more servers than shrank down to a handful of big servers running virtual machines to now using the cloud.

The evolution of hosting at ESRI (UK)

Year One to Four
One very big project made it a necessity to set up a dedicated team and infrastructure to host an innovative web service. Over the three to four years, the service and accompanying infrastructure grew and grew. Kit was replaced several times and a growing pile of older, possibly obsolete servers started to cause us issues in terms of reliability, storage and recycling/disposal. 

Year Four
The web application and web service was sold off to a third party - who naturally wanted new kit in their new hosting centre. We were left with the older kit to run other hosting applications. It became a balancing act to ensure that we had enough hardware to run existing applications well and to have enough flex room to take on more work. However, we had to be careful that we didn't grow too big in terms of kit without a hosting contract to pay for the kit.

This balancing act went on for a couple of years. A few big hosting contracts were won and this required even more new kit, the cycle was repeating itself and the balancing act was maintained.

Year Five to Six
Virtualisation was touted as a possible answer to lower the cost of ownership down. The idea is very attractive: get a big host server and replace dozens of physical servers with the same number of virtual machines. Spin them up as required and adjust their resource requirements on the fly. Total Cost of Ownership (TCO) should come down as one does not need to buy new kit and there is a saving in power requirements. However, licenses for servers and application still need to be purchased and the 'entry cost' was merely shifted from cost of buying the kit to licensing the software. Still the monetary savings on power alone through virtualisation was significant. The ease of backup and recovery was also noted - a dead virtual machine can be switched off and replaced by a backup virtual machine in a few minutes. No need to keep extra kit involved and the clutter in the server room was reduced. We still had the headache of patching the virtual machines each month made more complicated by the need to spin up back-up VMs to patch and maintain.

Year Seven
Cloud computing offers up an expansion of the in-house virtualisation in that short-lived, high-intensity applications can easily be made available on the cloud and then switched off / thrown away when finished. The OS cost is included in the daily fee so savings are immediate. Creating new VMs in the cloud is easy and the ability to utilise very large servers (in terms of CPU and RAM) is as easy as making a new choice when you are spinning up a new instance. One has to learn a new vocabulary when in the Amazon cloud as well. The ability to keep AMI patched up is still an ongoing issue but the 'entry cost' to hosting has now been almost eliminated. There is no need to buy kit or to license operating systems (or even databases if you opt to use the PostgreSQL AMI) - so a bare-bones system, running IIS can be up and running in a matter of minutes.