What’s it Like to Run a Cloud Email Service on AWS?
After we recently onboarded a large APAC telecommunications company (serving over half a million mailboxes) to a dedicated email service hosted in the Amazon cloud, we began fielding the question from peers and prospects, “Why Amazon?”
As OpenStack supporters, the decision to oblige our customer’s request and deploy on Amazon’s AWS platform was done after much due diligence and careful consideration. We agreed to run with Amazon’s AWS platform because of two key benefits:
- Scalability – AWS gives us the tools to scale horizontally, vertically and dynamically. (Our current estimates are comfortably within several millions of mailboxes – more would require further analysis); and
- Cost savings – Being able to switch off unneeded resources allows us to run this installation at a price point better than most of our competition, whilst still maintaining world class service and support.
What was the key to our successful AWS deployment?
The diagram below shows how our high-level flow of mail works on AWS:
- Inbound mail arrives at the edge and connects to our Mail Exchange (MX) service. This does a number of security checks including content scanning. The content scanning however is not done on the MX service, but rather handed off to a separate antiabuse service during the SMTP conversation.
- Outbound mail arrives at the Customer Mail Relay (CMR) service. Similar to the MX service, a number of security checks are carried out at this level, including content scanning, which is again delegated to the separate antiabuse service.
- The actual mail servers hosting the mailboxes are yet again a different service. They run centrally.
- One of our hidden services is the Mail Routing Relay (MRR) service. Every single mail goes through this layer, which helps us to have a single service that makes all of the routing decisions. This includes decisions on which mailstore to deliver mail to, as well as working out whether sent emails are going to an internal mailstore or are routed out to external parties. This is also where we do work relating to lawful interception of emails (which is a regulatory requirement in certain countries).
- The edge point for third party clients is our Client Mail Access (CMA) service. This relays POP and IMAP connections. Whilst this might not be essential in the business as usual scenario, it is extremely powerful during migration scenarios where the customer is looking to migrate users individually. It allows both mail platforms to operate in parallel and in that case, the CMA service acts as a proxy back to the legacy platform for those users who are still hosted there.
- Outbound messages to other domains not hosted on the platform get routed to the Outbound Mail Relay (OMR) service. This ensures that mail that cannot get delivered straight away is stored on queues that do not impact other services (by causing unnecessary load or delays).
- Last but not least, our application layers have each their own clusters: webmail; admin interface; and DAV.
What Amazon services and features are we using?
- Auto-healing (i.e. rebuilding of instances when they become unresponsive or otherwise fail health checks); and
- Dynamic resource management.
All of our infrastructure is coded up using AWS CloudFormation and this has been a major contributor to our success in the environment. Having a source-controlled infrastructure with exact version history has allowed us to keep track of the dynamic environment – that scales up to well beyond 70 servers at times and then down again.
Going beyond this, we have also made use of Amazon’s ElastiCache, which offers fully managed Redis and Memcached. Making use of Amazon’s managed services expertise allows us to focus on what matters most to our core business, which is building a high-quality email service, not directly running cloud databases and cache servers.
In order to keep the system running, we have found Amazon CloudWatch, with its monitoring and alarm capabilities, to be of great value. The alarms can be set at autoscaling group level (which is great for scaling as well as general health monitoring), as well as specific instances. CloudWatch also allows us to upload our own metrics, for instance the number of messages flowing through the system. Trigger alerts for this have made it a lot easier to spot bottlenecks faster.
If you’d like to discuss our AWS experience in person and/or share your own deployment experience, we invite you meet us this week at AWS re:Invent. Alternatively, feel free to get in touch and arrange a more suitable time and venue. Together, the more we all learn about AWS, the more benefits we bring to our companies and our customers.
New to atmail?
With 20 years of global, white label, email expertise serving telecommunications and hosting providers across every continent, you can trust atmail to deliver white label, cloud email solutions that are stable, secure and scalable. We power 170 million mailboxes and offer user-friendly, cloud hosted email with your choice of data centre (US, GDPR-compliant EU or other locations on-demand). Talk to us today.