14

I am designing a web application and I am wondering how to design the architecture to manage sending automated emails.

Currently I have this feature build into my web app and the emails are sent based on user input / interactions (like creating a new user). The problem is that connecting directly to a mail server takes a couple of seconds. Scaling my application up, this will be a significant bottle neck in the future.

What is the best way to manage sending a large amount of automated emails within my system architecture?

There won't be a huge amount of emails sent (2000 a day max). Emails don't need to be send immediately, up to 10 mins lag is fine.

Update: Message queuing has been given as an answer, but how would this be designed? Would this be handled in the app and processed during a quiet period, or do i need to create a new 'mail app' or web service to just manage the queue?

GWed
  • 3,085
  • 5
  • 26
  • 43
  • Can you give us a rough sense of scale? Hundreds, thousands or millions of mails? Also, should the emails be send immediately or is a small lag acceptable? – yannis Mar 19 '13 at 10:57
  • Sending email involves handing over an SMTP message to a receiving mail host, but that doesn't mean the message has actually been delivered. So effectively, all email sending is asynchronous, and there is no point in pretending to "wait for success". – Kilian Foth Mar 19 '13 at 11:00
  • 1
    Im not "waiting for success", but I do have to wait for the smtp server to accept my request. @YannisRizos see update RE your comment – GWed Mar 19 '13 at 11:03
  • For 2000 (which is your described max) mails it will just work. When they happen in say 10 business hours it's 3 mails per minute which is very doable. Just make sure you setup your DNS record well and the provider accepts you sending them in these amounts. Also think about: "what is the mailserver is down?". The load of sending 2000 mails is not something to worry about. – Luc Franken Mar 19 '13 at 11:12
  • The answer to where is CRONTAB – Tulains Córdova Mar 19 '13 at 13:50

4 Answers4

16

The common approach, as Ozz already mentioned, is a message queue. From a design perspective a message queue is essentially a FIFO queue, which is a rather fundamental data type:

FIFO queue

What makes a message queue special is that while your application is responsible for en-queueing, a different process would be responsible for de-queueing. In queueing lingo, your application is the sender of the message(s), and the de-queueing process is the receiver. The obvious advantage is that the whole process is asynchronous, the receiver works independently of the sender, as long as there are messages to process. The obvious disadvantage is that you need an extra component, the sender, for the whole thing to work.

Since your architecture now relies on two components exchanging messages, you can use the fancy term inter-process communication for it.

How does introducing a queue affect your application's design?

Certain actions in your application generate emails. Introducing a message queue would mean that those actions should now push messages to the queue instead (and nothing more). Those messages should carry the absolute minimum amount of information that's necessary to construct the emails when your receiver gets to process them.

Format and content of the messages

The format and content of your messages is completely up to you, but you should keep in mind the smaller the better. Your queue should be as fast to write on and process as possible, throwing a bulk of data at it will probably create a bottleneck.

Furthermore several cloud based queueing services have restrictions on message sizes and may split larger messages. You won't notice, the split messages will be served as one when you ask for them, but you will be charged for multiple messages (assuming of course you are using a service that requires a fee).

Design of the receiver

Since we're talking about a web application, a common approach for your receiver would be a simple cron script. It would run every x minutes (or seconds) and it would:

  • Pop n amount of messages from the queue,
  • Process the messages (i.e. send the emails).

Notice that I'm saying pop instead of get or fetch, that's because your receiver is not just getting the items from the queue, it's also clearing them (i.e. removing them from the queue or marking them as processed). How exactly that will happen depends on your implementation of the message queue and your application's specific needs.

Of course what I'm describing is essentially a batch operation, the simplest way of processing a queue. Depending on your needs you may want to process messages in a more complicated manner (that would also call for a more complicated queue).

Traffic

Your receiver could take into consideration traffic and adjust the number of messages it processes based on the traffic at the time it runs. A simplistic approach would be to predict your high traffic hours based on past traffic data and assuming you went with a cron script that runs every x minutes you could do something like this:

if( 
    now() > 2pm && now() < 7pm
) {
    process(10);
} else {
    process(100);
}

function process(count) {
    for(i=0; i<=count; i++) {
        message = dequeue();
        mail(message)
    }
}

A very naive & dirty approach, but it works. If it doesn't, well, the other approach would be to find out the current traffic of your server at each iteration and adjust the number of process items accordingly. Please don't micro-optimize if it's not absolutely necessary though, you'd be wasting your time.

Queue storage

If your application already uses a database, then a single table on it would be the simplest solution:

CREATE TABLE message_queue (
  id int(11) NOT NULL AUTO_INCREMENT,
  timestamp timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  processed enum('0','1') NOT NULL DEFAULT '0',
  message varchar(255) NOT NULL,
  PRIMARY KEY (id),
  KEY timestamp (timestamp),
  KEY processed (processed)
) 

It really isn't more complicated than that. You can of course make it as complicated as you need, you can, for example, add a priority field (which would mean that this is no longer a FIFO queue, but if you actually need it, who cares?). You could also make it simpler, by skipping the processed field (but then you'd have to delete rows after you processed them).

A database table would be ideal for 2000 messages per day, but it would probably not scale well for millions of messages per day. There are a million factors to consider, everything in your infrastructure plays a role in the overall scalability of your application.

In any case, assuming you've already identified the database based queue as a bottleneck, the next step would be to look at a cloud based service. Amazon SQS is the one service I used, and did what it promises. I'm sure there are quite a few similar services out there.

Memory based queues is also something to consider, especially for short lived queues. memcached is excellent as message queue storage.

Whatever storage you decide to build your queue on, be smart and abstract it. Neither your sender nor your receiver should be tied up to a specific storage, otherwise switching to a different storage at a later time would be a complete PITA.

Real life approach

I've build a message queue for emails that's very similar to what you are doing. It was on a PHP project and I've build it around Zend Queue, a component of the Zend Framework that offers several adapters for different storages. My storages where:

  • PHP arrays for unit testing,
  • Amazon SQS on production,
  • MySQL on the dev and testing environments.

My messages were as simple as they can be, my application created small arrays with the essential information ([user_id, reason]). The message store was a serialized version of that array (first it was PHP's internal serialization format, then JSON, I don't remember why I switched). The reason is a constant and of course I have a big table somewhere that maps reason to fuller explanations (I did manage to send about 500 emails to clients with the cryptic reason instead of the fuller message once).

Further reading

Standards:

Tools:

Interesting reads:

yannis
  • 39,547
  • 40
  • 183
  • 216
  • Wow. Just about the best answer I've ever recieved on here! Can't thank you enough! – GWed Mar 19 '13 at 15:14
  • I, and am sure millions other use this FIFO with Gmail & Google Apps Script. a Gmail filter labels any incoming mail based on a criteria, and thats all, queues them. A Google Apps Script run every X duration, gets first y messages, sends them, dequeues them. Rinse & Repeat. – DavChana Dec 15 '19 at 18:06
6

You need some kind of queuing system.

One simple way could be to write to a database table and have another external application process rows in this table, but there are many other queuing technologies you could use.

You could have an importance on emails so that certain ones are actioned almost immediately (password reset for example), and ones of lesser importance could be batched up to be sent later.

ozz
  • 8,322
  • 2
  • 29
  • 62
  • do you have an architecture diagram or example that shows how this works? For example, does the queue sit in an different 'app' say mail app, or does it get process from within the web application during a quiet period. Or should i create a sort of web service to process them? – GWed Mar 19 '13 at 11:04
  • 1
    @Gaz_Edge Your application pushes items to the queue. A background process (a cron script most likely) pops x items from the queue every n seconds and process them (in your case, sends the email). A single database table works fine as queue storage for small amounts of items, but generally speaking write operations on a database are expensive and for larger amounts you might want to look at services like [Amazon's SQS](http://aws.amazon.com/sqs/). – yannis Mar 19 '13 at 11:09
  • 1
    @Gaz_Edge I'm not sure I can diagram it any more simpler than what I wrote "...write to a database table and have another external application process rows in this table...." and for table, read "any queue" whatever technology that might be. – ozz Mar 19 '13 at 11:14
  • 1
    (cont...) You can build the background process that clears the queue in a way that takes into consideration your traffic, for example you can instruct it to process less items (or none at all) at times when your server is under stress. You'll either have to predict those stressful times by looking at your past traffic data (easier than it sounds, but with a large margin of error) or by having your background process check the traffic status every time it runs (more accurate, but the added overhead is rarely necessary). – yannis Mar 19 '13 at 11:14
  • @YannisRizos want to combine your comments into an answer? Also, architecture diagrams and designs would be helpful (I'm determined to get them from this question this time! ;-) ) – GWed Mar 19 '13 at 11:19
  • And if the queue is now in a database you could even decide to keep the messages, just set a field 'Sent' or 'DateSent' when the mail has gone out. And if you really want to be fancy, add an 'ErrorMessage' field to store any SMTP sending errors that you can inspect to see what's going wrong ("Hey, the mail server has been unreachable since 3:28 yesterday"). Or anything in between ;-) like throwing away the sent mail records and keeping the failed ones. – Jan Doggen Mar 19 '13 at 13:44
2

There won't be a huge amount of emails sent (2000 a day max).

As addition to queue, second thing which you should consider is sending emails through specialized services: MailChimp, for example (I'm not affiliated with this service). Otherwise many of mail-services, such as gmail, soon will send your letters into a spam folder.

OZ_
  • 307
  • 1
  • 6
2

I have modelled the my queue system in different 2 table as;

CREATE TABLE [dbo].[wMessages](
  [Id] [uniqueidentifier]  NOT NULL,
  [FromAddress] [nvarchar](255) NOT NULL,
  [FromDisplayName] [nvarchar](255) NULL,
  [ToAddress] [nvarchar](255) NOT NULL,
  [ToDisplayName] [nvarchar](255) NULL,
  [Graph] [xml] NOT NULL,
  [Priority] [int] NOT NULL,
  PRIMARY KEY CLUSTERED ( [Id] ASC ))

CREATE TABLE [dbo].[wMessageStates](
  [MessageId] [uniqueidentifier] NOT NULL,
  [Status] [int] NOT NULL,
  [LastChange] [datetimeoffset](7) NOT NULL,
  [SendAfter] [datetimeoffset](7) NULL,
  [SendBefore] [datetimeoffset](7) NULL,
  [DeleteAfter] [datetimeoffset](7) NULL,
  [SendDate] [datetimeoffset](7) NULL,
  PRIMARY KEY CLUSTERED ( [MessageId] ASC )) ON [PRIMARY]
) ON [PRIMARY]

There 1-1 relation between these tables.

Messages table for storing the message content. Actual content (To, CC, BCC, Subject, Body etc.) is serialized to Graph field in XML format. Other From,To information is just used for reporting issues without deserializing graph. Seperating this table allows to partition table content to a different disk storage. Once you ready to send a message you need to read all information therefore nothing wrong with serializing all content to one column with primary key index.

MessageState table for storing state of message content with additional date based information. Seperating this table allows to fast access mechanism with additional indexes on fast IO storage. Other columns are already self explanatory.

You could use a seperate thread pool that scans this tables. If application and pool lives in same machine you could use a EventWaitHandle class to signal the pool from application about something inserted these tables, otherwise periodically scan with a timeout is the best.

ertan
  • 117
  • 4