Delivering E-mail: The Difference Between Piping and Polling

Community September 9th, 2008

posted by Jeff Standen

As a community we’re often talking about the highly-visible features used to share and manage tickets; but we have far fewer conversations about how Cerberus Helpdesk manages e-mail deliveries before those messages are turned into tickets. This is another place where Cerb4 has subtle but significant improvements over past versions.

There are usually two choices for delivering mail to an application: piping and polling.

Piping (a.k.a. Pushing)

With the piping method, e-mail is delivered to an application in real-time at the cost of server resources and redundant application overhead.

Pros:

  • It’s fast. E-mail is delivered to the helpdesk proactively and in real-time.
  • It’s modular. The relatively expensive operations required to convert a raw MIME e-mail message into plaintext with usable file attachments can be offloaded to a speedy, specialized application.  In previous versions of Cerberus Helpdesk, e-mail processing was handled by a parser written in the “C” programming language.

Cons:

  • It doesn’t scale well.   Each new message is processed concurrently, each costing the overhead of the application starting up, initializing resources, processing, and then shutting down between each handled message.
  • It can bounce/blackhole messages.  Since mail is delivered in real-time, the mail server is generally looking for an immediate clear-cut success or failure response code from an application.  That means if you’re upgrading, tweaking, rebooting, or otherwise taking your helpdesk offline for a few minutes, new mail is bouncing for later retry.  That’s normal behavior.  However, if your helpdesk is accessible but in a broken state (e.g. halfway through an upgrade, having database issues) then it’s possible for mail to be accepted but not processed.

cerb4_piping.png

Polling (a.k.a. Pulling)

With polling, an application makes intermittent checks for new mail.

Pros:

  • It’s highly efficient.  The application only needs to start and stop once to process hundreds of messages at a time.  Information can be cached in memory from the database and shared between parses to reduce database query round tips and redundancy.
  • You get application integrity.  Cerb4 won’t check for mail unless everything is currently in proper working order.  If you need to reboot the application server then it simply won’t check for new mail until you’re done.
  • It’s easier to add more mail.  Since a POP3 or IMAP mailbox is just a folder of mail on the server, you can simply copy raw e-mail messages to that folder (rather than forwarding it) to deliver mail to the application.
  • It’s easier to move junk out of the way under high loads.  If your mail server is overloaded by a burst of junk mail, a virus, or worms, you can simply instruct Cerb4 to not check for new mail.  You can then use server-side tools to filter the mailbox of junk (using grep, SpamAssassin, etc.) before checking for mail again.  With piping you’re under constant machine gun fire.

Cons:

  • There’s a slight delay between new mail deliveries.  However, a 1-5 minute delay is likely going to be unnoticeable unless your workers measure response times in seconds.
  • Low-volume mailboxes cause a lot of needless checks for new mail.  This is generally nothing to worry about.
  • High-volume mailboxes may backlog.  If you receive more mail than your “poll” downloads every few minutes, you may reach a state where your polls will never be current.

cerb4_polling.png

The Present

In versions prior to 4.0 we offered both piping and polling choices. As of 4.0, we’ve adopted polling as the recommended mail delivery method.

We made this decision because we found the benefits of piping weren’t compelling enough — if you have a low-volume mailbox, you likely don’t need absolute real-time deliveries because your staff probably isn’t checking for new mail every second.  If you receive a huge volume of e-mail, you’ll actually get faster deliveries from a shared cache used by the polling method as it downloads and processes 500 messages per poll.

The Future

There’s still room in Cerb4 for a huge-volume delivery option that blends the benefits of piping and polling. Ultimately, the mail server’s job is easier than the application server.  A delivery system that scales better will depend on distributed copies of the helpdesk running in tandem.  From there, you could round-robin a busy e-mail address to multiple POP3 mailboxes.  Each of those mailboxes could be checked concurrently by different instances of the helpdesk, each with their own server resources.

While that sounds like a fun challenge to us, the reality is we’ve rarely seen a helpdesk receiving enough mail to justify it, that couldn’t be pared down to a trickle of legitimate e-mail easier with a decent spam filtering solution first.

If I’ve done my job right, the next time you hear the words “piping” and “polling” thrown around you’ll know what’s being discussed.

Thanks for reading!

-Jeff@WGM

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]


Leave a Comment