Major Performance Improvements – 4.0 (Build 603)

Community May 20th, 2008

posted by Jeff Standen

We’ve implemented a short-term/long-term caching strategy in 4.0 (Build 603) which has significantly boosted performance.

I’ve included a bit of technical information for anyone interested. If this kind of thing puts you to sleep, the above sentence pretty much says it all: with build 603 you should be finding the system even faster than it was already. During the maintenance window last weekend we updated our hosted Cerb4 helpdesks to this version.

Now for the technical details:

Long-term vs. Short-Term Caching

Long-term caching stores frequently-read and infrequently-changed information from the database. The cache is invalidated and re-cached when the source is updated in the database. This long-term cache can be shared between all the helpdesk workers and community tool visitors, and it drastically reduces the load on the database. This caching strategy is especially useful for lists of workers, groups, plugins, mail routing rules or global settings. Any time we have an ID of one of these objects we can simply refer to the cache (to display names, labels and other information) rather than doing expensive joins from the database.

Short-term caching exists for the life of a single page request by a single worker. Often a single request will need the same information (like a list of workers) multiple times when rendering a page. After the initial request is made from the long-term cache, it’s stored in short-term cache so any subsequent requests are lightning fast. Even though the long-term cache is almost always faster than talking to the database, there’s still some overhead involved in talking to it rather than memory. This depends on the caching options from framework.config.php — by default the long-term cache is on the filesystem, though more advanced options like Memcached can be easily enabled.

The Impact on Cerb4

The long-term cache strategy has existed in Cerb4 since the beginning; but we’ve recently realized it could become pretty inefficient on high-load systems as the filesystem became a bottleneck for frequently-requested caches.

This was happening because the long-term cache was retrieved from the disk every time it was requested. For a process like the parser, which is loosely-coupled and modular, some caches are requested from events each time an e-mail message is parsed. If you download 100 messages you have 300+ cache requests. It obviously doesn’t make a lot of sense to be accessing files on the disk 300 times during the same page load. You can imagine the performance-drain effect this has if you have a helpdesk with a lot of concurrent workers all causing frequent hits on the disk cache as well.

The first impulse many programmers have to address this is to “cache the cache” in a local variable. This isn’t as simple as it sounds when you’re dealing with loosely-coupled, object-oriented code. You can’t simply pass references around to methods; and it doesn’t make a lot of sense to implement static patterns like Singleton for every object, or to bloat the API with another specialized global registry just to store cache output.

In prior builds, we were directly exposing the Zend_Cache class (of the Zend Framework) from Devblocks to handle caching. This left short-term caching up to the programmers.

With this build we’ve abstracted the cache manager (DevblocksPlatform::getCacheService()) to automatically handle short-term and long-term caching without requiring any special code. Existing cache code should continue to work exactly as it was written, just much faster under higher loads.

Enjoy!

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

Some Practical Security Considerations for Your Helpdesk

Community, Tips & Tricks May 15th, 2008

posted by Jeff Standen

Alright! Earlier today I promised to share some helpdesk security tips.

If you look at the .htaccess (or .htaccess-dist) file in your /cerb4 directory, you’ll notice the following lines:

<IfModule mod_rewrite.c>
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . index.php [L]
RewriteRule ^(.*/)?\.svn(/|$) – [F,L]
RewriteRule ^(.*/)?api(/|$) – [F,L]
RewriteRule ^(.*/)?libs(/|$) – [F,L]
RewriteRule ^(.*/)?plugins(/|$) – [F,L]
RewriteRule ^(.*/)?storage(/|$) – [F,L]
RewriteRule ^(.*/)?templates(/|$) – [F,L]

</IfModule>

This file sets up some basic rules, based on URLs and paths, to tell the web server that we don’t want anyone snooping around our private directories. You don’t want shady visitors poking around your /storage directory (which holds incoming e-mail sources and attachments), or your /libs/devblocks/tmp directory (which holds caches that could be used to peek at your helpdesk data or crack your worker logins).

In a perfect world, we’d be creating these directories completely outside the web-accessible path. It’s a fairly easy tweak for those so inclined (who have proper server access); but for the average shared-hosting installation such a setup is often not possible. Our default installation needs to remove as many obstacles as we can without compromising too much. The beauty of Cerb4’s new design is that you can start customizing your installation from our simple, working baseline without losing your ability to easily upgrade.

If your web server isn’t capable of parsing .htaccess files (e.g. they’re disabled, you use IIS, etc) then you need to block the list of paths above using something like “directory security”. Practically every web server will give you this ability, and most control panels in shared hosting environments will as well (e.g. cPanel, Plesk).

Go ahead and test your helpdesk URL to make sure you have something in place to protect these directories. For example, here’s what our online demo shows:
http://demo.cerb4.com/admin/storage

http://demo.cerb4.com/admin/plugins

The absolute worst result you can get is a directory listing that shows all the folders in these directories. If you see that, make sure you get some administrator help immediately (you can send them to this URL).

It’s possible to set up directory security on your entire helpdesk URL (/cerb4/*), but there are a few important things you need to keep in mind if you want to do this:

  • You’ll need to allow your community tools to connect to your helpdesk URL. At the moment our reverse proxy script doesn’t have the ability to do HTTP authentication. In an advanced configuration you could allow the IP of your community tool web server to bypass the authentication.
    I’ve gone ahead and opened up a development task to add HTTP authentication awareness to community tools: http://wgmdev.com/jira/browse/CHD-679

  • Your cronjobs or scheduled tasks will need to do HTTP authentication to talk to /cerb4/cron or /cerb4/update. This is easier because command-line tools like wget support this already.

  • If you use our Web-API plugin, you’ll need to add some extra code to your application to deal with HTTP authentication.

You should be fine with directory security on specific subdirectories without locking down your entire site.

A more practical approach to securing your entire helpdesk path or domain would be to lock it down by IPs rather than passwords. However, this can be very cumbersome — to the point of not being worth doing — if you have a lot of helpdesk staff. You can do this from the web server or from your firewall. You’ll still need to permit your scheduled tasks, Web-API users and community tools (from your public web servers) to connect to the helpdesk.

And while “security through obscurity” is nothing you should critically depend on, some simple common sense can go a long way too:

  • Use SSL.

  • Rename your /cerb4 subdirectory to something less predictable. If you’re using a subdomain for your private helpdesk (not your public tools), you should try to make it something harder to guess than “helpdesk”.

  • Don’t make public links to your helpdesk URL. To the degree possible to avoid, this also includes e-mail, bug reports and forum posts. If Google has indexed your helpdesk URL, then you’re already more immediately vulnerable to every published exploit.

  • You could use a non-standard port for your helpdesk web server, if possible. If you control your own server it’s pretty straightforward to set up a separate Apache instance on a port like 9999. The downside is being slightly inconvenient to your helpdesk staff if they have an aversion to browser bookmarks or shortcuts. Get used to people going “the helpdesk disappeared!”.

  • You could also host your helpdesk on your office intranet, using a domain like “http://helpdesk/” instead of a web-accessible IP. This won’t impact scheduled tasks that access your URL as long as they run from inside your network. You could permit your public community tools, or Web-API users, to access specific paths of your intranet helpdesk by opening up a non-standard port on your router and filtering traffic for specific URLs from specific IPs.

I wouldn’t really go as far as even calling most of these notes ‘recommendations’. These are simply options available to you.

However, make sure that you can’t access the URLs listed at the top of this post on your helpdesk from your web browser. That should be the very least that you do to lock things down.

 

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

Important Security Patch – 4.0 (Build 600)

Community May 15th, 2008

posted by Jeff Standen

A security exploit has been reported that demonstrates the viewing of some helpdesk content without an active session. Please upgrade to 4.0 (Build 600) as soon as possible to correct this important issue.

The flaw has to do with how we authenticate controllers (the first ‘/slash’ commands in the application, like ‘/display’ or ‘/kb’) that aren’t standard helpdesk pages. Pages that you can access from the menu handle security inherently, but functionality that doesn’t tie into the web interface (like ‘/cron’, which is secured by IP addresses) has to handle its own authentication based on the use case.

This preliminary patch addresses the reported issue directly. We’re going to run a full audit through the code to ensure other functionality isn’t susceptible to this same flaw.

If you use Subversion to update Cerberus Helpdesk 4.0 (and you really should be), you can simply issue the console command “svn update -r 600” from your /cerb4 directory on a Unix-based server. If you use Windows, you can use a graphical client like TortoiseSVN to ‘Update to Revision‘ 600 from the right-click menu.

After the code audit, I’ll do a follow-up post this afternoon on the blog with other security considerations. For example, if you wrap your helpdesk URL in HTTP Authentication from the webserver you can add another layer of protection. You’d just need to make sure your cronjob/task (using something like wget) is using the HTTP Authentication. I’ll explain how to do that in my next post.

(We’ll be upgrading all our hosted helpdesks immediately.)

 

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

Hosted Helpdesk (*.cerb4.com) Upgrade Notice: Saturday May 17th

Community May 14th, 2008

posted by Jeff Standen

This weekend we’ll be performing some upgrades and maintenance on our Cerb4 hosted helpdesk infrastructure.

We’ll be:

  • Adding more RAM to the database machines.

  • Adding non-RAID storage to the main hosted machines. This will take a lot of burden off the local disks as we start doing even more frequent backups.

  • Migrating some helpdesks to balance load inside the network.

  • Upgrading everyone to the latest stable version (currently 4.0 Build 593 602).

The hardware upgrades should be rather quick (minutes if all goes well), and that is the only planned major downtime during this maintenance window.

Moving helpdesks around our network and upgrading them to the latest version will take several hours cumulatively, but each helpdesk should only be impacted for a few minutes. Your hosted helpdesk IP may change but your URL won’t (so ping it and update any firewalls, etc).

Thanks to those of you who are hosting your helpdesk with us! We have a lot of exciting improvements on the way (distributed hosting for large scale helpdesks, cloud-computing/virtualization-ready Cerb4, off-site backup deliveries, and more).

Update 15-May-2008: Our new expected upgrade window is Friday, May 16th from 9:00PM to 10:00PM Pacific Time. This maintenance period should have even less of an impact on Cerb4 hosted helpdesks than planned since we already did the helpdesk upgrades this afternoon with the release of our latest security patch.

 

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

Hosted Helpdesk (*.cerb4.com) Backup Policy Changes

Community May 14th, 2008

posted by Jeff Standen

Up until a few weeks ago we’ve been doing weekly off-site backups for our hosted helpdesk infrastructure. Local backups were done more frequently, but usually not as frequently as nightly due to performance considerations (and the fact there’s really no such thing as “nightly” to our international customer base — someone is always impacted).

Backups on our scale are a bit more complicated than a single helpdesk, where a simple mysqldump is usually enough to get the job done. We’re hosting helpdesk customers, especially with legacy Cerb2+Cerb3, who have been with us for years. Between those years of e-mail and attachments it’s not uncommon to see 10GB+ databases. That’s a lot of redundant data to be dumping out to *.sql files on the disk — not to mention the redundant cycles wasted on re-compressing it, and wasted bandwidth on transmitting it off-site.

We’ve done a few things to optimize our process over the years:

  • We use mysqlhotcopy to copy database content directly (*.frm, *.MYD) and we don’t backup indexes (*.MYI), as they can be fully regenerated during an eventual backup recovery.

  • With Cerb2+Cerb3 databases, we also don’t back up the search_index or trigram tables since they can also be regenerated with a ‘re-index’. These inefficient tables (thankfully!) don’t exist in Cerb4.

  • When we designed Cerb4 from scratch, we took what we learned about these accumulative Cerb2+Cerb3 inefficiencies and engineered with them in mind. That’s why our attachments are no longer in the database, and why we rely on database indexes for searching again instead of rolling our own indexes inside the database (among hundreds of other design improvements).

  • Having these Cerb4 attachments on disk, and named by their unique, incremental database ID, makes it trivial for us to do incremental backups (i.e. only backing up this week’s new attachments and not the entire directory or database table, then aggregating them with the off-site backups).

  • We’ve started using Amazon’s Web Services (EC2 + S3) to compress backups off-site. This saves a lot of processing power that otherwise goes to waste compressing backup data, just to delete it the next day (or week) and replace it with fresher data.

We’re still a bit handicapped by these cumbersome databases on the Cerb2+Cerb3 hosting end, but we’re going to stop let it affecting the Cerb4 backup policy. We’re going to start doing daily backups for all Cerb4 helpdesks.

Running backups currently impacts our performance, in a large part, because we’re copying dozens or hundreds of gigabytes from the local RAID to another location on the local RAID (so all disks are involved in jumping back and forth). It would be a lot more logical to do these backups to drives that aren’t involved in serving real-time content to users. This isn’t a revelation, it’s just something we’d sacrificed to economy pricing. Having our new hosting prices based on storage helps us factor these realistic costs in.

Through our optimizations and audits over the past couple weeks (to figure out what’s holding us back from keeping ever-more-frequent backups), we found a major inefficiency on our scale is “original_message.html” attachments. These are the HTML versions of incoming e-mail that usually have a plaintext alternative (which is what we display in the GUI). If an e-mail is HTML-only we end up generating the plaintext part from the HTML.

We’re hosting several hosted helpdesks who have 500,000+ tickets, who never delete a thing (spam or otherwise), and who collect customer through website forms that always generate an unnecessary HTML part. One helpdesk in particular has 530,000 attachments, and ~527,000 of them are a redundant HTML copy of the messages already in the helpdesk.

From now on we’re not going to guarantee that we’ll backup attachments named “original_message.html” — it’s going to be at our discretion. This will save a lot of resources wasted on copying, compressing and transmitting redundant data. It will also shorten the backup windows.

Thanks!

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

More Pricing Tweaks (What a Bunch of Incrementalists!)

Community, Debate May 6th, 2008

posted by Jeff Standen

Putting a Price Tag on Bits and Bytes

Software is a tough business, but pricing software is an even tougher business. As obsessive full-time developers we often want as many people as possible to be using our projects, but we also have to strike a balance with the fact there are bills to pay and families to feed.

The price you’re paying for our software has to factor in the costs of ongoing development and support. In our case, the Cerb4 licensing cost is basically just passing around a (slightly-coerced) tip jar for people who like what they see so far so development can continue. As developers we like to constantly be stuck thinking about the future and what new things are possible from the pieces we’ve just built.

Customers tend to approach things differently. Very few prospective users are genuinely thinking about future potential when they see a price tag on software. They take a look through all the corners of a demo and website to get a feel for what is currently possible with the software – probably because they’ve learned over the years that developers can be incredibly undependable in their estimates about new development (and they’re right; but hey, we’re working with ethereal “thought stuff” here).

You’re probably thinking this lead-in sounds like a typical justification for raising prices, right? Well here’s the curve ball… we think we’ve been pricing Cerb4 too high based on its exciting future potential that (for the most part) only we can see right now.

A few months ago we switched over to per-user pricing because it seemed like a great way to offer a lower entry price for companies with few users, while companies with more users could contribute more (based on the theory software seats scale like office chairs). But that’s proven pretty artificial, and for these few months since that decision we’ve been dealing with several tough questions about per-user pricing, like: What if I want to deactivate workers who have left the company without deleting their history? What if we’re a small company that has dozens of volunteer helpdesk workers?

It has recently dawned on us that we’re just setting up an artificial obstacle for people with these worker limits. If you buy into our project, we want you to feel like you own the software. It’s not our style to litter the project with hooks and throttles just for the sake of cash flow. Just like overly-aggressive license validation procedures, it ends up punishing all the paying users out of the paranoia of going broke. The reality is that if we’re doing something genuinely useful we’ll find a way to keep the rent paid, the lights on and the fridge stocked. We may not become the next Microsoft or Salesforce.com with that mindset, but at least we won’t be holding you guys back when you decide to spend a few hundred dollars to use Cerb4 for your mission critical e-mail.

So here’s what we’re going to do: We’re going to remove the per-worker limitations for everyone. Those of you with 5 worker licenses are likely going “Woo hoo!” while those with 25+ worker licenses may be going “What the hell, Jeff?”. Please keep in mind what we’re trying to do here is for the project and community before ourselves, and the fact we may come off as manic-depressive in our search for ideal pricing from time to time has to do with the fact we’re developers, not prescient economists. We’re not fickle, we’re just incrementalists. Each tweak is based on a lot of ongoing observation about the impact of each decision in the real world (which is pretty hard to predict). We’ll continue to find ways to reward people who’s early contributions have helped us grow over the years.

The Impact on Owned Licenses

We’re setting the “owned license” base price for the project to $499 with no artificial limitations. We’ve brought back the small business discount for companies making less than $250,000 USD gross revenues per year which knocks 20% off the price (to $399). Educational institutions get a similar 20% discount (also to $399). Users upgrading from Cerberus Helpdesk 3.x get a 30% discount (to $349, the discount used to be 50% but it was against $995 for unlimited, so you’re still saving about $150 more from this tweak).

The Impact on “As-a-Service” Hosted Licenses

On the hosting “software as a service” end, we’ve stopped tiering pricing by users as well. The scaling issues we face in managing servers have very little correlation to total users, and more to do with how companies manage their e-mail behaviorally (e.g. never deleting spam, receiving lots of work-in-progress attachments). We’ve moved to a much more straightforward pricing model on hosting of just basing it on storage. All hosting plans from today will start with 5GB of storage for $49/mo. 5GB additional storage is +$25/mo and 10GB additional storage (prepaid) is +$42/mo. If this was purely dollars for hosted gigabytes it would be a bit expensive, but this covers the software, upgrades, support, phone support, backups, server monitoring at 4AM, and all those things which you’re glad we’re doing so you don’t have to.

With that Out of the Way

I can’t guarantee that several months from now I won’t be sitting here telling you something else based on future information and observations; but with all the information and history we have at our disposal right now we feel this is the best direction for funding Cerberus Helpdesk’s ongoing development (so we can continue to reach for all that exciting potential we’re always talking about).

If you have any questions about what we’re doing, how we’re doing it, or what we’re aiming for, ask away! If you’d like to rant, send me the URL to your forum thread and I’ll read through it and post my thoughts without getting defensive or cranky. Nothing is off limits.

Thanks!

-Jeff

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

4.0 Build 593 is the New Stable Release

Community May 6th, 2008

posted by Jeff Standen

Alright! We’ve just done another full round of QA tests against Build 593 to confidently declare it stable. We’ve been using it live on our own helpdesk for about a week.

There are a couple dozens tweaks and improvements included, but our major focus was on bringing knowledgebase management back to the worker GUI, integrating the public knowledgebase with the Support Center (to remove the need for two public tools) and performance improvements.

We’ve also switched over to requiring MySQL 4.1 to vastly simplify fulltext indexes for quicker searches, to get subquery and transaction support, and to move forward with internationalization. It was really useful to gather community feedback about this change ahead of time in the voting booth on the forums.

When we had to shuffle some of our hosted sites around during the server emergency last week, we ended up temporarily putting more sites on a single server than we normally would have liked. It was interesting to observe that about 100 concurrent copies of Cerberus Helpdesk 4.0 ran very usably on a mid-range Debian box (Dual Xeon 3GHz, 2GB RAM, 2×500GB RAID-1). In practice, for that kind of load, we’d throw more memory at the machine and probably bump up the RAID concurrency. The hard disks are almost always the bottleneck if you can’t keep databases entirely in memory.

Naturally we’re not in the habit of overselling a machine’s resources, but it gave us the opportunity to look at bottlenecks with that many copies of Cerb4 running. We used this information to add some last minute performance improvements to Build 593. One of the most effective was improved database indexing on the message_header table – which is used to thread incoming messages with the messages they’re replying to, as well as to display the original headers while reading ticket messages. There were a couple other missing indexes, and a few places in the code we could reduce unnecessary queries.

For the knowledgebase, you’re now able to use nested categories again (as was heavily requested). The top-level categories of the knowledgebase are now called “topics”, and you use topics to decide what branches of the knowledgebase to display on your various public community tool instances. You can display more than one topic for a particular website. We’ve found it’s most effective to use your products and services as the basis of topics.

To take advantage of the new community tool changes you’ll need to go into ‘Helpdesk Setup->Community Tools’ and choose your knowledgebase topics to display for each tool.

You can find the full changelog for Build 593 in the forums.

 

 

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]