Working closely with our customers and community, we’ve identified an issue that could affect the integrity of MindTouch-powered sites. The MindTouch engineering team has released an update to remedy the issue, which applies to MindTouch versions 10.0.x and 9.12.x. If you have an earlier version of MindTouch, we urge you to upgrade to the latest version and apply fix.

You can find more information on this update at the MindTouch Community Portal Blog, along with instructions on how to apply the fix.

Last week we completed the migration of MindTouch Deki Express (aka wik.is) to the Amazon Elastic Computing Cloud (EC2). We’ve been seeing great growth from our Express offering lately, and we wanted to make sure we continued to offer the great uptime and performance our users have come to expect.

A ton of work went into this migration and I’ll give an overview of some of the problems we had to solve to make the migration a success.

For those of you who don’t already know, EC2 is a cloud computing infrastructure which allows you to dynamically allocate virtual servers based on demand. While EC2 by itself is a fantastic technology, it’s missing a few key components like a comprehensive management platform. For this, we chose to use the RightScale Platform.

RightScale builds on top of EC2 and provides:

  • pre-configured server templates
  • fully scriptable (and repeatable) server configuration
  • ability to clone scripts, templates, deployments (collection of servers)
  • performance graphs, monitoring, alerts

We are also using Amazon’s Elastic Block Storage for persistent, reliable storage volumes. EBS allows us to easily snapshot our volumes for easy recovery in case of a crash.

This whole system allows us to start-up new instances and scale out at a click of a button! If the system comes under high usage, more servers are automatically added to our cluster, thus ensuring great performance for our Express users :)

After many iterations, we came up with the following architecture:

I’ll provide a brief overview of the infrastructure behind the new deployment here, but for a more detailed view, check out the MindTouch Developer wiki.

Load Balancer:
For our load balancer, we are using HAProxy. HAProxy is a screaming fast software load balancer. It fit our requirements perfectly and RightScale had pre-built scripts for configuring and adding/removing backend web servers.

Apache (PHP/Deki API):
To keep things simple, we chose to host the PHP and C# bits on the same server (though they could easily be separated and scaled independently). When an Apache instance boots, it registers with the load balancer and starts accepting requests. We did performance testing using apachebench and found the High-CPU medium EC2 instance to offer the best price/performance ratio.

MySQL:
Our MySQL servers are setup in a master/slave configuration. Both master and slave are EC2 large instances with the data stored on an EBS volume. We take daily snapshots of the slave database. If the master fails, the slave can easily be promoted to master. If the slave fails, a new slave can be launched and populated with the data from the latest snapshot and replication can begin.


Deki Configuration
:
Our Deki config is fairly complex. Obviously, we are using a multi-tenant configuration. In the multi-tenant setup, instead of fetching configuration information for each wiki instance from the mindtouch.deki.startup.xml file, we fetch the data from a web service. This web service uses it’s own database to manage wiki instances. We also run our extension services and lucene index on a separate EC2 instance. Finally, we are storing our PHP sessions in memcache (with memcached running on our master and slave database instances). This allows us to launch any number of PHP/API servers and round-robin the requests.

Email Configuration:
Sending emails directly from EC2 instances is problematic. There are some spam filters that reject email from EC2 hosts so we decided to use an email relay. We configured postfix to send emails through our email provider (01.com) using SMTP auth.

Auto Scaling:
RightScale provides a monitoring, alerting, and auto-scaling system. The scaling process is pretty simple and all based on voting. When a frontend instance load gets over the defined alert threshold for some time, it votes for “growth”. If at some point the majority of the voting instances are voting for “growth” then a new frontend is launched to help them out. Scaling down works exactly the same way, the instances vote for “shrink” when the load stays under the alert threshold.

As you can see, putting together this deployment took a great deal of work. However, there are many benefits. The new site performs better, can scale automatically, and gives us better disaster recovery. Finally, since everything is fully scripted, we can “clone” our deployment in RightScale, change a few configuration inputs, configure our DNS and launch the entire deployment with a single click! In fact, we’ve already seen our system auto-scale up (added more servers automatically!) under peak usage for Deki Express!

Since the 8.05 Jay Cooke VM release, Debian has announced several security updates which affect the Deki Wiki VM. Because reading the debian-security-announce mailing list probably isn’t your idea of fun (though I think it is), we’ve started tracking the Deki Wiki specific updates on the DekiWiki VM Security Updates page.

One of the latest vulnerabilities is particularly annoying. According to DSA-1576-1:

The recently announced vulnerability in Debian’s openssl package (DSA-1571-1, CVE-2008-0166) indirectly affects OpenSSH. As a result, all user and host keys generated using broken versions of the openssl package must be considered untrustworthy, even after the openssl update has been applied.

To apply the security fix for openssl and openssh you’ll need run the following commands (as root)

apt-get update
apt-get upgrade
apt-get install openssh-server openssh-client

This will regenerate a secure host key for you. The next time you log in via SSH you will most likely receive the following error message:

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that the RSA host key has just been changed.
The fingerprint for the RSA key sent by the remote host is
52:8e:93:04:64:a5:7e:ac:c8:2c:2b:9a:96:ad:66:32.
Please contact your system administrator.
Add correct host key in /root/.ssh/known_hosts to get rid of this message.
Offending key in /root/.ssh/known_hosts:58
RSA host key for 192.168.1.215 has changed and you have requested strict checking.
Host key verification failed.

For most people this is simply an annoyance. However, if you have any automated processes that use the old ssh keys to log in, you will need to update your keys. The DSA has a lot of good info, and instructions on how to use ssh-vulnkey too identify weak keys so I highly recommend giving it a good read.

As always, if you have any questions please drop by the forums or IRC!

Since everyone at MindTouch has decided to jump on this newfangled blogging bandwagon (just another fad like the Internet, IMO), I figured it was a good time to do my first post.

At MindTouch, we use svk to synchronize our private SVN repository with our public svn repo over at SourceForge.net. Over the weekend, that sync broke for some reason and I was forced to actually learn how the magic worked. I put together a document describing how our svk sync works here:

Synchronizing SVN Repositories with SVK