PLEASE BOOKMARK (usually control-D) THIS PAGE NOW SO YOU CAN FIND IT AGAIN IN CASE OF AN EMERGENCY!


If you are experiencing a problem not reported here, check our web panel for more information.
(Please remember, posting in the comments here IS NOT an official way to contact DreamHost.)

Blingy File Server Issue

Posted 1 week, 6 days ago (April 28th, 2008 at 9:21 am PST) by Justin

We are currently experiencing issues with the main file server for the blingy cluster. Our file server admins are on their way to the data center to look into the issue, and we are doing our best to get this fixed up and working again as soon as possible. This issue will cause websites and email slowness or unavailability until it is resolved. When we have further information or it is fixed we will post an update.

– Update Mon Apr 28 10:06:16 PDT 2008 –

The file server is back up and working, we are going around fixing up any web servers which are still having issues or needing a restart after the file server issues. The file server admins are continuing their investigation into the cause of the file server problem.

Update 4/28/08 12pm Pacific — All servers are back up and running (except for flamenco, see post above). Sorry again about the inconvenience this has caused you. If you’re still seeing problems with your web or email, please let support know as it’s likely unrelated.

Looney Cluster having issues!

Posted 2 weeks, 1 day ago (April 26th, 2008 at 9:43 pm PST) by Brian

We’re currently investigating an issue on looney that’s causing web and mail service to become spotty, and more likely, completely down. This effects the following servers:

apok axl blanka brazil charm curly dhalsim egon guile limbo-looney moe slappy soy tank tofu vega winston

We apologize for this, and should have things working soon!

UPDATE: As of 10:15PM PST, this should be fixed now. If your site is still down, please email support and let us know!

Snocap Downtime

Posted 2 weeks, 2 days ago (April 25th, 2008 at 3:18 pm PST) by greg

Severity: Low   Resolved: No

The snocap server has been plagued with some intermittent issues. We’re working to resolve this as fast as we can, and will hopefully have more information on this shortly. If you’re on the snocap server you could be affected by slow website performance, website showing as down, slow uploads and downloads, and trouble updating your dynamic websites.

Central Frisky Server Down

Posted 2 weeks, 2 days ago (April 25th, 2008 at 1:55 pm PST) by mir

Customers on the Frisky cluster are experiencing a range of issues:

-slow site performance due to high loads
-slow data access times
-server crashes
-slow/unavailable mail

One of the file servers in the Frisky cluster has crashed - the admin team is working on getting it back up but due to the nature of the server in question it is effecting all servers that mount data from it (which includes many of the hosting servers and all of the mail servers in that cluster). We’ll update here once we have it fixed (if we cannot bring it back up within a reasonable amount of time we’ll see about removing the mounts so that it has less impact). While this is a terrible situation it has allowed us to pinpoint the cause of similar performance hiccups that we have been seeing with this cluster and should allow us to resolve those for our customers. We will update this post as soon as we have more information for you.

Note: if you would like to determine if you are on this cluster just go to your panel here:

https://panel.dreamhost.com/index.cgi?tree=domain.manage&

and click on the ‘DNS’ link for any domain hosted there and scroll to the bottom, the mail MX records will have the name like so:

mail MX 0 mx1.balanced.frisky.mail.dreamhost.com.
mail MX 0 mx2.balanced.frisky.mail.dreamhost.com.

UPDATE 16:23

We are currently moving the data drives from one file server to another. We are still a few hours from resolution on this issue. As soon as we have the drives moved over and the new hardware booting up correctly we will update this status post again.

UPDATE 17:26

This issue is now resolved. We have replaced the faulty hardware and moved your data drives over to the new file server. This should be completely resolved for you.

WordPress 2.5.1 is out!

Posted 2 weeks, 2 days ago (April 25th, 2008 at 11:30 am PST) by Jason C

A pretty major and important security fix is out for the latest version of WordPress — and it is suggested that you upgrade your sites via our One-Click Installer as soon as possible. For more information on the upgrade, it is suggested you check out the WordPress Development Blog.

Moving redhot onto new hardware

Posted 2 weeks, 2 days ago (April 25th, 2008 at 10:51 am PST) by JamesH

The webserver redhot has been having some stability issues lately, so we’re going to move it to new hardware in hopes of fixing it. This process should take about 30 minutes.

Update: This was finished at about 11:30am.

Looney Mail Issues

Posted 2 weeks, 2 days ago (April 25th, 2008 at 7:33 am PST) by mir

I apologize for the late post but I wanted to make sure that we had a resolution for you before posting here. Last week I saw complaints about slowness and timeouts. With some really excellent feedback from customers I was able to pinpoint when the issues were occurring and keep an eye on it and with the help of the admin team we determined that it was due to a file server that needed a bit of offloading. We had the hardware available to add 4 TB of space to the cluster and are installing that today so we just need to remove some data from the troublesome file server and all will be well. This is expected to be finished within the next 7 days so by then mail should be working properly again (it may even be sooner than that depending on the point at which it will return to normal). This should also give the cluster enough space to be fine for the next few years (this is an older cluster which grows slowly as no new customers are being added) so you guys are in great shape for a long time to come!

Central Database Issues

Posted 2 weeks, 3 days ago (April 24th, 2008 at 1:56 pm PST) by mir

Some bad code caused our session database to back up which is resulting in errors and sluggish account control panel performance. The admin team is fixing this up and should have it resolved in the next 10 minutes or so.

Update: this issue is now resolved

Webserver persephone being moved to new hardware

Posted 2 weeks, 4 days ago (April 23rd, 2008 at 3:11 pm PST) by JamesH

The webserver persephone has experienced a hardware failure and is being moved to new hardware. This process should take about 30 minutes, and I’ll post an update when finished.

Update 15:51: I’ve finished moving persephone to its new hardware, and it appears to be working properly. If you have any sites hosted on this webserver that are not working please contact technical support.

Update 16:05: It seems the new hardware I put persephone is having some problems of it’s own, so I’ll be moving it again. I apologize for the continued outages.

Update 16:54: New new persephone should be working properly now, we’re going to continue to monitor it carefully, and of course if you have any sites hosted on it that aren’t working please let us know.

Spunky Mail Issues

Posted 2 weeks, 4 days ago (April 23rd, 2008 at 8:23 am PST) by Justin

We are investigating what was causing customers some issues with pop and imap services this morning on the spunky cluster. At this time it seems to be working normally again but we are still testing and checking into the cause. Our apologies for the problems.

We believe we have found the problem server causing the downtime and it has been corrected. We’ve tested some email addresses on this cluster and they are showing pop, imap and smtp ok now. If you are still having issues please contact support with details.

Update April 28, 2008:

This was unfortunately not resolved as we are having problems with this cluster again today (authentication errors which from the reports I am seeing are effecting only pop3 as webmail is functioning). We have isolated it to the mail database and our database expert is correcting this now, we’ll add another update once we have more details from him.

Added information: we have a ‘master’ mail database and two clones so that we have plenty of redundancy and do not lose data. What happened here is that one of the clones is acting up and was refusing logins (I believe webmail is set to use the master by default which is fine and that would explain why webmail was working). We’re having normal mail use the master as well for now until we can work out the issue with the clone, so mail is working properly again while we make sure we have this fixed for you.

Quick note (04/26/2008):

We had a hiccup on a specific mail serer (just one out of the 9 on that cluster) which was giving similar errors to that resulting from the database server issues but we have fixed it and it is not actually related (just to alleviate concerns that the same problem was coming back).

Update: this was resolved per the more recent post on the subject and everything should be back under control.