4:46 pm

Copenhagen’s been chewed up

Posted (October 25th, 2009 at 4:46 pm PST) by Jason C

Due to a hardware failure on “copenhagen”, our admin team is in the process of migrating the server to new hardware.  It hopefully shouldn’t take too long to move everything over — but we’ll be posting updates here as the issue progresses.

Update – Oct. 25th, 5:24 PM (PST): Due to the state of the server, fixing everything up is going to take a little longer than normal.  So we were kind of dead wrong on our initial estimate.  Sorry about that!  We’re doing our best to get this resolved as fast as we can, however.

Update – Oct. 25th, 6:27 PM (PST): As of this moment, we’ve decided to go ahead and mount your data from our most recent backups. During this time we will be restoring your content from our backups to the new hardware.

Update – Oct. 26th, 7:10 PM (PST): Due to the filesystem flipping itself into “read only” status earlier today, the restore is taking even longer than anticipated.  It is still running and all content is expected to be restored eventually — but please note that the server will be running at reduced capacity (see: slower) until everything is back to normal.

Update – Oct. 26th, 9:25 PM (PST): One of our admins has restarted the Apache services on the server to help knock the load down.  This should stabilize the sites that are up and running for the time being.  Basically, the sites that are still missing are causing some Apache services to hang.  That ends up driving the load up.

One of our techs who has experience with recovering servers that have seen problems like this will be taking a look at things shortly.  Once we know where the recovery effort stands, we’ll make sure to update this article with more info.

Update Oct. 27th 12:32 AM PDT: I was able to get the array back into a workable state with no visible corruption. I started copying the data over to the new copenhagen now. An ETA before all data is restored is probably around 24 hours. Any data you upload that was newer than the original should not be over-written.

Update Oct. 27th 4:02 PM (PST): The data is still copying and on target to be finished in the (hopefully early) AM hours of the 28th.  Once the copy has completed, we’ll drop another update here.

Update Oct. 27th 10:50 PM PDT: All of the users we didn’t originally have backup data for have been restored from the original hardware except the largest 16 users (so the vast majority have been recovered). Users have been recovered in the order of disk space usage (smallest first) so all the remains are the largest users on the server. If you are on copenhagen, have less than 10GB of data in your users directory and you are missing files please contact support. This will be listed as completely resolved once these final 16 users are copied.

Update Oct. 28th 10:28 PM PDT: For several hours now all the data that we were missing backups from recent moves to the server have been restored from the original hardware. If anyone else is missing data please contact support to have your user manually restored.

This entry was posted on Sunday, October 25th, 2009 at 4:46 pm and is filed under General Outages, Single Server Issue. You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.

97 Responses to “Copenhagen’s been chewed up”

Hugs for the admins working on a sunday

My stuff on scion seems to be down. Anyone else on Scion having trouble?

Scion seems to be back up, thanks!

Isn’t there a way DH admins can predict a hardware failure and prevent it from happening in the first place? I am thinking more of hard drive failures, but maybe there are other monitoring program for other hardwares as well?

Error: Database connection failed.

It is possible that the database is overloaded or otherwise not running properly.

The site administrator should also check that the database details have been correctly specified in config.php

When can my web site back to work?

*sigh*

We just got moved to Copenhagen a couple weeks ago (and not at our own request, so far as I know).

Lots of clients down, at very critical times for them. :-(

We were on Arrow. Any way those could be brought back up as a backup?

Is Brazaville down, too? My site isn’t working..

My client’s site is down, too – on brazzaville / dopey.

Can we get a status update? And what exactly does “chewed up” mean?

Yes. Rather annoying as it’s now been HOURS. No updates and no working site. Hundreds of clients on our site with no access. Especially annoying with a photo hosting site….

It’s very frustrating that this is happening now..is there any clue as to when this will be up and working? I really love Dreamhost and I think the support rocks, but this has been happening too much lately.

10:18am here in Europe, and all sites I host are down. Hopefully it will be resolved soon as my subscription runs out in a few days and I was hoping to get donations sorted!!

gerçekten hoş bilgi.

SearchForMission: This kind of thing isn’t totally unavoidable. Hard drives can break even without a warning from SMART, fans can fail etc. And when you have say 200 servers, you can expect a drive to break every couple of weeks. :( At least we don’t have to fix it ourselves.

7am eastern time…..still out……all sites returning “error id: “bad_httpd_conf”. Any updates from DH?

@Linto – 200 servers? DH has thousands of servers.

@Brandon -
http://wiki.dreamhost.com/Bad_httpd_conf

Still down. clients complaining – Anybody know/heard any updates from DH? Can DH please provide an update on the situation?

@brasscrest They actually have 118

Failures happen, but this is over half a day! I was moved to Copenhagen just a couple weeks ago as well….just in time for it to fail.

Over 13 hours and still down. The only thing that shows on my website is text that says “It works!”…which is ironic, because it doesn’t work…

@Linto – where did you get that information?

We’re still down. I thought it would be up by now. You guys should at for warned us the severity of this issue if it was going to be longer than a 6-12hrs. You had 2 updates in an hour increment 12hrs ago…

Still down..this is a nightmare..you guys aren’t going out of business are you?

any update please? how recent is the backup you are restoring?

any updates?? To all others: Have you sent Dreamhost a notification! This is taking too long!!!

What the HELL!
When will this be fixed.
Is dreamhost going to give me a couple free months for the money I might be losing by potential clients not being able to get to my sites.
http://www.bryanmitchell.com
http://www.bryanmitchell.com/wp

Not that you can go to them right now. Still down in Detroit around 11:30 AM EST.

This is absurd. 12 hours and no resolution and no update?

And yes, we’ve sent MANY messages, no answer.

Grade: F

i think Dreamhost is about to go down in flames…this is odd to hae no current updates, no response to support emails, i really do think they may be going under and I am getting really scared.

This is the only response I got when I tried to contact them last night:

Your inquiry has been moved to the queue of a specific tech support team member (this is either because they are already familiar with your case or are the best equipped to assist you with this specific issue). They will respond to you as quickly as they can but depending on the complexity of the issue it may take longer than normal for them to get back to you (even in excess of 24 hours in some cases).

Thanks!
The Dreamhost Ticket Moving Robot!

They did so good at keeping us updated first but now they just leave hanging??

well at least you go a response

this is no good! We got no updates from them either!!….Something is not right, it should NOT take this long, and no feedback???

What is reasonable time to fix such a problem? 12 hours- 24 hours???

Guess Copenhagen is still down???
Haven’t heard a thing site still down….

Still down for me too

Any good news?

This is turning into a nightmare for me…

Yes! its a nightmare. My clients are pissed off right now!. Have anyone been in direct contact with any of the following employees in DH lately? http://www.dreamhost.com/aboutus-profiles.html

cant get any feedback from the support desk/service so i am wondering if anyone have a direct contact info to people in DH.

Another 3 hours and it’ll be one day. One WHOLE DAY. 24 feekin hours. 1/7 of a week. 1/365 of a year.
Yesiree…ONE DAY OF OUR LIVES THAT WE NEVER GET BACK.

Going to drink now.

It’s not the fact that it’s down that is upsetting, it is the fact that there have been absolutely zero updates. I understand that when people are fixing issues that they can’t break away every 5 minutes to say that they’re still working on things, but even so – there hasn’t been an update for an eternity. I have people asking me the situation, and I can’t tell them anything….

It’s been down the whole day. Can we please get some visibility on this? It’s getting ridiculous.

This is absurd. Any updates from the DH team forthcoming?

error id: “bad_httpd_conf” still on all my sites.

Another bump on this thread, just to kee it alive…

lets create the Copenhagen union ;P

I used windows 7 wizard

My clients are outside with torches… they want my head on a stick…

C’mon guys… I need an answer!!! Now!!

Anyone know of any good web hosts?

Downtime is a part of life…….not updating the status of the server or contacting clients is NOT.

I’m thinking it’s time for blue host…I am stunned…no answer to any of my emails..no updates, not even an automated response from my support emails..how scared are you guys that something is seriously wrong here

Over 24 hours down…zero replies to messages sent…no updates in quite a long time…at a minimum, we need responses, DH. A server goes down and you need to transfer data to another server…it didn’t take you this long to move us to Copenhagen, so it shouldn’t take this long to move us to a different server…

It’s not so much that they’re taking this long… it’s the complete radio silence. I don’t even need an ETA, just a sign that things are happening.

Seriously, DH need to get their act together. Whether its 118 (according to Linto) to 1118 servers, only this one is down at the moment? How hard is it or how long does it take to copy the backup data to the new hardware?

This reflect their poor infrastructure design and a lack of human resource to handle the hosting service.

Corrections:
Seriously, DH need to get their act together. Whether its 118 (according to Linto) or 1118 servers, only this one server is down at the moment. How hard is it or how long does it take to copy the backup data to the new hardware?

This reflect their poor infrastructure design and a lack of human resource to handle the hosting service.

Guys – get real. Listen – IT IS A $4.99 PER MONTH HOSTING PACKAGE. If you want to build a business then use PS not a shared hosting service. DH in my experience are as good as any others – the fabled Bluehost included. I think the server is coming back up real soon – all my files are restored but the web server still has an issue. I expect they are working on this too and things will be back to normal real soon.

I was showing in my Panel that the problem has been resolved, which it is NOT! One good thins, I guess…I can finally FTP into my server. So, perhaps we’re getting somewhere. Still very unacceptable with the lack of communication.

Finally ! Only took about what…..27 hours?
Kudos to those that had to migrate to a new server, but shame on DH for not replying to support requests or posting any info here.

Amen, Phil. 27 hours seems a bit on the long side. But an update would have certainly made it a lot more acceptable. I’m back up as well…

Yeah – the site is backup but very slow – load averages on server:-

load average: 255.57, 196.86, 116.14

never seen a LA so high in my life without a server crash…..

OK – load averages coming down rapidly now…..lets hope this is it BACK UP!!!

Seems like load is really coming down. I’m up and running at pretty good speeds…

Still down, down, down.

This is unacceptable at ANY price. I’m looking forward to a big refund.

is all data well be back in 24hr?

The server is still up and down quite a bit. I expect they are ironing out the last few details.

Load aveages still above 8

I am still down!!!!
Ugh!

Did anyone else’s crontab disappear?

Mine is gone, and when I went to edit it with crontab -e it tried to create a new one for me. When I went to save it, it had a “Permission denied” error.

I have a number of sites back up and running – unfortunately my main site is still down. I hope it’s up soon!

I’m still down anyone else in my boat???

Yep- still down. All our sites.

If you’re still getting “Bad httpd_conf” errors on FULLY HOSTED domains, try this:

First, use whatever FTP tool you normally use to log in to the domain in question. If it doesn’t give you some message about the Home directory not existing, then chances are your site’s data is at least partly back up. You should be able to verify this from the FTP.

If the site’s data appears to be in place but the site still gives that error, go into DH Panel, to Manage Domains, and click the “Edit” button located underneath “Fully Hosted / User …”. On the following page, don’t change anything (unless you really want to take the opportunity to do so), then click “Change settings” at the BOTTOM of the TOP section, “Fully Hosted” (should be marked with a green arrow as the currently active selection).

You should be back to the Manage Domains listing with a message telling you the change had been successfully queued and will take effect in 5–10 minutes. Also, you should see a stopwatch icon to the left of the name of the domain you just did that to. Either refresh the page every minute or so, or take the opportunity to do this to more domains.

In either case, when the stopwatch disappears from the listing of that first domain, try to browse to it again. It should come right up!

Note that doing this will NOT work on domains that have not yet been restored (for obvious reasons)!

I strongly suspect that the DH gurus will run a batch task that will in effect do this command on all your domains once they’ve all been restored, but if you have some sites that just can’t wait and you can FTP to them and see that their data is live, or at least that the Home Directory is present and that you can re-upload the site to it, doing this trick should bring the site up even BEFORE that batch task is run!

Another important note for those moved to Copenhagen from other servers, that wasn’t mentioned in the automated Email sent to you for the move:

Some server variables variables may have changed. In particular, if you used “$_SERVER['REDIRECT_REMOTE_USER']” in PHP to retrieve the name of the logged-in user (via Apache .htaccess control instead of your own credentials management system), say, to redirect them to their own “home” directory, or to filter database results by a column containing the login names that the records “belong” to, you’ll need to change that to simply $_SERVER['REMOTE_USER'] (remove the “REDIRECT_” part] or your script will no longer work properly in response to user logins.

(Ooops, typo! Mods, please delete previous.)

Another important note for those moved to Copenhagen from other servers, that wasn’t mentioned in the automated Email sent to you for the move:

Some server variables may have changed. In particular, if you used “$_SERVER['REDIRECT_REMOTE_USER']” in PHP to retrieve the name of the logged-in user (via Apache .htaccess control instead of your own credentials management system), say, to redirect them to their own “home” directory, or to filter database results by a column containing the login names that the records “belong” to, you’ll need to change that to simply $_SERVER['REMOTE_USER'] (remove the “REDIRECT_” part] or your script will no longer work properly in response to user logins.

Still Down.

This is Day 3.

Still unacceptable at any price. They could have BUILT a server and moved the data over in much less.

Well as soon as everything is back up I’ll be moving my hosting to a more reliable company. I’ve been with DH for 3 years now and I’ve never been so appalled by the service and updates. This is just totally unacceptable. I can’t have client’s sites down for 3 days.

I have already lost 2 clients because of this, I can’t afford lose any more.

is amost 24hr!! still down!!

IMPORTANT!!

You MUST open a support ticket to get credit for this outage from Dreamhost! And even then, you’ll only get credit from the time you open the ticket!

http://dreamhost.com/hosting-100-percent-uptime-guarantee.html

That’s an AWFUL requirement.

Just went to submit another ticket, 3rd one… and tried the live chat not expecting much. I was right…
Please wait for a site operator to respond.
You are now chatting with ‘Mike S’
Mike S: Good morning! What can I do to help you?
you: Mike any update on the status of copanhagen
Mike S: Were you waiting on an update for it? Is there an issue with the server? It looks to be running fine for me.
you: Mike have you checked the copanhagen blog?? it’s been down for over 24hrs???
Mike S: Ahh, it looks like it is still having issues with some home directories.
Mike S: I see that you’ve got a ticket filed about this already. I’m going to move it over to our administrators so that they can get back to you when there’s more information about what’s going on.
you: I’ve been down for over a day now, lots of unhappy campers, anything I can do on my end to speed up the process
you: appreciate anything you can do.
Mike S: Unfortunately there’s nothing that we can do. I’m terribly sorry about that. :(

TO ALL: Please read my message above about how to refresh the httpd_conf! Try that!

In addition to JoelMMCC’s suggestion, also try emptying your browser’s cache, if it supports doing that. (For example, in Safari on the Mac go to Safari -> Empty Cache…) I noticed I was still getting the “Bad httpd_conf” message in Safari even though my home directory was no longer missing, so I emptied caches and reloaded again, and my site came up. So in my case, I was having a local browser cache issue.

hmm…… all sites have gone down again. I had 2 or 3 back, but they’ve gone under now :(

Well, right now my clients not only think that I am a terrible option. Now they are also talking about my lack of knowledge because I tell them that everything was fixed… Thank you very much!!!

I’m really upset… I can understand that shit happens, and to be sincere I always received and excellent support from DH.

But this is unacceptable. I’m losing money. Got it? Fix this NOW.

Still down, down, down.

It’s been three days now. Maybe it will take a week. Maybe a month.

What would it take for Dreamhost to lose 10% of its customers? 20%?

I would say that getting the word out on this really dramatic fail would do the trick.

@JoelMMCC, your suggestion worked for me once I was able to ssh into my domain. Thanks for the tip.

“…up to 10% of your next pre-paid hosting renewal fee.” That’s pretty weak. So for a $10 plan, they’ll credit me $1 at most?

I’m up now, but I don’t trust it yet – they say they won’t be finished with the server until early tomorrow!

Looking forward to an oh-so-humorous explanation in the next newsletter.

I have 2 site data all lose

We’re past the wee hours of the 28th. Sites not up. Am I to understand that if my site is not currently up, that means I’m one of the 16 users lots of data?

WOW, it’s slow right now. My Rails (Passenger) apps won’t even start up anymore. (My PHP sites work fine)

> uptime
09:08:41 up 17:55, 11 users, load average: 417.57, 428.02, 416.70

Yep…just when things seem to be going well, it’s back to being so slow that it’s not even usable…

And then DH marks a ticket as resolved and it is NOT resolved. The problem continues…

@Reuben, you’re welcome. Glad it helped somebody else. The info I posted was adapted from the DH Wiki.

HELP!! The absolutely VITAL mod_rewrite is missing from Copenhagen!! Drupal can’t go anywhere without it!

Back to being incredibly slow… This is getting frustrating.

Seems like the job is not done. My sites won’t load in the past few hours.

@Phil…mine neither. I submit trouble tickets, they get marked as resolved, and nothing changed. Furthermore, I got a response a while back that my website ha had too many connections and they have throttled my connections, that the problem is my website, and I need to redesign it. HA! It was just fine before they moved me to Copenhagen, it was fine after the move to Copenhagen but before it crashed over the weekend…now all of a sudden my website is a problem?! Gimme a break…

it is still chewed up – freekin’ load averages over 300 and so slow as to be unusable.

My sites are all not loading, my ftp cannot connect. Explanation please.

why is down again????

Everything looks OK for me at the moment, but it was pretty dicey yesterday. The frustrating thing is the lack of updates from Dreamhost. Is the migration complete? Were there complications? Can we expect more outages and excessive load today? Dreamhost is silent…

Our files have yet to be recovered. Everytime I FTP into our account via the panel interface, there are no files still!!! This is has been missing for A WEEK!!! WHAT IS GOING ON!

 
© 1996-2009, DreamHost.com
Entries (RSS) and Comments (RSS).