Current Time: 15:26:04 PDT

What's Up?

If you are experiencing a problem that has not been reported here, check our web panel for more information.

(Please remember, posting in the comments here IS NOT an official way to contact DreamHost.)

Search

Pages

Categories

Other Stuff

2:36 am

File server issues (Finished!)

Posted (February 19th, 2007 at 2:36 am PST) by kitchen

One of the fileservers that smithers, ralphie and homer mount is having some issues and some sites are being affected. Our fileserver guru is loooking into right now and hopefully will have a resolution soon. Check back here for updates on the progress.

We sincerely apologize for the inconvenience.

UPDATE(Feb 19 @ 8:06AM Pacific): It appears that a file server lost a volume sometime last night. The file server suffered a multidrive failure. This caused the raid5 volume to fail. We have replaced the drives and are restoring from backups. This will take at least an hour or more to happen. Customer sites who are affected will come online as the data is restored. There are 36 users directly affected by this issue. This is the list of servers affected, since we do not post usernames here:

homer otto dawber gotcha tak homer marge lisa chalmers wiggum nelson kearny cletus ralphie flanders burns mcclure seamus millhouse smithers willie annie cheeky dib

We’re sorry about the previous sparse information.

UPDATE(Feb 20, 2007 @ 10:17AM Pacific): The file server has been back online since the previous posting. The only people affected by this are the 36 people who were missing the data in their home directory. We’re restoring it as fast as we can, but 4TB of data simply takes a long time to copy. Only 18 users left to copy. The servers above should not be feeling any ill effects of not having this data there. We’re sorry about the downtime. The data is copying as fast as the hard drives will let it.

UPDATE(Feb 21, 2007 @ 10:55AM Pacific): The restore program is on to the last 2 users. We are also working on ways internally to make this go faster. This is the first time we’ve had to run a restore of this magnitude on this particular type of file server. Hopefully, if/when there is a next time, we will have a faster method in place. The backups are all there, it just takes a very long time to copy them over.

UPDATE(Fed 22, 2007, @ 10:34AM Pacific): The restore program is finished! If you are still missing files from this server you should contact support.

This entry was posted on Monday, February 19th, 2007 at 2:36 am and is filed under General Outages, Multiple Server Issue. You can follow any responses to this entry through the RSS 2.0 feed. You can skip to the end and leave a response. Pinging is currently not allowed.

54 Responses to “File server issues (Finished!)”

I submitted a server-wide outage report 40 mins ago ( so that I’d get notified when the problem was solved).

I’ve just got an email saying there is no server-wide problem affecting ralphie, which contradicts this blog entry (and the fact that my site is still down!).

It’s usually safest to assume that there is a problem until you have visited your site that day…

Site still down … I’ve submitted another support request.

Why would you continue to submit support requests on the same subject? Obviously they know that the server you are on is having issues, so all you’re doing is backing up the support queue for everyone else’s issues. As for the system status tool, I don’t even use it anymore - all it EVER says is “no system-wide outage”. Do us all a favor, don’t stuff the support queue full so the rest of us can’t get responses in a decent amount of time.

Here, here to Matt.

Also, DH has extensively reported the fact that outages are reported on this blog, NOT the old system outages area.

The first support request was because the “system-wide outage” tool said there was no problem.

The second request was a follow-up to the first (and referenced the first one’s id) because they had reported here that the problem had been solved when it had not (and still hasn’t). How else are we supposed to draw their attention to the fact that a problem they think they have fixed is still occurring? Surely support requests are the correct mechanism to do that …

@Richard

The system-wide outage tool is generally 100% automated. It looks for other users reporting similar problems on the same server before it “verifies” an outage. There is sometimes human intervention, but not always. Any information posted on the status page will always trump the automated reply from that tool.

Database outages, too?

2 site forums down. I must have something to do with this.

Error connecting to IMAP server.. sigh. Not this again.. in the middle of the day?!

Website is fine. Email / SMTP is borked both from webmail and Outlook. Sending does not work.

all of my IMAP connections just died as well.

SMTP is down for me, too.

Same issue with mail here. Can’t connect to IMAP/POP servers. The MX servers for my domains seem to be down as well. :(

I had problems this past weekend with me entire home directory disappearing on the server that I’m on too. Although it wasn’t one of the ones mentioned above?

Very strange indeed. Hopefully, they’ll have it back up and running soon!

all my IMAP and SMTP died about 5minutes ago.

I was using webmail just fine until about 15 min ago when it dropped the IMAP connection. Still won’t connect, but from the looks of the volume trouble of last night and all the machines that are affected I figure it’s not something that is easily resolved.

When will this be resolved. Mail and webmail dead. this seems to be happening more and more often.

I say let’s revolt or get some sort of credit on our accounts because of lost service. If it hits them in their pocket books maybe this wouldn’t happen as often.

Our website is back up now, having been down all day (roughly 0830 - 1830, GMT), and IMAP/SMTP are currently working OK here too … [fingers crossed :) ]

my outages have been minimal AllSource, nothing for me to be overly upset about. mail is back up for me now too.

So if we’re all not getting mail (I’m not either) then does this count as reported? If so, why doesn’t the status page say so?

Or are we all waiting for someone to file a report?

According to the blog, it was supposed to take an hour or so to restore from back. That was FIVE hours ago. What’s the latest? My entire domain/email/shell is gone.

The Simpsons servers must be pissed off because of the movie. =P

I haven’t had a problem for weeks, apart from 2 minutes of downtime yesterday (GOSH). I am still happy with Dreamhost. If anyone at Dreamhost is lonely and would like some loving, let me know.

That’s a bit too much information, Jim.

Daniel, welcome to the next 12 months.

*sigh* The reviews warned me about this, but nooo, I was assured by trustworthy friends and acquaintances that DraemHost was all fine and good. Let’s hope that this proves to be the only outage I get after all …

Am I to assume that these guys have not posted an update on this issue in 16 hours?
Oh well, I was considering moving my sites from another provider…guess I’ll rethink this choice.
If it’s really taking this long to do a restore, then they’re likely desperately looking for one, day by day, that’s not corrupt.
Poor bastards.

Please file a support ticket (or use the alternative “Contact Support” link), for issues not related or limited to the one highlighted in the parent post (that is, not a file server issue, not limited to the specific servers mentioned).

Mentioning it here (especially only here) won’t do very much (see the large bold text at the home page of DreamHost Status).

24 hrs.+… still down. An updated ETA would be nice.

I’m still down also.. I would love to know what’s going on, even if the outlook is bleak

Hey, does this influence the IMAP server as well? It keeps on timing out on me?

Good luck fixing this (always things to fix in IT, so wonderful, we’ll never be obsolete as IT profs)

Yeah. Come on. Some sort of additional information please.

An update would be nice… another 24hrs, 48hrs?

Wow, so the average Dreamhoster weighs in at 111 Gb? That’s interesting.

I’m on Willie and I can’t upload files larger that 10k via FTP, is it because of the failure?

Web site access has stopped. (Since 11:00 am/earlier) How long before the site is up and running again?

it seems homer still having problems with it’s filer…
can’t ftp to homer…. argh…..need to update files on some websites…..

what’s the deal here, email has been down all day (since 8amEST) and no updates!

hmmm. how’s that for customer service?!

Alright, mine’s back online now.

According to google analytics, my blog has not been receiving page views since 9am (PST). I can’t access it in IE at all, I can’t access it in IE or firefox using fadetoplay.com only using http://www.fadetoplay.com can I access in firefox. My friend can’t access my blog in any way so it may be a server problem.

our site is not working.
Please correct and send me an email at above

we’re down still too. mail and web.

i just NEED my random insult gererator page, how else am i sipposed to blow off steam at my boss? im dyin here!
theres no substitute!

52 hrs.+… still down. Good thing I noticed the blog update, or I would be even more annoyed by the way my support tickets are being ignored.

Things went down last night suddenly-some parts of the site work, some don’t. Can access member pages. Weird. I hope this is resolved soon!!

that is “can’t” access member pages

Could we request an e-mail notification when the site goes down? For those of us whose sites run on autopilot we may not oterwise know when service is interrupted.

111GB is a lot indeed… one thing to keep in mind is that backups are automatically made of everything (from ssh, do cd .snapshot to see them). It’s 6 backups (2x hourly, daily, weekly) so if the backups are restored immediately as well (although it’d make more sense to do everyone’s “current” snapshot, then phase in the rest of them as time permits) then it could be that each user 111GB/6 on average. 20 GB sounds more reasonable, although it’s certainly a lot… I’m using less than 1…

I’m one of the last two and I have ~100GB - not counting backups. Guess it’s karma that I’m at the end…

I suppose that makes me tha last two also.
No-one else huh?

^ It took a long time, but everything seems to getting back to normal. My files are present and accounted for; the dust is settling.

This may sound as spam, but please buy netapp appliances… the cheapest model (fas720?) may not be much more expensive than a server with a JBOD, and in my experience will run much more reliably than that.

Regards

Great job done! Dreamhost is the best hosting you can find!

It´s a very interesting post and simple answer of many questions.
Thank you
SEIKO CITIZEN TISSOT

It´s a very interesting post .
Thank you !!!
NÓŻ SCYZORYK GERBER CRKT

Leave a Reply

Comments posted here may not be viewed by DreamHost staff at all. Please note that this is not a way to contact DreamHost.

 
 
 
 
 
© 1996-2007, DreamHost.com
Entries (RSS) and Comments (RSS).