Filer problems with blingy cluster.

We are currently having a problem with a filer which has crashed and is recovering at this time. While this is happening some customers in the blingy cluster will experience problems loading their websites/email. We apologize for the outage and service is expected to return to normal as soon as the filer recovers.

UPDATE 3:01:AM PDT

The filer has finished recovering and all services are back up and running. We are working with the filer vendor to find the source of the crash to prevent any further outages.

Update 24/03/08 10am: We’re working on the file server again to alleviate the load that’s causing problems with web, mail and mysql services. Sorry about that.

Update 27/03/08: We are doing emergency data moves to quell the stem of problems recently caused by your file server. During these moves, your data may be inaccessible. We are moving as we can off as fast as possible. Very sorry about the continued inconvenience!

Update 27/03/08 This series of moves has finished. We are going to keep an eye on things to see how much it helped and may have to do more moves tonight and tomorrow morning to get everything working smoothly again. This post will be updated with more information as soon as possible.

Update 29/03/08

We are continuing to move data off of the problematic file server but it’s a bit of a catch-22 because customers on that machine are continuing to add data at a very high rate. It filled up this morning for a while causing device full errors as well as mail problems and issues serving websites (when these fill up it causes problems across the board). To explain in more detail, when we move data it does not immediately disappear (there is a ’snapshot’ created of the old data that remains in case there was a problem with the move - that ensures that we do not lose customer data but until the admin team can check the move to make sure it went through properly we cannot delete the old data). We just did some of that and have some breathing room again and of course more moves are still in progress but we are asking customer on this cluster to help us by holding up on any non-essential uploads of data for the next couple of days. As soon as we have a significant portion of the data removed the problematic file server will begin to function properly (and additional moves will go much more quickly and smoothly) but right now we’re having trouble moving data more quickly than it’s being added by people. If everyone could please limit uploads to absolutely essential data until we reach the turning point where everything is working this will be resolved much more quickly (in other words if for example you are setting up a repository of large files you’ll actually be better off waiting a couple of days and getting the all clear from us on this issue because you’ll be able to access that data reliably instead of cramming it on there now and slowing the recovery process).

In the meantime we’ll be doing everything we can to safely and quickly move data off and get things back to normal.

Added information: Some of the people recently moved to the new file server are seeing errors because the data did not get set up completely (loading the site will work but just show an empty index). The admin team has been running an rsync that will fully restore all data and should hopefully finish by 9 PM PST - once that is finished all site and email data will be available for those users.

Update 30/03/08

We’re still racing to keep ahead of new data being added so any help we can get on that front is greatly appreciated (we’re still asking for customers to limit uploads as much as possible to speed up the recovery process). Some customers who are being moved are seeing blank directories still but those are due to moves in progress and the data will be fully restored when those complete.

Update April 1, 2008

We seem to be ahead of the curve right now, we are moving data off the primary volume and on to a secondary one faster than new data is being uploaded. The volume hasn’t filled up completely in a few days. We are working closely with the technical support team to see how we can speed up the process further. Thank you for your patience.

Update, April 1, 2008

I apologize for the late update but we’ve been going over our options (while moving data of course). While we’re not seeing any real relief in terms of data uploads we do have some very large moves that are almost complete. Once those finish we can start deleting the data (for example one is around a half TB or around 500 GB which will be 4-5% of the total but it’s going to take until around Friday to delete it all, so we’re dealing with a ton of data). Tomorrow’s update should be earlier in the day and hopefully we’ll have some progress to reports from the large moves being complete.

Update, April 3, 2008

The data moves to other file servers has been running constantly, but last night and this morning some complications happened with the moves, requiring admin attention. To clear up some space there had to be a short interruption in file serving, this is now finished, space is available and the moves are continuing. The admins are fixing up the last of the web servers which were having issues after file serving was restored. Our apologies again for the continued issues.

Update, April 4, 2008

Today has been a pretty good day of progress. We were able to complete even more moves and free up more data from the file server. Moves have been going quicker and stability is dramatically improving. Monitoring of the servers and email in the blingy cluster today have shown a significant decrease in problems. Issues do still exist but the problem is noticeably getting better. We are also pleased to note that we have more storage that will be coming early next week. We believe that this will go a long way in helping us fix this major problem.

Update, April 4, 2008

Things are continuing to improve today - when I got in I was pleased to see that we had held firm and even gained a percent (the effected file server was down to 95% which is as low as I have seen it in the last week and since I have been working it has dropped to 94%). Performance should improve as we gain ground (this will speed up moving data off as well). This progress, along with the added storage space we are expecting early next week should hopefully allow us to restore service for our customers to normal.

5:45 PM PST : The moves we started just a while ago seem to be causing server problems, we’re looking into it and should have it resolved shortly (they were run just like the ones that had completed so we have to determine why these specifically caused an issue). Update: this resolved itself before we could detect the cause but we’re monitoring the situation to ensure that it’s not a recurring issue (we have no indication that it will be).

Update, April 8, 2008

Please see our other posting for details on the work we did on the effected file server:

http://www.dreamhoststatus.com/2008/04/06/30-min-blingy-downtime-tonight/

We are also continuing to offload data and are making good progress (it’s never as fast as we would like it to be of course). There’s excellent detail here in case you missed it:

http://blog.dreamhost.com/2008/04/07/another-anatomy/

which chronicles the situation and fills you in pretty much up to today. We’re seeing the data dip to 90% so we’re hoping to have it down in the 80’s by the end of the week (every percent we gain helps and as performance improves we can speed up the rate of moving but we’re still looking to hit critical mass where you get the proper level of performance).

Update, April 9, 2008

As we had hoped progress is speeding up as we free up more space - while the file server is showing 95% usage, around 11% of that is data that has already been moved and is no longer in use. Due to a software issue we haven’t been able to remove it yet (the admin team is working on the best way to execute that), but once that is gone we should be around 85% usage which is another large step forward.

In terms of effect, I have already seen improvement in site function for many customers as well as greatly increased speed in moving chunks of data off as well as receiving reports that mail is functioning quite a bit better. That said this issue remains at the High severity rating and in unresolved status as we have not reached a normal level of service. I can’t stress enough how sorry I am that our customers have had to put up with this but I thank those of you who have stuck with us (check the newsletter for details on what we’re doing for Blingy customers) and look forward to providing you with the level of service we strive for at DreamHost.

Update, April 9th, 2008 21:59 PDT

Unfortunately, we need to unmount the volume again to kill these snapshots before they leave us with 0 bytes of free space. In 2 hours (midnight) I will be taking the problematic volume offline to delete the phantom snapshots. Total downtime will be between 10 and 30 minutes. Sorry for the short notice and additional outage!

Update, April 10, 2008

Well the snapshot mentioned yesterday is gone and we’re actually at 83% used today which is below where we were hoping to see marked improvement (85%). Of course we’re still moving data off (which increases the usage on the file server) so that won’t fully translate to customer usage improvement but it should be quite a bit better and keep improving until we stop moving data.

Update, April 14, 2008

Okay, we’re finally getting ready to mark this as resolved.. things have seemed pretty much okay for a while now. But, just to be sure, we’re dropping the severity to Medium for now and leaving it as unresolved.

Update, April 17, 2008

We’re still hearing some reports of site slowness - we were able to resolve an issue causing high loads today which should help but we’re not going to consider this resolved until everyone is receiving good service.


Severity: Medium   Resolved: No
.

1505 Responses to “Filer problems with blingy cluster.”

Pages: « 15 6 7 8 9 [10] 11 12 13 14 1531 » Show All

  1. 451
    Craig Says:

    According to my account summary I’m only on “blingy” for email (”clank” for web pages and “rous” for MySQL). However, email is the only thing that seems to be working for me. Maybe the problem is more widespread as it seems clank and rous are experiencing high loads (round 80-100 mark!)

  2. 452
    Lisa Spinelli Says:

    How long?

  3. 453
    Jason Sheroan Says:

    I’m pretty sure it’s not yet resolved for me, as my email is still not working. Hopefully soon.

    j

  4. 454
    M Says:

    Holy shit I’m finally receiving emails from 4 fucking days ago.

  5. 455
    pHv Says:

    For a short time yesterday evening and this morning email was working. Not quick but at least it worked. Now i’m getting ERROR: Connection dropped by IMAP server.
    Query: FETCH 1:* (FLAGS UID RFC822.SIZE INTERNALDATE BODY.PEEK[HEADER.FIELDS (Date To Cc From Subject X-Priority Importance Priority Content-Type)])

  6. 456
    Rick Says:

    So has anyone tried to visit “Astonishinghost.com” to leave complaints and been highly amused at what actually pops up?

    Seriously.

    Goes to a YouTube video… check out the WhoIs, good for another laugh. Looks like someone over at Astonishinghost forgot to renew their domain…

  7. 457
    diego Says:

    Como tienen tanta cara estos malnacidos de dreamhost???

    Ardereis en el infierno hijos de la gran puta.

  8. 458
    Flo Says:

    That really suxxx! I think I’ll move my domains to another hoster…damn, three days without accessability ;-(

  9. 459
    TC Says:

    All I can say is, check this site out. http://www.webhostingtalk.com/forumdisplay.php?f=1

    Apparently the problems DH are having are rather common amongst low-cost hosting providers, although not all low-cost providers have such catastrophic issues like the one I am dealing with now. I think I might have finally found an alternative and will leave you to do your own research on that site.

    Telling a client who has had no email for 4 1/2 days to ‘wait it out’ while I have not been given any useful info from DH is bad for everyone involved. DH will certainly not reimburse me for the week of struggling to provide workarounds and researching dozens of new hosts.

    I don’t think anyone here is impressed.

  10. 460
    senior Says:

    This server is toast — I ssh’d in and made a tarball of my site (to move to a new host!), and it took 5 minutes to compress a simple Wordpress folder! 5 minutes!!! Is my site hosted on Commodore 64? Is the cassette tape jamming up or something?

  11. 461
    Disappointed Says:

    Still getting insanely high server load, page load times of 1-5 minutes and no word at all from support. I’m about to have a look at other hosts.

    When moving between hosts, what would be the best way to minimise downtime due to DNS propagation? I was thinking of setting www2 to the new host which should propagate immediately as that subdomain has never been used, then using htaccess to redirect www to www2 while waiting for www’s new record to propagate. Would this work? Is there a better way? Any advice would be appreciated.

  12. 462
    Jason Gilmore Says:

    Having been a Dreamhost client for some years now, I’ve grown used to the occasional outages, having learned to be patient given the cheap pricing schedule. However it seems this time around I’ve reached my patience threshold, having yet again rolled into the office this morning only to learn I’m unable to check my email for the 5th day running. It’s like playing craps, press the Send/Receive button and hope you get a decent roll. Sometimes it works, and sometimes it doesn’t. This is simply ridiculous.

    I doubt I’m the first on this comment list to state as much, but next week I’ll start looking for another hosting provider, with the goal of moving my site off of Dreamhost entirely within the next two weeks.

    Jason

  13. 463
    Junior Says:

    Get a new filer from the vendor, or better yet get a new server to host our sites

  14. 464
    T. Scheisskopf Says:

    Well, we are having problems as well and we are sweating it as well, but I figure no one is sweating it as hard as the people with skin in the game at DH. The ones pulling double shifts and sleeping under desks, I mean. And you know that they are.

    I have used a few hosting providers and I remember what their customer and tech support was like. If you come from an abusive background, you will recognize it immediately. All I have to do is look at my support history messages and know that this is a problem that has their complete and undivided attention at DH.

    I used to be a roadie for some of the biggest bands in the world. I know what it is like to stand “naked” on a stage with 30,000 screaming fans, none screaming louder than Mr. Petulant Rockstar, while your gear is going up in “flames” and you are trying to keep focused and centered and fix the problem. Fixing problems takes the time it takes and not one second less.

    I think we’ll stick. Weather the storm and all that. We’ll survive.

  15. 465
    Paul Says:

    I doubt they allow embedded images in these posts so here ya go:

    ttp://98.130.145.104/img/sleeping_on_the_job.jpg

    This is going up on every site I own. I am transitioning to IXWebHosting ( www.ixwebhosting.com) as fast as DreamHost’s Yo-Yo FTP access will allow me to.

    I’m through with this company.

  16. 466
    Paul Says:

    http://98.130.145.104/img/sleeping_on_the_job.jpg

    bump!

  17. 467
    fuckers Says:

    this is insane,
    have not been able to update sites in over three days!
    You are a JOKE!!!!

  18. 468
    fuckers Says:

    p.s.
    shawn, show your face…….
    I got a few ideas…..

  19. 469
    bob cobb Says:

    keep an eye on things? my site is down AGAIN for the 5th day in a row. Ridiculous

  20. 470
    jcisio Says:

    @Paul (#469): don’t go with a hosting who provider more than 10 GB space for less than $10. Don’t take any lesson here ? I’ve just posted something there http://www.webhostingtalk.com/showthread.php?t=682124

    I’ve gone. Hope that DH be back with more serious plans.

  21. 471
    poorkid Says:

    i am a fairly new customer so after reading most of the posts, i am dreading the moment when (like most of you) i will have to highly depend on my hosting to work. is there a way for a domain to point to more than one host - such as in this case?

  22. 472
    Paul Says:

    bump!

    http://98.130.145.104/img/sleeping_on_the_job.jpg

  23. 473
    Elliot Says:

    My DreamHost PS is down:

    Internal Server Error

    The server encountered an internal error or misconfiguration and was unable to complete your request.

    Please contact the server administrator, webmaster@adventuresinparenting.org and inform them of the time the error occurred, and anything you might have done that may have caused the error.

    More information about this error may be available in the server error log.

    Additionally, a 500 Internal Server Error error was encountered while trying to use an ErrorDocument to handle the request.

    Rebooting fixed it for now, but this downtime is unacceptable.

  24. 474
    Mike Anderson Says:

    Yeah, I agree the mail service in the last few days siince I been on has been absolutely horrendous. At least my clients’ web sites have been up.

  25. 475
    Nikki Says:

    I’m still having problems with my website. I hope this will be resolved soon =/

  26. 476
    Surrounded by Non-thinkers Says:

    This is going up on every site I own. I am transitioning to IXWebHosting ( www.ixwebhosting.com) as fast as DreamHost’s Yo-Yo FTP access will allow me to.

    It’s cute when really when people publicly display that they are incapable of thinking.

    I see you really did your research, since you don’t seem to realize that they are heading into 48 hours of downtime (at least) for a data center move, like the one Dreamhost just did in 12 hours.

    Do you really think going to a crappier host makes things better?

    Just write down every host on an individual piece of paper, stick all of them up your ass, then reach in and pull out one. That’s your new host. YOU have just as much chance of picking a good one that way as you seem to using whatever method made you think IX was the way to go.

  27. 477
    Daniel Says:

    I just have 3 days in this host and it is a nightmare!

    how can I ask for my money back?

  28. 478
    Rowan Says:

    This is ridiculous.

    I haven’t been able to access email all day and its incredibly importnat for my work. We are supposed to be launching a site that is the livelihood of our business this coming week but now I’m just going to move servers.

    The least Dreamhost could do would be to update us every hour with ata least some kind of update so we know they aren’t just taking a nap over there. Way to go Dreamhost. I can handle downtime, but not downtime AND silence. Also the ridiculously slow speeds I’ve been getting since joining you a little over a month ago are completely unacceptible.

    Anyone got a good server suggestion that sounds similar to Dreamhost but actually works as expected?

  29. 479
    Dave Bernier Says:

    Dreamhost sucks beyond belief - nothing but constant problems. I want a refund immediately.

    Can someone reccomend a host that is reliable?

  30. 480
    Esteban Says:

    Hey, how about an update????!?!?!?!

  31. 481
    Lachlan Hunt Says:

    My website seems to be functioning fine, although it has been a bit slow sometimes, but my email keeps going down intermittently, which is incredibly annoying. However, even when it is up, it is really slow, That might be because I’m accessing it from Norway right now, but it still seems excessively slow. I hope you can fix this soon and do something about the speed.

  32. 482
    Sean Says:

    This is beyond ridiculous now and I’m also considering taking my hosting somewhere else. Communication with your customers would go a long way toward appeasing them. All week I’ve had email and webmail issues during the day, and when you go home at night and stop screwing with it it works fine.

  33. 483
    Astonishing Host Says:

    So has anyone tried to visit “Astonishinghost.com” to leave complaints and been highly amused at what actually pops up?

    Seriously.

    Goes to a YouTube video… check out the WhoIs, good for another laugh. Looks like someone over at Astonishinghost forgot to renew their domain…

    I am most definitely NOT amused, THIS IS AN OUTRAGE! They figure they have time and technicians to spare for going around the net hacking other companies and hijacking their DNS as their customers struggle to keep their own clients because their host is too busy playing pranks to fix their hardware, so I do not think I would be laughing if I were in your shoes either my friend. Our legal department are the only ones who should be happy about this, it’s a goldmine for them! DreamHost will pay dearly for their puerile behavior, and in the meantime our techs assure me that we will be back online shortly, and then we can finish saving the rest of you from this clearly very evil company.

  34. 484
    BoulderBronco Says:

    email down again today. 7+ days and counting. My site has been fine the whole time but it’s on the mario server. It’s laughable. 7 days!

  35. 485
    Tenerife webcams Says:

    All my sites down from 36 hours ago…Do you know any alternatives to dreamhost? I like the simple interface they have here, but i don’t mind to pay even double to assure a REAL host service

  36. 486
    Thomas Says:

    Geez! This crap happens like every other month and totally messes up my business. I’m going to have to move to another provider, I can’t have all these email issues constantly!

  37. 487
    Fight Back! Says:

    While you wait, Astonishing Host, we would greatly appreciate it if your company could make a donation at http://dreamhost-classaction-lawsuit.com/donate.html to help those of us trying to get compensation.

  38. 488
    Dave Bernier Says:

    I wasn’t kidding, I want my money back.

    Also, can someone please recommend another provider - SERIOUSLY! I can’t continue with unreliable email.

  39. 489
    Travis Chillemi Says:

    I am moving to MediaTemple. Not as cheap, but worth the reliability. I have worked with several hosts in the past. They all have issues. However, this has just been too many issues for me to put up with.

  40. 490
    Wake Up Says:

    This is going up on every site I own. I am transitioning to IXWebHosting ( www.ixwebhosting.com) as fast as DreamHost’s Yo-Yo FTP access will allow me to.

    I see you really did your research, since you don’t seem to realize that they are heading into 48 hours of downtime (at least) for a data center move, like the one Dreamhost just did in 12 hours.

    Do you really think going to a crappier host makes things better?

    Just write down every host on an individual piece of paper, stick all of them up your ass, then reach in and pull out one. That’s your new host. YOU have just as much chance of picking a good one that way as you seem to using whatever method made you think IX was the way to go.

  41. 491
    Wake Up Says:

    This is going up on every site I own. I am transitioning to IXWebHosting as fast as DreamHost’s Yo-Yo FTP access will allow me to.

    I see you really did your research, since you don’t seem to realize that they are heading into 48 hours of downtime (at least) for a data center move, like the one Dreamhost just did in 12 hours. This affects almost ALL of their customers–not one cluster like Dreamhost.

    Do you really think going to a crappier host makes things better?

    Just write down every host on an individual piece of paper, stick all of them up your ass, then reach in and pull out one. That’s your new host. YOU have just as much chance of picking a good one that way as you seem to using whatever method made you think IX was the way to go.

  42. 492
    Wake Up Says:

    Also, can someone please recommend another provider - SERIOUSLY! I can’t continue with unreliable email.

    Are you really that stupid? You’re asking a bunch of spammers and people with financial interest in the companies they recommend, pretending to be actual customers.

    If you’re not smart enough to do some actual research, or get recommendations from people you actually trust, then follow the advice I gave in my last post for choosing a host.

  43. 493
    nataichera Says:

    I’ve been with DH a few months now and I’m sick and tired of all the outages and system failures. I agree with the previous poster, I feel like I’ve wasted my money by purchasing from DH. As soon as I get my dedicated server, I’m going to relocate all my sites out of DH.

  44. 494
    nataichera Says:

    not to mention that my sites have serious performance issues. it’s quite ironic, that free hosts like the 110mb give faster and more responsive performance. and they have way better uptime too.
    from the last 4 months experience with DH, i must say DH’s paid service is actually worse than 110mb’s free service.

  45. 495
    nataichera Says:

    and why does DH experience so many hardware issues? do they buy all their machines from cheap-ass chinese manufacturers? i’ve read on some blogs that the shared server machines DH deploys are outdated. and they cram as many sites as 1000 into a single machine.

  46. 496
    selfdestruct Says:

    An update is certainly required at this point. It has been over half a day since the last.
    I have several customers who are pointing fingers and I really would love to have a time frame on their e-mail access. I understand downtime is inevitable, but we need some sort of idea as to when we’ll be back up. Thanks for working hard on this,
    -Scott

  47. 497
    Quit Whining Says:

    and why does DH experience so many hardware issues?

    Because they have so many. This is common sense. Go buy between 1,500 and 2,000 servers and let us know if none of them ever have problems.

    If a server broke every single day, that would still be like 3 - 4 year per problem per server.

    and why does DH experience so many hardware issues? do they buy all their machines from cheap-ass chinese manufacturers? i’ve read on some blogs that the shared server machines DH deploys are outdated. and they cram as many sites as 1000 into a single machine.

    and they cram as many sites as 1000 into a single machine.

    If they’re just parked domains, that’s nothing. And if you mean users instead of sites, that’s meaningless since 1 customer can create unlimited users… doesn’t mean they’re actually doing anything.

    Most sites use hardly any resources and the ones that really hammer the server can take it down even if there aren’t any other sites on it.

    Since everything you learned seems meaningless, you might want to look for better sources of info than random blogs.

  48. 498
    bob cobb Says:

    having to reboot my private server every 10 minutes is getting old

  49. 499
    Aries Says:

    I don’t think so they’re outdated megaman is Dual-Core AMD Opteron(tm) Processor 1218 HE . The problem i think is wrongly designed architecture. But it’s not the real problem. The real problem is to keep the money, not spending to new hw when the trouble comes. If they have 100,000 account, they have around $15M . You can buy a very nice server farm for 10M, 5M for salary. And the next year you can pay the investors.The problem is that they don’t have hot-swap replacements, dunno why. They have the money and the chance (in California, there should be enough good vendor).

  50. 500
    JR Bob Dobbs Says:

    The drama continues. There’s a really whiny anti-dreamhost video on YT at youtube.com/watch?v=qS7nqwGt4-I

Pages: « 15 6 7 8 9 [10] 11 12 13 14 1531 » Show All

Leave a Reply