Filer problems with blingy cluster.

We are currently having a problem with a filer which has crashed and is recovering at this time. While this is happening some customers in the blingy cluster will experience problems loading their websites/email. We apologize for the outage and service is expected to return to normal as soon as the filer recovers.

UPDATE 3:01:AM PDT

The filer has finished recovering and all services are back up and running. We are working with the filer vendor to find the source of the crash to prevent any further outages.

Update 24/03/08 10am: We’re working on the file server again to alleviate the load that’s causing problems with web, mail and mysql services. Sorry about that.

Update 27/03/08: We are doing emergency data moves to quell the stem of problems recently caused by your file server. During these moves, your data may be inaccessible. We are moving as we can off as fast as possible. Very sorry about the continued inconvenience!

Update 27/03/08 This series of moves has finished. We are going to keep an eye on things to see how much it helped and may have to do more moves tonight and tomorrow morning to get everything working smoothly again. This post will be updated with more information as soon as possible.

Update 29/03/08

We are continuing to move data off of the problematic file server but it’s a bit of a catch-22 because customers on that machine are continuing to add data at a very high rate. It filled up this morning for a while causing device full errors as well as mail problems and issues serving websites (when these fill up it causes problems across the board). To explain in more detail, when we move data it does not immediately disappear (there is a ’snapshot’ created of the old data that remains in case there was a problem with the move - that ensures that we do not lose customer data but until the admin team can check the move to make sure it went through properly we cannot delete the old data). We just did some of that and have some breathing room again and of course more moves are still in progress but we are asking customer on this cluster to help us by holding up on any non-essential uploads of data for the next couple of days. As soon as we have a significant portion of the data removed the problematic file server will begin to function properly (and additional moves will go much more quickly and smoothly) but right now we’re having trouble moving data more quickly than it’s being added by people. If everyone could please limit uploads to absolutely essential data until we reach the turning point where everything is working this will be resolved much more quickly (in other words if for example you are setting up a repository of large files you’ll actually be better off waiting a couple of days and getting the all clear from us on this issue because you’ll be able to access that data reliably instead of cramming it on there now and slowing the recovery process).

In the meantime we’ll be doing everything we can to safely and quickly move data off and get things back to normal.

Added information: Some of the people recently moved to the new file server are seeing errors because the data did not get set up completely (loading the site will work but just show an empty index). The admin team has been running an rsync that will fully restore all data and should hopefully finish by 9 PM PST - once that is finished all site and email data will be available for those users.

Update 30/03/08

We’re still racing to keep ahead of new data being added so any help we can get on that front is greatly appreciated (we’re still asking for customers to limit uploads as much as possible to speed up the recovery process). Some customers who are being moved are seeing blank directories still but those are due to moves in progress and the data will be fully restored when those complete.

Update April 1, 2008

We seem to be ahead of the curve right now, we are moving data off the primary volume and on to a secondary one faster than new data is being uploaded. The volume hasn’t filled up completely in a few days. We are working closely with the technical support team to see how we can speed up the process further. Thank you for your patience.

Update, April 1, 2008

I apologize for the late update but we’ve been going over our options (while moving data of course). While we’re not seeing any real relief in terms of data uploads we do have some very large moves that are almost complete. Once those finish we can start deleting the data (for example one is around a half TB or around 500 GB which will be 4-5% of the total but it’s going to take until around Friday to delete it all, so we’re dealing with a ton of data). Tomorrow’s update should be earlier in the day and hopefully we’ll have some progress to reports from the large moves being complete.

Update, April 3, 2008

The data moves to other file servers has been running constantly, but last night and this morning some complications happened with the moves, requiring admin attention. To clear up some space there had to be a short interruption in file serving, this is now finished, space is available and the moves are continuing. The admins are fixing up the last of the web servers which were having issues after file serving was restored. Our apologies again for the continued issues.

Update, April 4, 2008

Today has been a pretty good day of progress. We were able to complete even more moves and free up more data from the file server. Moves have been going quicker and stability is dramatically improving. Monitoring of the servers and email in the blingy cluster today have shown a significant decrease in problems. Issues do still exist but the problem is noticeably getting better. We are also pleased to note that we have more storage that will be coming early next week. We believe that this will go a long way in helping us fix this major problem.

Update, April 4, 2008

Things are continuing to improve today - when I got in I was pleased to see that we had held firm and even gained a percent (the effected file server was down to 95% which is as low as I have seen it in the last week and since I have been working it has dropped to 94%). Performance should improve as we gain ground (this will speed up moving data off as well). This progress, along with the added storage space we are expecting early next week should hopefully allow us to restore service for our customers to normal.

5:45 PM PST : The moves we started just a while ago seem to be causing server problems, we’re looking into it and should have it resolved shortly (they were run just like the ones that had completed so we have to determine why these specifically caused an issue). Update: this resolved itself before we could detect the cause but we’re monitoring the situation to ensure that it’s not a recurring issue (we have no indication that it will be).

Update, April 8, 2008

Please see our other posting for details on the work we did on the effected file server:

http://www.dreamhoststatus.com/2008/04/06/30-min-blingy-downtime-tonight/

We are also continuing to offload data and are making good progress (it’s never as fast as we would like it to be of course). There’s excellent detail here in case you missed it:

http://blog.dreamhost.com/2008/04/07/another-anatomy/

which chronicles the situation and fills you in pretty much up to today. We’re seeing the data dip to 90% so we’re hoping to have it down in the 80’s by the end of the week (every percent we gain helps and as performance improves we can speed up the rate of moving but we’re still looking to hit critical mass where you get the proper level of performance).

Update, April 9, 2008

As we had hoped progress is speeding up as we free up more space - while the file server is showing 95% usage, around 11% of that is data that has already been moved and is no longer in use. Due to a software issue we haven’t been able to remove it yet (the admin team is working on the best way to execute that), but once that is gone we should be around 85% usage which is another large step forward.

In terms of effect, I have already seen improvement in site function for many customers as well as greatly increased speed in moving chunks of data off as well as receiving reports that mail is functioning quite a bit better. That said this issue remains at the High severity rating and in unresolved status as we have not reached a normal level of service. I can’t stress enough how sorry I am that our customers have had to put up with this but I thank those of you who have stuck with us (check the newsletter for details on what we’re doing for Blingy customers) and look forward to providing you with the level of service we strive for at DreamHost.

Update, April 9th, 2008 21:59 PDT

Unfortunately, we need to unmount the volume again to kill these snapshots before they leave us with 0 bytes of free space. In 2 hours (midnight) I will be taking the problematic volume offline to delete the phantom snapshots. Total downtime will be between 10 and 30 minutes. Sorry for the short notice and additional outage!

Update, April 10, 2008

Well the snapshot mentioned yesterday is gone and we’re actually at 83% used today which is below where we were hoping to see marked improvement (85%). Of course we’re still moving data off (which increases the usage on the file server) so that won’t fully translate to customer usage improvement but it should be quite a bit better and keep improving until we stop moving data.

Update, April 14, 2008

Okay, we’re finally getting ready to mark this as resolved.. things have seemed pretty much okay for a while now. But, just to be sure, we’re dropping the severity to Medium for now and leaving it as unresolved.

Update, April 17, 2008

We’re still hearing some reports of site slowness - we were able to resolve an issue causing high loads today which should help but we’re not going to consider this resolved until everyone is receiving good service.


Severity: Medium   Resolved: No
.

1505 Responses to “Filer problems with blingy cluster.”

Pages: « 1 2 3 [4] 5 6 7 8 9 10 1131 » Show All

  1. 151
    twentynine Says:

    Jeez, a little downtime and the entire service sucks. Give these guys a break.

  2. 152
    bob cobb Says:

    Im FINALLY back up, thank god

  3. 153
    Howie Says:

    @Daniel: “I will lose $1000 if this server and my site are not fully online within the hour. Come on, this is appalling.”

    Come on, guys. If you’re making that much money off your website, go for something a little more robust than shared hosting. You get what you pay for. If your website will make or break you, then shell out the cash and get a dedicated server or colo your own somewhere.

  4. 154
    jeremy Says:

    other shared hosting is better. i’ve asked around. hostgator for instance. DH used to be good. this isn’t an isolated incident. this is constant. and for every outage reported here i have many more that are short and never reported.

  5. 155
    jeremy Says:

    also, i’d glady pay 5-$10 more per month. but there’s a big difference in $10.95 and $150 for dedicated… reliable shared hosting is possible…

  6. 156
    bob cobb Says:

    to be fair howie, a lot of us are on Virtual private servers now, which can be as much as $200/month

  7. 157
    Marc Says:

    twenty nine: first, it wasn’t a little downtime. It was a lot of downtime. Some of have sites with more important information than comments about how dogs like to sh*t themselves.

  8. 158
    Nat Says:

    Hi,

    I’ve had some problems with Wordpress since this issue.

    On one site, it has converted all my pages to posts, and when I open them up there’s nothing there. I haven’t touched the site in months it just loads the theme and says “No posts matched your criteria” - see www.delicadofoods.com.au.

    On another it just won’t load the post. I’m a student on a short easter break trying to get some web work done to support me at university and get my uni assignments done in the same period. I had planned my time out well, but this has cost me 4 days so far.

    Anyone else experiencing the same problems with Wordpress???

  9. 159
    monkey Says:

    work faster!!! :)

  10. 160
    Andrew Says:

    Anyone else getting error id: “bad_httpd_conf” ?????

  11. 161
    David Says:

    “Anyone else getting error id: “bad_httpd_conf” ?????”

    YES!

    I have a solution:
    CLASS ACTION LAWSUIT

  12. 162
    suckinghost.com Says:

    Wow DH! I got you in a brand new year(2008), and you put me in to this bingy shit.

    Hey, can anyone explain me how I can claim a refund? I have paid dream host using someone else’s creditcard. If I claim for a refund, will this su**nghost send me the money to my paypal account? or will they send it back only to that credit card? This is the first time I am going to use a money back guaranty in my life.

  13. 163
    reviewstash.com Says:

    Can you guys shut the fuck up and stop complaining? For how much DH costs, and the level of service they provide, they’re amazing.

  14. 164
    Daniel Gregory Says:

    I have been a Dreamhost Customer for around 8 years and I have found them to be responsive and very dependable.
    I’m not up to date on what is going on now however I’m sure they will not only solve the issue but also fortify their systems.
    These guys are very sharp and I trust them.

    Daniel Gregory

  15. 165
    Chris Says:

    I’m on Sothe, but in the blingy cluster. Every now any and all dynamic pages, even ones that don’t connect to databases, just regular php files, will completely time out, and freeze for 2-4 minute periods PER REFRESH.

    I’ve been reading through this site, and blingy keeps having the same problems over and over again, quit patching the hardware and replace it instead. My forum in particular is moving at a crawl, and it’s pissing off my community.

    Thankfully I’m still within my money back guarantee, which is the only thing keeping me placid at the moment. This is totally unacceptable, this issue doesn’t need patching, it needs fixing permanently.

    Look at the list of servers under the cluster, then all of the sites under each server. This is affecting a lot of people. More importantly, before I signed up I asked if Dreamhost was the upstream provider or a retailer. Their answer should have been a red flag for me, but I looked past it. For such a short time using this service I am seriously questioning this hosts abilities. I guess if you want something done you have to do it yourself. I’ll have to host my site myself before it’s over.

  16. 166
    Worst Web Host Says:

    “Can you guys shut the fuck up and stop complaining? For how much DH costs, and the level of service they provide, they’re amazing.”

    YOU shut the fuck up, asshole! You think it’s fine for DreamHost to repeatedly JACK hundreds, even thousands of clients and render them completely unable to make a living? Go f yourself!

    A class action lawsuit sounds good to me!

  17. 167
    shiraz Says:

    Mail was down for me from the morning til mid afternoon. A major inconvenience, and definitely worrying, since this is my first Dreamhost account. But I think overall the verdict is good (they replaced the whole server in the end) — they didn’t seem neglectful.

    The one thing that bugged me about this was, how would you know that you’re on the blingy cluster? It’s not in control panel, or not anywhere obvious enough that I’d remember — “hey, that’s my cluster” when I saw the message on the screen. So I did an IP lookup for mail.mydomain and then I did a reverse IP lookup on that IP which showed that my email was indeed on the blingy cluster. But not everyone’s that adept, nor should they be expected to be.

    So my suggestion to Dreamhost would be to put a link in the status page where you can enter your domain name to find out (a) what cluster you’re in, and (b) the status of your particular cluster.

    Peace and respect,

    Shiraz

  18. 168
    jeremy Says:

    “Look at the list of servers under the cluster, then all of the sites under each server”

    how do you find what cluster you’re on or servers within a cluster?

  19. 169
    Not Me Says:

    Click on the “Account Status” link toward the upper right of the panel, your email server is which cluster you are in. It’s not very intuitive but that would be better addressed through the Suggestions section of the panel.

  20. 170
    suckinghost.com Says:

    @reviewstash.com

    “Can you guys shut the fuck up and stop complaining?”
    Ok.. I will. I will get my refunds, and then shut my fuke*d up account at DH. I won’t have to complain. Believe me, I hate complaining, but my site users are complaining me so much that I couldn’t hold the vomit withing. Sorry if this complain was stinking.

    “For how much DH costs, and the level of service they provide, they’re amazing.”
    Try out http://110mb.com They are free and have far better uptime and speed than DH. Anyways.. if you don’t host porn, probably you won’t need 500Gb of space. Average user may need 1GB max, and they provide you 5GB.

    Oh yeah.. did I mention that?.. the level of headache that DH gives me is far more costlier than the $120 per year hosting charges? Obviously they give you a *lot*(of headache) for very little.

    @ Daniel Gregory

    “I have been a Dreamhost Customer for around 8 years and I have found them to be responsive and very dependable.”

    I have been here for the last 2.5 months, and heres the recent news: Now they su*c*k. They sucked me since day one till day 67. Exhausted. No more fluid left for you to suck. Give me a break DH. I thought I will make a living out of web designing.

    “These guys are very sharp and I trust them.”

    Sure. I agree. My friend is on DH since last 2 years. His sites are working fine. He recommended me, and I came here. But I am on blingy, and like everyone else on blingy I am frustrated. have some mercy dear.. we are humans. If I would have faced these problems after 8 years of hosting, that would have been justified. They sucked me from day one till date. Its also an emotional thing.

  21. 171
    jcisio Says:

    @skh.com: $120 is fairly big, but $10 per month is nothing. And DH allow us to host resource-consuming website for just this little amount, and if you host them elsewhere you’ll need at least $50/mo. Then I don’t count their other features like unlimited domain/acc/mailing list/…

    Don’t compare any other hosting provider with DH. 110mb ? Really funny. 5 GB and then ? 110mb doesn’t allow Joomla!, vBulletin and other scripts that they class “CPU resource intensive scripts”.

    Yet I’m unhappy with DH recently. Weekly uptime:96.26% Downtime:3 hour(s) 14 min(s).

  22. 172
    businessgeeks Says:

    I just got my hosting here and got disappointed immediately. dreamhost guys. Please fix this ASAP or i will need to go back to my previous hosting.

  23. 173
    Mike Says:

    wow………. fucking … FUCK.

  24. 174
    Randy Says:

    Yea, I just got service here too. I’ve been with probably five or six different hosts over the years, and this is by far the worst service I’ve experienced. The fact that they don’t post any updates is particularly troubling.

  25. 175
    suckinghost.com Says:

    @Jcisio

    “Don’t compare any other hosting provider with DH. 110mb ? Really funny. 5 GB and then ? 110mb doesn’t allow Joomla!, vBulletin and other scripts that they class “CPU resource intensive scripts”.

    Yeah.. I agree with you. I don’t want to compare anyone with anyone else. I just did not like what @reviewstash.com said. It made me feel that because we are using a cheap offer, its justified for DH to give us cheap headaches. I gave him even a cheaper(free) site with lesser headaches. Of course 110mb.com is not for serious web hosting. Else I wouldn’t have come here.

    I know DH is good. My friend who recomended me DH, his sites are running fine. I was unlucky to be on blingy. I requested DH to change my cluster, and they said they currently don’t have the technology to change clusters . Change to another server within a cluster is possible, though. Then what option do I have? Have lost all patience. Just the money back guaranty is holding me back here. I wish things get alright before I am too pissed off to cancel.

  26. 176
    JustJoined Says:

    Is there an ETA on a fix?
    Even a status update?

    I haven’t had email working at regular speed since I joined.

    It just doesn’t work.

  27. 177
    Simon Says:

    All of my websites have just gone down - about 30 mins ago. Anyone know an ETR?

  28. 178
    sktle Says:

    My sites are still down…

  29. 179
    FUCKING DREAMHOST Says:

    Dreamhost sucks! It s a shity hoster, the only things working is the billing method! One month on this hoster and 50 % of the time it s really reaaaaaaaaaaaaaaaaaaaaaaaaaaallllllllllllllllllllyyyyyyyyyyyyyyyyyyyy sloooooooooooooooooooooooooooooooooowwwwwwwww and 25 % of the time it s down.

    It s a shame! Free hosters like ovh works faster and better!

    1&1 is a bad idea: i tested it with Joomla and there is a lot of 500 erros, a lot of times.

    The only one really good hoster is INFOMANIAK but it s about 90€ for one year ( 180 $ ) , but IT WORKS REALLY FAST AND WELL

    I m feeling like a chicken to have pay for this nightmarehost shit.

    I m loosing money and can t work!

    Nice , really nice Dreamhost.

    Don t high load your server and it will works! It s the same everytimes you make high load on a server! Just buy servers and make it slow load .

    Perhaps you ll make not as money as today but you ll not loose client and make a bad picture of you on the web!

    Suckers!:

    And you re blog is a wordpress blog! Really funky to use a opensource CMS for a paying system!

  30. 180
    Randy Says:

    Yep, down here to, but Email is now running like a rocket. Let’s hope that’s a sign of things to come for our web servers. My plain old html files won’t even come up.

  31. 181
    Andrew Says:

    My email is working but all of my sites (like 13) are giving me

    error id: “bad_httpd_conf”

  32. 182
    Andrew Says:

    Are you guys getting the error id: “bad_httpd_conf” or something different like 404s?

  33. 183
    JustJoined Says:

    A “rocket” that takes 3 minutes to open an IMAP folder with a couple hundred messages in it… but “only” one minute to open an IMAP folder with 90 messages. No update for 17 hours, when basic services like web & email are down? Is it always like this?

    The steady trickle of spam reassures that mail is getting through, but even that is slow to delete.

    This is ridiculous. Outbound messages get sent but then can’t even copy to the Sent folder because it times out (after I increased the timeout to 120 seconds).

  34. 184
    Randy Says:

    “A “rocket” that takes 3 minutes to open an IMAP folder with a couple hundred messages in it… but “only” one minute to open an IMAP folder with 90 messages. No update for 17 hours, when basic services like web & email are down? Is it always like this?”

    Well, yea, I’m in Taiwan and my web mail loads in about 5 seconds, and Thunderbird gets my mail in less than 3. That’s not bad. But the web server is still very slow.

  35. 185
    Juan Lupión Says:

    Geez, guess it was the wrong day to migrate from GMail to DH’s IMAP!

  36. 186
    Randy Says:

    Yep, this has been going on far too long. Service comes, service goes, service comes, service goes. I’d be better off running my site from home on my $400 dollar box. At least then I could update myself on a regular basis (no pun intended, sickos.)

  37. 187
    Anonymous Says:

    WRRRRRRRRRRRRRYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY

    I need mah /n/ews, dammit. Work harder you lazy goddam bastards.

  38. 188
    Pepe Says:

    Deamhost: You fucking suck.

  39. 189
    James Says:

    Asked support what is up and if my problems are related…

    Just got a response so passing it on to everyone else on the off chance you are still wondering what they are up to….

    ——

    Response from DH —-

    Yes…this is an issue with blingy, which we are definitely working to
    alleviate. Your site is back up for the moment, and admins are in the
    process of moving data off of the filer that’s been causing problem, and
    we hope to have things stable soon. It’s not the fastest process, given
    the many terabytes of data that need to be moved, but we’re slowly and
    steadily moving data to more reliable filers.

  40. 190
    Lucas Says:

    ¿How can I know if the server of my sites is on blingy? Is there any place that informs about servers on DH clusters?

  41. 191
    GUAGUAU Says:

    por favor ya son muuy seguidos los cortes delservicio , solucionen pronto los inconvenientes,….
    ABRAZO VITTUAL

  42. 192
    Paul Says:

    When I try to log into my email it now comes up as unknown user or wrong password, anyone else encountering this.

  43. 193
    Business client Says:

    I’m a new business client, and I must admit that support is pretty good. But the thingy with Blingy, response times etc. is a disaster. If you don’t know how to implement things correctly, keep you hands away from updating. If you don’t want to spend money on hardware upgrades, don’t promise too much.

    Even though the service is pretty competitive, they don’t offer any professional solutions like MS ADO components, that I can make work on my test servers without any problems (also Linux based with Apache)

    In my opinion, there’s no excuse for having a problem taking more than 4 hours to fix, that is if you know what you’re doing!

  44. 194
    MK Says:

    This is completely ridiculous. This has gone on for days. My business is crippled without email and I am mortified that it has taken this long. What can be done? I cant access anything! The site is up but my business is email focused, not web! Support hasnt offered an update since 10am yesterday! Cmon already

  45. 195
    Jim Sullivan Says:

    1. I’m on toadstool, not bingy so I don’t know why DH is using the bingy excuse on me.
    2. From reading this forum it appears that it is just a place to vent with zero effort o provide resolution.
    3. I’m using dreamhost because of application requirements. Guess I may have to look around for something else. I’ve had good success with cihost in the past. If they have the server requirements I may have to switch.
    4. I”m astounded that dreamhost had to scramble to move things around. Whatever happened to triple redundant systems?
    5. The site is back online, but running VERY SLOW.
    6. I had to put a hold on a press release for the site due to this problem. If the server goes down again after I do send the press release, my company will experience losses which my attorneys will be happy to discuss with you as to how you expect to compensate us for those losses.
    7. Bingy? What kind of lame is that for a server?

  46. 196
    kris Says:

    I don’t think DH support looks here…USE YOUR CREDITS TO VOTE FOR THE DOWNTIME ERROR MESSAGES in the suggestions area of the control panel.

  47. 197
    Chris Says:

    Well my site worked for about 5 minutes this morning and is out again. This sucks.

  48. 198
    Chris Says:

    I am on snocap but my databases are in blingy so my site is down (twice this week now). It seems that there would be a lot less bitching if dreamhost would leave people in a single cluster instead of spreading domains around. From a management standpoint it doesn’t make a whole lot of sense. If my site is hosted in one cluster then by all means, keep my databases in that cluster too. With the current setup, I am literally getting cluster fucked.

  49. 199
    Matt Says:

    All my sites still down. Shonkeh.

  50. 200
    David Streever Says:

    I’m not on blingly (spunky?I think) and it’s still taking 15+ seconds to generate pages from mysql.

    the same site I’ve run on other hosts–MUCH CHEAPER!! I pay 16 bucks a month for DH–and those hosts can generate my page in under .5 seconds… same php back-end, too….

    it’s total bull. I used to advocate DH, but I’m done. I’m done. I keep thinking it’ll get better for 2 years now, it’s like an abusive relationship.

Pages: « 1 2 3 [4] 5 6 7 8 9 10 1131 » Show All

Leave a Reply