Fileserver switch troubles (Resolved!)

One of the switches which connects some of our fileservers is having unexplained high cpu load. This is causing slower than normal loading of some sites. We’re looking at the issues and hope to have it fixed up shortly. Updates to follow.

We think we have this one resolved. It seems the CPU load was simply too high and causing the switch to have watchdog timeouts and reboot. This was affecting about 8-9 file servers. We installed another switch to handle the extra load and watching it for half an hour saw no problems. We will continue to monitor it throughout the night to ensure that the problem is completely resolved.

.

24 Responses to “Fileserver switch troubles (Resolved!)”

  1. Sergio Says:

    How come you guys never post the actual server the problem is on? It would be a lot more helpful, that way we would know if this is the problem, or something else is causing it. Well my site is being affected by it, and I doubt it’s the fact that there’s more traffic today.

  2. ivix Says:

    Providing the name of the fileserver won’t help much - the impact would be seen on multiple webservers that use the fileserver for storage.

  3. hardedge Says:

    Read the words, “some of…” More than one. You having a problem? Then you’re probably “one of some.” Not having a problem, you’re okay. Want to know for sure if you’re having a problem? Can’t access your site? You’re having a problem. Can reach your site? You’re not.

    See, the problem isn’t whether or not you’re having a problem. The problem is what appears to be a cascade of problems which have now become quite problematic for everyone.

  4. pow Says:

    yup straight up and down y’all.

  5. Leon Says:

    My site http://www.www.erotofun.com was just down - error
    “Warning: mysql_connect() [function.mysql-connect]: Lost connection to MySQL server during query”

    I checked system wide status for one of DB i use and got:
    “Verified mysql outage: xxx [stop tracking : 1 min 38 secs ago: Outage verified: We are actively looking into resolving it.”

  6. Allen Says:

    Perhaps not resolved entirely — me ol’ websites are slow, slow, slow right now.

  7. kevin Says:

    I am still seeing quite a few VERY long delays…not convinced you’ve nailed the issues.

  8. David Says:

    Agreed - very long delays and connection issues, time outs etc. Whatever the problem is, ya still got it.

  9. HD Says:

    And it’s still a problem …

  10. Pixelman Says:

    I also have very slow performance.
    Damn, I have to finish this project this weekend.

  11. geo Says:

    it isnt just slow i cant get to the web page

  12. seth Says:

    it would be useful to know what web servers are dependent on these file servers that are down. That way we could know if we are submitting a duplicate site outtage report or not. Currently my webserver on Cletus is not serving even static pages, so I’m wondering if the problem is related to this switch issue or not. Oh, well. Been down for almost half an hour now that I’ve noticed.

    Seth

  13. Mark Says:

    Is this still happening?
    I am getting problems uploading to FTP, and even the web ftp is throwing out this message :

    Warning: ftp_put(): Transfer aborted. No space left on device in /usr/local/ndn/web/webftp/includes/filesystem.inc.php on line 1145

  14. Dude. Says:

    Sounds like it’s out of disk space. Report it. Somewhere other than here because not one of us can fix it for you.

  15. HsN Says:

    i have the same problem.

  16. DoubleDrive Media Says:

    I hear ya Pixelman, I have two critical web projects to complete by Nov 1 and this is the SECOND weekend in a row where my servers have had problems. I should send a bill to Dreamhost!

  17. Ozh Says:

    Same here, my sites are still intermittently down …..

  18. JC Says:

    Mark, last time that happened to me, I got this response:
    “Your file server ran out of file handles, which causes problems about the same as running out of space. I’ve fixed it. I have no idea why this was not reported by our monitoring systems, but I’m investigating right now, and will keep an eye on your file server to make sure this doesn’t happen again.”

  19. neha Says:

    Been a delay / or complete failure in loading pages for the last 12 hours. Doesn’t look like the issue is actually resolved.

  20. Jethro Says:

    Also getting severe delays still (22:00 GMT).

  21. Sharphead Says:

    I’m finding a fix to workaround this issue - is to set up a new unix user… change the hosting of your website to that user and re-upload the website to the new user login…. I was having the same problem with all my sites (disk full, 550 ftp errors) and they were under the same unix user… after changing the hosting to a new unique username, I can write to the drive again and sites are back up and running (well not yet… I have over 20 gigs of files to move up and down and 15 websites to transfer) :)

    Hope that helps.

  22. bd_ Says:

    You can find out what fileserver you’re on by logging in to a shell account and typing ls -ld ~

    You’ll see a result along the lines of:
    lrwxrwxrwx 1 root staff 23 2006-10-13 19:23 /home/username -> .fileserver/username

  23. rengalan Says:

    You guys have had a hell of a time over the last few months, whatever you’ve done recently, my web panel is now as zippppy as ever. Keep up the great work!!

  24. Babs Says:

    Yet to explore the new fileserver !!

Leave a Reply

Comments posted here may not be viewed by DreamHost staff at all. This is not a way to contact DreamHost.