An update on the recent downtime

Discussion in 'CPL News & Announcements' started by spiritburner, May 25, 2017.

  1. spiritburner

    spiritburner Admin

    Offline
    Joined:
    Sep 6, 2010
    Messages:
    1,333
    Location:
    N.E. England
    I'll posting this on both forums so excuse the duplication!

    Thank you for the comments of support while we've been offline & since we came back. It's much appreciated. However I have to admit all I've been able to do is question, investigate, question again & get very stressed! :(

    Some background.

    We're with the same hosting service but it is now under new ownership since late last year. A bit like our old forum package, it was a small operation but very personable & bespoke. The owner was also part owner of the the old forum software & specialised in hosting that platform. It was a case of first name terms & package tuned to our needs rather than selected from the options shown on their website.

    Last year the owner retired & sold up to a bigger outfit. Apart from the relationship & ticket system there were no changes. Previously our VPS was in a rented rack in a data centre in Las Vegas. The new owners have their own data centre's & last weekend the servers were physically moved some 300 miles to their data centre in Tempe, Arizona. The servers were opened & cleaned for re-racking in what is a pretty sterile environment & that is where things went wrong for us.

    Initially I was told the situation may be so bad the data would not be recoverable. I was later told there were issues with the VPS but the data could be seen & seemed ok & would be moved to a new VPS & the issue (with the kernel) would be resolved.

    Although we've come through it & I have no doubt someone has been working very hard to fix this, communications have been sparse during the process. I have spoken with more experienced & technically knowledgeable admins & like me they would expect full backups to have been made prior to the move & downtime to have been minimal.

    I therefore have questions to ask of our host & decisions to make about our future with them. They have not responded to my questions about backups during this process. Even if they make routine backups as the previous owner did I'm concerned that in this instance no specific backup seems to have been done prior to this move & that restoration of service took so long when a fault occurred. I work in IT (non technical) & this seems plain wrong to me. I want to know what their backup routine is.

    Historically, I used to run regular backups & the previous host owner ran a weekly backup & daily incremental backups as standard. It was part of the TOS. I was comfortable with this as, with my level of tech knowledge, running backups was a chore. The only time I actually used them was to set up duplicate test sites & I always found that difficult & often fell back on our host doing the database restore for me. They could do in minutes what was taking me hours. From talking with them & subsequently with 3rd parties I employed in the platform migration, I learnt that there were easier ways of doing it that were way above my skill level. Ultimately, responsibility for backups rests with me, regardless of what the host maintains for disaster recovery. Due to the recent platform change & ongoing background work I do have recent backups of CPL. My own backup of CCS is much older although our host should have one a max of 1 week old. I don't have the 1 to 1 relationship I had with the host before & likely never will as we are now dealing with a much bigger concern so backup strategy is going to have to change.

    It's all part of the journey as we get bigger.

    Now both sites are on a mainstream platform things are easier,more secure & stable. Because it's a major platform support is more readily available. I've had good 3rd party support since the migration of CCS & continued with that for the migration of CPL.

    On an ad hoc, case by case basis we now have a server support tech, Matt, on board. He has done some good work for us already & well regarded by the many sites that use his services. He is also well versed in our forum platform. I'll be asking him to look over the VPS remotely today to confirm all is well & make a system wide backup. The way he does it will not involve any downtime or closure. Going forward we'll be setting up an automated backup process that moves the backup to a different server altogether, unrelated to our host, whoever they may be in future.

    We've also had a coder, Ant, on board since the migration of CCS. We wouldn't be where we are on both forums without his work. He's still doing some work behind the scenes on CPL. Unfortunately he is moving on once that work is done. Going forward I have engaged the services of a new coder, Jay, who'll be working with me on new features for both sites putting my ideas into practice & helping out with any issues with the forum platform that may occur that I can't deal with.

    So as well as myself, Christer (CPL & CCS) & Trevor (CCS) on the bridge we now have Ant (for now), Jay & Matt helping out when needed in the engine room.
     
  2. Tony Press

    Tony Press Australia Subscriber

    Offline
    Joined:
    Aug 28, 2012
    Messages:
    11,026
    Location:
    Stinkpot Bay, Howden, Tasmania, Australia
    Great work, Ross. I reckon that it must have been very stressful.

    You deserve a big hand, and a good break.

    Cheers

    Tony
     
  3. AussiePete

    AussiePete United States Subscriber

    Offline
    Joined:
    Nov 21, 2015
    Messages:
    3,648
    Location:
    Toowoomba Australia
    Well done Sir, our site is in good hands. I say that with some experience, past Tech Manager of Europe for a major building management systems controls manufacturer and distributor.

    Cheers
    Peter
     
  4. onafest Italy

    Offline
    Joined:
    Feb 3, 2014
    Messages:
    73
    Location:
    SICILY
  5. phaedrus42

    phaedrus42 Subscriber

    Offline
    Joined:
    Jun 14, 2014
    Messages:
    2,056
    Good work getting it back up, guys, but I'm still having timeout problems, especially when posting. This is worse with Chrome than with Firefox. I have a stable 20Mb/s ADSL connection, and no issues with other websites. It has taken me more than 10 minutes and many clicks on the "Post Reply" button to get this comment posted after writing it.
     
  6. Tony Press

    Tony Press Australia Subscriber

    Offline
    Joined:
    Aug 28, 2012
    Messages:
    11,026
    Location:
    Stinkpot Bay, Howden, Tasmania, Australia
    @phaedrus42

    I think Ross is off on a well earned break and out of telecommunications range.

    This site, and CCS are being a bit slow, but I haven't had a timeout problem (Chrome on Apple). I assume it's because of the upload problem that almost saw the whole thing disappear.

    I'll tag @Carlsson, but I think we will have to wait a couple of weeks for Ross to return to the fold.

    Cheers

    Tony
     
  7. shagratork

    shagratork Founder Member, R.I.P. Subscriber

    Offline
    Joined:
    Sep 6, 2010
    Messages:
    361
    Location:
    Durham, N.E. England
    It seems that different people are having different experiences.
    I log on to CPL and CCS a number of times each day.
    Since the sites have been working again I have had no timeout problems and no observable slowness.

    I will be able to check posting problems when I post this.
     
  8. shagratork

    shagratork Founder Member, R.I.P. Subscriber

    Offline
    Joined:
    Sep 6, 2010
    Messages:
    361
    Location:
    Durham, N.E. England
    Done. No posting slowness at all.
     
  9. phaedrus42

    phaedrus42 Subscriber

    Offline
    Joined:
    Jun 14, 2014
    Messages:
    2,056
    I fully agree that Ross has earned his break. Seeing that other users are not also having these issues, I've passed it on for investigation to my ISP.
     
  10. Tony Press

    Tony Press Australia Subscriber

    Offline
    Joined:
    Aug 28, 2012
    Messages:
    11,026
    Location:
    Stinkpot Bay, Howden, Tasmania, Australia
    In my case, I might do the "hard boot" and see if that fixed my perceived slowness.

    Cheers

    Tony
     
  11. phaedrus42

    phaedrus42 Subscriber

    Offline
    Joined:
    Jun 14, 2014
    Messages:
    2,056
    It's a Gentoo Linux system, so the only reason I have ever wanted to restart it is after compiling a new kernel, or for hardware upgrades. Up-times of several years are not unusual for these systems.
     
  12. shagratork

    shagratork Founder Member, R.I.P. Subscriber

    Offline
    Joined:
    Sep 6, 2010
    Messages:
    361
    Location:
    Durham, N.E. England
    @phaedrus42 @Tony Press
    I did not mean to imply that there are no problems with timeouts and sluggishness.
    I was only reporting that so far I have not encountered them.
     
  13. phaedrus42

    phaedrus42 Subscriber

    Offline
    Joined:
    Jun 14, 2014
    Messages:
    2,056
    No worries, Trevor :content: If some people experience issues and others do not it gives us techies more data to work with to pinpoint the problem. I suspect it may be a router or caching proxy upstream from me.
     
  14. Carlsson

    Carlsson Sweden Admin/Founder Member

    Offline
    Joined:
    Sep 6, 2010
    Messages:
    3,960
    Some report that the sites has been slower than they should, but I haven't had any problems myself.
    Not with time-out either, but I haven't been on much this weekend, so could easily have missed a sluggish period.

    Maybe not a couple of weeks, but atlest one more.
    I think he needs the break, but I imagine that he will check the sites out as soon as he get back, and if it's just a bit sluggish from time to time, i think we can live with it for another week.
    But please report here anyway if you feel that it's something wrong, and that it's not related to your own systems. It's good with some stats.
     
  15. phaedrus42

    phaedrus42 Subscriber

    Offline
    Joined:
    Jun 14, 2014
    Messages:
    2,056
    Interesting that both Jeff and I have experienced much better results and fewer problems on the site using Firefox rather than Chrome.
     
  16. Nils Stephenson

    Nils Stephenson Founder Member

    Offline
    Joined:
    Sep 9, 2010
    Messages:
    3,378
    Location:
    Copenhagen, Denmark
    Everything works but it is slow. I use Chrome.
     
  17. spiritburner

    spiritburner Admin

    Offline
    Joined:
    Sep 6, 2010
    Messages:
    1,333
    Location:
    N.E. England
    Hi all - I still have some comms. I am seeing higher server loads/cpu usage since we were moved onto a new VPS. There are also other issues I'm not happy with. From mid June I'll be working with Matt to resolve them & improving the performance of the site to a higher level of performance & service than we had before. The downtime will be minimal & mainly around the refresh of DNS.
     
  18. Tony Press

    Tony Press Australia Subscriber

    Offline
    Joined:
    Aug 28, 2012
    Messages:
    11,026
    Location:
    Stinkpot Bay, Howden, Tasmania, Australia
    :thumbup:. Thanks, Ross.

    No sweat.

    Tony
     
  19. JEFF JOHNSON

    JEFF JOHNSON United Kingdom Subscriber

    Offline
    Joined:
    Nov 28, 2010
    Messages:
    16,572
    Location:
    Shetland Islands UK..
    Hello Ross, thanks for the update!:thumbup:

    Since both websites have been back online I have experienced some problems with them, the performance has been and still is a bit sluggish and the sites have been up and down like a fiddler's elbow, sometimes both sites have been down and at other times one or the other has been down.

    I have noticed differences in performance between browsers, but hopefully it will all get sorted through time.

    I have not had those problems with any other websites, Jeff.
     
  20. AussiePete

    AussiePete United States Subscriber

    Offline
    Joined:
    Nov 21, 2015
    Messages:
    3,648
    Location:
    Toowoomba Australia
    I'm accessing this site with an iPad. To date I've not experienced any issues. The site seams to work well and I have not noticed any lag time, speed issues. I also access the stove site with the same results.

    Cheers
    Peter
     
  21. James

    James Subscriber

    Offline
    Joined:
    Jan 5, 2011
    Messages:
    1,152
    Running like a dog for me. Taking at least 20-30 seconds to load each page. Firefox browser.
     
  22. shagratork

    shagratork Founder Member, R.I.P. Subscriber

    Offline
    Joined:
    Sep 6, 2010
    Messages:
    361
    Location:
    Durham, N.E. England
    I use Firefox on an iMac.
    I have no problems at all on CPL and CCS.
     
  23. JEFF JOHNSON

    JEFF JOHNSON United Kingdom Subscriber

    Offline
    Joined:
    Nov 28, 2010
    Messages:
    16,572
    Location:
    Shetland Islands UK..
    The problems which i mentioned in my previous post are still occurring and using different browsers no longer makes any difference.
     
  24. spiritburner

    spiritburner Admin

    Offline
    Joined:
    Sep 6, 2010
    Messages:
    1,333
    Location:
    N.E. England
    Hi Jeff. any performance issues outstanding from the recent disaster will hopefully be eliminated & performance improved, even on what we had prior to that shortly. I'll post more updates as I have them.
     
  25. JEFF JOHNSON

    JEFF JOHNSON United Kingdom Subscriber

    Offline
    Joined:
    Nov 28, 2010
    Messages:
    16,572
    Location:
    Shetland Islands UK..
    Hello Ross, that's fair enough!:thumbup:
     
  26. phaedrus42

    phaedrus42 Subscriber

    Offline
    Joined:
    Jun 14, 2014
    Messages:
    2,056
    Hi Ross, just to let you know that I'm still experiencing lags in page loading. There seems to be a delay between requesting a page and the response. Once the response comes the content displays fast enough. Also, it has happened at least twice that I post responses which look to have been accepted but then do not show in the forum. If I go back later I then see half the post in greyed out draft in the editing window. I can then complete the message and post it but again with a delay or failure.
    The above issues occur both with Chrome and Firefox, and only on the forums, nowhere else.

    -Phil
     
  27. spiritburner

    spiritburner Admin

    Offline
    Joined:
    Sep 6, 2010
    Messages:
    1,333
    Location:
    N.E. England
    I've reported the sporadic spikes in server load again last night. They are what is causing the issue since we were put on a new VPS. I'm not satisfied with the server performance or speed of response to tickets. That will be resolved later this week.
     
  28. phaedrus42

    phaedrus42 Subscriber

    Offline
    Joined:
    Jun 14, 2014
    Messages:
    2,056
    Thanks, Ross. Not complaining, but giving you more info/ammo to take them on with :content:
     
  29. spiritburner

    spiritburner Admin

    Offline
    Joined:
    Sep 6, 2010
    Messages:
    1,333
    Location:
    N.E. England
    Cheers. However, they are no longer part of the solution. ;)
     
  30. David Shouksmith

    David Shouksmith India Founder Member

    Offline
    Joined:
    Sep 6, 2010
    Messages:
    8,416
    Location:
    North-East England

Share This Page