Thoughts on Getting Hacked
In case you haven’t heard, Mecklenburg County government was hacked. As in smoking crater, kick all the power cords out of the wall, run about screaming hacked.
In a nutshell:
- An employee opened a worm-laden attachment. The hackers did a really nice job here - the email appeared to come from another County account and looked very much like normal business.
- That sent a worm looking for vulnerabilities in the network. And it found some. LockCrypt ransomware ate ~10% of the County’s servers before everything was unplugged.
- We got a ransom demand, we didn’t pay it, and we’ve been in recovery mode since.
A lot of things had to go wrong for this to happen. Points where things could have been intercepted include the email system, staff training (opening attachments), PC virus protection, network intrusion detection, server security and virus protection, to name a few.
But each one of these is an ordinary failure. We have to allow human beings and the systems they build to be fallible in ordinary ways. Places that spend many orders of magnitude more on security have been hacked in much worse ways. If there is somebody that deserves a scolding, I haven’t met them yet.
My $5/month Digital Ocean droplet was blissfully unaware of all this stuff and was running fine. But the apps hosted there relied to varying extents on services from a County web server (eaten), which in turn relied on a County Postgres/PostGIS server (Linux - perfectly fine but down as a precaution).
There was the rub - the server was down, so I couldn’t get database backups from it, and I couldn’t get my hands on the server backups because everything was quarantined and IT was busy examining the smoking crater that used to be our infrastructure.
So there I sat, flagellating myself for a week. There were roughly four hours between figuring out what was going on and shutting everything down, and your intrepid scribe took 4.1 hours to think of grabbing a Postgres dump. Had I thought of it in time, some of our primary apps could have had less than an hour of down time. Instead they were down for a week.
Lesson Learned: Have a back up plan for the backup plan. DO NOT TRUST ANYBODY ELSE WITH THIS JOB. The up time of your apps is your job. I’ll archive important things off the County’s network in the future.
By some miracle, after a week of futile pings I found my Postgres server back online. The web server with the HTTP API was still dead though, so in less than an hour I:
- Grabbed a recent Postgres backup
- Cranked up a Digital Ocean droplet
- Installed Postgres/PostGIS/Address Dictionary
- Restored the data
It should have been more like 30 minutes, but Postgres’ reliability hurt me here - I hadn’t performed a data restore in so long I had forgotten how.
Lesson Learned: Having an open source software solution helped a lot. I was able to quickly and easily obtain and install all the software I needed on a standard Ubuntu cloud server. Imagine trying that with a proprietary product suite. Hell, our Esri license server was hacked and encrypted.
After that it was a matter of setting up our HTTP API on the Digital Ocean droplet. As that’s open source code sitting on Github, that was no problem.
Lesson Learned: Having up to date code sitting on Github was a life saver. Redeployment on another server was a
yarn installaway. Besides making it readily accessible, sharing code with the world makes you think about how one might easily go about installing it.
I pointed the proxy on my web server droplet from the County’s HTTP API install to my brand new cloud version, Bob’s your uncle. I was back in business. I’m not embarrassed to admit that after getting those services back online, I walked to my car, closed the door, and screamed for a bit. Happy screams, not unexpected family visit screams.
The recovery and post-mortem is ongoing and will probably still be ongoing when the new year arrives. We’re still pretty mortem right now. I might have more thoughts on this later. Here are a couple of parting shots:
- How you recover from a disaster is, in many ways, more important than the disaster itself. If there is to be a reckoning on this later, that’s where I’d expect it to be.
- When you’re hearing a pitch on centralizing and standardizing and optimizing your verticals for win-win and similar stuff, keep repeating the phrase “single point of failure” to yourself. Diversity is a strength, not a weakness. It may be one hell of a basket you’re building, but I don’t want all of my eggs in it.
- IT people have crap jobs. Nobody was thanking them on the 3rd for our infrastructure not being a smoking crater. Be nice to those folks.
- Why are desktop email clients still a thing? Reading email and opening attachments in a sandboxed browser has to be a safer way to go.