The following relates some things that happened back the beginning of march between 07-08 March. And also the effects thereof a month later when I thought everything had been fixed.
The frivolity of software is not due to the fact that it might lack something. No it's due to the fact that software is becoming extremely complex.
Software has to work on a broad array of hardware architectures. On a broad array of operating systems working each on another architecture.
Linux is the best example of software that is robust
But what happens when software isn't robust? When there are new features added, when there are security issues which require a fix?
Then you issue an update!
And so our journey begins. An update can either break something or enhance something. Having enough trust in Linux systems for the past 10 years I can say that an update/upgrade rarely breaks the system.
If something goes wrong it's a minor point and you can usually fix it or it won't bother you too much.
I've trusted Debian, Ubuntu, Fedora and Linux Mint, each had a portion of my time.
In 2016 I decided to go once again with Debian as a desktop environment.
Debian is great from the security standpoint, it ships older versions of "well tested" programs.
Version 8 was a great enhancement, of course it wouldn't come close to the new enhancements in Fedora or Linux Mint but hey, it works!
Forensic Analysis Expert
I had a few moments when I bricked my system because of some settings and when I was playing around with various things.
At one point decided to play a forensic analysis hero, Without out knowing it I broke my whole partition table due to a fat finger /sda instead of /sdd.
It was at such a moment in life that you realize you're just one week away before the monthly disaster backup is due.
Command Line Fu to the rescue, i rebuilt it based on metadata from each disk plus a little bit of luck.
Remember kids, forensic analysis should be done in virtual machines
Then a few months later I dediced to do a upgrade to version 9.
Everything worked fine, except that my setup had 3 disks. 1 SSD which had the / partition and another fast partition. Then 2 disks where in RAID 0 that used MDADM, I had also manually setup encryption for my home folder.
After the upgrade i got a rare message "You need to wait for 1 min and 30 seconds.." After the waiting time was done I could hit CTRL+D to go in EMERGENCY MODE.
Oh crap! What happened?
Well, looking at the logs it seems that something related to /etc/fstab couldn't boot to the /dev/md0 device.
But after I typed CTRL+D I recieved the login screen.
After looking into the logs and figuring out what went wrong I still to this day can't understand WTF!?
Anyway, I fixed the 90 seconds wait with by editing /etc/fstab and adding x-systemd.device-timeout=2
The error still existed, but everything worked fine. Guess it's a minor bug.
Fast forward 1 year later 2018.
Don't dig too deep for Fossils
I was working on an Elixir project. I updated Fossil from 1.37 to 2.5 by downloading 1 self contained executable form the Fossil website.
Everything seemed to work fine, except the clone and push commands gave a Segmentation Fault
I started digging in and found out that some library on my debian crashed it.
I tried it on another debian 9 and the binary worked flawlessly. Great. it's my debian that has issues.
I decided it was time to check for updates. I installed them all. Rebooted. Same error.
CTRL+D didn't seem to react. I could however access another TTy. But couldn't start the Xorg server.
Looking in the logs I found some weird errors. I decided hey, let's do a dist upgrade.
Reboot.. This time whenever I hit CTRL+D or tried to go to a TTY it began flickering my screen.
Great God NO! Anyway, I had multiple errors and tried to fix multiple things, but after each step I figured got a new error.
Reinstall Me Please!
Reinstalling Linux is a breeze. I've always had my / and /home partitions kept separate and I encourage you to do the same.
I usually reinstall Linux only of something goes wrong with my fiddling or I decide to use another distro as my main one BEFORE i've tested it out in a virtual machine.
I decided to give Linux Mint another try.
During the installation process I saw I couldn't select my RAID disks.
I then began to look into gparted. It was still there.
Then I decided it was time to do it with mdadm.
But mdadm wasn't installed. Another WTF! Why wouldn't they ship it?
sudo apt-get install mdadm sudo mdadm --examine --scan ARRAY /dev/md/0 metadata=1.2 UUID=d8c71eda:0f21c7b4:2c5e0ced:71537788 name=zamolxes:0 sudo su -c "mdadm --examine --scan > /etc/mdadm/mdadm.conf"
The mdadm --examine --scan looks at all disks and scans for arrays.. If it finds something it will output it. So whenever you brick your system, your data is still there, don't start formatting everything!
Great, it still exists, before I started the installation I decided to backup everything.
I had some backup external HDD's and considering the fact that i had RAID 0 anything that could happen I still had my data there.
On the safe side, let me backup everything again.
I went through a little hassle in remounting everything with ecryptfs.
Then I hit anothe rroadblock when trying to backup, the sheer size of data and the lovely NTFS.
This was the last time I'll ever use NTFS on external backup HDD's for "interoperability" with windows machines.
Ext4 all the way baby (or something else Linux can work with easily)
Without a proper recent backup (2 months of changes) I set out to reinstall it and I said If the installation bricks my encrypted mdadm raid then it's a sign I should become a lumberjack, painter, priest or anything else except a Programmer/Linux enthousiast.
All our operators are currently busy bricking your installation, please wait untill we brick it again
I reinstalled Linux Mint, tried to login.
POOF Linux Mint was endlessly starting up. It's a sign, I'll have to let my beard grow.
Then i thought the distro probably didn't have my settings nor mdadm. Couldn't login to a TTy either since there was no home directory.
Eventually got a recovery shell, yep, no mdadm.
Rebooted. Created /home/myuser as root, logged in.
And then I installed mdadm and generated the mdadm config again
sudo su -c "mdadm --examine --scan > /etc/mdadm/mdadm.conf"
It was time to find out if it will work or fail.
After the reboot, I logged in with my user and voila, ecryptfs worked automatically.
I ignored the fact that cinnamon crashed and went to fallback mode endlessly, probably due to my old cinnamon settings.
This was later fixed with
dconf reset -f /org/cinnamon/
Yes, my data is still there! Thank you Super Robust Linux features!
Now for the fun part. I need to backup everything again. This means copying things from 3 exernal hdd's to my disk and vice versa.. Format each one of them to Ext4.
All this for binary that had sefgaults!
I could have settled with using the old 1.37 version shipped with debian. It worked, however why settle with an older version if my VPS has the newer one?
I love linux for it's modularity and for the surprises it brings.
I'm pretty sure something was rotten and it wasn't the binary's fault but something else in the OS or some library.
I eventually had luck reinstalling fossil and this worked without any segmentation fault.
fossil clone https://core.tcl.tk/tcl/ tcl.fossil
If you don't know what fossil is i encourage you to go to https://fossil-scm.org/index.html/doc/trunk/www/index.wiki
Download fossil and play around with it.
Is everything working?
99% Of the things are working as expected.
One of the things that persits in FAILING is the suspend function. I had the same problem back in 2009.
We're now 2018 and this is so annoying to a point that I remember why i had chosen debian in the first place. Because there it just worked out of the box.
ON workstation it's a MUST to have suspend. On my laptop Linux Mint suspend seems to work.
Kernel update to the rescue.
1 month later, another blog update, OpenSSL stuff
One month later I decided to update something that didn't work with the Phoenix Blog.
I also decided to do it late at night before my birthday. Bad decision.
So as I used distillery to roll out everything including the Erlang Binary Distribution and all the packages I thought everything will work out.
WRONG. It seems that The debian 9.3 version of SSL is 2 years NEWER than the one shipped with Linux Mint 18.3 (based on Ubuntu Xenial)
I thought I could fix it by compiling the newer version of OPENSSL on my system. Wrong again, the compilation against succeeded but the library had some missing pointers.
So, the keepers of Linux Mint and Ubuntu Xenial never thought to update the OpenSSL version from 2015. I know it's a LTS but I think OpenSSL is a package that shouldn't lag 3 years.
I decided to do what every professional would do.
Use a staging/development/testing virtual machine for the compilation.
After setting up a vagrant box with the Debian version I wanted and provisioning it with all the tools needed I finally was able to update the blog.
Always have a backup, even if you already have a backup, make another one BEFORE you start updating your system/
Whenever you update your system accept the fact that it can brick everything.
Don't spend time figuring out what went wrong, a clean install is the fastest way to solve a bricked system caused by upgrades/updates.
If you use VM's for services/servers etc you won't lose a thing.
If you dig for fossils you will end up reinstalling your system anyway.
Never roll out an update at night or in the weekend.