We're back online!
Posted: Sun Oct 28, 2018 3:15 am
Yay! We're back online. And my apologies for the outage.
What went wrong: This site uses HTTPS, which means that your friendly neighborhood script kiddie (or intelligence agency) can't read your password, private messages, etc., by snooping on network traffic. Well, probably the NSA can, but they won't admit it. But in order to secure your connection, this site needs to have a "certificate", which says that we're who we say we are.
Previously, we were getting our certificates from Let's Encrypt, a wonderful non-profit which has done more than anybody else to secure web traffic. They provide free certificates which renew automatically. We used a piece of software called docker-letsencrypt-nginx-proxy-companion to get our certificates from Let's Encrypt. Usually, this worked well—once or twice a year, either rdearman or I needed to reboot the server, but that was all.
But sometime early yesterday (US time), the certificate software fell over hard. I rebooted the server, and it didn't come back up. So I rebuilt the server from scratch (which takes like 10 minutes). But this meant that we upgraded to new versions of the OS and docker-letsencrypt-nginx-proxy-companion, and everything broke. I fixed three serious problems, but it still wasn't working, and I couldn't see why not. So I dumped docker-letsencrypt-nginx-proxy-companion and installed caddy-docker-proxy, which has like 2,000 fewer moving parts. Unfortunately, this also failed in several different mysterious ways, with no errors.
So I said "Arggghh! No more stupid Docker proxies that talk to Let's Encrypt! I'm going to cough up another $21/month (in addition to the $40/month it costs me) and pay for an Amazon Application Load Balancer with a certificate from AWS Certificate Manager!" Now, the downside of this is that it has like a dozen moving parts and it takes hours to set up correctly. The upside is that it's likely to keep working from now until the heat death of the universe, because once Amazon builds something like this, it just keeps running (and gets very slowly cheaper).
So, that finally worked.
The site is mostly back online. However, for now, you must connect using "https:" and not "http:". I'll work on setting up a redirect. I'll talk more about this soon, but for now I need sleep! Once again, my apologies for the unplanned outage.
(Also, while this was going on, I had to get rid of about a zillion leaves before an ice storm hit, and deal with a minor pluming leak.)
Anyway, I'll talk more about this later. But I just wanted to let people know what had been happening.
What went wrong: This site uses HTTPS, which means that your friendly neighborhood script kiddie (or intelligence agency) can't read your password, private messages, etc., by snooping on network traffic. Well, probably the NSA can, but they won't admit it. But in order to secure your connection, this site needs to have a "certificate", which says that we're who we say we are.
Previously, we were getting our certificates from Let's Encrypt, a wonderful non-profit which has done more than anybody else to secure web traffic. They provide free certificates which renew automatically. We used a piece of software called docker-letsencrypt-nginx-proxy-companion to get our certificates from Let's Encrypt. Usually, this worked well—once or twice a year, either rdearman or I needed to reboot the server, but that was all.
But sometime early yesterday (US time), the certificate software fell over hard. I rebooted the server, and it didn't come back up. So I rebuilt the server from scratch (which takes like 10 minutes). But this meant that we upgraded to new versions of the OS and docker-letsencrypt-nginx-proxy-companion, and everything broke. I fixed three serious problems, but it still wasn't working, and I couldn't see why not. So I dumped docker-letsencrypt-nginx-proxy-companion and installed caddy-docker-proxy, which has like 2,000 fewer moving parts. Unfortunately, this also failed in several different mysterious ways, with no errors.
So I said "Arggghh! No more stupid Docker proxies that talk to Let's Encrypt! I'm going to cough up another $21/month (in addition to the $40/month it costs me) and pay for an Amazon Application Load Balancer with a certificate from AWS Certificate Manager!" Now, the downside of this is that it has like a dozen moving parts and it takes hours to set up correctly. The upside is that it's likely to keep working from now until the heat death of the universe, because once Amazon builds something like this, it just keeps running (and gets very slowly cheaper).
So, that finally worked.
The site is mostly back online. However, for now, you must connect using "https:" and not "http:". I'll work on setting up a redirect. I'll talk more about this soon, but for now I need sleep! Once again, my apologies for the unplanned outage.
(Also, while this was going on, I had to get rid of about a zillion leaves before an ice storm hit, and deal with a minor pluming leak.)
Anyway, I'll talk more about this later. But I just wanted to let people know what had been happening.