Maintenance underway on Saturday

Discuss technical problems and features here
User avatar
emk
Black Belt - 1st Dan
Posts: 1620
Joined: Sat Jul 18, 2015 12:07 pm
Location: Vermont, USA
Languages: English (N), French (B2+)
Badly neglected "just for fun" languages: Middle Egyptian, Spanish.
Language Log: viewtopic.php?f=15&t=723
x 6323
Contact:

Maintenance underway on Saturday

Postby emk » Sat Oct 14, 2017 5:13 pm

Server performance has been terrible Saturday, and I'm troubleshooting. More soon, and I apologize for the inconvenience.
8 x

User avatar
emk
Black Belt - 1st Dan
Posts: 1620
Joined: Sat Jul 18, 2015 12:07 pm
Location: Vermont, USA
Languages: English (N), French (B2+)
Badly neglected "just for fun" languages: Middle Egyptian, Spanish.
Language Log: viewtopic.php?f=15&t=723
x 6323
Contact:

Re: Maintenance underway on Saturday

Postby emk » Sat Oct 14, 2017 5:21 pm

emk wrote:Server performance has been terrible Saturday, and I'm troubleshooting. More soon, and I apologize for the inconvenience.

Ah, there we go! I switched the entire forum to a brand-new cloud server (using our Terraform scripts, so it took about 10 minutes), and now it's moving along at a nice speedy clip again! I'm not sure what was wrong with the old server, but not even rebooting it was helping any, so it's gone. If you see bad performance again in the coming months, please feel free to mention it here. This forum is supposed to be fast, and if it's not, that's a bug.
14 x

User avatar
emk
Black Belt - 1st Dan
Posts: 1620
Joined: Sat Jul 18, 2015 12:07 pm
Location: Vermont, USA
Languages: English (N), French (B2+)
Badly neglected "just for fun" languages: Middle Egyptian, Spanish.
Language Log: viewtopic.php?f=15&t=723
x 6323
Contact:

Re: Maintenance underway on Saturday

Postby emk » Sat Oct 14, 2017 5:50 pm

OK, we may have another 15 minutes of downtime while I mess around with fixing this issue as well.
6 x

User avatar
emk
Black Belt - 1st Dan
Posts: 1620
Joined: Sat Jul 18, 2015 12:07 pm
Location: Vermont, USA
Languages: English (N), French (B2+)
Badly neglected "just for fun" languages: Middle Egyptian, Spanish.
Language Log: viewtopic.php?f=15&t=723
x 6323
Contact:

Re: Maintenance underway on Saturday

Postby emk » Sat Oct 14, 2017 6:06 pm

Looks good! The X-Forward-For stuff has theoretically been set up, and I'm testing it now.
3 x

User avatar
emk
Black Belt - 1st Dan
Posts: 1620
Joined: Sat Jul 18, 2015 12:07 pm
Location: Vermont, USA
Languages: English (N), French (B2+)
Badly neglected "just for fun" languages: Middle Egyptian, Spanish.
Language Log: viewtopic.php?f=15&t=723
x 6323
Contact:

Re: Maintenance underway on Saturday

Postby emk » Sun Oct 15, 2017 12:23 am

OK, I'm sad. :cry: The X-Forward-For fix didn't work. I'll need to fix it in the web server, not PHP.
0 x

User avatar
Adrianslont
Blue Belt
Posts: 827
Joined: Sun Aug 16, 2015 10:39 am
Location: Australia
Languages: English (N), Learning Indonesian and French
x 1936

Re: Maintenance underway on Saturday

Postby Adrianslont » Sun Oct 15, 2017 1:38 am

emk wrote:OK, I'm sad. :cry: The X-Forward-For fix didn't work. I'll need to fix it in the web server, not PHP.

I have no idea what that means but I appreciate that you do and that you are giving your time so that we may quibble over our opinions and share our language learning experiences. You are a legend.
10 x

User avatar
emk
Black Belt - 1st Dan
Posts: 1620
Joined: Sat Jul 18, 2015 12:07 pm
Location: Vermont, USA
Languages: English (N), French (B2+)
Badly neglected "just for fun" languages: Middle Egyptian, Spanish.
Language Log: viewtopic.php?f=15&t=723
x 6323
Contact:

Re: Maintenance underway on Saturday

Postby emk » Mon Oct 16, 2017 10:28 am

Somebody is hitting the site with a ton of traffic, and the site is not responding gracefully. Check this out:

Code: Select all

[ec2-user@ip-172-31-19-62 ~]$ uptime
 10:12:52 up 1 day, 16:56,  1 user,  load average: 152.63, 153.32, 153.24


"load average: 152" means that we've got 150 processes waiting for 4 CPUs, which is why everything takes forever.

What's happening is that:

  1. We're getting a huge number of inbound requests, so...
  2. The Apache webserver is spinning up more copies of itself to handle the traffic, but...
  3. Eventually it runs of RAM, and so...
  4. Everything becomes hugely slow, therefore...
  5. Go to 2, and repeat until the server dies.

The solutions are some combination of:

  1. Teach Apache not to start so many copies, and just "shed" the traffic with errors instead, which would at least break the loop above. This is more annoying than it should be, because I'm using the official PHP distribution for Docker, which is apparently garbage in this regard (and several others).
  2. Figure out who's hitting our site with lots of traffic and block them.
  3. Pay for a bigger server.
0 x

User avatar
emk
Black Belt - 1st Dan
Posts: 1620
Joined: Sat Jul 18, 2015 12:07 pm
Location: Vermont, USA
Languages: English (N), French (B2+)
Badly neglected "just for fun" languages: Middle Egyptian, Spanish.
Language Log: viewtopic.php?f=15&t=723
x 6323
Contact:

Re: Maintenance underway on Saturday

Postby emk » Mon Oct 16, 2017 10:54 am

emk wrote:Figure out who's hitting our site with lots of traffic and block them.

The "Yandex" search engine indexing bot is just hammering the site, with zero sense of politeness. I'm going to try blocking it using robots.txt and see if it takes a hint.

I still need to tune the PHP Apache image to only start a small number of servers, though, as a longer-term measure.
2 x

User avatar
emk
Black Belt - 1st Dan
Posts: 1620
Joined: Sat Jul 18, 2015 12:07 pm
Location: Vermont, USA
Languages: English (N), French (B2+)
Badly neglected "just for fun" languages: Middle Egyptian, Spanish.
Language Log: viewtopic.php?f=15&t=723
x 6323
Contact:

Re: Maintenance underway on Saturday

Postby emk » Mon Oct 16, 2017 11:56 am

OK, Yandex is banned, which helped some. But I'm still seeing terrible performance. :-(

Working notes:
  • Performance is slow even when we're not running too many Apache instances. It's extremely fast for a few minutes after restarting Apache (and maybe for a longer period after replacing the machine last night).
  • The slow requests are (1) page rendering and (2) image fetches, but static pages are relatively fast.
  • Normally, this would point the blame at either the database (which looks really fast, however) or the EBS disk (which also looks good).
  • We're not out of magic "burst mode" credits, which is the Amazon technology we use to keep this site running on a shoestring. Specifically, we have a full set of DB and disk credits, so no problems there. CPU credits are dangerously low but not exhausted (this morning at least), so we should be running at full CPU speed.
So I'm a bit confused, to say the least. I've ruled out the obvious culprits, and I'm addressing various issues I know about. But this may take a few days to figure out.
2 x

User avatar
zenmonkey
Black Belt - 2nd Dan
Posts: 2528
Joined: Sun Jul 26, 2015 7:21 pm
Location: California, Germany and France
Languages: Spanish, English, French trilingual - German (B2/C1) on/off study: Persian, Hebrew, Tibetan, Setswana.
Some knowledge of Italian, Portuguese, Ladino, Yiddish ...
Want to tackle Tzotzil, Nahuatl
Language Log: viewtopic.php?f=15&t=859
x 7030
Contact:

Re: Maintenance underway on Saturday

Postby zenmonkey » Mon Oct 16, 2017 4:08 pm

Just for info - I got a lot of "504 Gateway Timeout Error" this morning.
1 x
I am a leaf on the wind, watch how I soar


Return to “Technical Support and Feature Requests”

Who is online

Users browsing this forum: No registered users and 2 guests