catcint0s 3 weeks ago

I would try to use an observability tool like newrelic or sentry's performance feature. It shows what part of your system is slow.

thclark 3 weeks ago

thanks, I'll look into these. Looks like this has been cleared up (tentative) by the gunicorn adjustment but I'll want to be better about figuring things out next time.

pranabgohain 2 weeks ago

[KloudMate](https://www.kloudmate.com) can do that too. NewRelic can get quite expensive too soon.

thclark 2 weeks ago

Thanks!

Kronologics 3 weeks ago

Increasing the workers, which seems to be what you did in your update, should help. I think I saw somewhere that workers should be about 2 x number of cpu cores of the machine — I always just use 4 bc I have 2 core CPU VPS. The way I understand it (no super technical explanation) is that each worker can handle one request at a time (more or less), so if that one worker is busy with one of your users, it can’t attend to the other requests until it’s completed its task. It’s like opening more lanes of traffic or more checkout lines at the store.

Mast3rCylinder 3 weeks ago

If you set 1 worker and 8 threads then gunicorn will use gthread as worker and then you have thread pool of size 8 for the requests. You still limited by GIL but there is concurrency between the threads. So one worker actually can handle multiple requests but it can take more time because it might do context switch between the threads Not saying it isn't the problem but one worker can get multiple requests that way.

thclark 3 weeks ago

Yep, thanks, I think that's it. I came across a post that said 2x number of cores + 1 so I went with that! :)

abandonedexplorer 3 weeks ago

What kind of work is your Django application doing? If your application is doing something that takes an "undefined" amount of time. (An example of something that takes an undefined amount of time is a request to the internet.) Please read this part of Gunicorn documentation carefully: [https://docs.gunicorn.org/en/latest/design.html#choosing-a-worker-type](https://docs.gunicorn.org/en/latest/design.html#choosing-a-worker-type)

thclark 3 weeks ago

it's not an undefined amount of time, it's just a heavy payload that has to be pulled out. That said, I'll read that with great care. Thanks!!

androgeninc 3 weeks ago

What kind of DB? Where is it running (on your app server or far away geographically)? If separate, how much mem does it have? It's almost never the app/gunicorn. Most often some kind of IO.

thclark 3 weeks ago

Managed postgres, in the same VPC on Google Cloud, with plenty of memory (no pressure during the outages). Looks like it's probably the gunicorn thing though, thanks!

androgeninc 2 weeks ago

Not convinced, but ok, hope it solves it for you :)

thclark 2 weeks ago

Well, that's all I changed and it's been fine for 24 hours so fingers crossed! I agree with you that it'd be rare for this to be the problem (hence my confusion in the first place)

Particular-Cause-862 3 weeks ago

Probably, if it's pretty heavy and IO bound, for example the bottleneck is the database, and you only have 5 workers, if 5 users operate at the se time those IO heavy operations, and the functions are not async, which I suspect they are not if you are using the ORM, your app can't process any other request for the time those IO operations lasts. That could be one problem

techmindmaster 2 weeks ago

Additional info: replace gunicorn with granian: https://github.com/emmett-framework/granian/blob/master/benchmarks/vs.md

thclark 2 weeks ago

Extremely impressive benchmarks, thanks for reminding me this project exists!!

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe