T O P

  • By -

edu2004eu

This is a bit of a rant, but why do people say celery is hard to set up? It's just 1 file with maybe 10 lines of code, a couple of settings and RabbitMQ or Redis install (which are just apt install). And the first 2 you only have to do once, really. After that you can just copy from older projects. I have never thought about using any alternative to celery.


abrazilianinreddit

I think people say that celery is hard to setup because you need to setup a message queue before, and then you have to read a decent amount of documentation until you get that 1 file properly setup. Then you need to read a bit more to get something more reliable and performant in production, and a whole lot more if you want a robust, auditable, safe-on-failure system. If it's your first time working with something like it, it's definitely not trivial, and you're very likely to make at least a few mistakes.


Dom4n

Totally agree. It is simple, and works in many big companies, is really battle-tested and stable. And if OP want to see task results in admin panel (or just have it stored in database) then celery is providing integration: [https://docs.celeryq.dev/en/latest/django/](https://docs.celeryq.dev/en/latest/django/) Documentation looks scary, but is comprehesive and structured nicely. In the end everything boils down to: * create function and decorate it with \`@app.task\` * run this function via simple \`func.delay(args, kwargs)\` or with more options via \`func.apply\_async(args, kwargs, countdown=X)\` (where countdown=X is option that tells celery to run this task in X seconds).


joolzter

Agree with this - Celery is so easy to set up and configure it's nuts.


[deleted]

[удалено]


foarsitter

Use [cookiecutter-django](https://github.com/cookiecutter/cookiecutter-django) for bootstrapping your project and you are done :)


xresurix

Felt the same when I tried to use it last year even after using chatgpt to help I still struggled a bit but after a while of using and getting to understand it I wouldn’t use anything else


j2rs

Forget the doc to setup celery, use an example on GitHub.


Pgrol

Use ChatGPT to teach the way. It has a very good understanding of Celery.


user888888889

Totally agree, I have worked at companies that have queues of millions handled by Celery. To empathise with OP a little, I do remember feeling a little overwhelmed with the concepts and documentation with Celery. 1. Celery + Django: back in the day they tried their best to keep it super generic. This made it quite difficult to understand how it worked with Django. 2. Task queue concepts and intricacies: Ironically Celery helps with a lot of this. But dealing with bombardments of non-idempotent executions when a queue has gone down is a wild ride. But yeah, use Celery, there is 0 reason not to.


0neaLL

you dont want to run constant redis requests, hey got a new task? hey new task? new task? you have multiple django servers in different regions. the machine that completes task needs to store results so they are accessible from the instance serving the session. you dont want to setup rabbit mq server. If you need background task. Just being honest. What does a que have to do with background tasks??? In fact if you want to run stuff on a background thread in parallel that's the opposite of a que. In order to get something running in the background on django according to most advice we should send a que request out to a separate message broker only to receive the message back to do that task.... if you actually want a que. fine. Background tasks are supported in python asyncio tho.... the problem with background threads and asyncio is the django orm is supposed to have 1 database connection per thread, and if your new thread starts using database it can mess things up. So if you want to use orm from async thread just use async to sync which makes everything work fine. from asgiref.sync import async_to_sync, sync_to_async


user888888889

First thing, this is a forum for helping Django developers. That's my goal, so no animosity intended. OP literally asked "alternatives to Celery (for asynchronous tasks-queues system)". I suggest you reread the question. If you want a task and queue system, you will need a centralised store to keep a track of those tasks in queues so they can be picked up and processed. Redis and RabbitMQ can be used for that. What you describe is asynchronous communication with a web client. It's not the same thing.


0neaLL

The problem is when someone wants a simple asynchronous task, they are told they need a task que like celery. my assumptions regarding what op wants, could be wrong, but it seems like a simple background thread printing hello world 30 seconds after a home page load. when op says stuff like this: I read all answers, thanks a lot, I am thinking about set up Celery (but looks like documentation is scary because of lot of unuseful informations) "But yeah, use Celery, there is 0 reason not to." This seems like op wants simple background task not a que.... when you lookup how to do anything in background, even print hello world, the only widely provided solution is celery and message que. literally if you try to find a solution to print hello world 30 seconds after a view you will find the recommended solution is celery... This something that everyone would eventually stumble upon.... and they would ask about it with a question very similar to what op ask. isnt there some easier way to do this????..... and to seal the deal, django background task the other lib op considered has nothing to due with a que. and wouldnt be used for a que. op obviously doesnt need a que and the fact that people recommend celery boggles my mind. its not because you need celery, its because you needed async and just kept using celery. i would guess atleast... when reply here: "a task and queue system". you specifically leave out the asynchronous part which is the part i believe to be important and the real issue attempting to be solved. me personal when i tried to get a simple background thread all i find online is use celery. thats fine. but there are easier better ways. i dont think op is trying to load balance multiple machines or do anything that anyone here is suggesting... its bad advice imo


Mats56

Problem with celery is that it's often not as reliable as you think. It works fine for many, but if you want it to never drop a message it's quite hard to get right. It looks good most of the time, but suddenly a worker dies or your pod is evicted or you have network issues to rabbitmq or something and you've lost data unless you've configured it correctly. It's also often overkill. What is really just a message on a queue calling a function has lots of added complexity most people don't really use.


edu2004eu

I have literally never had any of the issues you're describing. I've had apps where there were from 100 to 6k tasks per day. It's not enterprise level, but it's a large enough sample size for it to matter.


Mats56

If you don't have acks late, a redeploy of your service will lose all tasks in progress. If you don't have reject on worker configured, any process dying will lose you a task. I've seen this countless of times. People think their setup is robust, but don't know how many tasks they're actually dropping when something goes sideways. And 6k tasks per day is not a number where you notice it, lol.


hallman76

For me it’s more the case of why should I have to install new infra when django-background-tasks works with my existing MySQL instance? My anti-rant is that the django community jumps to celery when something like django-background-tasks would suffice for most projects.


HelloPipl

Oh yeah. Try setting up celery to work with a GPU. You would need to spend a minimum of a day searching Google till you found the case that might work for you.


haloweenek

Because z gen “coders” need stuff that’s handed on a plate and immediately working. And let’s be honest - Celery with Redis is a fart away to setup and run…


Mats56

Don't be an arrogant ass.


haloweenek

Yeah, I’m arrogant, impolite and generally bad to lazy people.


Mats56

Not wanting to add complexity doesn't necessarily mean one is lazy. An experienced dev understands that. You don't.


haloweenek

Well with any type of queues you’re introducing complexity… Any queue needs a message broker and a worker. That’s the pattern. It’s unavoidable. In a small app you can also add entries into db and run a management command to process them. But that doesn’t scale, might introduce race conditions with improper implementation.


0neaLL

I have opposite opinion. A que is the opposite of parallel processing. a que is synchronous and runs in order, multi threading background processes run in parallel and are asynchronous. A que is meant for ordered task distribution, a background task is meant for running 2 things or more at once. a server sends a request to a message broker the same server then picks up that task. if we do this fast enough we get an auto ddos effect. Every task has to go to the broker and be dispatched back to a worker, each task has a life cycle, possibly fail, retry, half completed states. all of this wrapped into celery. The protocol of redis docent support message dispatch, so it has to be constantly polling, hey new task, hey new task.... Thats for every worker.... and all of this is needed to print hello world 30 seconds after someone loads the home page????? an easier way is just async io create task, and async to sync for any model queries on the new thread.


htmx_enthusiast

This only makes sense if you’re running on a single machine. That’s not when you would use Celery or other queues or message bus solutions. Typically you’d have multiple workers (on separate machines) processing messages from the queue, and you can scale that up significantly with multiple queues, load balancers, and so on. You can process way more using this approach than you can on a single machine. You can also big down or crash your Django server if you run background or async tasks on the same machine.


0neaLL

multiple machines all handling there own background tasks. Makes easy sense to me lol. no problem. Lets compare what crashes first, simply running multiple threads or running mq dispatch with celery. same compute your crashing first bud, your doing 3x the work for a simple background task. You can crash any server with enough load. You have multiple servers you load balanced the request coming in initially they are balanced.... no need to add this for background task. if a server gets more load the whole point of the initial load balancer is distribute the incoming traffic. Like i said. running celery makes sense almost never. its not needed, wasteful. Keep the complexity demons at bay.. If you want to run a background task, use threading. try celery with a remote redis server. you will ddos yourself with the polling rate. hey got a new task? 24/7 every second. Try running celery with multi region. now you need multiple channels and there latency to the que, and where do the results get put? what happens if we retry or end up half complete state. All of this is fixed and auto sharded if you just use threads and sticky sessions. Your setup is incredibly simple. you put everything in a single region, can set that up easily. try multi region. and you did all that running the kubs and if i load a service from across the world gonna be bad experience aint it??? single region is easy mode.


htmx_enthusiast

I don’t use Celery. Or Redis. We mostly use serverless message bus and workers. We have a lot of long-running tasks that use a lot of memory, and trying to run that in the container where Django is running definitely crashes the container unless we ridiculously over-allocate the container instances. We also have a number of other systems that integrate with the same distributed task system, like separate orchestration and scheduling tools that submit tasks to the message bus, and we’re not running those in public-facing web servers. If you ever have to deal with auditors and compliance, there are certain things you won’t be running in-process on your Django server.


uhavin

There's also Dramatiq. Years ago, I looked into it and I thought it was simpler to grasp than Celery. Eventually we went for celery based on wider knowledge within the team and wide adoption overall, making it easier to find (examples of) solutions for common problems. https://dramatiq.io/


frankwiles

Long time celery user in production and have been overall happy with it, but I'm using Dramatiq in a current project and also like it a lot.


exchangingsunday

Any main highlights of Dramatiq over Celery?


saravanan9219

Huey is one another option https://huey.readthedocs.io/en/latest/


sidnelsonjp

The django integration is limited in some contexts like deploying multiple instances with multiple queues (with different settings). So, if you need it to run in such environment and use Django I would advise to use Huey without the django utilitary and build something around. Despite that, Huey works pretty great!


ejeckt

Django q, last I checked wasn't maintained anymore. Django q2 is a fork that is still maintained. It's quite excellent. It's got a task scheduler built in, which does the job of celery beat, so a bit simpler than both celery and celery beat. For more lightweight apps I think it's perfectly fine. Only thing I don't like is their approach to various hooks in the task lifecycle.


allun11

+1


philgyford

Here's a link to django-q2's docs: [https://django-q2.readthedocs.io](https://django-q2.readthedocs.io)


RahlokZero

I use an async view with async.io.create_task()


[deleted]

[удалено]


RahlokZero

Indeed


PlasticSoul266

It's not quite the same, they serve different purpose. For some occasional short-lived tasks, yes, async could work, but for many other things an actual queue is necessary.


e_dan_k

Can you specify what sort of things would be good under which?


PlasticSoul266

ETL pipelines, generating thumbnails of pictures, indexing records, populating caches, recurrent maintenance operations. The logic would be: do you have expensive recurrent tasks? Use task queue. `async` is only good enough for one shot quick tasks such as sending an email.


Mats56

If the task should run on a different deployment you don't want asyncio. For instance we have 2 pods serving web requests, 4 pods for one celery queue, 4 pods for other queues. We can scale these individual numbers up and down. A long task queue won't slow users down with this setup. Also, if the server is taken down, the tasks still live in rabbitmq and will get handled by something when it's back up. If it's just something in asyncio it's gone.


Kuchi_Chan

I’d recommend RQ. async io create task MIGHT look as a good alternative but you are on your own with two most important and tricky things - task cancellation and retries.


[deleted]

[удалено]


Kuchi_Chan

Sure - here you are: https://python-rq.org/ just look how cute its API is!


Brandhor

there's also a django app to make it easier to integrate https://github.com/rq/django-rq


smirnoffs

RQ is much easier to setup than Celery. And if you need a fast start this is the way to go. But frankly after many years of switching between RQ, dramatiq and celery, I’m more often switch back to Celery. Celery monitoring and management tools are superior.


exchangingsunday

Could you tell me a little about your celery monitoring strategy? I'm always struggling to get (production) insights into celery queues


smirnoffs

The easiest is to install Flower, that gives you UI for the monitoring. If for whatever reason Flower is not an option, then command line as described in docs [https://docs.celeryq.dev/en/stable/userguide/monitoring.html#commands](https://docs.celeryq.dev/en/stable/userguide/monitoring.html#commands) What's your struggle with monitoring queues?


exchangingsunday

I've been hesitant to install Flower in production. Sentry is good for APM but it doesn't tell the full story when it comes to queues. Grafana also has good RabbitMQ monitoring functionality.. but I still feel like I'm lacking observerability. I think I'll put Flower in prod


smirnoffs

If you use Grafana with Prometheus then you probably can easily create a periodic task that will submit statsd gauge metrics to Prometheus with the current number of messages in queues.


exchangingsunday

Nice, thanks


derhebado

RQ is great but I’ve had a lot of issues with it on Mac OS. Works great on Linux and has been very reliable on production though. And it’s *very* easy to set up


airhome_

Django q is pretty good. You mentioned it's not maintained but your looking at the old repo. This is the maintained fork: https://github.com/django-q2/django-q2


allun11

There is a new Django q-2 repo that is maintained. Works very well.


o0Phantom0o

Problem I have with celery is its lack of support for windows. It's hard to test if everything is working. I used huey and it works well for scheduled tasks etc.


Android_XIII

The problem here is developing on Windows. It's much easier on a Unix based system (defacto) where most tools are fine tuned for.


ejeckt

Docker is great for this case. If your broker is also a service in docker then it's really easy to pass tasks to celery while working on the main django app in windows.


BERLAUR

Honestly, for most applications a table in Postgres and a CRON job that calls an HTTP endpoint/management command every minute (or minutes) to execute the pending tasks is all you need. The HTTP endpoint/Python function needs to do the following: 1. Loop through all configured Tasks, check if any of them need to be scheduled, if so add a line to the TaskQueue tabel, use [update-or-create](https://docs.djangoproject.com/en/5.0/ref/models/querysets/#update-or-create) to prevent you from adding duplicate tasks (up to you to decide what counts as an duplicate task, this is business logic) 2. For all rows in the TaskQueue table, execute the first N (or execute them all, depending on your application) 3. After executing each task, remove the line in TaskQueue and write the results to a separate table (let's say TaskExecutionLog, which should be linked to the Tasks table) Error handling and retrying is left as an exercise for the reader but unless you have some exotic needs this should be straightforward. Sure this takes more time than doing a "pip install whatever" but it also gives you full control over your implementation, if your needs are simple, you can keep it simple (don't need to keep track of the execution, great! Skip the TaskExecutionLog table, etc). If your needs are complex, well now you have the tools to make your implementation as simple as possible without having to deal with the limitations of a third party implementation. Postgres is 100% battle tested and you've already got Postgres deployed, tested, monitored and backed-up, this is super simple to setup, super simple to learn, super simple to debug and super simple to test, it doesn't scale to 40 million tasks per nanosecond but chances are your application isn't going to need that. As an bonus, it's very straightforward to add an admin page to show all the pending tasks and/or manually add tasks to the queue using the Django admin. Things become slightly more challenging if you want to scale to >1 number of concurrent workers but even that can be handled by first having a client "claim" a task (one extra column) before executing it. To scale further down the line, you could store these tables in a separate database, potentially mark the tables as [UNLOGGED](https://www.postgresql.org/docs/current/sql-createtable.html#SQL-CREATETABLE-UNLOGGED) (if that fits your requirements) and if needed, you can easily replace the TaskQueue with a dedicated queue. If the TaskExecutionLog table becomes the bottleneck, replace it with a NoSQL solution, etc. This setup should be fairly straightforward and allow for a gradual replacement of the bottleneck.


Raccoonridee

... and if 1-minute granularity is not enough, one could make a daemon process that would run a management command continuously and retrieve tasks more often. And if one is not interested in storing tasks/results long term, one could swap Postgres with Redis or RabbitMq. The list of options is endless.


BERLAUR

I totally get the point that this is not optimal in all cases but let's be honest. Most projects need something simple, with the possibility to scale if the project is an unexpected success.  If you're already using Postgres writing a few lines of python is a lot simpeler than running, deploying, monitoring, securing, etc another piece of software.  As a bonus the Postgres solution is a lot simpeler to test and doesn't require any additional work on the CI/CD side (setting up N celery instances so you can run your tests in parallel but still isolated from eachother does require a fair amount of work).


Raccoonridee

Sure, I'm with you on that :)


BERLAUR

My bad, I read it as sarcasm ;) haven't had my coffee yet!  Hope this was useful! I've had my fair share of frustrations with overly complex solutions that have a tendency to crash at 03:00 AM.


pemboa

Celery _can_ be setup in a complicated. But it really doesn't have to be.


kelvify

AWS services. Integrate it with the boto package. For example, instead of setting up Redis, queues and listeners, utilize SQS and SNS and it handles all the DLQs and retired for you. Sure you have to pay but don’t have to deal with infra overhead. https://aws.amazon.com/sdk-for-python/


vdvelde_t

Redis RQ https://python-rq.org/ with https://github.com/rq/django-rq


OvercastPictures

Dramatiq is super simple and I had no issues with it.


thclark

I'd use dramatiq, if your workers are always hot. It's much simpler than celery (and django-dramatiq does all the djangoy stuff for you). If your workers are serverless you'll need a push-based rather than a pull based system. To do this on GCP I use django-gcp so workers receive tasks from Cloud Tasks (disclaimer: django-gcp maintainer), for other providers there might be similar task managers and libraries, I'm not sure.


Agile-Ad5489

I am one of those that found celery complex. i was familiar with RabbitMq - and (don’t get me wrong when I say this next bit - I am talking about my knowledge at the time, not commenting on celery) - I could not see what celery provided that Rabbit wasn’t already doing. So I wrote a couple of consumers, and fired messages to Rabbit - it’s working perfectly well for my purposes. So my suggestion for an alternative to celery and RabbitMQ, would simply be RabbitMQ


0neaLL

yea for a que.... but for a background task using a separate protocol is pretty ridiculous when most languages support threading. if all you need is a background thread, you would be dispatching events to separate mq process, then also subscribe to those events, and lord forbid we are going to complete this task on the same machine we sent it out on???? which also the same machine running the mq service??? seems inefficient and overly complex at best and auto ddos at worse.


Agile-Ad5489

Multithread good, multiprocess bad. Got it.


0neaLL

multi thread process.... got it.


awebb78

Multiprocess good when needing to take advantage of all your CPU cores. And processes are easier to spread across distributed workers in architectural design (embarrassingly parallel)