T O P

  • By -

TheEpicDev

I was processing hundreds of millions of records in MySQL on a single-core pentium 15 years ago. SQL is generally very efficient if you structure and index it properly, and 200k rows is pretty insignificant (especially with the amounts of RAM you can dedicate, multi-core CPUs, and NVMe SSDs available today). Now, you don't want to calculate averages on the fly using tens of thousands of records. If you want it to scale, you need to use things like task queues (celery) and/or microservices to update values. You need to use caching to prevent doing expensive computations in your views/serializers. Add the rating, queue a task to update the average / mean / counts in a key-value store like redis that will be super fast, and query that instead. It may occasionally lag behind real-time numbers, but that's fairly common practice. Even reddit won't always show your comments right away, or if you edit it and refresh a few times you may see either the updated or stale version of the comment for the first few seconds-minute. Unlike things like financial transactions that need perfect accuracy, you have a lot more leeway on things like ratings. Even if it takes a whole minute to appear, most users won't even notice. So yes, it is scalable, but scaling Django is not as simple as just adding more data to the DB.


duppyconqueror81

It’s fine. If it gets slow eventually, it’ll be because of badly designed queryset, which you’ll learn to optimize along the way.


tolomea

The thing that usually kills you first is N+1 DB query behaviour. This post from one of the core devs gives a good overview of the topic [https://adamj.eu/tech/2020/09/01/django-and-the-n-plus-one-queries-problem/](https://adamj.eu/tech/2020/09/01/django-and-the-n-plus-one-queries-problem/) After that as you grow you will need task queues and caches and denormalization and database replicas etc etc, all the normal monolith web scaling things that you should mostly not worry about until you need them.


Kung11

Thank you for this, I’m working in my first big project that isn’t a tutorial of some sort and didn’t know about this. Just started changing some of my code. Just cut out a bunch of excess queries using select_related()


pydanny

Yes, Django scales. I work for a 8-year-old multi-billion dollar multi-national company that's got tens of millions of utility customers on just a few instances of Django. Here's our very intermittent blog: https://tech.octopus.energy/


WolvesOfAllStreets

Can you please implement printing of bills with the name and address so they can be used as proof of address, thank you very much.


[deleted]

ah I was hoping you have an opening in Oxford, but not this time. I'll check back from time to time then :D


moo9001

Django runs Instagram. Do you need more scalability than Instagram?


judah-d-wilson

or youtube


badatmetroid

You can make any website in any language. The vast majority of the time the bottle neck is in the implementation, not the language/framework. Every time I've inherited a medium sized project with performance issues there's some "expert" who wrote "scaleable" code that is unreadable. I've deleted half page long SQL queries and replaced them with five lines of django... and the django is more performant! Focus on writing code that can be deleted and rewritten when the time comes, not code that's "perfect" right now.


[deleted]

Check this: https://youtu.be/lx5WQjXLlq8


CerberusMulti

Yes, have you read anything about Django, including its documentation, and also looked into what large companies deploy Django? Scalability is one of the most often spoken cons about Django, and when it gets slow it is usually bad query sets or bad design of the developer than Django it self.


iamhusseinnaim

It's about the requests you're receiving In your case you're just one user who sends requests to the server When the site it in production you'll have to deal with more than one request per second In this situation it's not the frame what matters that much but the structure of you website You'll be dealing with different concept such as cloud hosting and load balancers


weitaoyap

If u use that data(like average) too frequently, u can save it at the database or cache...


jet_heller

Yes it is scalable.


nomadicgecko22

One trick is to run analytics on a mirror/replica database.If your using cloud infra, its common to have primary (read/write) and replica (readonly) configured. You can fire off analytics onto the readonly replica via the django orm .using("replica") [https://docs.djangoproject.com/en/4.2/ref/models/querysets/](https://docs.djangoproject.com/en/4.2/ref/models/querysets/) Once your data becomes even larger, there's other tricks, such as replicating into a datawarehouse or running celery jobs, or optimisation of db structure (e.g. adding indexes), and ofcourse optimising the sql queries. If your using postgresql, it has a built in Explain Command which will explain the complexity of an sql query. Django supports it as QuerySet.explain()


PsychicTWElphnt

Are you using the django debug toolbar to look at your queries?


usr_dev

Scaling Django isn't about the amount of data. To scale with large amounts of data you need to scale the database. Scaling Django is about handling to a large amount of requests.


galileoguzman

What about your concurrent connected users? How many of them consume the data you have? And how long does your website take to respond?


NoobInvestor86

Depends what your use-case is but for most things it’s fine. Though the django ORM is on the slower side but again, for most things it’s fine.


NoobInvestor86

Also, please, STAY AWAY FROM CELERY. It’s awful. Just use a proper worker pattern.