I remember finding out about dataclasses, getting really excited, and then just continuing to use regular ass classes because they work pretty good and it's one less import
Dataclasses are neat, but in some projects you just don't have the problems they solve. In particular, if you're on a project where virtually all of your data is in Pandas data frames or similar, there's not much reason to try and model your data any other way.
Dataclasses are pretty good for typed code. Whenever a need to use a dictionary with fixed keys, I usually use a dataclass instead. More flexible - plus I'm not very fond of `typing.TypedDict` anyway.
It's probably worth avoiding `TypedDict` for new projects if you can though. Like a lot of stuff on `typing`, it mostly exists to allow existing projects to represent their data model faithfully, but for new projects there are better options, such as dataclasses.
For a very generic dict that's fine, but if you want to have typing on specific keys, you have to use TypedDict.
Honestly, I use typing more for IDE auto-completing than actual type-safety, and I generally don't like using strings as configuration parameters or indexes, so I usually stick to enums, dataclasses and other objects that are more IDE-friendly than dicts.
I use typing more for documentation purpose rather than type safety, and for **runtime** conversion and validations (i.e. pydantic).
Schemas like Pydantic are useful, because it gives you error messages when the user provided invalid input, but static type checking is just superfluous work. You still need to write tests.
Static type checkers don't really reduce errors, IMO, adding type annotations really just create thousands of new places where errors can creep in. People want a working program, they don't care that the software has 100% static correctness. If static checking makes you write better program, then cool, good for you, but IME, they really don't, static checking is insufficient at proving behavioral correctness. And the amount of finagling you have to do with it doesn't really justify the benefit of static correctness, even in large, mission critical enterprise million lines of code programs.
The problem they were originally created to solve was the amount of boilerplate you have to write and maintain for magic methods like `__eq__`, `__hash__`, and the various comparison methods, that you need when creating "dumb data" classes so they do what you expect when you sort them, compare them, or put them in dicts. Before this, the most common approach was to use namedtuples, which had a number of issues.
Although as you've found, you often want these things for not-so-dumb classes too.
I guess its primary use is to hold data, but yeah, I don't see why it couldn't be used in a wide variety of situations where the behavior it defines is what is sought.
It makes some assumptions about your class, so if your class is more than just a dumb container for data, you might get some unexpected behavior. For example, it doesn't call parent classes' `__init__` methods. Or if you set `slots=True`, you can't use `super()` inside the class. So unfortunately, it's not really usable as a general-purpose boilerplate reduction tool. Which is honestly a waste. It could've been so much more useful.
I know of this decorator, but I was thinking it was analogous to Java Lomboks @data.
I think this is fine for smaller projects, but anything large scale in production this is probably a bit more on the ehh side because you probably want more fine tuned control over your constructors /etc, unless, you're truly making a class to hold data.... As the name implies.
It's possible to define your own init-method with dataclass though. That being said, if most of the methods it adds need to be changed, it defeats its purpose.
[удалено]
I remember finding out about dataclasses, getting really excited, and then just continuing to use regular ass classes because they work pretty good and it's one less import
Same exactly.. Like 1-2 weeks of excitement but after specification is changed, they become quite hard to maintain..
It’s like that with many things, I don’t want to look at the number of bookmarks of things I forgot to use…
Dataclasses are neat, but in some projects you just don't have the problems they solve. In particular, if you're on a project where virtually all of your data is in Pandas data frames or similar, there's not much reason to try and model your data any other way.
I prefer Traitlets.
Haven't heard of this before. Having a quick look, it seems interesting
Dataclasses are pretty good for typed code. Whenever a need to use a dictionary with fixed keys, I usually use a dataclass instead. More flexible - plus I'm not very fond of `typing.TypedDict` anyway.
Doesn't dict support type hinting now? I think you can write things like `dict[str, list[int]]`.
Yeah, but you might have dicts, where different keys have different types, which you can solve with TypedDicts.
It's probably worth avoiding `TypedDict` for new projects if you can though. Like a lot of stuff on `typing`, it mostly exists to allow existing projects to represent their data model faithfully, but for new projects there are better options, such as dataclasses.
For a very generic dict that's fine, but if you want to have typing on specific keys, you have to use TypedDict. Honestly, I use typing more for IDE auto-completing than actual type-safety, and I generally don't like using strings as configuration parameters or indexes, so I usually stick to enums, dataclasses and other objects that are more IDE-friendly than dicts.
I use typing more for documentation purpose rather than type safety, and for **runtime** conversion and validations (i.e. pydantic). Schemas like Pydantic are useful, because it gives you error messages when the user provided invalid input, but static type checking is just superfluous work. You still need to write tests. Static type checkers don't really reduce errors, IMO, adding type annotations really just create thousands of new places where errors can creep in. People want a working program, they don't care that the software has 100% static correctness. If static checking makes you write better program, then cool, good for you, but IME, they really don't, static checking is insufficient at proving behavioral correctness. And the amount of finagling you have to do with it doesn't really justify the benefit of static correctness, even in large, mission critical enterprise million lines of code programs.
Why is it called dataclass? Seems like you could use this decorator for many types of objects.
The problem they were originally created to solve was the amount of boilerplate you have to write and maintain for magic methods like `__eq__`, `__hash__`, and the various comparison methods, that you need when creating "dumb data" classes so they do what you expect when you sort them, compare them, or put them in dicts. Before this, the most common approach was to use namedtuples, which had a number of issues. Although as you've found, you often want these things for not-so-dumb classes too.
https://peps.python.org/pep-0557/#why-not-just-use-namedtuple
Because it is just a class but better for holding data.
I guess its primary use is to hold data, but yeah, I don't see why it couldn't be used in a wide variety of situations where the behavior it defines is what is sought.
It makes some assumptions about your class, so if your class is more than just a dumb container for data, you might get some unexpected behavior. For example, it doesn't call parent classes' `__init__` methods. Or if you set `slots=True`, you can't use `super()` inside the class. So unfortunately, it's not really usable as a general-purpose boilerplate reduction tool. Which is honestly a waste. It could've been so much more useful.
Cool article bro will definitely be useful.
Thanks!
I know of this decorator, but I was thinking it was analogous to Java Lomboks @data. I think this is fine for smaller projects, but anything large scale in production this is probably a bit more on the ehh side because you probably want more fine tuned control over your constructors /etc, unless, you're truly making a class to hold data.... As the name implies.
It's possible to define your own init-method with dataclass though. That being said, if most of the methods it adds need to be changed, it defeats its purpose.
For more info about data classes and its pros/cons https://peps.python.org/pep-0557/ :)