Python has over 100 special methods that let developers customize the behavior of their objects. These method's names are surrounded by double underscores (e.g., __init__
, __new__
, __str__
, __eq__
, etc.). You've already met some of these names. They are collectively known as magic methods or dunder methods (dunder stands for double underscore).
There are also a small number of magic variables and magic functions. We'll meet a few of these, but not many.
We've already seen __new__
and __init__
, so we won't cover them again. We should mention that they help customize the creation and initialization of new objects. You will create __init__
for nearly every class you define. However, you may spend your entire career without ever writing a custom __new__
method.
When speaking, it's common to pronounce __something__
as "dunder something". Thus, __new__
is "dunder new" and __init__
is "dunder init". For this reason, we use "a" instead of "an" when an indefinite article is required: "a __init__
method", not "an __init__
method".
Let's look at some of the most common magic methods.
We met the str
and repr
built-in functions In our previous Python book. As you may recall, they both return string representations of an object. The return value of str
is meant to be a human-readable representation of an object. In contrast, repr
typically depicts how you would recreate an object.
In most cases, str
and repr
return the same value. However, this isn't universally true. For example, the datetime.datetime
type has differing representations:
from datetime import datetime
dt = datetime.now()
print(str(dt)) # 2023-09-21 21:04:54.036563
print(repr(dt))
# datetime.datetime(2023, 9, 21, 21, 4, 54, 36563)
One of the coolest aspects of str
and repr
is that they work with every object in Python, regardless of type. That's not always obvious, however:
class Cat:
def __init__(self, name):
self.name = name
cat = Cat('Fuzzy')
print(str(cat)) # <__main__.Cat object at 0x...>
print(repr(cat)) # <__main__.Cat object at 0x...>
What's happening here? When Python tries to call str(cat)
, it looks for a __str__
method in the Cat
class. Likewise, when it tries to call repr(cat)
, it looks for Cat.__repr__
. Since neither method exists, Python looks elsewhere. But where?
Every object in Python ultimately inherits from the object
class. In fact, this is the default superclass for a class that doesn't explicitly subclass another class. Thus, our Cat
class inherits from object
.
As it happens, object.__str__
and object.__repr__
produce the above output.
Suppose you want to define class-specific str
and repr
methods for a class. All you have to do is add __str__
and __repr__
instance methods to the class:
class Cat:
def __init__(self, name):
self.name = name
#highlight
def __str__(self):
return self.name
def __repr__(self):
return f'Cat({repr(self.name)})'
#endhighlight
cat = Cat('Fuzzy')
#highlight
print(str(cat)) # Fuzzy
print(repr(cat)) # Cat('Fuzzy')
#endhighlight
Note that we use repr(self.name)
to format the name
argument depicted by the return value. That's what puts the quotes around Fuzzy
in the output using repr
. It's good practice to use repr
like this in your __repr__
methods. There's a world of difference between Person('Hall, Annie')
and Person(Hall, Annie)
. The first value (Person('Hall, Annie')
) can be copy and pasted into your code to create a new Person
object. If you copy and paste Person(Hall, Annie)
, your code will most likely raise an error.
When a program calls str
on an object, Python first searches for a __str__
method in the object. If it finds one, it invokes that method to determine the string representation. If it doesn't find a __str__
method in the object, it then searches any classes it inherits from (we'll explore inheritance later). If it finds __str__
in one of the inherited classes, it will use that method. If Python doesn't find a __str__
method anywhere, it next looks for a __repr__
method using the same search mechanism used for the __str__
method. If it can't find a __repr__
method anywhere, it calls object.__str__
, which returns a somewhat meaningless string tha usually looks something like this:
<__main__.MyType object at 0x1052828a0>
When a program calls repr
on an object, Python takes a similar path to finding an appropriate __repr__
method. Note that Python never searches for __str__
when it is responding to a call to repr
.
# Class definition omitted
cat = Cat('Fuzzy')
# Cat has both __str__ and __repr__
print(str(cat)) # Fuzzy
print(repr(cat)) # Cat('Fuzzy')
# Cat has __str__ but not __repr__
print(str(cat)) # Fuzzy
print(repr(cat)) # <__main__.Cat object at 0x...>
# Cat has __repr__ but not __str__
print(str(cat)) # Cat('Fuzzy')
print(repr(cat)) # Cat('Fuzzy')
# Cat has neither __repr__ nor __str__
print(str(cat)) # <__main__.Cat object at 0x...>
print(repr(cat)) # <__main__.Cat object at 0x...>
Lines 11 and 19 use object.__repr__
, while line 18 uses object.__str__
. In the remaining print
invocations, Python calls either Cat.__repr__
or Cat.__str__
as appropriate.
Python implicitly calls str
or repr
in a variety of places:
str
on each positional argument passed to the print
function.
str
when performing string interpolation, as in an f-string.
repr
when printing the elements of a container object.
You may recall that you can compare most Python types for equality with the ==
or !=
operators. You can also compare many types as ordered quantities with <
, <=
, >
, and >=
. Here are the magic methods that correspond to all these operators:
Operator | Method | Description |
---|---|---|
== |
__eq__ |
Equal to |
!= |
__ne__ |
Not equal to |
< |
__lt__ |
Less than |
<= |
__le__ |
Less than or equal to |
> |
__gt__ |
Greater than |
>= |
__ge__ |
Greater than or equal to |
==
and !=
Let's first see what happens with ==
and !=
when __eq__
and __ne__
aren't defined for an object. We'll use the Cat
class from above:
# Class definition omitted
fuzzy = Cat('Fuzzy')
fluffy = Cat('Fluffy')
fluffy2 = Cat('Fluffy')
print(fuzzy == fluffy) # False
print(fluffy == fluffy) # True
print(fuzzy != fluffy) # True
print(fuzzy != fuzzy) # False
print(fluffy == fluffy2) # False
print(fluffy != fluffy2) # True
There shouldn't be any surprises here on lines 7-10. fuzzy
is obviously not the same as fluffy
, but fluffy
is clearly fluffy
. The inequalities also work as expected.
However, things get a little strange on lines 12 and 13. fluffy
and fluffy2
represent someone named Fluffy
. However, the objects are not equal to each other. This worked when we previously compared the fluffy
object with itself, but not now. The problem is that fluffy
and fluffy2
are distinct objects. By default, Python assumes that two custom objects are only equal when they are the same object.
If you need more control over equality, you need the __eq__
and __ne__
methods. Without them, Python assumes that equal objects are the same object. With these methods defined, however, Python uses them to check for equality.
class Cat:
def __init__(self, name):
self.name = name
# __str__ and __repr__ omitted
def __eq__(self, other):
return self.name == other.name
def __ne__(self, other):
return self.name != other.name
fuzzy = Cat('Fuzzy')
fluffy = Cat('Fluffy')
fluffy2 = Cat('Fluffy')
print(fuzzy == fluffy) # False
print(fluffy == fluffy) # True
print(fuzzy != fluffy) # True
print(fuzzy != fuzzy) # False
print(fluffy == fluffy2) # True
print(fluffy != fluffy2) # False
Note that lines 23 and 24 return values reflecting the equality of the fluffy
and fluffy2
objects.
Inheritance plays a big role in how ==
works. When Python sees an expression like fluffy == fluffy2
, it tries to find a __eq__
method in fluffy
's class. That is, it tries to find Cat.__eq__
. If the method exists, Python calls it as fluffy.__eq__(fluffy2)
.
However, if Cat.__eq__
doesn't exist, Python looks elsewhere. In the case of Cat
, it looks to the object
class -- a class that all objects inherit from. There it finds object.__eq__
, which it calls to evaluate fluffy == fluffy2
. object.__eq__
checks whether two objects are the same object, so fluffy1.__eq__(fluffy2)
returns True
only when fluffy1
and fluffy2
reference the same object.
Since the object
class has no state, there's nothing that object.__eq__
can compare for equality. As a result, it defaults treating object identity as equality. That is, two objects are equal only when they are the same object.
This will make more sense when you read the next chapter.
What happens when Python encounters an expression like a == b
where the types of a
and b
are different? Essentially, it first calls a.__eq__(b)
. If the method doesn't know how to handle an object of b
's type, Python then calls b.__eq__(a)
. If that also doesn't know what to do, Python returns the value of a is b
. In more detail, the process looks like the following pseudocode:
a.__eq__(b)
.
a == b
evaluates to that value.
NotImplemented
:
b.__eq__(a)
.
a == b
evaluates to that value.
NotImplemented
:
a == b
evaluates as a is b
, which will usually be False
.
This is a simplified view of how Python evaluates a == b
, as there are some subtle differences. However, it does a good job as a mental model.
We're not going to go into any detail about why this process is important. However, if you're not careful to return NotImplemented
when appropriate, you may see strange and unexpected behavior. For instance, in the right circumstances, if a == b
returns True
, b == a
might return False
. That's rarely what you want.
With that information, we can rewite our __eq__
and __ne__
to better conform to Python's expectations:
class Cat:
def __init__(self, name):
self.name = name
def __eq__(self, other):
if not isinstance(other, Cat):
return NotImplemented
return self.name == other.name
def __ne__(self, other):
if not isinstance(other, Cat):
return NotImplemented
return self.name != other.name
If you want your custom classes to conform with the built-in classes, your __eq__
and __ne__
methods should work similarly.
You can skip the type checks when there's little chance of performing a comparison with a different object type. For instance, a nested class that is marked for internal use can most likely avoid the problem:
class Person:
class _Name:
def __init__(self, name):
self.name = name
def __eq__(self, other):
return self.name == other.name
def __ne__(self, other):
return self.name != other.name
def __init__(self, name1, name2):
print(self._Name(name1) == self._Name(name2))
Person('John', 'John') # True
Person('Alice', 'Allison') # False
Since _Name
is meant for internal use, you don't have to account for people who violate the single underscore convention. Presumably, they know they shouldn't use _Name
, and you know that _Name
objects will never be compared with non-_Name
objects.
<
, <=
, >
, and >=
Let's see what happens when we try to perform ordered comparisons:
# Cat class omitted for brevity
fluffy = Cat('Fluffy')
whiskers = Cat('Whiskers')
print(fluffy == whiskers) # False
print(fluffy != whiskers) # True
#highlight
print(fluffy < whiskers)
# TypeError: '<' not supported between instances of
# 'Cat' and 'Cat'
#endhighlight
Okay, we don't have a working <
operator this time. (We also don't have working <=
, >
, or >=
operators). Let's define them; we'll compare Cat
objects based on the cat names:
class Cat:
def __init__(self, name):
self.name = name
def __eq__(self, other):
if not isinstance(other, Cat):
return NotImplemented
return self.name == other.name
def __ne__(self, other):
if not isinstance(other, Cat):
return NotImplemented
return self.name != other.name
#highlight
def __lt__(self, other):
if not isinstance(other, Cat):
return NotImplemented
return self.name < other.name
def __le__(self, other):
if not isinstance(other, Cat):
return NotImplemented
return self.name <= other.name
def __gt__(self, other):
if not isinstance(other, Cat):
return NotImplemented
return self.name > other.name
def __ge__(self, other):
if not isinstance(other, Cat):
return NotImplemented
return self.name >= other.name
#endhighlight
#highlight
fluffy = Cat('Fluffy')
fluffy2 = Cat('Fluffy')
whiskers = Cat('Whiskers')
print(fluffy < whiskers) # True
print(fluffy <= whiskers) # True
print(fluffy <= fluffy2) # True
print(fluffy > whiskers) # False
print(fluffy >= whiskers) # False
print(fluffy >= fluffy2) # True
#endhighlight
If you walk through the last 6 statements, you should see that they all return the correct values for the comparisons.
That code is really repetitive, though. You can use functools.total_ordering
to get around this repetitiveness. However, there are really subtle subtleties involved when using it with comparison methods that can return NotImplemented
. It's usually wiser to write your own code.
One last thing: it's worth noting that you don't normally want to use isinstance
in your code; this is not very object-oriented. However, using isinstance
in many dunder methods is almost mandatory.
We won't cover the arithmetic operator methods in great detail, but we can look at one. The __add__
method lets you control how the +
operator works with a custom class, while __iadd__
handles augmented assignment with +=
. We'll use vector addition as our example:
Adding two vectors, <a, b> and <c, d>, yields the vector result, <a + c, b + d>.
class Vector:
def __init__(self, x, y):
self.x = x
self.y = y
def __add__(self, other):
if not isinstance(other, Vector):
return NotImplemented
new_x = self.x + other.x
new_y = self.y + other.y
return Vector(new_x, new_y)
def __iadd__(self, other):
if not isinstance(other, Vector):
return NotImplemented
self.x += other.x
self.y += other.y
return self
def __repr__(self):
x = repr(self.x)
y = repr(self.y)
return f'Vector({x}, {y})'
v1 = Vector(5, 12)
v2 = Vector(13, -4)
print(v1 + v2) # Vector(18, 8)
As you can see, __add__
takes an other
argument then returns a new vector that represents the sum of the self
and other
vectors. It's "called" by using the +
operator. __iadd__
, on the other hand, performs an in-place operation by mutating the object to the left of +=
. You should normally define __iadd__
when you define __add__
. If you don't, Python falls back to using __add__
, which may not work as you expect.
Notice that we followed the same "check the type and return NotImplemented
" pattern used with the comparison operators. We also returned self
from the __iadd__
method; you must do that for all augmented assignment methods.
Other arithmetic operator methods include __sub__
(subtraction with -
), __mul__
(multiplication with *
), __truediv__
(floating division with /
), and __floordiv__
(integer division with //
). There are several more you can use. As with __add__
and __iadd__
, you should normally define the __isub__
, __imul__
, __itruediv__
, and __ifloordiv__
methods when you define the primary method.
Defining arithmetic operators for custom types can lead to elegant code, but only when their use is intuitive and consistent with the rest of Python. The operators must make sense and generally be limited to numeric and sequence types. Don't define arithmetic operators simply because it's cool; it probably isn't.
In particular, the arithmetic operators should obey the commutative and associative laws of arithmetic, as appropriate. For example, +
and *
should be commutative and associative.
a + b == b + a
a * b == b * a
a + (b + c) == (a + b) + c
a * (b * c) == (a * b) * c
Most other operators do not have to be commutative or associative.
It's worth noting that concatenation isn't commutative, yet Python uses +
for strings, lists, and tuples. One could argue that providing +
for concatenation is non-intuitive. However, it is familiar and comfortable to many developers, not just Python programmers. At this point, concatenation is a perfectly acceptable use for the +
operator.
The *
operator for strings, lists, and tuples is commutative, associative, and relatively intuitive. Performing repetition for other types that support concatenation is acceptable.
Think carefully before defining arithmetic operators for non-arithmetic classes. Operators should be intuitive and consistent with Python's built-in types, or they may lead to confusion and errors.
Python has a handful of magic variables, aka dunder variables, that are primarily useful for debugging and testing. Let's look at a few of them.
__name__
returns the current module's name as a string. Here's a simple example involving 2 modules and a main program:
print(__name__)
print(__name__)
import mod1
import mod2
print(__name__)
mod1
mod2
__main__
As you can see, the print(__name__)
statements in mod1.py
and mod2.py
print mod1
and mod2
, respectively. However, our program showed a name of __main__
, not test
.
This is normal: if the current module is the program being run, __name__
returns __main__
. It's common to see code like this in Python programs to facilitate testing.
if __name__ == '__main__':
# call the program's main processing function
This code runs the entire program when the module is the main program. It does nothing otherwise. This lets you test your code in a more piecemeal style without running the full program version.
__file__
returns the full path name of the current running program. This can help your program find various assets and other resources needed by a program.
import os
print(__file__)
print(os.path.abspath(__file__))
assets = os.path.abspath(f'{__file__}/../assets')
print(assets)
image = f'{assets}/foo.png'
print(image)
Assume we run this code as follows:
mkdir ~/Projects/Bar
cd ~/Projects/Bar
python ../Foo/file.py
This code will output something like this:
/Users/me/Projects/Bar/../Foo/file.py
/Users/me/Projects/Foo/file.py
/Users/me/Projects/Foo/assets
/Users/me/Projects/Foo/assets/foo.png
On line 3 of file.py
, we print the value of __file__
. The output consists of the file name used in the python
command (../Foo/file.py
) appended to the absolute path name of the current working directory (/Users/me/Projects/Bar
).
If you want to eliminate that relative file reference (../Foo/file.py
) from the output, you need to request the absolute path name by passing __file__
as an argument to the path.abspath
function from the os
module. We do this on line 4 of file.py
. This eliminates the /Bar/..
portion of the file name.
On line 5 of file.py
, we want to get the absolute path name of the assets
subdirectory in the project directory (/Users/me/Projects/Foo
). Since we only have __file__
to work with, we can use a relative path name and pass it to os.path.abspath
. To get the relative path name, we just append ../assets
to the value of __file__
.
Finally, on line 8, we print the absolute path name of our foo.png
image file in the assets
folder.
Using __file__
can be a little tricky until you get comfortable with it. You might think it'll be easier to just hardcode the file and folder names in your program. However, that's not a good idea. Once you start distributing the program, you'll lose control over where people will put the project files. By using __file__
, relative path names, and os.path.abspath
, you won't need to care about where people install your software. So long as they don't mess with the folder structure of the project, the program will work.
__dict__
returns a dictionary of all the instance variables defined by an object. This can be helpful in the REPL:
class MyClass:
def __init__(self, x):
self.x = x
self.y = []
self.z = 'xxx'
obj = MyClass(5)
print(obj.__dict__)
# {'x': 5, 'y': [], 'z': 'xxx'}
This chapter provided a fast and short tour of the world of magic methods, also known as dunder methods. We learned how to customize the str
and repr
methods and how to customize comparisons and arithmetic operations. A few magic variables appeared out of nowhere and vanished just as quickly.
Let's do some exercises before we move on.
Create a Car
class that makes the following code work as indicated:
vwbuzz = Car('ID.Buzz', 2024, 'red')
print(vwbuzz) # Red 2024 ID.Buzz
print(repr(vwbuzz)) # Car('ID.Buzz', 2024, 'red')
class Car:
def __init__(self, model, year, color):
self.model = model
self.year = year
self.color = color
def __str__(self):
return f'{self.color.title()} {self.year} {self.model}'
def __repr__(self):
color = repr(self.color)
year = repr(self.year)
model = repr(self.model)
return f'Car({model}, {year}, {color})'
Don't let the mathiness of this problem scare you off. You don't have to know any math; you only need to know how to write code.
Earlier, we wrote the following class:
class Vector:
def __init__(self, x, y):
self.x = x
self.y = y
def __add__(self, other):
if not isinstance(other, Vector):
return NotImplemented
new_x = self.x + other.x
new_y = self.y + other.y
return Vector(new_x, new_y)
# __iadd__ method omitted; we don't need it for this exercise
def __repr__(self):
x = repr(self.x)
y = repr(self.y)
return f'Vector({x}, {y})'
v1 = Vector(5, 12)
v2 = Vector(13, -4)
print(v1 + v2) # Vector(18, 8)
Update this class so the following code works as indicated:
print(v1 - v2) # Vector(-8, 16)
print(v1 * v2) # 17
print(abs(v1)) # 13.0
In this code, the *
operator should compute the dot product of the two vectors. For instance, if you have Vector(a, b)
and Vector(c, d)
, the dot product is a * c + b * d
, where *
and +
are the usual arithmetic operators.
The abs
function computes the magnitude of a vector. If you have a vector Vector(a, b)
, the magnitude is given by sqrt(a**2 + b**2)
. You will need the math
module to access the sqrt
function. Note that abs
is a built-in function, so you don't want to override it entirely; you only want to change its behavior for Vector
objects. There's a magic method you can use.
Don't worry about augmented assignment in this exercise.
#highlight
import math
#endhighlight
class Vector:
def __init__(self, x, y):
self.x = x
self.y = y
def __add__(self, other):
if not isinstance(other, Vector):
return NotImplemented
new_x = self.x + other.x
new_y = self.y + other.y
return Vector(new_x, new_y)
#highlight
def __sub__(self, other):
if not isinstance(other, Vector):
return NotImplemented
new_x = self.x - other.x
new_y = self.y - other.y
return Vector(new_x, new_y)
def __mul__(self, other):
if not isinstance(other, Vector):
return NotImplemented
dot_product = ((self.x * other.x) +
(self.y * other.y))
return dot_product
def __abs__(self):
sum_of_squares = ((self.x ** 2) +
(self.y ** 2))
return math.sqrt(sum_of_squares)
#endhighlight
def __repr__(self):
x = repr(self.x)
y = repr(self.y)
return f'Vector({x}, {y})'
You can override __sub__
and __mul__
to make -
and *
work with Vector
objects. The __abs__
method lets us override the built-in abs
method without changing its ability to compute absolute values for ordinary numbers.
Challenge: Create the classes needed to make the following code work as shown:
mike_jones = Candidate('Mike Jones')
susan_dore = Candidate('Susan Dore')
kim_waters = Candidate('Kim Waters')
candidates = {
mike_jones,
susan_dore,
kim_waters,
}
votes = [
mike_jones,
susan_dore,
mike_jones,
susan_dore,
susan_dore,
kim_waters,
susan_dore,
mike_jones,
]
for candidate in votes:
candidate += 1
election = Election(candidates)
election.results()
Mike Jones: 3 votes
Susan Dore: 4 votes
Kim Waters: 1 votes
Susan Dore won: 50.0% of votes
Don't worry about ties or whether votes
should be singular.
You should use the __iadd__
method to customize the behavior of +=
in the for
loop. __iadd__
is similar to __add__
except that it implements +=
. You don't need a __add__
method.
class Candidate:
def __init__(self, name):
self.name = name
self.votes = 0
def __iadd__(self, other):
if not isinstance(other, int):
return NotImplemented
self.votes += other
return self
class Election:
def __init__(self, candidates):
self.candidates = candidates
def results(self):
max_votes = 0
vote_count = 0
winner = None
for candidate in candidates:
vote_count += candidate.votes
if candidate.votes > max_votes:
max_votes = candidate.votes
winner = candidate.name
for candidate in candidates:
name = candidate.name
votes = candidate.votes
print(f'{name}: {votes} votes')
percent = 100 * (max_votes / vote_count)
print()
print(f'{winner} won: {percent}% of votes')
The __iadd__
method is crucial to this solution. Note that we are adding integers to the Candidate
objects; thus, our __iadd__
method needs to deal with integers to the right of the +
.
We didn't customize +
in this example. Had we done so, we would have had to create a new Candidate
object, which implies multiple Candidate
objects for the same candidate. They might even end up with different vote counts.
In general, you shouldn't customize +
and +=
when you need to create an unwanted new object. Use something like an add_vote
method instead.