Magic Methods

Python has over 100 special methods that let developers customize the behavior of their objects. These method's names are surrounded by double underscores (e.g., __init__, __new__, __str__, __eq__, etc.). You've already met some of these names. They are collectively known as magic methods or dunder methods (dunder stands for double underscore).

There are also a small number of magic variables and magic functions. We'll meet a few of these, but not many.

We've already seen __new__ and __init__, so we won't cover them again. We should mention that they help customize the creation and initialization of new objects. You will create __init__ for nearly every class you define. However, you may spend your entire career without ever writing a custom __new__ method.

When speaking, it's common to pronounce __something__ as "dunder something". Thus, __new__ is "dunder new" and __init__ is "dunder init". For this reason, we use "a" instead of "an" when an indefinite article is required: "a __init__ method", not "an __init__ method".

Let's look at some of the most common magic methods.

The __str__ and __repr__ Methods

We met the str and repr built-in functions In our previous Python book. As you may recall, they both return string representations of an object. The return value of str is meant to be a human-readable representation of an object. In contrast, repr typically depicts how you would recreate an object.

In most cases, str and repr return the same value. However, this isn't universally true. For example, the datetime.datetime type has differing representations:

from datetime import datetime

dt = datetime.now()
print(str(dt))      # 2023-09-21 21:04:54.036563
print(repr(dt))
# datetime.datetime(2023, 9, 21, 21, 4, 54, 36563)

One of the coolest aspects of str and repr is that they work with every object in Python, regardless of type. That's not always obvious, however:

class Cat:

    def __init__(self, name):
        self.name = name

cat = Cat('Fuzzy')
print(str(cat))  # <__main__.Cat object at 0x...>
print(repr(cat)) # <__main__.Cat object at 0x...>

What's happening here? When Python tries to call str(cat), it looks for a __str__ method in the Cat class. Likewise, when it tries to call repr(cat), it looks for Cat.__repr__. Since neither method exists, Python looks elsewhere. But where?

Every object in Python ultimately inherits from the object class. In fact, this is the default superclass for a class that doesn't explicitly subclass another class. Thus, our Cat class inherits from object.

As it happens, object.__str__ and object.__repr__ produce the above output.

Suppose you want to define class-specific str and repr methods for a class. All you have to do is add __str__ and __repr__ instance methods to the class:

class Cat:

    def __init__(self, name):
        self.name = name

    #highlight
    def __str__(self):
        return self.name

    def __repr__(self):
        return f'Cat({repr(self.name)})'
    #endhighlight

cat = Cat('Fuzzy')
#highlight
print(str(cat))  # Fuzzy
print(repr(cat)) # Cat('Fuzzy')
#endhighlight

Note that we use repr(self.name) to format the name argument depicted by the return value. That's what puts the quotes around Fuzzy in the output using repr. It's good practice to use repr like this in your __repr__ methods. There's a world of difference between Person('Hall, Annie') and Person(Hall, Annie). The first value (Person('Hall, Annie')) can be copy and pasted into your code to create a new Person object. If you copy and paste Person(Hall, Annie), your code will most likely raise an error.

When a program calls str on an object, Python first searches for a __str__ method in the object. If it finds one, it invokes that method to determine the string representation. If it doesn't find a __str__ method in the object, it then searches any classes it inherits from (we'll explore inheritance later). If it finds __str__ in one of the inherited classes, it will use that method. If Python doesn't find a __str__ method anywhere, it next looks for a __repr__ method using the same search mechanism used for the __str__ method. If it can't find a __repr__ method anywhere, it calls object.__str__, which returns a somewhat meaningless string tha usually looks something like this:

<__main__.MyType object at 0x1052828a0>

When a program calls repr on an object, Python takes a similar path to finding an appropriate __repr__ method. Note that Python never searches for __str__ when it is responding to a call to repr.

# Class definition omitted

cat = Cat('Fuzzy')

# Cat has both __str__ and __repr__
print(str(cat))  # Fuzzy
print(repr(cat)) # Cat('Fuzzy')

# Cat has __str__ but not __repr__
print(str(cat))  # Fuzzy
print(repr(cat)) # <__main__.Cat object at 0x...>

# Cat has __repr__ but not __str__
print(str(cat))  # Cat('Fuzzy')
print(repr(cat)) # Cat('Fuzzy')

# Cat has neither __repr__ nor __str__
print(str(cat))  # <__main__.Cat object at 0x...>
print(repr(cat)) # <__main__.Cat object at 0x...>

Lines 11 and 19 use object.__repr__, while line 18 uses object.__str__. In the remaining print invocations, Python calls either Cat.__repr__ or Cat.__str__ as appropriate.

Python implicitly calls str or repr in a variety of places:

  • It implicitly calls str on each positional argument passed to the print function.
  • It implicitly calls str when performing string interpolation, as in an f-string.
  • It implicitly calls repr when printing the elements of a container object.

The Comparison Methods

You may recall that you can compare most Python types for equality with the == or != operators. You can also compare many types as ordered quantities with <, <=, >, and >=. Here are the magic methods that correspond to all these operators:

Operator Method Description
== __eq__ Equal to
!= __ne__ Not equal to
< __lt__ Less than
<= __le__ Less than or equal to
> __gt__ Greater than
>= __ge__ Greater than or equal to

Customizing == and !=

Let's first see what happens with == and != when __eq__ and __ne__ aren't defined for an object. We'll use the Cat class from above:

# Class definition omitted

fuzzy = Cat('Fuzzy')
fluffy = Cat('Fluffy')
fluffy2 = Cat('Fluffy')

print(fuzzy == fluffy)        # False
print(fluffy == fluffy)       # True
print(fuzzy != fluffy)        # True
print(fuzzy != fuzzy)         # False

print(fluffy == fluffy2)      # False
print(fluffy != fluffy2)      # True

There shouldn't be any surprises here on lines 7-10. fuzzy is obviously not the same as fluffy, but fluffy is clearly fluffy. The inequalities also work as expected.

However, things get a little strange on lines 12 and 13. fluffy and fluffy2 represent someone named Fluffy. However, the objects are not equal to each other. This worked when we previously compared the fluffy object with itself, but not now. The problem is that fluffy and fluffy2 are distinct objects. By default, Python assumes that two custom objects are only equal when they are the same object.

If you need more control over equality, you need the __eq__ and __ne__ methods. Without them, Python assumes that equal objects are the same object. With these methods defined, however, Python uses them to check for equality.

class Cat:

    def __init__(self, name):
        self.name = name

    # __str__ and __repr__ omitted

    def __eq__(self, other):
        return self.name == other.name

    def __ne__(self, other):
        return self.name != other.name

fuzzy = Cat('Fuzzy')
fluffy = Cat('Fluffy')
fluffy2 = Cat('Fluffy')

print(fuzzy == fluffy)        # False
print(fluffy == fluffy)       # True
print(fuzzy != fluffy)        # True
print(fuzzy != fuzzy)         # False

print(fluffy == fluffy2)      # True
print(fluffy != fluffy2)      # False

Note that lines 23 and 24 return values reflecting the equality of the fluffy and fluffy2 objects.

Inheritance plays a big role in how == works. When Python sees an expression like fluffy == fluffy2, it tries to find a __eq__ method in fluffy's class. That is, it tries to find Cat.__eq__. If the method exists, Python calls it as fluffy.__eq__(fluffy2).

However, if Cat.__eq__ doesn't exist, Python looks elsewhere. In the case of Cat, it looks to the object class -- a class that all objects inherit from. There it finds object.__eq__, which it calls to evaluate fluffy == fluffy2. object.__eq__ checks whether two objects are the same object, so fluffy1.__eq__(fluffy2) returns True only when fluffy1 and fluffy2 reference the same object.

Since the object class has no state, there's nothing that object.__eq__ can compare for equality. As a result, it defaults treating object identity as equality. That is, two objects are equal only when they are the same object.

This will make more sense when you read the next chapter.

Comparing Different Types

What happens when Python encounters an expression like a == b where the types of a and b are different? Essentially, it first calls a.__eq__(b). If the method doesn't know how to handle an object of b's type, Python then calls b.__eq__(a). If that also doesn't know what to do, Python returns the value of a is b. In more detail, the process looks like the following pseudocode:

  • Python calls a.__eq__(b).
  • If the return value is a boolean, a == b evaluates to that value.
  • If the return value is NotImplemented:
    • Python calls b.__eq__(a).
    • If the return value is a boolean, a == b evaluates to that value.
    • If the return value is NotImplemented:
      • a == b evaluates as a is b, which will usually be False.

This is a simplified view of how Python evaluates a == b, as there are some subtle differences. However, it does a good job as a mental model.

We're not going to go into any detail about why this process is important. However, if you're not careful to return NotImplemented when appropriate, you may see strange and unexpected behavior. For instance, in the right circumstances, if a == b returns True, b == a might return False. That's rarely what you want.

With that information, we can rewite our __eq__ and __ne__ to better conform to Python's expectations:

class Cat:

    def __init__(self, name):
        self.name = name

    def __eq__(self, other):
        if not isinstance(other, Cat):
            return NotImplemented

        return self.name == other.name

    def __ne__(self, other):
        if not isinstance(other, Cat):
            return NotImplemented

        return self.name != other.name

If you want your custom classes to conform with the built-in classes, your __eq__ and __ne__ methods should work similarly.

You can skip the type checks when there's little chance of performing a comparison with a different object type. For instance, a nested class that is marked for internal use can most likely avoid the problem:

class Person:

    class _Name:

        def __init__(self, name):
            self.name = name

        def __eq__(self, other):
            return self.name == other.name

        def __ne__(self, other):
            return self.name != other.name

    def __init__(self, name1, name2):
        print(self._Name(name1) == self._Name(name2))

Person('John', 'John')           # True
Person('Alice', 'Allison')       # False

Since _Name is meant for internal use, you don't have to account for people who violate the single underscore convention. Presumably, they know they shouldn't use _Name, and you know that _Name objects will never be compared with non-_Name objects.

Customizing <, <=, >, and >=

Let's see what happens when we try to perform ordered comparisons:

# Cat class omitted for brevity

fluffy = Cat('Fluffy')
whiskers = Cat('Whiskers')

print(fluffy == whiskers)          # False
print(fluffy != whiskers)          # True

#highlight
print(fluffy < whiskers)
# TypeError: '<' not supported between instances of
# 'Cat' and 'Cat'
#endhighlight

Okay, we don't have a working < operator this time. (We also don't have working <=, >, or >= operators). Let's define them; we'll compare Cat objects based on the cat names:

class Cat:

    def __init__(self, name):
        self.name = name

    def __eq__(self, other):
        if not isinstance(other, Cat):
            return NotImplemented

        return self.name == other.name

    def __ne__(self, other):
        if not isinstance(other, Cat):
            return NotImplemented

        return self.name != other.name

    #highlight
    def __lt__(self, other):
        if not isinstance(other, Cat):
            return NotImplemented

        return self.name < other.name

    def __le__(self, other):
        if not isinstance(other, Cat):
            return NotImplemented

        return self.name <= other.name

    def __gt__(self, other):
        if not isinstance(other, Cat):
            return NotImplemented

        return self.name > other.name

    def __ge__(self, other):
        if not isinstance(other, Cat):
            return NotImplemented

        return self.name >= other.name
    #endhighlight

#highlight
fluffy = Cat('Fluffy')
fluffy2 = Cat('Fluffy')
whiskers = Cat('Whiskers')

print(fluffy < whiskers)      # True
print(fluffy <= whiskers)     # True
print(fluffy <= fluffy2)      # True
print(fluffy > whiskers)      # False
print(fluffy >= whiskers)     # False
print(fluffy >= fluffy2)      # True
#endhighlight

If you walk through the last 6 statements, you should see that they all return the correct values for the comparisons.

That code is really repetitive, though. You can use functools.total_ordering to get around this repetitiveness. However, there are really subtle subtleties involved when using it with comparison methods that can return NotImplemented. It's usually wiser to write your own code.

One last thing: it's worth noting that you don't normally want to use isinstance in your code; this is not very object-oriented. However, using isinstance in many dunder methods is almost mandatory.

The Arithmetic Methods

We won't cover the arithmetic operator methods in great detail, but we can look at one. The __add__ method lets you control how the + operator works with a custom class, while __iadd__ handles augmented assignment with +=. We'll use vector addition as our example:

Adding two vectors, <a, b> and <c, d>, yields the vector result, <a + c, b + d>.

class Vector:

    def __init__(self, x, y):
        self.x = x
        self.y = y

    def __add__(self, other):
        if not isinstance(other, Vector):
            return NotImplemented

        new_x = self.x + other.x
        new_y = self.y + other.y
        return Vector(new_x, new_y)

    def __iadd__(self, other):
        if not isinstance(other, Vector):
            return NotImplemented

        self.x += other.x
        self.y += other.y
        return self

    def __repr__(self):
        x = repr(self.x)
        y = repr(self.y)
        return f'Vector({x}, {y})'

v1 = Vector(5, 12)
v2 = Vector(13, -4)
print(v1 + v2)   # Vector(18, 8)

As you can see, __add__ takes an other argument then returns a new vector that represents the sum of the self and other vectors. It's "called" by using the + operator. __iadd__, on the other hand, performs an in-place operation by mutating the object to the left of +=. You should normally define __iadd__ when you define __add__. If you don't, Python falls back to using __add__, which may not work as you expect.

Notice that we followed the same "check the type and return NotImplemented" pattern used with the comparison operators. We also returned self from the __iadd__ method; you must do that for all augmented assignment methods.

Other arithmetic operator methods include __sub__ (subtraction with -), __mul__ (multiplication with *), __truediv__ (floating division with /), and __floordiv__ (integer division with //). There are several more you can use. As with __add__ and __iadd__, you should normally define the __isub__, __imul__, __itruediv__, and __ifloordiv__ methods when you define the primary method.

Defining arithmetic operators for custom types can lead to elegant code, but only when their use is intuitive and consistent with the rest of Python. The operators must make sense and generally be limited to numeric and sequence types. Don't define arithmetic operators simply because it's cool; it probably isn't.

In particular, the arithmetic operators should obey the commutative and associative laws of arithmetic, as appropriate. For example, + and * should be commutative and associative.

  • Commutative law:
    • a + b == b + a
    • a * b == b * a
  • Associative law:
    • a + (b + c) == (a + b) + c
    • a * (b * c) == (a * b) * c

Most other operators do not have to be commutative or associative.

It's worth noting that concatenation isn't commutative, yet Python uses + for strings, lists, and tuples. One could argue that providing + for concatenation is non-intuitive. However, it is familiar and comfortable to many developers, not just Python programmers. At this point, concatenation is a perfectly acceptable use for the + operator.

The * operator for strings, lists, and tuples is commutative, associative, and relatively intuitive. Performing repetition for other types that support concatenation is acceptable.

Think carefully before defining arithmetic operators for non-arithmetic classes. Operators should be intuitive and consistent with Python's built-in types, or they may lead to confusion and errors.

Magic Variables

Python has a handful of magic variables, aka dunder variables, that are primarily useful for debugging and testing. Let's look at a few of them.

The __name__ Variable

__name__ returns the current module's name as a string. Here's a simple example involving 2 modules and a main program:

print(__name__)
print(__name__)
import mod1
import mod2

print(__name__)
mod1
mod2
__main__

As you can see, the print(__name__) statements in mod1.py and mod2.py print mod1 and mod2, respectively. However, our program showed a name of __main__, not test.

This is normal: if the current module is the program being run, __name__ returns __main__. It's common to see code like this in Python programs to facilitate testing.

if __name__ == '__main__':
    # call the program's main processing function

This code runs the entire program when the module is the main program. It does nothing otherwise. This lets you test your code in a more piecemeal style without running the full program version.

The __file__ Variable

__file__ returns the full path name of the current running program. This can help your program find various assets and other resources needed by a program.

import os

print(__file__)
print(os.path.abspath(__file__))
assets = os.path.abspath(f'{__file__}/../assets')
print(assets)

image = f'{assets}/foo.png'
print(image)

Assume we run this code as follows:

mkdir ~/Projects/Bar
cd ~/Projects/Bar
python ../Foo/file.py

This code will output something like this:

/Users/me/Projects/Bar/../Foo/file.py
/Users/me/Projects/Foo/file.py
/Users/me/Projects/Foo/assets
/Users/me/Projects/Foo/assets/foo.png

On line 3 of file.py, we print the value of __file__. The output consists of the file name used in the python command (../Foo/file.py) appended to the absolute path name of the current working directory (/Users/me/Projects/Bar).

If you want to eliminate that relative file reference (../Foo/file.py) from the output, you need to request the absolute path name by passing __file__ as an argument to the path.abspath function from the os module. We do this on line 4 of file.py. This eliminates the /Bar/.. portion of the file name.

On line 5 of file.py, we want to get the absolute path name of the assets subdirectory in the project directory (/Users/me/Projects/Foo). Since we only have __file__ to work with, we can use a relative path name and pass it to os.path.abspath. To get the relative path name, we just append ../assets to the value of __file__.

Finally, on line 8, we print the absolute path name of our foo.png image file in the assets folder.

Using __file__ can be a little tricky until you get comfortable with it. You might think it'll be easier to just hardcode the file and folder names in your program. However, that's not a good idea. Once you start distributing the program, you'll lose control over where people will put the project files. By using __file__, relative path names, and os.path.abspath, you won't need to care about where people install your software. So long as they don't mess with the folder structure of the project, the program will work.

The __dict__ Variable

__dict__ returns a dictionary of all the instance variables defined by an object. This can be helpful in the REPL:

class MyClass:

    def __init__(self, x):
        self.x = x
        self.y = []
        self.z = 'xxx'

obj = MyClass(5)
print(obj.__dict__)
# {'x': 5, 'y': [], 'z': 'xxx'}

Summary

This chapter provided a fast and short tour of the world of magic methods, also known as dunder methods. We learned how to customize the str and repr methods and how to customize comparisons and arithmetic operations. A few magic variables appeared out of nowhere and vanished just as quickly.

Let's do some exercises before we move on.

Exercises

  1. Create a Car class that makes the following code work as indicated:

    vwbuzz = Car('ID.Buzz', 2024, 'red')
    print(vwbuzz)        # Red 2024 ID.Buzz
    print(repr(vwbuzz))  # Car('ID.Buzz', 2024, 'red')
    

    Solution

    class Car:
    
        def __init__(self, model, year, color):
            self.model = model
            self.year = year
            self.color = color
    
        def __str__(self):
            return f'{self.color.title()} {self.year} {self.model}'
    
        def __repr__(self):
            color = repr(self.color)
            year = repr(self.year)
            model = repr(self.model)
            return f'Car({model}, {year}, {color})'
    
  2. Don't let the mathiness of this problem scare you off. You don't have to know any math; you only need to know how to write code.

    Earlier, we wrote the following class:

    class Vector:
    
        def __init__(self, x, y):
            self.x = x
            self.y = y
    
        def __add__(self, other):
            if not isinstance(other, Vector):
                return NotImplemented
    
            new_x = self.x + other.x
            new_y = self.y + other.y
            return Vector(new_x, new_y)
    
        # __iadd__ method omitted; we don't need it for this exercise
    
        def __repr__(self):
            x = repr(self.x)
            y = repr(self.y)
            return f'Vector({x}, {y})'
    
    v1 = Vector(5, 12)
    v2 = Vector(13, -4)
    print(v1 + v2)      # Vector(18, 8)
    

    Update this class so the following code works as indicated:

    print(v1 - v2) # Vector(-8, 16)
    print(v1 * v2) # 17
    print(abs(v1)) # 13.0
    

    In this code, the * operator should compute the dot product of the two vectors. For instance, if you have Vector(a, b) and Vector(c, d), the dot product is a * c + b * d, where * and + are the usual arithmetic operators.

    The abs function computes the magnitude of a vector. If you have a vector Vector(a, b), the magnitude is given by sqrt(a**2 + b**2). You will need the math module to access the sqrt function. Note that abs is a built-in function, so you don't want to override it entirely; you only want to change its behavior for Vector objects. There's a magic method you can use.

    Don't worry about augmented assignment in this exercise.

    Solution

    #highlight
    import math
    #endhighlight
    
    class Vector:
    
        def __init__(self, x, y):
            self.x = x
            self.y = y
    
        def __add__(self, other):
            if not isinstance(other, Vector):
                return NotImplemented
    
            new_x = self.x + other.x
            new_y = self.y + other.y
            return Vector(new_x, new_y)
    
        #highlight
        def __sub__(self, other):
            if not isinstance(other, Vector):
                return NotImplemented
    
            new_x = self.x - other.x
            new_y = self.y - other.y
            return Vector(new_x, new_y)
    
        def __mul__(self, other):
            if not isinstance(other, Vector):
                return NotImplemented
    
            dot_product = ((self.x * other.x) +
                            (self.y * other.y))
            return dot_product
    
        def __abs__(self):
            sum_of_squares = ((self.x ** 2) +
                              (self.y ** 2))
            return math.sqrt(sum_of_squares)
        #endhighlight
    
        def __repr__(self):
            x = repr(self.x)
            y = repr(self.y)
            return f'Vector({x}, {y})'
    

    You can override __sub__ and __mul__ to make - and * work with Vector objects. The __abs__ method lets us override the built-in abs method without changing its ability to compute absolute values for ordinary numbers.

  3. Challenge: Create the classes needed to make the following code work as shown:

    mike_jones = Candidate('Mike Jones')
    susan_dore = Candidate('Susan Dore')
    kim_waters = Candidate('Kim Waters')
    
    candidates = {
        mike_jones,
        susan_dore,
        kim_waters,
    }
    
    votes = [
        mike_jones,
        susan_dore,
        mike_jones,
        susan_dore,
        susan_dore,
        kim_waters,
        susan_dore,
        mike_jones,
    ]
    
    for candidate in votes:
        candidate += 1
    
    election = Election(candidates)
    election.results()
    
    Mike Jones: 3 votes
    Susan Dore: 4 votes
    Kim Waters: 1 votes
    
    Susan Dore won: 50.0% of votes
    

    Don't worry about ties or whether votes should be singular.

    You should use the __iadd__ method to customize the behavior of += in the for loop. __iadd__ is similar to __add__ except that it implements +=. You don't need a __add__ method.

    Solution

    class Candidate:
    
        def __init__(self, name):
            self.name = name
            self.votes = 0
    
        def __iadd__(self, other):
            if not isinstance(other, int):
                return NotImplemented
    
            self.votes += other
            return self
    
    class Election:
    
        def __init__(self, candidates):
            self.candidates = candidates
    
        def results(self):
            max_votes = 0
            vote_count = 0
            winner = None
    
            for candidate in candidates:
                vote_count += candidate.votes
                if candidate.votes > max_votes:
                    max_votes = candidate.votes
                    winner = candidate.name
    
            for candidate in candidates:
                name = candidate.name
                votes = candidate.votes
                print(f'{name}: {votes} votes')
    
            percent = 100 * (max_votes / vote_count)
            print()
            print(f'{winner} won: {percent}% of votes')
    

    The __iadd__ method is crucial to this solution. Note that we are adding integers to the Candidate objects; thus, our __iadd__ method needs to deal with integers to the right of the +.

    We didn't customize + in this example. Had we done so, we would have had to create a new Candidate object, which implies multiple Candidate objects for the same candidate. They might even end up with different vote counts.

    In general, you shouldn't customize + and += when you need to create an unwanted new object. Use something like an add_vote method instead.