2013-08-11 by sourpoi in misc, tagged: python metaclass class

If you are having trouble customizing types or moving beyond basic class definitions, hopefully this will help. In the context of instantiating a simple class, we'll learn about the mechanics that require more attention when working with complex types. In progress - everything after the review is sketchy.

A basic class

As of Python 3.3.2 and 2.7.3, this is how we might use a class definition to define a Person, instantiate a Person named Moe, and have him greet Larry:

class Person(object):

    def __init__(self, name):
        self.name = name

    def say_hello(self, friend):
        print('{} says: Hello {}!'.format(self.name, friend))

moe = Person('Moe')
moe.say_hello('Larry')
  • The Person type declares itself to be based on the built-in Python object,
  • __init__() and say_hello() are included in the Person type's namespace by nesting them within the Person scope,
  • moe, an instance of Person, is created by calling Person() with the argument Moe which is expected by the initializer __init__(),
  • __init__() automatically initializes a Person instance (but where the instance comes from is a mystery), and
  • moe can say_hello() "natively" because say_hello() is a method of Person instances.

Classes - or more generally, types - bundle related data and methods in a namespace. While the Person type above is defined in terms of a class declaration and a nested scope, the components of the Person type can also be assembled manually to the same effect.

Let's follow the evolution of a manually-assembled type to identify the components involved and clarify the assumptions that must be satisfied (the mystery of step #4) in the process of instantiating a basic class.

The basic type and object

The key players in a manual assembly are type.__new__() and object.__new__(). From the docstring:

T.__new__(S, ...) -> a new object with type S, a subtype of T

In context: we'll use type.__new__() to create a new type that object.__new__() will instantiate. Note that the signature of __new__() depends on the type or object for which it was defined.

clsname = 'Person'
bases = (object,)
namespace =  {}
Person = type.__new__(type, clsname, bases, namespace)

moe = object.__new__(Person)
moe.name = 'Moe'

def say_hello(person, target):
    print('{} says: Hello {}!'.format(person.name, target))

say_hello(moe, 'Larry')
  • The Person type is returned by type.__new__() which was provided a) a type object type, b) the new type's name, c) a tuple of base classes the type derives from (here, the basic Python object), and an d) empty namespace dictionary,
  • no methods are explicitly included in the Person type's namespace,
  • moe, an instance of Person, was created by calling the built-in object.__new__() without initialization information,
  • nothing automatically initializes a Person instance so we just tacked a name attribute on to moe, and
  • say_hello() is a just function that accomodates moe or any other object that has a printable name attribute.

Providing type as the first argument in type.__new__(type, ...) requests type.__new__() to return a new "type object." In our case, we've named that type object Person.

Bound methods

In this step we'll formalize the Person type's capabilities by explicitly including them in the Person type's namespace.

def __init__(self, name):
    self.name = name

def say_hello(self, target):
    print('{} says: Hello {}!'.format(self.name, target))

clsname = 'Person'
bases = (object,)
namespace =  {
    '__init__':__init__,
    'say_hello':say_hello }
Person = type.__new__(type, clsname, bases, namespace)

moe = object.__new__(Person)
moe.__init__('Moe')
moe.say_hello('Larry')
  • We create a Person type using same mechanism as above, but this time..
  • __init__() and say_hello() have been included in the Person type's namespace,
  • we use the same instantiation mechanism as above,
  • moe initializes himself by calling his bound method __init__() (details follow), and
  • moe can again say_hello() "natively" because say_hello() was included in the Person type's namespace.

When an instance of a type is created, most of the functions included in the type's namespace become "bound methods" or methods bound to the instance. When an instance's bound method is called, the method implicitly receives the instance it is bound to as its first argument; hence the convention of using self as the first argument to most method signatures and calling an instance's method with the remaining arguments in the signature. [1]

Again, this is true for most of the functions in a type's namespace. __new__() is one exception and it is treated differently by virtue of it's name alone. A function in a type's namespace named __new__() - or, with less pedantry, "a type method named __new__()" - is automatically treated as a static method ..which is a fancy way of saying "a method that does not implicitly receive a first argument." A static method is simply a function in a type's namespace.

Type.__new__()

We'll wrap object.__new__() in Person.__new__() to include all the components needed to instantiate a Person within the Person type's scope.

def __new__(cls):
    return object.__new__(cls)

def __init__(self, name):
    self.name = name

def say_hello(self, target):
    print('{} says: Hello {}!'.format(self.name, target))

clsname = 'Person'
bases = (object,)
namespace = {
    '__new__':__new__,
    '__init__':__init__,
    'say_hello':say_hello }
Person = type.__new__(type, clsname, bases, namespace)

moe = Person.__new__(Person)
moe.__init__('Moe')

moe.say_hello('Larry')
  • We create a Person type using same mechanism as above,
  • __new__() joins the Person type's namespace,
  • moe is now created by calling Person.__new__(Person),
  • moe is initialized the same way as above,
  • say_hello() hasn't changed either.

object is a base of the Person type, so Person.__new__(Person) would wind up calling object.__new__(Person) if __new__() wasn't included in the Person type's namespace. We wrapped object.__new__() instead of eliminating it (as redundant as it is) to emphasize that the Person type is free to override the otherwise inherited behavior of object.__new__().

If we're free to override __new__(), what guarantees that it will return a Person instance, let alone an instance with the expected __init__()? Nothing but your honor.

In most cases, we expect __new__() to return an instance of the type which provided it, and we assume that this instance is compatible with the type's __init__() (if present). Unlike __init__(), __new__() is always present in new-style classes because they inherit from object.

Conventional instantiation

We're almost finished. Although the manual instantiation and initialization was straightforward,

>>> moe = Person.__new__(Person)
>>> moe.__init__('Moe')

..our Person type can't be called conventionally:

>>> larry = Person('Larry')
TypeError: __new__() takes 1 positional argument but 2 were given

This call requires that __new__() receive the type to instantiate as its first argument (as in Person.__new__(Person)) and accomodate any following arguments for the sake of initialization. Because instantiation and initialization are independent steps, the convenience of the conventional call requires that a type's __new__() method, which may be inherited, play nicely with its __init__() method.

To support the conventional call for our case specifically, the least we would need to do is modify our existing Person.__new__()..

def __new__(cls):
    return object.__new__(cls)

..to accomodate the name argument expected by __init__():

def __new__(cls, name):
    return object.__new__(cls)

..regardless of whether or not the argument played a role in instantiation (which in this case, it doesn't).

In an excellent article about __new__(), Arion Sprague summarizes object creation:

def instantiate(cls, *args, **kwargs):
    obj = cls.__new__(cls, *args, **kwargs)
    if isinstance(obj,cls):
        cls.__init__(obj, *args, **kwargs)
    return obj

This assumes that cls.__new__() and cls.__init__() accept those arguments for a given cls, but it's not a bad idea for a type to be accomodating if it invites customization. Also, note that initialization does not necessarily follow instantiation, it depends on isinstance(obj, cls). It is up to you to coordinate __new__() and __init__() if you want a type to behave conventionally.

For our purposes, we'll just include the bare essentials in __new__() and conclude our manual assembly of the Person type.

The assembled type

def __new__(cls, name):
    return object.__new__(cls)

def __init__(self, name):
    self.name = name

def say_hello(self, target):
    print('{} says: Hello {}!'.format(self.name, target))

clsname = 'Person'
bases = (object,)
namespace = {
    '__new__':__new__,
    '__init__':__init__,
    'say_hello':say_hello }
Person = type.__new__(type, clsname, bases, namespace)

moe = Person('Moe')
moe.say_hello('Larry')

Review

Mechanisms that Python enforces:

  • The built-in type.__new__() returns a new type (or class).
  • The built-in object.__new__() returns a new object instance.
  • "Bound methods" automatically receive their instance as a first argument. Most methods are bound by default after instantiation.
  • A type's __new__() method, if it exists, is treated as a static method. A "static method" acts like a normal function within a type's namespace; it does not automatically receive a first argument.

Conventions that you probably want to follow:

  • __new__() expects to receive the type to instantiate as it's first argument and is expected to return an instance of that type. Initialization code should be left in __init__().
  • __init__() expects to receive an instance of the type which defined it as it's first argument and is expected to initialize that instance. It is not expected to return anything.
  • A convential type or class call of the form obj = Type(*args, **kwargs) requires that Type.__new__() accomodate the same *args, **kwargs that Type.__init__() accepts.

__new__() considers super()

By convention, calling Type.__new__(Type, ...) should return an instance of Type. If Type does not define it's own __new__(), Base.__new__(Type, ...) should deliver an instance of Type if Type inherits from Base (we used object, the default base type, above). If Type does define it's own __new__(), then it must use a base type's __new__() method to avoid infinite recursion.

class Type(Base):

    # Results in RuntimeError: maximum recursion depth exceeded
    #
    #def __new__(cls, arg):
    #    return Type.__new__(cls)

    # Better.
    def __new__(cls, arg):
        return Base.__new__(cls)

    def __init__(self, arg):
        self.arg = arg
        print("I coordinated the arguments of __init__ and __new__!")
TODO:

Talk about using super() as a convenience, then move on to __mro__. Maybe add something about https://fuhm.net/super-harmful/ and https://rhettinger.wordpress.com/2011/05/26/super-considered-super/, opinionize that super() isn't really about a generic capability like an iterator or context manager - it should be treated something like a decorator: a convenience within the context of a framework or library. Using them for more than a convenience assumes that you're coordinating classes and informing your consumers.

TODO:

Also emphasize the functional connection between inherited methods, self, and the original __mro__ that super() satisfies.

Class protocol

TODO:

The beauty of Python's native protocols (descriptors, context managers, etc.) is that they are conventions which leverage familiar, built-in mechanisms; in order to take full advantage you just learn some new magic methods and understand how Python implements them. I'm not aware of a Python "class protocol" per se, but the built-in type and object capabilities and the relationship between __new__() and __init__() resembles one. Go into more detail about Python's object-orientedness being built on functions and namespaces..

[1]On the other hand, functions destined to be bound to an instance are considered "unbound" while they are just sitting in the type's namespace. It is unusual, but you can call a type's unbound method with a compatible object as the first argument: MyType.method(some_object, ...)

Comments