Python2.自定义扩展类型:教程

python lib

Python 允许编写 C 扩展模块定义可以从 Python 代码中操纵的新类型,这很像内置的 strlist 类型。所有扩展类型的代码都遵循一个模式,但是在您开始之前,您需要了解一些细节。这份文件是对这个主题介绍。

2.1. 基础¶

CPython 运行时将所有 Python 对象都视为类型 PyObject* 的变量,即所有 Python 对象的"基础类型"。 PyObject 结构体本身包含了对象的 reference count 和对象的"类型对象"。 类型对象确定解释器需要调用哪些 (C) 函数,例如一个属性查询一个对象,一个方法调用,或者与另一个对象相乘。 这些 C 函数被称为“类型方法”。

所以,如果你想要定义新的扩展类型,需要创建新的类型对象。

这类事情只能用例子解释,这里用一个最小化但完整的的模块,定义了新的类型叫做 Custom 在C扩展模块 custom 里。

注解

这里展示的方法是定义 static 扩展类型的传统方法。可以适合大部分用途。C API也可以定义在堆上分配的扩展类型,使用 PyType_FromSpec() 函数,但不在本入门里讨论。

#define PY_SSIZE_T_CLEAN

#include<Python.h>

typedefstruct{

PyObject_HEAD

/* Type-specific fields go here. */

}CustomObject;

staticPyTypeObjectCustomType={

PyVarObject_HEAD_INIT(NULL,0)

.tp_name="custom.Custom",

.tp_doc="Custom objects",

.tp_basicsize=sizeof(CustomObject),

.tp_itemsize=0,

.tp_flags=Py_TPFLAGS_DEFAULT,

.tp_new=PyType_GenericNew,

};

staticPyModuleDefcustommodule={

PyModuleDef_HEAD_INIT,

.m_name="custom",

.m_doc="Example module that creates an extension type.",

.m_size=-1,

};

PyMODINIT_FUNC

PyInit_custom(void)

{

PyObject*m;

if(PyType_Ready(&CustomType)<0)

returnNULL;

m=PyModule_Create(&custommodule);

if(m==NULL)

returnNULL;

Py_INCREF(&CustomType);

if(PyModule_AddObject(m,"Custom",(PyObject*)&CustomType)<0){

Py_DECREF(&CustomType);

Py_DECREF(m);

returnNULL;

}

returnm;

}

这部分很容易理解,这是为了跟上一章能对接上。这个文件定义了三件事:

  1. Custom 类的对象 object 包含了: CustomObject 结构,这会为每个 Custom 实例分配一次。

  2. Custom type 的行为:这是 CustomType 结构体,其定义了一堆标识和函数指针,会指向解释器里请求的操作。

  3. 初始化 custom 模块: PyInit_custom 函数和对应的 custommodule 结构体。

结构的第一块是

typedefstruct{

PyObject_HEAD

}CustomObject;

这就是一个自定义对象所包含的。 PyObject_HEAD 是强制要求必须在每个对象结构体之前,用以定义一个类型为 PyObject 的字段叫 ob_base ,包含了一个指针指向类型对象和一个引用计数(这可以用宏 Py_REFCNTPy_TYPE 来区分)。用宏来抽象,使得附加字段可以用调试构建。

注解

注意在宏 PyObject_HEAD 后没有分号。意外添加分号会导致编译器提示出错。

当然,对象除了在 PyObject_HEAD 存储数据外,还有额外数据;例如,如下定义了标准的Python浮点数:

typedefstruct{

PyObject_HEAD

doubleob_fval;

}PyFloatObject;

第二个位是类型对象的定义:

staticPyTypeObjectCustomType={

PyVarObject_HEAD_INIT(NULL,0)

.tp_name="custom.Custom",

.tp_doc="Custom objects",

.tp_basicsize=sizeof(CustomObject),

.tp_itemsize=0,

.tp_flags=Py_TPFLAGS_DEFAULT,

.tp_new=PyType_GenericNew,

};

注解

推荐使用如上C99风格的初始化,以避免列出所有的 PyTypeObject 字段,其中很多是你不需要关心的,这样也可以避免关注字段的定义顺序。

object.h 中实际定义的 PyTypeObject 具有比如上定义更多的 字段。 剩余的字段会由 C 编译器用零来填充,通常的做法是不显式地指定它们,除非你确实需要它们。

我们先挑选一部分,每次一个字段:

PyVarObject_HEAD_INIT(NULL,0)

这一行是强制的样板,用以初始化如上提到的 ob_base 字段:

.tp_name="custom.Custom",

我们的类型的名称。 这将出现在我们的对象的默认文本表示形式和某些错误消息中,例如:

>>> ""+custom.Custom()

Traceback (most recent call last):

File "<stdin>", line 1, in <module>

TypeError: can only concatenate str (not "custom.Custom") to str

Note that the name is a dotted name that includes both the module name and the

name of the type within the module. The module in this case is custom and

the type is Custom, so we set the type name to custom.Custom.

Using the real dotted import path is important to make your type compatible

with the pydoc and pickle modules.

.tp_basicsize=sizeof(CustomObject),

.tp_itemsize=0,

This is so that Python knows how much memory to allocate when creating

new Custom instances. tp_itemsize is

only used for variable-sized objects and should otherwise be zero.

注解

If you want your type to be subclassable from Python, and your type has the same

tp_basicsize as its base type, you may have problems with multiple

inheritance. A Python subclass of your type will have to list your type first

in its __bases__, or else it will not be able to call your type's

__new__() method without getting an error. You can avoid this problem by

ensuring that your type has a larger value for tp_basicsize than its

base type does. Most of the time, this will be true anyway, because either your

base type will be object, or else you will be adding data members to

your base type, and therefore increasing its size.

We set the class flags to Py_TPFLAGS_DEFAULT.

.tp_flags=Py_TPFLAGS_DEFAULT,

All types should include this constant in their flags. It enables all of the

members defined until at least Python 3.3. If you need further members,

you will need to OR the corresponding flags.

We provide a doc string for the type in tp_doc.

.tp_doc="Custom objects",

To enable object creation, we have to provide a tp_new

handler. This is the equivalent of the Python method __new__(), but

has to be specified explicitly. In this case, we can just use the default

implementation provided by the API function PyType_GenericNew().

.tp_new=PyType_GenericNew,

Everything else in the file should be familiar, except for some code in

PyInit_custom():

if(PyType_Ready(&CustomType)<0)

return;

This initializes the Custom type, filling in a number of members

to the appropriate default values, including ob_type that we initially

set to NULL.

Py_INCREF(&CustomType);

if(PyModule_AddObject(m,"Custom",(PyObject*)&CustomType)<0){

Py_DECREF(&CustomType);

Py_DECREF(m);

returnNULL;

}

This adds the type to the module dictionary. This allows us to create

Custom instances by calling the Custom class:

>>> importcustom

>>> mycustom=custom.Custom()

That's it! All that remains is to build it; put the above code in a file called

custom.c and:

python notranslate">
fromdistutils.coreimportsetup,Extension

setup(name="custom",version="1.0",

ext_modules=[Extension("custom",["custom.c"])])

in a file called setup.py; then typing

$ python setup.py build

at a shell should produce a file custom.so in a subdirectory; move to

that directory and fire up Python --- you should be able to importcustom and

play around with Custom objects.

这并不难,对吗?

Of course, the current Custom type is pretty uninteresting. It has no data and

doesn't do anything. It can't even be subclassed.

注解

While this documentation showcases the standard distutils module

for building C extensions, it is recommended in real-world use cases to

use the newer and better-maintained setuptools library. Documentation

on how to do this is out of scope for this document and can be found in

the Python Packaging User's Guide.

2.2. Adding data and methods to the Basic example¶

Let's extend the basic example to add some data and methods. Let's also make

the type usable as a base class. We'll create a new module, custom2 that

adds these capabilities:

#define PY_SSIZE_T_CLEAN

#include<Python.h>

#include"structmember.h"

typedefstruct{

PyObject_HEAD

PyObject*first;/* first name */

PyObject*last;/* last name */

intnumber;

}CustomObject;

staticvoid

Custom_dealloc(CustomObject*self)

{

Py_XDECREF(self->first);

Py_XDECREF(self->last);

Py_TYPE(self)->tp_free((PyObject*)self);

}

staticPyObject*

Custom_new(PyTypeObject*type,PyObject*args,PyObject*kwds)

{

CustomObject*self;

self=(CustomObject*)type->tp_alloc(type,0);

if(self!=NULL){

self->first=PyUnicode_FromString("");

if(self->first==NULL){

Py_DECREF(self);

returnNULL;

}

self->last=PyUnicode_FromString("");

if(self->last==NULL){

Py_DECREF(self);

returnNULL;

}

self->number=0;

}

return(PyObject*)self;

}

staticint

Custom_init(CustomObject*self,PyObject*args,PyObject*kwds)

{

staticchar*kwlist[]={"first","last","number",NULL};

PyObject*first=NULL,*last=NULL,*tmp;

if(!PyArg_ParseTupleAndKeywords(args,kwds,"|OOi",kwlist,

&first,&last,

&self->number))

return-1;

if(first){

tmp=self->first;

Py_INCREF(first);

self->first=first;

Py_XDECREF(tmp);

}

if(last){

tmp=self->last;

Py_INCREF(last);

self->last=last;

Py_XDECREF(tmp);

}

return0;

}

staticPyMemberDefCustom_members[]={

{"first",T_OBJECT_EX,offsetof(CustomObject,first),0,

"first name"},

{"last",T_OBJECT_EX,offsetof(CustomObject,last),0,

"last name"},

{"number",T_INT,offsetof(CustomObject,number),0,

"custom number"},

{NULL}/* Sentinel */

};

staticPyObject*

Custom_name(CustomObject*self,PyObject*Py_UNUSED(ignored))

{

if(self->first==NULL){

PyErr_SetString(PyExc_AttributeError,"first");

returnNULL;

}

if(self->last==NULL){

PyErr_SetString(PyExc_AttributeError,"last");

returnNULL;

}

returnPyUnicode_FromFormat("%S %S",self->first,self->last);

}

staticPyMethodDefCustom_methods[]={

{"name",(PyCFunction)Custom_name,METH_NOARGS,

"Return the name, combining the first and last name"

},

{NULL}/* Sentinel */

};

staticPyTypeObjectCustomType={

PyVarObject_HEAD_INIT(NULL,0)

.tp_name="custom2.Custom",

.tp_doc="Custom objects",

.tp_basicsize=sizeof(CustomObject),

.tp_itemsize=0,

.tp_flags=Py_TPFLAGS_DEFAULT|Py_TPFLAGS_BASETYPE,

.tp_new=Custom_new,

.tp_init=(initproc)Custom_init,

.tp_dealloc=(destructor)Custom_dealloc,

.tp_members=Custom_members,

.tp_methods=Custom_methods,

};

staticPyModuleDefcustommodule={

PyModuleDef_HEAD_INIT,

.m_name="custom2",

.m_doc="Example module that creates an extension type.",

.m_size=-1,

};

PyMODINIT_FUNC

PyInit_custom2(void)

{

PyObject*m;

if(PyType_Ready(&CustomType)<0)

returnNULL;

m=PyModule_Create(&custommodule);

if(m==NULL)

returnNULL;

Py_INCREF(&CustomType);

if(PyModule_AddObject(m,"Custom",(PyObject*)&CustomType)<0){

Py_DECREF(&CustomType);

Py_DECREF(m);

returnNULL;

}

returnm;

}

This version of the module has a number of changes.

We've added an extra include:

#include<structmember.h>

This include provides declarations that we use to handle attributes, as

described a bit later.

The Custom type now has three data attributes in its C struct,

first, last, and number. The first and last variables are Python

strings containing first and last names. The number attribute is a C integer.

The object structure is updated accordingly:

typedefstruct{

PyObject_HEAD

PyObject*first;/* first name */

PyObject*last;/* last name */

intnumber;

}CustomObject;

Because we now have data to manage, we have to be more careful about object

allocation and deallocation. At a minimum, we need a deallocation method:

staticvoid

Custom_dealloc(CustomObject*self)

{

Py_XDECREF(self->first);

Py_XDECREF(self->last);

Py_TYPE(self)->tp_free((PyObject*)self);

}

which is assigned to the tp_dealloc member:

.tp_dealloc=(destructor)Custom_dealloc,

This method first clears the reference counts of the two Python attributes.

Py_XDECREF() correctly handles the case where its argument is

NULL (which might happen here if tp_new failed midway). It then

calls the tp_free member of the object's type

(computed by Py_TYPE(self)) to free the object's memory. Note that

the object's type might not be CustomType, because the object may

be an instance of a subclass.

注解

The explicit cast to destructor above is needed because we defined

Custom_dealloc to take a CustomObject* argument, but the tp_dealloc

function pointer expects to receive a PyObject* argument. Otherwise,

the compiler will emit a warning. This is object-oriented polymorphism,

in C!

We want to make sure that the first and last names are initialized to empty

strings, so we provide a tp_new implementation:

staticPyObject*

Custom_new(PyTypeObject*type,PyObject*args,PyObject*kwds)

{

CustomObject*self;

self=(CustomObject*)type->tp_alloc(type,0);

if(self!=NULL){

self->first=PyUnicode_FromString("");

if(self->first==NULL){

Py_DECREF(self);

returnNULL;

}

self->last=PyUnicode_FromString("");

if(self->last==NULL){

Py_DECREF(self);

returnNULL;

}

self->number=0;

}

return(PyObject*)self;

}

and install it in the tp_new member:

.tp_new=Custom_new,

The tp_new handler is responsible for creating (as opposed to initializing)

objects of the type. It is exposed in Python as the __new__() method.

It is not required to define a tp_new member, and indeed many extension

types will simply reuse PyType_GenericNew() as done in the first

version of the Custom type above. In this case, we use the tp_new

handler to initialize the first and last attributes to non-NULL

default values.

tp_new is passed the type being instantiated (not necessarily CustomType,

if a subclass is instantiated) and any arguments passed when the type was

called, and is expected to return the instance created. tp_new handlers

always accept positional and keyword arguments, but they often ignore the

arguments, leaving the argument handling to initializer (a.k.a. tp_init

in C or __init__ in Python) methods.

注解

tp_new shouldn't call tp_init explicitly, as the interpreter

will do it itself.

The tp_new implementation calls the tp_alloc

slot to allocate memory:

self=(CustomObject*)type->tp_alloc(type,0);

Since memory allocation may fail, we must check the tp_alloc

result against NULL before proceeding.

注解

We didn't fill the tp_alloc slot ourselves. Rather

PyType_Ready() fills it for us by inheriting it from our base class,

which is object by default. Most types use the default allocation

strategy.

注解

If you are creating a co-operative tp_new (one

that calls a base type's tp_new or __new__()),

you must not try to determine what method to call using method resolution

order at runtime. Always statically determine what type you are going to

call, and call its tp_new directly, or via

type->tp_base->tp_new. If you do not do this, Python subclasses of your

type that also inherit from other Python-defined classes may not work correctly.

(Specifically, you may not be able to create instances of such subclasses

without getting a TypeError.)

We also define an initialization function which accepts arguments to provide

initial values for our instance:

staticint

Custom_init(CustomObject*self,PyObject*args,PyObject*kwds)

{

staticchar*kwlist[]={"first","last","number",NULL};

PyObject*first=NULL,*last=NULL,*tmp;

if(!PyArg_ParseTupleAndKeywords(args,kwds,"|OOi",kwlist,

&first,&last,

&self->number))

return-1;

if(first){

tmp=self->first;

Py_INCREF(first);

self->first=first;

Py_XDECREF(tmp);

}

if(last){

tmp=self->last;

Py_INCREF(last);

self->last=last;

Py_XDECREF(tmp);

}

return0;

}

by filling the tp_init slot.

.tp_init=(initproc)Custom_init,

The tp_init slot is exposed in Python as the

__init__() method. It is used to initialize an object after it's

created. Initializers always accept positional and keyword arguments,

and they should return either 0 on success or -1 on error.

Unlike the tp_new handler, there is no guarantee that tp_init

is called at all (for example, the pickle module by default

doesn't call __init__() on unpickled instances). It can also be

called multiple times. Anyone can call the __init__() method on

our objects. For this reason, we have to be extra careful when assigning

the new attribute values. We might be tempted, for example to assign the

first member like this:

if(first){

Py_XDECREF(self->first);

Py_INCREF(first);

self->first=first;

}

But this would be risky. Our type doesn't restrict the type of the

first member, so it could be any kind of object. It could have a

destructor that causes code to be executed that tries to access the

first member; or that destructor could release the

Global interpreter Lock and let arbitrary code run in other

threads that accesses and modifies our object.

To be paranoid and protect ourselves against this possibility, we almost

always reassign members before decrementing their reference counts. When

don't we have to do this?

  • when we absolutely know that the reference count is greater than 1;

  • when we know that deallocation of the object 1 will neither release

    the GIL nor cause any calls back into our type's code;

  • when decrementing a reference count in a tp_dealloc

    handler on a type which doesn't support cyclic garbage collection 2.

We want to expose our instance variables as attributes. There are a

number of ways to do that. The simplest way is to define member definitions:

staticPyMemberDefCustom_members[]={

{"first",T_OBJECT_EX,offsetof(CustomObject,first),0,

"first name"},

{"last",T_OBJECT_EX,offsetof(CustomObject,last),0,

"last name"},

{"number",T_INT,offsetof(CustomObject,number),0,

"custom number"},

{NULL}/* Sentinel */

};

and put the definitions in the tp_members slot:

.tp_members=Custom_members,

Each member definition has a member name, type, offset, access flags and

documentation string. See the 泛型属性管理 section

below for details.

A disadvantage of this approach is that it doesn't provide a way to restrict the

types of objects that can be assigned to the Python attributes. We expect the

first and last names to be strings, but any Python objects can be assigned.

Further, the attributes can be deleted, setting the C pointers to NULL. Even

though we can make sure the members are initialized to non-NULL values, the

members can be set to NULL if the attributes are deleted.

We define a single method, Custom.name(), that outputs the objects name as the

concatenation of the first and last names.

staticPyObject*

Custom_name(CustomObject*self)

{

if(self->first==NULL){

PyErr_SetString(PyExc_AttributeError,"first");

returnNULL;

}

if(self->last==NULL){

PyErr_SetString(PyExc_AttributeError,"last");

returnNULL;

}

returnPyUnicode_FromFormat("%S %S",self->first,self->last);

}

The method is implemented as a C function that takes a Custom (or

Custom subclass) instance as the first argument. Methods always take an

instance as the first argument. Methods often take positional and keyword

arguments as well, but in this case we don't take any and don't need to accept

a positional argument tuple or keyword argument dictionary. This method is

equivalent to the Python method:

defname(self):

return"%s%s"%(self.first,self.last)

Note that we have to check for the possibility that our first and

last members are NULL. This is because they can be deleted, in which

case they are set to NULL. It would be better to prevent deletion of these

attributes and to restrict the attribute values to be strings. We'll see how to

do that in the next section.

Now that we've defined the method, we need to create an array of method

definitions:

staticPyMethodDefCustom_methods[]={

{"name",(PyCFunction)Custom_name,METH_NOARGS,

"Return the name, combining the first and last name"

},

{NULL}/* Sentinel */

};

(note that we used the METH_NOARGS flag to indicate that the method

is expecting no arguments other than self)

and assign it to the tp_methods slot:

.tp_methods=Custom_methods,

Finally, we'll make our type usable as a base class for subclassing. We've

written our methods carefully so far so that they don't make any assumptions

about the type of the object being created or used, so all we need to do is

to add the Py_TPFLAGS_BASETYPE to our class flag definition:

.tp_flags=Py_TPFLAGS_DEFAULT|Py_TPFLAGS_BASETYPE,

We rename PyInit_custom() to PyInit_custom2(), update the

module name in the PyModuleDef struct, and update the full class

name in the PyTypeObject struct.

Finally, we update our setup.py file to build the new module:

fromdistutils.coreimportsetup,Extension

setup(name="custom",version="1.0",

ext_modules=[

Extension("custom",["custom.c"]),

Extension("custom2",["custom2.c"]),

])

2.3. Providing finer control over data attributes¶

In this section, we'll provide finer control over how the first and

last attributes are set in the Custom example. In the previous

version of our module, the instance variables first and last

could be set to non-string values or even deleted. We want to make sure that

these attributes always contain strings.

#define PY_SSIZE_T_CLEAN

#include<Python.h>

#include"structmember.h"

typedefstruct{

PyObject_HEAD

PyObject*first;/* first name */

PyObject*last;/* last name */

intnumber;

}CustomObject;

staticvoid

Custom_dealloc(CustomObject*self)

{

Py_XDECREF(self->first);

Py_XDECREF(self->last);

Py_TYPE(self)->tp_free((PyObject*)self);

}

staticPyObject*

Custom_new(PyTypeObject*type,PyObject*args,PyObject*kwds)

{

CustomObject*self;

self=(CustomObject*)type->tp_alloc(type,0);

if(self!=NULL){

self->first=PyUnicode_FromString("");

if(self->first==NULL){

Py_DECREF(self);

returnNULL;

}

self->last=PyUnicode_FromString("");

if(self->last==NULL){

Py_DECREF(self);

returnNULL;

}

self->number=0;

}

return(PyObject*)self;

}

staticint

Custom_init(CustomObject*self,PyObject*args,PyObject*kwds)

{

staticchar*kwlist[]={"first","last","number",NULL};

PyObject*first=NULL,*last=NULL,*tmp;

if(!PyArg_ParseTupleAndKeywords(args,kwds,"|UUi",kwlist,

&first,&last,

&self->number))

return-1;

if(first){

tmp=self->first;

Py_INCREF(first);

self->first=first;

Py_DECREF(tmp);

}

if(last){

tmp=self->last;

Py_INCREF(last);

self->last=last;

Py_DECREF(tmp);

}

return0;

}

staticPyMemberDefCustom_members[]={

{"number",T_INT,offsetof(CustomObject,number),0,

"custom number"},

{NULL}/* Sentinel */

};

staticPyObject*

Custom_getfirst(CustomObject*self,void*closure)

{

Py_INCREF(self->first);

returnself->first;

}

staticint

Custom_setfirst(CustomObject*self,PyObject*value,void*closure)

{

PyObject*tmp;

if(value==NULL){

PyErr_SetString(PyExc_TypeError,"Cannot delete the first attribute");

return-1;

}

if(!PyUnicode_Check(value)){

PyErr_SetString(PyExc_TypeError,

"The first attribute value must be a string");

return-1;

}

tmp=self->first;

Py_INCREF(value);

self->first=value;

Py_DECREF(tmp);

return0;

}

staticPyObject*

Custom_getlast(CustomObject*self,void*closure)

{

Py_INCREF(self->last);

returnself->last;

}

staticint

Custom_setlast(CustomObject*self,PyObject*value,void*closure)

{

PyObject*tmp;

if(value==NULL){

PyErr_SetString(PyExc_TypeError,"Cannot delete the last attribute");

return-1;

}

if(!PyUnicode_Check(value)){

PyErr_SetString(PyExc_TypeError,

"The last attribute value must be a string");

return-1;

}

tmp=self->last;

Py_INCREF(value);

self->last=value;

Py_DECREF(tmp);

return0;

}

staticPyGetSetDefCustom_getsetters[]={

{"first",(getter)Custom_getfirst,(setter)Custom_setfirst,

"first name",NULL},

{"last",(getter)Custom_getlast,(setter)Custom_setlast,

"last name",NULL},

{NULL}/* Sentinel */

};

staticPyObject*

Custom_name(CustomObject*self,PyObject*Py_UNUSED(ignored))

{

returnPyUnicode_FromFormat("%S %S",self->first,self->last);

}

staticPyMethodDefCustom_methods[]={

{"name",(PyCFunction)Custom_name,METH_NOARGS,

"Return the name, combining the first and last name"

},

{NULL}/* Sentinel */

};

staticPyTypeObjectCustomType={

PyVarObject_HEAD_INIT(NULL,0)

.tp_name="custom3.Custom",

.tp_doc="Custom objects",

.tp_basicsize=sizeof(CustomObject),

.tp_itemsize=0,

.tp_flags=Py_TPFLAGS_DEFAULT|Py_TPFLAGS_BASETYPE,

.tp_new=Custom_new,

.tp_init=(initproc)Custom_init,

.tp_dealloc=(destructor)Custom_dealloc,

.tp_members=Custom_members,

.tp_methods=Custom_methods,

.tp_getset=Custom_getsetters,

};

staticPyModuleDefcustommodule={

PyModuleDef_HEAD_INIT,

.m_name="custom3",

.m_doc="Example module that creates an extension type.",

.m_size=-1,

};

PyMODINIT_FUNC

PyInit_custom3(void)

{

PyObject*m;

if(PyType_Ready(&CustomType)<0)

returnNULL;

m=PyModule_Create(&custommodule);

if(m==NULL)

returnNULL;

Py_INCREF(&CustomType);

if(PyModule_AddObject(m,"Custom",(PyObject*)&CustomType)<0){

Py_DECREF(&CustomType);

Py_DECREF(m);

returnNULL;

}

returnm;

}

To provide greater control, over the first and last attributes,

we'll use custom getter and setter functions. Here are the functions for

getting and setting the first attribute:

staticPyObject*

Custom_getfirst(CustomObject*self,void*closure)

{

Py_INCREF(self->first);

returnself->first;

}

staticint

Custom_setfirst(CustomObject*self,PyObject*value,void*closure)

{

PyObject*tmp;

if(value==NULL){

PyErr_SetString(PyExc_TypeError,"Cannot delete the first attribute");

return-1;

}

if(!PyUnicode_Check(value)){

PyErr_SetString(PyExc_TypeError,

"The first attribute value must be a string");

return-1;

}

tmp=self->first;

Py_INCREF(value);

self->first=value;

Py_DECREF(tmp);

return0;

}

The getter function is passed a Custom object and a "closure", which is

a void pointer. In this case, the closure is ignored. (The closure supports an

advanced usage in which definition data is passed to the getter and setter. This

could, for example, be used to allow a single set of getter and setter functions

that decide the attribute to get or set based on data in the closure.)

The setter function is passed the Custom object, the new value, and the

closure. The new value may be NULL, in which case the attribute is being

deleted. In our setter, we raise an error if the attribute is deleted or if its

new value is not a string.

We create an array of PyGetSetDef structures:

staticPyGetSetDefCustom_getsetters[]={

{"first",(getter)Custom_getfirst,(setter)Custom_setfirst,

"first name",NULL},

{"last",(getter)Custom_getlast,(setter)Custom_setlast,

"last name",NULL},

{NULL}/* Sentinel */

};

and register it in the tp_getset slot:

.tp_getset=Custom_getsetters,

The last item in a PyGetSetDef structure is the "closure" mentioned

above. In this case, we aren't using a closure, so we just pass NULL.

We also remove the member definitions for these attributes:

staticPyMemberDefCustom_members[]={

{"number",T_INT,offsetof(CustomObject,number),0,

"custom number"},

{NULL}/* Sentinel */

};

We also need to update the tp_init handler to only

allow strings 3 to be passed:

staticint

Custom_init(CustomObject*self,PyObject*args,PyObject*kwds)

{

staticchar*kwlist[]={"first","last","number",NULL};

PyObject*first=NULL,*last=NULL,*tmp;

if(!PyArg_ParseTupleAndKeywords(args,kwds,"|UUi",kwlist,

&first,&last,

&self->number))

return-1;

if(first){

tmp=self->first;

Py_INCREF(first);

self->first=first;

Py_DECREF(tmp);

}

if(last){

tmp=self->last;

Py_INCREF(last);

self->last=last;

Py_DECREF(tmp);

}

return0;

}

With these changes, we can assure that the first and last members are

never NULL so we can remove checks for NULL values in almost all cases.

This means that most of the Py_XDECREF() calls can be converted to

Py_DECREF() calls. The only place we can't change these calls is in

the tp_dealloc implementation, where there is the possibility that the

initialization of these members failed in tp_new.

We also rename the module initialization function and module name in the

initialization function, as we did before, and we add an extra definition to the

setup.py file.

2.4. Supporting cyclic garbage collection¶

Python has a cyclic garbage collector (GC) that

can identify unneeded objects even when their reference counts are not zero.

This can happen when objects are involved in cycles. For example, consider:

>>> l=[]

>>> l.append(l)

>>> dell

In this example, we create a list that contains itself. When we delete it, it

still has a reference from itself. Its reference count doesn't drop to zero.

Fortunately, Python's cyclic garbage collector will eventually figure out that

the list is garbage and free it.

In the second version of the Custom example, we allowed any kind of

object to be stored in the first or last attributes 4.

Besides, in the second and third versions, we allowed subclassing

Custom, and subclasses may add arbitrary attributes. For any of

those two reasons, Custom objects can participate in cycles:

>>> importcustom3

>>> classDerived(custom3.Custom):pass

...

>>> n=Derived()

>>> n.some_attribute=n

To allow a Custom instance participating in a reference cycle to

be properly detected and collected by the cyclic GC, our Custom type

needs to fill two additional slots and to enable a flag that enables these slots:

#define PY_SSIZE_T_CLEAN

#include<Python.h>

#include"structmember.h"

typedefstruct{

PyObject_HEAD

PyObject*first;/* first name */

PyObject*last;/* last name */

intnumber;

}CustomObject;

staticint

Custom_traverse(CustomObject*self,visitprocvisit,void*arg)

{

Py_VISIT(self->first);

Py_VISIT(self->last);

return0;

}

staticint

Custom_clear(CustomObject*self)

{

Py_CLEAR(self->first);

Py_CLEAR(self->last);

return0;

}

staticvoid

Custom_dealloc(CustomObject*self)

{

PyObject_GC_UnTrack(self);

Custom_clear(self);

Py_TYPE(self)->tp_free((PyObject*)self);

}

staticPyObject*

Custom_new(PyTypeObject*type,PyObject*args,PyObject*kwds)

{

CustomObject*self;

self=(CustomObject*)type->tp_alloc(type,0);

if(self!=NULL){

self->first=PyUnicode_FromString("");

if(self->first==NULL){

Py_DECREF(self);

returnNULL;

}

self->last=PyUnicode_FromString("");

if(self->last==NULL){

Py_DECREF(self);

returnNULL;

}

self->number=0;

}

return(PyObject*)self;

}

staticint

Custom_init(CustomObject*self,PyObject*args,PyObject*kwds)

{

staticchar*kwlist[]={"first","last","number",NULL};

PyObject*first=NULL,*last=NULL,*tmp;

if(!PyArg_ParseTupleAndKeywords(args,kwds,"|UUi",kwlist,

&first,&last,

&self->number))

return-1;

if(first){

tmp=self->first;

Py_INCREF(first);

self->first=first;

Py_DECREF(tmp);

}

if(last){

tmp=self->last;

Py_INCREF(last);

self->last=last;

Py_DECREF(tmp);

}

return0;

}

staticPyMemberDefCustom_members[]={

{"number",T_INT,offsetof(CustomObject,number),0,

"custom number"},

{NULL}/* Sentinel */

};

staticPyObject*

Custom_getfirst(CustomObject*self,void*closure)

{

Py_INCREF(self->first);

returnself->first;

}

staticint

Custom_setfirst(CustomObject*self,PyObject*value,void*closure)

{

if(value==NULL){

PyErr_SetString(PyExc_TypeError,"Cannot delete the first attribute");

return-1;

}

if(!PyUnicode_Check(value)){

PyErr_SetString(PyExc_TypeError,

"The first attribute value must be a string");

return-1;

}

Py_INCREF(value);

Py_CLEAR(self->first);

self->first=value;

return0;

}

staticPyObject*

Custom_getlast(CustomObject*self,void*closure)

{

Py_INCREF(self->last);

returnself->last;

}

staticint

Custom_setlast(CustomObject*self,PyObject*value,void*closure)

{

if(value==NULL){

PyErr_SetString(PyExc_TypeError,"Cannot delete the last attribute");

return-1;

}

if(!PyUnicode_Check(value)){

PyErr_SetString(PyExc_TypeError,

"The last attribute value must be a string");

return-1;

}

Py_INCREF(value);

Py_CLEAR(self->last);

self->last=value;

return0;

}

staticPyGetSetDefCustom_getsetters[]={

{"first",(getter)Custom_getfirst,(setter)Custom_setfirst,

"first name",NULL},

{"last",(getter)Custom_getlast,(setter)Custom_setlast,

"last name",NULL},

{NULL}/* Sentinel */

};

staticPyObject*

Custom_name(CustomObject*self,PyObject*Py_UNUSED(ignored))

{

returnPyUnicode_FromFormat("%S %S",self->first,self->last);

}

staticPyMethodDefCustom_methods[]={

{"name",(PyCFunction)Custom_name,METH_NOARGS,

"Return the name, combining the first and last name"

},

{NULL}/* Sentinel */

};

staticPyTypeObjectCustomType={

PyVarObject_HEAD_INIT(NULL,0)

.tp_name="custom4.Custom",

.tp_doc="Custom objects",

.tp_basicsize=sizeof(CustomObject),

.tp_itemsize=0,

.tp_flags=Py_TPFLAGS_DEFAULT|Py_TPFLAGS_BASETYPE|Py_TPFLAGS_HAVE_GC,

.tp_new=Custom_new,

.tp_init=(initproc)Custom_init,

.tp_dealloc=(destructor)Custom_dealloc,

.tp_traverse=(traverseproc)Custom_traverse,

.tp_clear=(inquiry)Custom_clear,

.tp_members=Custom_members,

.tp_methods=Custom_methods,

.tp_getset=Custom_getsetters,

};

staticPyModuleDefcustommodule={

PyModuleDef_HEAD_INIT,

.m_name="custom4",

.m_doc="Example module that creates an extension type.",

.m_size=-1,

};

PyMODINIT_FUNC

PyInit_custom4(void)

{

PyObject*m;

if(PyType_Ready(&CustomType)<0)

returnNULL;

m=PyModule_Create(&custommodule);

if(m==NULL)

returnNULL;

Py_INCREF(&CustomType);

if(PyModule_AddObject(m,"Custom",(PyObject*)&CustomType)<0){

Py_DECREF(&CustomType);

Py_DECREF(m);

returnNULL;

}

returnm;

}

First, the traversal method lets the cyclic GC know about subobjects that could

participate in cycles:

staticint

Custom_traverse(CustomObject*self,visitprocvisit,void*arg)

{

intvret;

if(self->first){

vret=visit(self->first,arg);

if(vret!=0)

returnvret;

}

if(self->last){

vret=visit(self->last,arg);

if(vret!=0)

returnvret;

}

return0;

}

For each subobject that can participate in cycles, we need to call the

visit() function, which is passed to the traversal method. The

visit() function takes as arguments the subobject and the extra argument

arg passed to the traversal method. It returns an integer value that must be

returned if it is non-zero.

Python provides a Py_VISIT() macro that automates calling visit

functions. With Py_VISIT(), we can minimize the amount of boilerplate

in Custom_traverse:

staticint

Custom_traverse(CustomObject*self,visitprocvisit,void*arg)

{

Py_VISIT(self->first);

Py_VISIT(self->last);

return0;

}

注解

The tp_traverse implementation must name its

arguments exactly visit and arg in order to use Py_VISIT().

Second, we need to provide a method for clearing any subobjects that can

participate in cycles:

staticint

Custom_clear(CustomObject*self)

{

Py_CLEAR(self->first);

Py_CLEAR(self->last);

return0;

}

Notice the use of the Py_CLEAR() macro. It is the recommended and safe

way to clear data attributes of arbitrary types while decrementing

their reference counts. If you were to call Py_XDECREF() instead

on the attribute before setting it to NULL, there is a possibility

that the attribute's destructor would call back into code that reads the

attribute again (especially if there is a reference cycle).

注解

You could emulate Py_CLEAR() by writing:

PyObject*tmp;

tmp=self->first;

self->first=NULL;

Py_XDECREF(tmp);

Nevertheless, it is much easier and less error-prone to always

use Py_CLEAR() when deleting an attribute. Don't

try to micro-optimize at the expense of robustness!

The deallocator Custom_dealloc may call arbitrary code when clearing

attributes. It means the circular GC can be triggered inside the function.

Since the GC assumes reference count is not zero, we need to untrack the object

from the GC by calling PyObject_GC_UnTrack() before clearing members.

Here is our reimplemented deallocator using PyObject_GC_UnTrack()

and Custom_clear:

staticvoid

Custom_dealloc(CustomObject*self)

{

PyObject_GC_UnTrack(self);

Custom_clear(self);

Py_TYPE(self)->tp_free((PyObject*)self);

}

Finally, we add the Py_TPFLAGS_HAVE_GC flag to the class flags:

.tp_flags=Py_TPFLAGS_DEFAULT|Py_TPFLAGS_BASETYPE|Py_TPFLAGS_HAVE_GC,

That's pretty much it. If we had written custom tp_alloc or

tp_free handlers, we'd need to modify them for cyclic

garbage collection. Most extensions will use the versions automatically provided.

2.5. Subclassing other types¶

It is possible to create new extension types that are derived from existing

types. It is easiest to inherit from the built in types, since an extension can

easily use the PyTypeObject it needs. It can be difficult to share

these PyTypeObject structures between extension modules.

In this example we will create a SubList type that inherits from the

built-in list type. The new type will be completely compatible with

regular lists, but will have an additional increment() method that

increases an internal counter:

>>> importsublist

>>> s=sublist.SubList(range(3))

>>> s.extend(s)

>>> print(len(s))

6

>>> print(s.increment())

1

>>> print(s.increment())

2

#define PY_SSIZE_T_CLEAN

#include<Python.h>

typedefstruct{

PyListObjectlist;

intstate;

}SubListObject;

staticPyObject*

SubList_increment(SubListObject*self,PyObject*unused)

{

self->state++;

returnPyLong_FromLong(self->state);

}

staticPyMethodDefSubList_methods[]={

{"increment",(PyCFunction)SubList_increment,METH_NOARGS,

PyDoc_STR("increment state counter")},

{NULL},

};

staticint

SubList_init(SubListObject*self,PyObject*args,PyObject*kwds)

{

if(PyList_Type.tp_init((PyObject*)self,args,kwds)<0)

return-1;

self->state=0;

return0;

}

staticPyTypeObjectSubListType={

PyVarObject_HEAD_INIT(NULL,0)

.tp_name="sublist.SubList",

.tp_doc="SubList objects",

.tp_basicsize=sizeof(SubListObject),

.tp_itemsize=0,

.tp_flags=Py_TPFLAGS_DEFAULT|Py_TPFLAGS_BASETYPE,

.tp_init=(initproc)SubList_init,

.tp_methods=SubList_methods,

};

staticPyModuleDefsublistmodule={

PyModuleDef_HEAD_INIT,

.m_name="sublist",

.m_doc="Example module that creates an extension type.",

.m_size=-1,

};

PyMODINIT_FUNC

PyInit_sublist(void)

{

PyObject*m;

SubListType.tp_base=&PyList_Type;

if(PyType_Ready(&SubListType)<0)

returnNULL;

m=PyModule_Create(&sublistmodule);

if(m==NULL)

returnNULL;

Py_INCREF(&SubListType);

if(PyModule_AddObject(m,"SubList",(PyObject*)&SubListType)<0){

Py_DECREF(&SubListType);

Py_DECREF(m);

returnNULL;

}

returnm;

}

As you can see, the source code closely resembles the Custom examples in

previous sections. We will break down the main differences between them.

typedefstruct{

PyListObjectlist;

intstate;

}SubListObject;

The primary difference for derived type objects is that the base type's

object structure must be the first value. The base type will already include

the PyObject_HEAD() at the beginning of its structure.

When a Python object is a SubList instance, its PyObject* pointer

can be safely cast to both PyListObject* and SubListObject*:

staticint

SubList_init(SubListObject*self,PyObject*args,PyObject*kwds)

{

if(PyList_Type.tp_init((PyObject*)self,args,kwds)<0)

return-1;

self->state=0;

return0;

}

We see above how to call through to the __init__ method of the base

type.

This pattern is important when writing a type with custom

tp_new and tp_dealloc

members. The tp_new handler should not actually

create the memory for the object with its tp_alloc,

but let the base class handle it by calling its own tp_new.

The PyTypeObject struct supports a tp_base

specifying the type's concrete base class. Due to cross-platform compiler

issues, you can't fill that field directly with a reference to

PyList_Type; it should be done later in the module initialization

function:

PyMODINIT_FUNC

PyInit_sublist(void)

{

PyObject*m;

SubListType.tp_base=&PyList_Type;

if(PyType_Ready(&SubListType)<0)

returnNULL;

m=PyModule_Create(&sublistmodule);

if(m==NULL)

returnNULL;

Py_INCREF(&SubListType);

if(PyModule_AddObject(m,"SubList",(PyObject*)&SubListType)<0){

Py_DECREF(&SubListType);

Py_DECREF(m);

returnNULL;

}

returnm;

}

Before calling PyType_Ready(), the type structure must have the

tp_base slot filled in. When we are deriving an

existing type, it is not necessary to fill out the tp_alloc

slot with PyType_GenericNew() -- the allocation function from the base

type will be inherited.

After that, calling PyType_Ready() and adding the type object to the

module is the same as with the basic Custom examples.

备注

1

This is true when we know that the object is a basic type, like a string or a

float.

2

We relied on this in the tp_dealloc handler

in this example, because our type doesn't support garbage collection.

3

We now know that the first and last members are strings, so perhaps we

could be less careful about decrementing their reference counts, however,

we accept instances of string subclasses. Even though deallocating normal

strings won't call back into our objects, we can't guarantee that deallocating

an instance of a string subclass won't call back into our objects.

4

Also, even with our attributes restricted to strings instances, the user

could pass arbitrary str subclasses and therefore still create

reference cycles.

以上是 Python2.自定义扩展类型:教程 的全部内容, 来源链接: utcz.com/z/508146.html

回到顶部