Here is a patch that adds checking for NULL-termination of keyword arguments to PyArg_ParseTupleAndKeywords. This problem has bit us in gdb a couple of times.
I've included a doc update and a test case.
Let me know what you think.
I needed to add an 'operand' field to Unary. After looking at it a little, I think this code does not model GCC precisely enough.
E.g., 'location' is added just for 'Expression':
if localname == 'Expression': add_simple_getter('location', 'gcc_python_make_wrapper_location(EXPR_LOCATION(self->t))', "The source location of this expression")
But, I think this should actually be added for any class that uses tree_exp as its representation: tcc_expression, tcc_comparison, tcc_unary, tcc_binary, and tcc_statement.
Tom
diff --git a/docs/tree.rst b/docs/tree.rst index 8a0a89f..4a8a0fd 100644 --- a/docs/tree.rst +++ b/docs/tree.rst @@ -503,6 +503,10 @@ Unary Expressions Corresponds to the `tcc_unary` value of `enum tree_code_class` within GCC's own C sources.
+ .. py:attribute:: operand + + The operand of this operator, as a `gcc.Tree`. + ====================================== ================================================== Subclass Meaning; C/C++ operators ====================================== ================================================== diff --git a/generate-tree-c.py b/generate-tree-c.py index 419aaaa..4c52c63 100644 --- a/generate-tree-c.py +++ b/generate-tree-c.py @@ -253,6 +253,11 @@ PyObject* pytype.tp_methods = methods.identifier cu.add_defn(methods.c_defn())
+ if localname == 'Unary': + add_simple_getter('operand', + 'gcc_python_make_wrapper_tree(TREE_OPERAND (self->t, 0))', + 'The operand of this expression, as a gcc.Tree') + if localname == 'Expression': add_simple_getter('location', 'gcc_python_make_wrapper_location(EXPR_LOCATION(self->t))', diff --git a/libcpychecker/PyArg_ParseTuple.py b/libcpychecker/PyArg_ParseTuple.py index bf99e35..8e0b977 100644 --- a/libcpychecker/PyArg_ParseTuple.py +++ b/libcpychecker/PyArg_ParseTuple.py @@ -542,6 +542,33 @@ def check_pyargs(fun): if isinstance(operand, gcc.StringCst): return operand.constant
+ def check_keyword_array(stmt, idx): + keywords = stmt.args[idx] + if isinstance(keywords, gcc.AddrExpr): + operand = keywords.operand + if isinstance(operand, gcc.VarDecl): + initial = operand.initial + if isinstance(initial, gcc.Constructor): + elements = [None] * len(initial.elements) + for elt in initial.elements: + (num, contents) = elt + elt_idx = num.constant + if isinstance(contents, gcc.NopExpr): + contents = contents.operand + if isinstance(contents, gcc.AddrExpr): + contents = contents.operand + if isinstance(contents, gcc.StringCst): + elements[elt_idx] = contents.constant + elif isinstance(contents, gcc.IntegerCst): + elements[elt_idx] = contents.constant + if elements[-1] != 0: + gcc.permerror(stmt.loc, 'keywords to PyArg_ParseTupleAndKeywords are not NULL-terminated') + i = 0 + for elt in elements[0:-1]: + if not elt: + gcc.permerror(stmt.loc, 'keyword argument %d missing in PyArg_ParseTupleAndKeywords call' % i) + i = i + 1 + def check_callsite(stmt, funcname, format_idx, varargs_idx): log('got call at %s' % stmt.loc) log(get_src_for_loc(stmt.loc)) @@ -607,5 +634,6 @@ def check_pyargs(fun): if stmt.fndecl.name == 'PyArg_ParseTuple': check_callsite(stmt, 'PyArg_ParseTuple', 1, 2) elif stmt.fndecl.name == 'PyArg_ParseTupleAndKeywords': + check_keyword_array(stmt, 3) check_callsite(stmt, 'PyArg_ParseTupleAndKeywords', 2, 4)
diff --git a/tests/cpychecker/PyArg_ParseTuple/keywords/input.c b/tests/cpychecker/PyArg_ParseTuple/keywords/input.c new file mode 100644 index 0000000..d9d567e --- /dev/null +++ b/tests/cpychecker/PyArg_ParseTuple/keywords/input.c @@ -0,0 +1,45 @@ +/* + Copyright 2011 David Malcolm dmalcolm@redhat.com + Copyright 2011 Red Hat, Inc. + + This is free software: you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see + http://www.gnu.org/licenses/. +*/ + +/* + A simple check of PyArg_ParseTupleAndKeywords +*/ + +#include <Python.h> + +static PyObject * +parse_to_a_typedef(PyObject *self, PyObject *args, PyObject *kwargs) +{ + int val; + static char *keywords[] = { "hi" }; /* Note no NULL terminator. */ + + if (!PyArg_ParseTupleAndKeywords(args, kwargs, "i", keywords, &val)) { + return NULL; + } + + Py_RETURN_NONE; +} + +/* + PEP-7 +Local variables: +c-basic-offset: 4 +indent-tabs-mode: nil +End: +*/ diff --git a/tests/cpychecker/PyArg_ParseTuple/keywords/script.py b/tests/cpychecker/PyArg_ParseTuple/keywords/script.py new file mode 100644 index 0000000..e120c53 --- /dev/null +++ b/tests/cpychecker/PyArg_ParseTuple/keywords/script.py @@ -0,0 +1,19 @@ +# Copyright 2011 David Malcolm dmalcolm@redhat.com +# Copyright 2011 Red Hat, Inc. +# +# This is free software: you can redistribute it and/or modify it +# under the terms of the GNU General Public License as published by +# the Free Software Foundation, either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, but +# WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see +# http://www.gnu.org/licenses/. + +from libcpychecker import main +main() diff --git a/tests/cpychecker/PyArg_ParseTuple/keywords/stderr.txt b/tests/cpychecker/PyArg_ParseTuple/keywords/stderr.txt new file mode 100644 index 0000000..3168ae9 --- /dev/null +++ b/tests/cpychecker/PyArg_ParseTuple/keywords/stderr.txt @@ -0,0 +1,2 @@ +tests/cpychecker/PyArg_ParseTuple/keywords/input.c: In function ‘parse_to_a_typedef’: +tests/cpychecker/PyArg_ParseTuple/keywords/input.c:32:37: error: keywords to PyArg_ParseTupleAndKeywords are not NULL-terminated [-fpermissive]
On Mon, 2011-06-27 at 08:29 -0600, Tom Tromey wrote:
Here is a patch that adds checking for NULL-termination of keyword arguments to PyArg_ParseTupleAndKeywords. This problem has bit us in gdb a couple of times.
I've included a doc update and a test case.
Let me know what you think.
Good idea - thanks for looking into this!
I spent some time this weekend looking into local variables and their initializers, and unfortunately there's a problem with the approach the patch uses: the "initial" field of the VarDecl is only set up for variables in global scope, and for those local variables marked with "static". For local variables that aren't "static", the "initial" of the VarDecl is NULL/None and instead, you have to look at the gimple statements (it might be that I'm missing something here, in the early passes). [Somewhat bizarrely, this seems to be the case even if the variables are marked as "const": they still seem to receive assignments]
I wrote up some notes for the docs on this, and a test as: http://git.fedorahosted.org/git/?p=gcc-python-plugin.git;a=commitdiff;h=644e... where you can see the kind of GimpleAssign statements that we have to deal with.
So your code seems to work as is, but it won't detect cases without the "static" on the "char *[]".
I think the robust fix for this will be to integrate this test into the reference-count checker. The latter works by tracking the values of variables, and so when it reaches a call to PyArg etc it can look up what the value of the param that's passed in is. That's my hope, anyway - that code is much less mature than the rest of the project.
Having said that, overall, the patch is an improvement, so I've pushed it patch (with the caveat in a comment) as: http://git.fedorahosted.org/git/?p=gcc-python-plugin.git;a=commit;h=c806965f...
I've started a "contributors.rst" file in that patch also.
I needed to add an 'operand' field to Unary. After looking at it a little, I think this code does not model GCC precisely enough.
E.g., 'location' is added just for 'Expression':
if localname == 'Expression': add_simple_getter('location', 'gcc_python_make_wrapper_location(EXPR_LOCATION(self->t))', "The source location of this expression")
But, I think this should actually be added for any class that uses tree_exp as its representation: tcc_expression, tcc_comparison, tcc_unary, tcc_binary, and tcc_statement.
Agreed.
I tend to look in gcc/print-tree.c for this kind of thing, and that code has this in print_node(): if (EXPR_HAS_LOCATION (node)) { expanded_location xloc = expand_location (EXPR_LOCATION (node)); indent_to (file, indent+4); fprintf (file, "%s:%d:%d", xloc.file, xloc.line, xloc.column); }
and EXPR_HAS_LOCATION ultimately boils down to this being not UNKNOWN_LOCATION, from gcc/tree.h:
#define EXPR_LOCATION(NODE) \ (EXPR_P ((NODE)) ? (NODE)->exp.locus : UNKNOWN_LOCATION)
#define EXPR_P(NODE) IS_EXPR_CODE_CLASS (TREE_CODE_CLASS (TREE_CODE (NODE)))
which applies for these tcc values:
#define IS_EXPR_CODE_CLASS(CLASS)\ ((CLASS) >= tcc_reference && (CLASS) <= tcc_expression)
So, yes: it applies to this range of tcc values: tcc_reference, /* A reference to storage. */ tcc_comparison, /* A comparison expression. */ tcc_unary, /* A unary arithmetic expression. */ tcc_binary, /* A binary arithmetic expression. */ tcc_statement, /* A statement expression, which have side effects but usually no interesting value. */ tcc_vl_exp, /* A function call or other expression with a variable-length operand vector. */ tcc_expression /* Any other expression. */
I've extended the "location" property to be added to all of these tcc classes: http://git.fedorahosted.org/git/?p=gcc-python-plugin.git;a=commit;h=fc1946c2...
I wonder if it makes sense to make all of these be part of a new "Expression" subclass of gcc.Tree? We could eliminate the class for tcc_expression, moving all of its subclasses one level up. That would mean not following gcc so closely, but it would also stop us having 7 base classes that mean almost the same thing. Not sure.
Hope this all makes sense; thanks again for the patch. Dave
gcc-python-plugin@lists.stg.fedorahosted.org