Tuesday, August 8, 2023

Swap_8_and_9: A simple import can modify the Python interpreter

Back to the code for our module — we want to modify a couple of the elements of small_ints[], but we do not have direct access to it. We can use the API declared in longobject.h — namely, the following function that gives the PyObject for a given long (plain C) value: PyAPI_FUNC(PyObject *) PyLong_FromLong(long).

Its definition is in longobject.c. We can see that it checks for small integers (IS_SMALL_INT(ival)) and then calls get_small_int((sdigit)ival), which returns _PyLong_SMALL_INTS[_PY_NSMALLNEGINTS + ival].

Finally, in pycore_long.h, you can see that this refers to the small_ints[] array: #define _PyLong_SMALL_INTS _Py_SINGLETON(small_ints).

Now, we have the first part of our code. It will get the references to the two integers we care about:

PyLongObject* obj8 = (PyLongObject*)PyLong_FromLong(8);
PyLongObject* obj9 = (PyLongObject*)PyLong_FromLong(9);

How do we change the value?

Python int (and CPython PyLongObject) represent arbitrarily large integers (positive or negative). longintrepr.h defines how integers are stored. I’ll include the whole comment below because it is so informative, followed by the struct definition,

/* Long integer representation.
   The absolute value of a number is equal to
        SUM(for i=0 through abs(ob_size)-1) ob_digit[i] * 2**(SHIFT*i)
   Negative numbers are represented with ob_size < 0;
   zero is represented by ob_size == 0.
   In a normalized number, ob_digit[abs(ob_size)-1] (the most significant
   digit) is never zero.  Also, in all cases, for all valid i,
        0 <= ob_digit[i] <= MASK.
   The allocation function takes care of allocating extra memory
   so that ob_digit[0] ... ob_digit[abs(ob_size)-1] are actually available.
   We always allocate memory for at least one digit, so accessing ob_digit[0]
   is always safe. However, in the case ob_size == 0, the contents of
   ob_digit[0] may be undefined.

   CAUTION:  Generic code manipulating subtypes of PyVarObject has to
   aware that ints abuse  ob_size's sign bit.
*/

struct _longobject {
    PyObject_VAR_HEAD
    digit ob_digit[1];
};             
             

A couple other typedefs relate a few of the types shown above,

So, for the smallest integers, the "raw value" is simply .ob_digit[0].

Now, we have all the code we need: get a reference to cached integer object and change its value. These four lines can go in the //CODE HERE part above:

PyLongObject* obj8 = (PyLongObject*)PyLong_FromLong(8);
PyLongObject* obj9 = (PyLongObject*)PyLong_FromLong(9);

obj8->ob_digit[0] = 9;
obj9->ob_digit[0] = 8;


from Hacker News https://ift.tt/n9ouRqP

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.