-
-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MNT: Add linter for thread-unsafe C API uses #26159
Comments
…Ref (#26282) Replaces all usages of the private _PyDict_GetItemStringWithError Python C API function with PyDict_GetItemStringRef. The latter is available in the python 3.13 C API and via python-capicompat in older python versions. The new function has the same semantics as PyDict_GetItemRef but takes a UTF-8 C string instead of a python object. Like _PyDict_GetItemStringWithError, it does not suppress errors. It also has the nice improvement that the return type is now int and it signals success or failure. This lets me re-structure control flow a bit and avoid PyErr_Occurred() calls. I used the new PyDict_ContainsString function if all we care about is whether a key is in the dict. I also used Py_SETREF in a few places that were using the unsafe decref then set pattern. See #26159 for the relevant tracking issue. --- * MNT: replace _PyDict_GetItemStringWithError with PyDict_GetItemStringRef * MNT: fix issue spotted in code review * MNT: fix two more issues spotted in code review * MNT: more missing decrefs spotted in code review * MNT: delete unnecessary decrefs (covered by goto fail logic)
@ngoldbaum Can I work on replacing usage of |
So, careful, we don’t want to wholesale replace API usages. Sometimes we intentionally create borrowed references. Any usages where in the same region of code we first create a borrowed reference and then soon after upgrade it to an owned reference by calling Basically, be careful, because reference counting bugs can be hard to track down and spot, and we don’t want to introduce any new ones in a refactor. |
Ref #26159 See also the CPython HOWTO on this topic: https://docs.python.org/3.13/howto/free-threading-extensions.html#freethreading-extensions-howto. The remaining usages of PyDict_GetItem and PyDict_Next are all around the fields attribute of structured dtypes. I'm pretty sure that dictionary is effectively frozen after the DType is constructed, so I don't worry about those uses. It's not straightforward to write tests for this, I'm just applying static refactorings in places where the refactoring shouldn't introduce new reference counting bugs. * ENH: fix thread-unsafe C API usages * ENH: use critical sections in einsum * BUG: fix error handling in loadtxt C code * revert einsum changes
Ref numpy#26159 See also the CPython HOWTO on this topic: https://docs.python.org/3.13/howto/free-threading-extensions.html#freethreading-extensions-howto. The remaining usages of PyDict_GetItem and PyDict_Next are all around the fields attribute of structured dtypes. I'm pretty sure that dictionary is effectively frozen after the DType is constructed, so I don't worry about those uses. It's not straightforward to write tests for this, I'm just applying static refactorings in places where the refactoring shouldn't introduce new reference counting bugs. * ENH: fix thread-unsafe C API usages * ENH: use critical sections in einsum * BUG: fix error handling in loadtxt C code * revert einsum changes
I think the remaining issue on this task is to add a linter to NumPy's CI to check PRs for new uses of possibly problematic C API items. We should double-check and explicitly whitelist all the remaining ones, then make any new uses get whitelisted as they come in. I added the sprintable label to this since this should be doable for anyone who understands how to work with github actions and parse git diffs. |
Ref numpy#26159 See also the CPython HOWTO on this topic: https://docs.python.org/3.13/howto/free-threading-extensions.html#freethreading-extensions-howto. The remaining usages of PyDict_GetItem and PyDict_Next are all around the fields attribute of structured dtypes. I'm pretty sure that dictionary is effectively frozen after the DType is constructed, so I don't worry about those uses. It's not straightforward to write tests for this, I'm just applying static refactorings in places where the refactoring shouldn't introduce new reference counting bugs. * ENH: fix thread-unsafe C API usages * ENH: use critical sections in einsum * BUG: fix error handling in loadtxt C code * revert einsum changes
As noted in PEP 703 C API functions that return borrowed references are problematic in the nogil build.
We need to audit our usages and replace any that upgrade a borrowed reference to an owned reference with alternate C API functions that return new references.
Problematic Functions:
PyList_GetItem
(MNT: migrate PyList_GetItem usages to PyList_GetItemRef #26246)PyList_GET_ITEM
(all existing usages are in argument parsing)PyDict_GetItem
PyDict_GetItemWithError
PyDict_Next
PyDict_GetItemString
(ENH: fix thread-unsafe C API usages #27145 for the PyDict items)._PyDict_GetItemStringWithError
(MNT: replace _PyDict_GetItemStringWithError with PyDict_GetItemStringRef #26282)The text was updated successfully, but these errors were encountered: