Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MNT: Add linter for thread-unsafe C API uses #26159

Open
7 of 8 tasks
ngoldbaum opened this issue Mar 28, 2024 · 3 comments
Open
7 of 8 tasks

MNT: Add linter for thread-unsafe C API uses #26159

ngoldbaum opened this issue Mar 28, 2024 · 3 comments
Labels
03 - Maintenance 39 - free-threading PRs and issues related to support for free-threading CPython (a.k.a. no-GIL, PEP 703) sprintable Issue fits the time-frame and setting of a sprint

Comments

@ngoldbaum
Copy link
Member

ngoldbaum commented Mar 28, 2024

As noted in PEP 703 C API functions that return borrowed references are problematic in the nogil build.

We need to audit our usages and replace any that upgrade a borrowed reference to an owned reference with alternate C API functions that return new references.

Problematic Functions:

@rgommers rgommers added the 39 - free-threading PRs and issues related to support for free-threading CPython (a.k.a. no-GIL, PEP 703) label Apr 10, 2024
seberg pushed a commit that referenced this issue Apr 17, 2024
…Ref (#26282)

Replaces all usages of the private _PyDict_GetItemStringWithError Python C API function with PyDict_GetItemStringRef. The latter is available in the python 3.13 C API and via python-capicompat in older python versions.

The new function has the same semantics as PyDict_GetItemRef but takes a UTF-8 C string instead of a python object. Like _PyDict_GetItemStringWithError, it does not suppress errors. It also has the nice improvement that the return type is now int and it signals success or failure. This lets me re-structure control flow a bit and avoid PyErr_Occurred() calls.

I used the new PyDict_ContainsString function if all we care about is whether a key is in the dict. I also used Py_SETREF in a few places that were using the unsafe decref then set pattern.

See #26159 for the relevant tracking issue.

---

* MNT: replace _PyDict_GetItemStringWithError with PyDict_GetItemStringRef

* MNT: fix issue spotted in code review

* MNT: fix two more issues spotted in code review

* MNT: more missing decrefs spotted in code review

* MNT: delete unnecessary decrefs (covered by goto fail logic)
@arunppsg
Copy link
Contributor

@ngoldbaum Can I work on replacing usage of PyList_GET_ITEM? I intend to replace it with PyList_GetItemRef as it returns a strong reference. Is there anything else I should look for?

@ngoldbaum
Copy link
Member Author

So, careful, we don’t want to wholesale replace API usages. Sometimes we intentionally create borrowed references.

Any usages where in the same region of code we first create a borrowed reference and then soon after upgrade it to an owned reference by calling Py_INCREF can be replaced by GetItemRef. Replacing those usages (and deleting the subsequent INCREF) shouldn’t introduce new reference cou ting bugs unless we’re only doing an INCREF conditionally.

Basically, be careful, because reference counting bugs can be hard to track down and spot, and we don’t want to introduce any new ones in a refactor.

seberg pushed a commit that referenced this issue Aug 9, 2024
Ref #26159

See also the CPython HOWTO on this topic: https://docs.python.org/3.13/howto/free-threading-extensions.html#freethreading-extensions-howto.

The remaining usages of PyDict_GetItem and PyDict_Next are all around the fields attribute of structured dtypes. I'm pretty sure that dictionary is effectively frozen after the DType is constructed, so I don't worry about those uses.

It's not straightforward to write tests for this, I'm just applying static refactorings in places where the refactoring shouldn't introduce new reference counting bugs.

* ENH: fix thread-unsafe C API usages

* ENH: use critical sections in einsum

* BUG: fix error handling in loadtxt C code

* revert einsum changes
charris pushed a commit to charris/numpy that referenced this issue Aug 9, 2024
Ref numpy#26159

See also the CPython HOWTO on this topic: https://docs.python.org/3.13/howto/free-threading-extensions.html#freethreading-extensions-howto.

The remaining usages of PyDict_GetItem and PyDict_Next are all around the fields attribute of structured dtypes. I'm pretty sure that dictionary is effectively frozen after the DType is constructed, so I don't worry about those uses.

It's not straightforward to write tests for this, I'm just applying static refactorings in places where the refactoring shouldn't introduce new reference counting bugs.

* ENH: fix thread-unsafe C API usages

* ENH: use critical sections in einsum

* BUG: fix error handling in loadtxt C code

* revert einsum changes
@ngoldbaum ngoldbaum added the sprintable Issue fits the time-frame and setting of a sprint label Oct 24, 2024
@ngoldbaum ngoldbaum changed the title MNT: Replace C API usages that return borrowed references MNT: Add linter for thread-unsafe C API uses Oct 24, 2024
@ngoldbaum
Copy link
Member Author

I think the remaining issue on this task is to add a linter to NumPy's CI to check PRs for new uses of possibly problematic C API items. We should double-check and explicitly whitelist all the remaining ones, then make any new uses get whitelisted as they come in.

I added the sprintable label to this since this should be doable for anyone who understands how to work with github actions and parse git diffs.

@ngoldbaum ngoldbaum removed their assignment Oct 24, 2024
ArvidJB pushed a commit to ArvidJB/numpy that referenced this issue Nov 1, 2024
Ref numpy#26159

See also the CPython HOWTO on this topic: https://docs.python.org/3.13/howto/free-threading-extensions.html#freethreading-extensions-howto.

The remaining usages of PyDict_GetItem and PyDict_Next are all around the fields attribute of structured dtypes. I'm pretty sure that dictionary is effectively frozen after the DType is constructed, so I don't worry about those uses.

It's not straightforward to write tests for this, I'm just applying static refactorings in places where the refactoring shouldn't introduce new reference counting bugs.

* ENH: fix thread-unsafe C API usages

* ENH: use critical sections in einsum

* BUG: fix error handling in loadtxt C code

* revert einsum changes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
03 - Maintenance 39 - free-threading PRs and issues related to support for free-threading CPython (a.k.a. no-GIL, PEP 703) sprintable Issue fits the time-frame and setting of a sprint
Projects
None yet
Development

No branches or pull requests

3 participants
  NODES
Bugs 5
COMMUNITY 2
Note 1
Project 5
USERS 1