Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added exception translator specific mutex used with try_translate_exceptions #5362

Merged
merged 5 commits into from
Sep 17, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 11 additions & 11 deletions include/pybind11/detail/exception_translation.h
Original file line number Diff line number Diff line change
Expand Up @@ -50,17 +50,17 @@ inline void try_translate_exceptions() {
- delegate translation to the next translator by throwing a new type of exception.
*/

bool handled = with_internals([&](internals &internals) {
auto &local_exception_translators = get_local_internals().registered_exception_translators;
if (detail::apply_exception_translators(local_exception_translators)) {
return true;
}
auto &exception_translators = internals.registered_exception_translators;
if (detail::apply_exception_translators(exception_translators)) {
return true;
}
return false;
});
bool handled = with_exception_translators(
[&](std::forward_list<ExceptionTranslator> &exception_translators,
std::forward_list<ExceptionTranslator> &local_exception_translators) {
if (detail::apply_exception_translators(local_exception_translators)) {
return true;
}
if (detail::apply_exception_translators(exception_translators)) {
return true;
}
return false;
});

if (!handled) {
set_error(PyExc_SystemError, "Exception escaped from default exception translator!");
Expand Down
20 changes: 19 additions & 1 deletion include/pybind11/detail/internals.h
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,11 @@
# if PY_VERSION_HEX >= 0x030C0000 || defined(_MSC_VER)
// Version bump for Python 3.12+, before first 3.12 beta release.
// Version bump for MSVC piggy-backed on PR #4779. See comments there.
# define PYBIND11_INTERNALS_VERSION 5
# ifdef Py_GIL_DISABLED
# define PYBIND11_INTERNALS_VERSION 6
Copy link
Collaborator

@rwgk rwgk Sep 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless #5296 is merged first, this will lead to incompatibility between extension modules compiled with or without Py_GIL_DISABLED defined. — EDIT next day: THIS IS NOT AN ISSUE. See comments posted after this one.

I just only now realize ... Huston we have a problem?

We already added pymutex mutex to struct internals but didn't change PYBIND11_INTERNALS_VERSION. Ouch? Sorry I missed that before.

What do we want here? I'm very uncertain, @colesbury could you please advise?

struct internals {
#if PY_VERSION_HEX >= 0x030D0000
    pymutex mutex;
    pymutex exception_translator_mutex;
#endif
...

Could that work? The idea here is that struct internals is the same with and without Py_GIL_DISABLED defined.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Free-threading is experimental so changing the ABI before was fine. Now I think packages might be beginning to ship, so probably not such a great idea to change without a bump. But not fond of these confusing bumps. The ABI is currently a mess, by the way - the Windows only bump broke conda-forge and the only solution was to pin <2.12, which now is breaking packages (see the feedstock for more info).

Will the internals version matter after #5296? Could we just basically do away with the concept, and only do compiler/stdlib, etc?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we just basically do away with the concept

Not entirely, I think. The internals versions still need to match for the inheritance functionality to work correctly. (I'd probably need a day or two to figure out conclusively if that's fixable or just fundamentally impossible.)

But the more I think about it, the more I'm getting worried about having a different struct internals with and without Py_GIL_DISABLED defined:

What happens if there is one extension built with the GIL enabled, another built with free-threading, and both are imported into the same process?

Currently we only have this:

  • #define PYBIND11_CHECK_PYTHON_VERSION \
    { \
    const char *compiled_ver \
    = PYBIND11_TOSTRING(PY_MAJOR_VERSION) "." PYBIND11_TOSTRING(PY_MINOR_VERSION); \
    const char *runtime_ver = Py_GetVersion(); \
    size_t len = std::strlen(compiled_ver); \
    if (std::strncmp(runtime_ver, compiled_ver, len) != 0 \
    || (runtime_ver[len] >= '0' && runtime_ver[len] <= '9')) { \
    PyErr_Format(PyExc_ImportError, \
    "Python version mismatch: module was compiled for Python %s, " \
    "but the interpreter version is incompatible: %s.", \
    compiled_ver, \
    runtime_ver); \
    return nullptr; \
    } \
    }

Do we need to also check for Py_GIL_DISABLED compatibility there?

Or do we have to do something here?

  • #define PYBIND11_INTERNALS_ID \
    "__pybind11_internals_v" PYBIND11_TOSTRING(PYBIND11_INTERNALS_VERSION) \
    PYBIND11_INTERNALS_KIND PYBIND11_COMPILER_TYPE PYBIND11_STDLIB \
    PYBIND11_BUILD_ABI PYBIND11_BUILD_TYPE "__"
    #define PYBIND11_MODULE_LOCAL_ID \
    "__pybind11_module_local_v" PYBIND11_TOSTRING(PYBIND11_INTERNALS_VERSION) \
    PYBIND11_INTERNALS_KIND PYBIND11_COMPILER_TYPE PYBIND11_STDLIB \
    PYBIND11_BUILD_ABI PYBIND11_BUILD_TYPE "__"

@colesbury

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if there is one extension built with the GIL enabled, another built with free-threading, and both are imported into the same process?

You can't have extensions compiled with and without Py_GIL_DISABLED in the same process. An extension might require the GIL to be dynamically enabled, but that's different: that's a matter of calling (or not calling) PyUnstable_Module_SetGIL(). The extension will still need to be compiled with Py_GIL_DISABLED to be loaded in 3.13t.

In other words:

  • Py_GIL_DISABLED is part of the extension ABI, similar to Py_VERSION_HEX
  • It's part of the wheel tag (i.e., the "t" in cp313-cp313t), just like the version number

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or as @henryiii wrote in #5148 (comment): "A free-threaded build can't talk to a non-free-threaded build".

Copy link
Contributor Author

@vfdev-5 vfdev-5 Sep 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I should have built using same python version, e.g. 3.13t and 3.13.
Doing so, above example is giving either a segfault:

$ gdb --args python3.13t main.py

Starting program: /usr/bin/python3.13t main.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/x86_64-linux-gnu/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7be1f0b in pybind11::dict::dict(pybind11::object&&) () from /pybind11_playground/ft_or_not_ft/test_gil.so

(gdb) bt
#0  0x00007ffff796bf0b in pybind11::dict::dict(pybind11::object&&) () from /pybind11_playground/ft_or_not_ft/test_gil.so
#1  0x00007ffff79683c5 in pybind11::detail::get_internals() () from /pybind11_playground/ft_or_not_ft/test_gil.so
#2  0x00007ffff7967bdd in PyInit_test_gil () from /pybind11_playground/ft_or_not_ft/test_gil.so
#3  0x00000000005e7800 in ?? ()

or

$ python3.13 main.py
Traceback (most recent call last):
  File "/pybind11_playground/ft_or_not_ft/main.py", line 4, in <module>
    import test_ft
ImportError: /pybind11_playground/ft_or_not_ft/test_ft.so: undefined symbol: _Py_DecRefShared

If this is again unrelated sorry for the noise.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot! That really helps me understand the situation better. So I think what we need to do is work on this code:

#define PYBIND11_CHECK_PYTHON_VERSION \
{ \
const char *compiled_ver \
= PYBIND11_TOSTRING(PY_MAJOR_VERSION) "." PYBIND11_TOSTRING(PY_MINOR_VERSION); \
const char *runtime_ver = Py_GetVersion(); \
size_t len = std::strlen(compiled_ver); \
if (std::strncmp(runtime_ver, compiled_ver, len) != 0 \
|| (runtime_ver[len] >= '0' && runtime_ver[len] <= '9')) { \
PyErr_Format(PyExc_ImportError, \
"Python version mismatch: module was compiled for Python %s, " \
"but the interpreter version is incompatible: %s.", \
compiled_ver, \
runtime_ver); \
return nullptr; \
} \
}

We want to catch if there is a mismatch between the Python interpreter kind (gil, free-threading) and the way the extension was built.

In a separate PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I think packages might be beginning to ship, so probably not such a great idea to change without a bump. But not fond of these confusing bumps.

Yeah ... "version bump chaos"

Trying to summarize how I understand the situation:

  • The internals version wasn't changed for Python 3.13t — but that's ok because there were no released packages anyway that could have been built with a non-matching struct internals.

  • Assuming there are released packages for 3.13t now, yes, it would indeed be safest to bump the internals version for 3.13t. — But pegging that on Py_GIL_DISABLED will make it very difficult for the general user community to reason about the internals version.

If this PR gets merged as is, we'll have internals version

  • 4 for Python < 3.12 if _MSC_VER is not defined.
  • 5 for Python 3.12, or 3.13, or if _MSC_VER is defined.
  • 6 for Python 3.13t only.

Who can (or wants to) understand that? :-)

And then we also still have #5339 (smart_holder) and #5280 (native enum) that I'd really like to see merged.

I believe this is a situation where the maintainers need to step up to balance "making progress" with "minimizing confusion".

@henryiii wrote:

Free-threading is experimental

That makes me think, let's not mess with the internals version in this PR:

  • Maybe a small number of users experimenting will stumble,
  • but a much larger number of production users will not have to understand the weirdness summarized above.

WDYT?

Looking beyond that question, I feel it'd be best to approach this strategically:

That's a discussion far beyond this PR, but I'm mentioning it here because it would be another reason to not mess with the internals version here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes me think, let's not mess with the internals version in this PR

I'm concerned that landing this PR without changing the version number will cause problems not just for users but also package maintainers (e.g., scipy, PyTorch) that use pybind11 and support free-threading. The failure modes for different internals can be complicated -- in this case, I think you run the risk of treating uninitialized memory like a mutex.

I think the approach in this PR is a good long term fix, but if the version bump is the issue we can first pursue a different approach that avoids changing the internals struct (and consider landing this change later when #5280 is ready).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First priority, I want to be helpful. Second minimize potential for confusion.

Unfortunately I'm technically somewhat handicapped at the moment (no Linux workstation), but let me try to get #5296 merged asap, with @henryiii's formal approval, to be sure we're on the same.

After that version bumps will have a lot less impact, and it won't matter as much that people fully understand the numbering.

# else
# define PYBIND11_INTERNALS_VERSION 5
# endif
# else
# define PYBIND11_INTERNALS_VERSION 4
# endif
Expand Down Expand Up @@ -177,6 +181,7 @@ static_assert(sizeof(instance_map_shard) % 64 == 0,
struct internals {
#ifdef Py_GIL_DISABLED
pymutex mutex;
pymutex exception_translator_mutex;
#endif
// std::type_index -> pybind11's type information
type_map<type_info *> registered_types_cpp;
Expand Down Expand Up @@ -643,6 +648,19 @@ inline auto with_internals(const F &cb) -> decltype(cb(get_internals())) {
return cb(internals);
}

template <typename F>
inline auto with_exception_translators(const F &cb)
-> decltype(cb(get_internals().registered_exception_translators,
get_local_internals().registered_exception_translators)) {
auto &internals = get_internals();
#ifdef Py_GIL_DISABLED
std::unique_lock<pymutex> lock((internals).exception_translator_mutex);
#endif
auto &local_internals = get_local_internals();
return cb(internals.registered_exception_translators,
local_internals.registered_exception_translators);
}

inline std::uint64_t mix64(std::uint64_t z) {
// David Stafford's variant 13 of the MurmurHash3 finalizer popularized
// by the SplitMix PRNG.
Expand Down
21 changes: 12 additions & 9 deletions include/pybind11/pybind11.h
Original file line number Diff line number Diff line change
Expand Up @@ -2586,10 +2586,12 @@ void implicitly_convertible() {
}

inline void register_exception_translator(ExceptionTranslator &&translator) {
detail::with_internals([&](detail::internals &internals) {
internals.registered_exception_translators.push_front(
std::forward<ExceptionTranslator>(translator));
});
detail::with_exception_translators(
[&](std::forward_list<ExceptionTranslator> &exception_translators,
std::forward_list<ExceptionTranslator> &local_exception_translators) {
(void) local_exception_translators;
exception_translators.push_front(std::forward<ExceptionTranslator>(translator));
});
}

/**
Expand All @@ -2599,11 +2601,12 @@ inline void register_exception_translator(ExceptionTranslator &&translator) {
* the exception.
*/
inline void register_local_exception_translator(ExceptionTranslator &&translator) {
detail::with_internals([&](detail::internals &internals) {
(void) internals;
detail::get_local_internals().registered_exception_translators.push_front(
std::forward<ExceptionTranslator>(translator));
});
detail::with_exception_translators(
[&](std::forward_list<ExceptionTranslator> &exception_translators,
std::forward_list<ExceptionTranslator> &local_exception_translators) {
(void) exception_translators;
local_exception_translators.push_front(std::forward<ExceptionTranslator>(translator));
});
}

/**
Expand Down
10 changes: 10 additions & 0 deletions tests/custom_exceptions.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
from __future__ import annotations


class PythonMyException7(Exception):
def __init__(self, message):
self.message = message
super().__init__(message)

def __str__(self):
return "[PythonMyException7]: " + self.message.a
39 changes: 39 additions & 0 deletions tests/test_exceptions.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,16 @@ struct PythonAlreadySetInDestructor {
py::str s;
};

struct CustomData {
explicit CustomData(const std::string &a) : a(a) {}
std::string a;
};

struct MyException7 {
explicit MyException7(const CustomData &message) : message(message) {}
CustomData message;
};

TEST_SUBMODULE(exceptions, m) {
m.def("throw_std_exception",
[]() { throw std::runtime_error("This exception was intentionally thrown."); });
Expand Down Expand Up @@ -385,4 +395,33 @@ TEST_SUBMODULE(exceptions, m) {

// m.def("pass_exception_void", [](const py::exception<void>&) {}); // Does not compile.
m.def("return_exception_void", []() { return py::exception<void>(); });

m.def("throws7", []() {
auto data = CustomData("abc");
throw MyException7(data);
});

py::class_<CustomData>(m, "CustomData", py::module_local())
.def(py::init<const std::string &>())
.def_readwrite("a", &CustomData::a);

PYBIND11_CONSTINIT static py::gil_safe_call_once_and_store<py::object>
PythonMyException7_storage;
PythonMyException7_storage.call_once_and_store_result([&]() {
auto mod = py::module_::import("custom_exceptions");
py::object obj = mod.attr("PythonMyException7");
return obj;
});

py::register_local_exception_translator([](std::exception_ptr p) {
try {
if (p) {
std::rethrow_exception(p);
}
} catch (const MyException7 &e) {
auto exc_type = PythonMyException7_storage.get_stored();
py::object exc_inst = exc_type(e.message);
PyErr_SetObject(PyExc_Exception, exc_inst.ptr());
}
});
}
5 changes: 5 additions & 0 deletions tests/test_exceptions.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
import sys

import pytest
from custom_exceptions import PythonMyException7

import env
import pybind11_cross_module_tests as cm
Expand Down Expand Up @@ -195,6 +196,10 @@ def test_custom(msg):
raise RuntimeError("Exception error: caught child from parent") from err
assert msg(excinfo.value) == "this is a helper-defined translated exception"

with pytest.raises(PythonMyException7) as excinfo:
m.throws7()
assert msg(excinfo.value) == "[PythonMyException7]: abc"


def test_nested_throws(capture):
"""Tests nested (e.g. C++ -> Python -> C++) exception handling"""
Expand Down