New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-116738: Make _csv module thread-safe #118344
base: main
Are you sure you want to change the base?
Conversation
Modules/_csv.c
Outdated
@@ -1582,7 +1585,8 @@ csv_register_dialect(PyObject *module, PyObject *args, PyObject *kwargs) | |||
dialect = _call_dialect(module_state, dialect_obj, kwargs); | |||
if (dialect == NULL) | |||
return NULL; | |||
if (PyDict_SetItem(module_state->dialects, name_obj, dialect) < 0) { | |||
res = PyDict_SetItem(module_state->dialects, name_obj, dialect); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we don't have to use res
anymore. :)
Modules/_csv.c
Outdated
@@ -34,7 +35,7 @@ typedef struct { | |||
PyTypeObject *dialect_type; | |||
PyTypeObject *reader_type; | |||
PyTypeObject *writer_type; | |||
long field_limit; /* max parsed field size */ | |||
int32_t field_limit; /* max parsed field size */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, why is this change needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw there is not a _Py_atomic_load_long
so I changed it to int32_t
, but now I think we can just add it after you mentioned it, thanks for noticing!
This extension module has some module state values which may be read or written in different threads:
field_limit
: A simple integer value to store the maximum field size for the parser;dialects
: A dict value to store all the dialects by string names.field_limit
There is a public function
field_size_limit
that can read and write it, using a critical section to protect it. For other internal usages,_Py_atomic_load_int32
is used to make it thread safe.dialects
There are three public functions
get_dialect
,unregister_dialect
andlist_dialects
that can access this field.For the first two functions, a critical section is used to protect it.And the
list_dialects
function just returns the dict's keys withPyDict_Keys
. I think its result is readonly andPyDict_Keys
itself is thread safe, so we don't need to do anything.Other internal usages of
dialects
involve accessing its value withget_dialect_from_registry
, which simply callsPyDict_GetItemRef
and returns the result. I think we don't need to change it too.Update: I think these APIs is just call
dict
APIs which is already thread-safe, and there are no other state changes inside these APIs, so we don't need add locks on them.