Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: Enforce Numpy Docstring Validation | pandas.Series #58504

Open
gboeker opened this issue May 1, 2024 · 10 comments
Open

DOC: Enforce Numpy Docstring Validation | pandas.Series #58504

gboeker opened this issue May 1, 2024 · 10 comments
Labels
Docs Needs Triage Issue that has not been reviewed by a pandas team member

Comments

@gboeker
Copy link
Contributor

gboeker commented May 1, 2024

DOC: Enforce Numpy Docstring Validation (Parent Issue) #58063

Pandas has a script for validating docstrings in code_checks.sh. Currently, some methods fail some of these checks.

pandas.Series

pandas/ci/code_checks.sh

Lines 318 to 490 in c468028

-i "pandas.Series SA01" \
-i "pandas.Series.T SA01" \
-i "pandas.Series.__iter__ RT03,SA01" \
-i "pandas.Series.add PR07" \
-i "pandas.Series.at_time PR01" \
-i "pandas.Series.backfill PR01,SA01" \
-i "pandas.Series.bfill SA01" \
-i "pandas.Series.case_when RT03" \
-i "pandas.Series.cat PR07,SA01" \
-i "pandas.Series.cat.add_categories PR01,PR02" \
-i "pandas.Series.cat.as_ordered PR01" \
-i "pandas.Series.cat.as_unordered PR01" \
-i "pandas.Series.cat.codes SA01" \
-i "pandas.Series.cat.ordered SA01" \
-i "pandas.Series.cat.remove_categories PR01,PR02" \
-i "pandas.Series.cat.remove_unused_categories PR01" \
-i "pandas.Series.cat.rename_categories PR01,PR02" \
-i "pandas.Series.cat.reorder_categories PR01,PR02" \
-i "pandas.Series.cat.set_categories PR01,PR02" \
-i "pandas.Series.copy SA01" \
-i "pandas.Series.div PR07" \
-i "pandas.Series.droplevel SA01" \
-i "pandas.Series.dt.as_unit PR01,PR02" \
-i "pandas.Series.dt.ceil PR01,PR02,SA01" \
-i "pandas.Series.dt.components SA01" \
-i "pandas.Series.dt.date SA01" \
-i "pandas.Series.dt.day SA01" \
-i "pandas.Series.dt.day_name PR01,PR02,SA01" \
-i "pandas.Series.dt.day_of_year SA01" \
-i "pandas.Series.dt.dayofyear SA01" \
-i "pandas.Series.dt.days SA01" \
-i "pandas.Series.dt.days_in_month SA01" \
-i "pandas.Series.dt.daysinmonth SA01" \
-i "pandas.Series.dt.floor PR01,PR02,SA01" \
-i "pandas.Series.dt.freq GL08" \
-i "pandas.Series.dt.hour SA01" \
-i "pandas.Series.dt.is_leap_year SA01" \
-i "pandas.Series.dt.microsecond SA01" \
-i "pandas.Series.dt.microseconds SA01" \
-i "pandas.Series.dt.minute SA01" \
-i "pandas.Series.dt.month SA01" \
-i "pandas.Series.dt.month_name PR01,PR02,SA01" \
-i "pandas.Series.dt.nanosecond SA01" \
-i "pandas.Series.dt.nanoseconds SA01" \
-i "pandas.Series.dt.normalize PR01" \
-i "pandas.Series.dt.quarter SA01" \
-i "pandas.Series.dt.qyear GL08" \
-i "pandas.Series.dt.round PR01,PR02,SA01" \
-i "pandas.Series.dt.second SA01" \
-i "pandas.Series.dt.seconds SA01" \
-i "pandas.Series.dt.strftime PR01,PR02" \
-i "pandas.Series.dt.time SA01" \
-i "pandas.Series.dt.timetz SA01" \
-i "pandas.Series.dt.to_period PR01,PR02,RT03" \
-i "pandas.Series.dt.total_seconds PR01" \
-i "pandas.Series.dt.tz SA01" \
-i "pandas.Series.dt.tz_convert PR01,PR02,RT03" \
-i "pandas.Series.dt.tz_localize PR01,PR02" \
-i "pandas.Series.dt.unit GL08" \
-i "pandas.Series.dt.year SA01" \
-i "pandas.Series.dtype SA01" \
-i "pandas.Series.dtypes SA01" \
-i "pandas.Series.empty GL08" \
-i "pandas.Series.eq PR07,SA01" \
-i "pandas.Series.ffill SA01" \
-i "pandas.Series.first_valid_index SA01" \
-i "pandas.Series.floordiv PR07" \
-i "pandas.Series.ge PR07,SA01" \
-i "pandas.Series.get SA01" \
-i "pandas.Series.gt PR07,SA01" \
-i "pandas.Series.hasnans SA01" \
-i "pandas.Series.infer_objects RT03" \
-i "pandas.Series.is_monotonic_decreasing SA01" \
-i "pandas.Series.is_monotonic_increasing SA01" \
-i "pandas.Series.is_unique SA01" \
-i "pandas.Series.item SA01" \
-i "pandas.Series.keys SA01" \
-i "pandas.Series.kurt RT03,SA01" \
-i "pandas.Series.kurtosis RT03,SA01" \
-i "pandas.Series.last_valid_index SA01" \
-i "pandas.Series.le PR07,SA01" \
-i "pandas.Series.list.__getitem__ SA01" \
-i "pandas.Series.list.flatten SA01" \
-i "pandas.Series.list.len SA01" \
-i "pandas.Series.lt PR07,SA01" \
-i "pandas.Series.mask RT03" \
-i "pandas.Series.max RT03" \
-i "pandas.Series.mean RT03,SA01" \
-i "pandas.Series.median RT03,SA01" \
-i "pandas.Series.min RT03" \
-i "pandas.Series.mod PR07" \
-i "pandas.Series.mode SA01" \
-i "pandas.Series.mul PR07" \
-i "pandas.Series.nbytes SA01" \
-i "pandas.Series.ndim SA01" \
-i "pandas.Series.ne PR07,SA01" \
-i "pandas.Series.nunique RT03" \
-i "pandas.Series.pad PR01,SA01" \
-i "pandas.Series.plot PR02,SA01" \
-i "pandas.Series.pop RT03,SA01" \
-i "pandas.Series.pow PR07" \
-i "pandas.Series.prod RT03" \
-i "pandas.Series.product RT03" \
-i "pandas.Series.radd PR07" \
-i "pandas.Series.rdiv PR07" \
-i "pandas.Series.reorder_levels RT03,SA01" \
-i "pandas.Series.rfloordiv PR07" \
-i "pandas.Series.rmod PR07" \
-i "pandas.Series.rmul PR07" \
-i "pandas.Series.rpow PR07" \
-i "pandas.Series.rsub PR07" \
-i "pandas.Series.rtruediv PR07" \
-i "pandas.Series.sem PR01,RT03,SA01" \
-i "pandas.Series.shape SA01" \
-i "pandas.Series.size SA01" \
-i "pandas.Series.skew RT03,SA01" \
-i "pandas.Series.sparse PR01,SA01" \
-i "pandas.Series.sparse.density SA01" \
-i "pandas.Series.sparse.fill_value SA01" \
-i "pandas.Series.sparse.from_coo PR07,SA01" \
-i "pandas.Series.sparse.npoints SA01" \
-i "pandas.Series.sparse.sp_values SA01" \
-i "pandas.Series.sparse.to_coo PR07,RT03,SA01" \
-i "pandas.Series.std PR01,RT03,SA01" \
-i "pandas.Series.str PR01,SA01" \
-i "pandas.Series.str.capitalize RT03" \
-i "pandas.Series.str.casefold RT03" \
-i "pandas.Series.str.center RT03,SA01" \
-i "pandas.Series.str.decode PR07,RT03,SA01" \
-i "pandas.Series.str.encode PR07,RT03,SA01" \
-i "pandas.Series.str.find RT03" \
-i "pandas.Series.str.fullmatch RT03" \
-i "pandas.Series.str.get RT03,SA01" \
-i "pandas.Series.str.index RT03" \
-i "pandas.Series.str.ljust RT03,SA01" \
-i "pandas.Series.str.lower RT03" \
-i "pandas.Series.str.lstrip RT03" \
-i "pandas.Series.str.match RT03" \
-i "pandas.Series.str.normalize RT03,SA01" \
-i "pandas.Series.str.partition RT03" \
-i "pandas.Series.str.repeat SA01" \
-i "pandas.Series.str.replace SA01" \
-i "pandas.Series.str.rfind RT03" \
-i "pandas.Series.str.rindex RT03" \
-i "pandas.Series.str.rjust RT03,SA01" \
-i "pandas.Series.str.rpartition RT03" \
-i "pandas.Series.str.rstrip RT03" \
-i "pandas.Series.str.strip RT03" \
-i "pandas.Series.str.swapcase RT03" \
-i "pandas.Series.str.title RT03" \
-i "pandas.Series.str.translate RT03,SA01" \
-i "pandas.Series.str.upper RT03" \
-i "pandas.Series.str.wrap RT03,SA01" \
-i "pandas.Series.str.zfill RT03" \
-i "pandas.Series.struct.dtypes SA01" \
-i "pandas.Series.sub PR07" \
-i "pandas.Series.sum RT03" \
-i "pandas.Series.swaplevel SA01" \
-i "pandas.Series.to_dict SA01" \
-i "pandas.Series.to_frame SA01" \
-i "pandas.Series.to_list RT03" \
-i "pandas.Series.to_markdown SA01" \
-i "pandas.Series.to_period SA01" \
-i "pandas.Series.to_string SA01" \
-i "pandas.Series.to_timestamp RT03,SA01" \
-i "pandas.Series.truediv PR07" \
-i "pandas.Series.tz_convert SA01" \
-i "pandas.Series.tz_localize SA01" \
-i "pandas.Series.unstack SA01" \
-i "pandas.Series.update PR07,SA01" \
-i "pandas.Series.value_counts RT03" \
-i "pandas.Series.var PR01,RT03,SA01" \
-i "pandas.Series.where RT03" \

The task is:

  1. take 1-5 methods

  2. run: scripts/validate_docstrings.py --format=actions <method-name>

example command: scripts/validate_docstrings.py --format=actions pandas.Categorical.__array__
example output:

################################################################################
################################## Validation ##################################
################################################################################

2 Errors found for `pandas.Categorical.__array__`:
	ES01	No extended summary found
	SA01	See Also section not found
  1. check if validation docstrings passes for those methods, and if it’s necessary fix the docstrings according to whatever error is reported. Note: We've chosen to ignore ES01 errors, these are not required to be fixed.

  2. remove those methods from code_checks.sh if all errors are cleared and the docstring is correct, otherwise, remove the specific error that was fixed from the list of errors for that method.

  3. commit, push, open pull request

Please don't comment take as multiple people can work on this issue. You also don't need to ask for permission to work on this, just comment on which methods are you going to work : )

If you're new contributor, please check the contributing guide

thanks @datapythonista @jordan-d-murphy for the inspiration for this issue!

@gboeker gboeker added Docs Needs Triage Issue that has not been reviewed by a pandas team member labels May 1, 2024
@tuhinsharma121
Copy link
Contributor

tuhinsharma121 commented May 2, 2024

working on

 -i "pandas.Series SA01" \ 
 -i "pandas.Series.backfill PR01,SA01" \
 -i "pandas.Series.cat.codes SA01" \ 
 -i "pandas.Series.dt.days SA01" \

@gboeker
Copy link
Contributor Author

gboeker commented May 2, 2024

working on

 -i "pandas.Series.cat PR07,SA01" \
 -i "pandas.Series.__iter__ RT03,SA01" \

@tuhinsharma121
Copy link
Contributor

working on

-i "pandas.Series.max RT03" \
        -i "pandas.Series.mean RT03,SA01" \
        -i "pandas.Series.median RT03,SA01" \
        -i "pandas.Series.min RT03" \
        -i "pandas.Series.mod PR07" \
        -i "pandas.Series.mode SA01" \
        -i "pandas.Series.mul PR07" \

@shriyakalakata
Copy link
Contributor

shriyakalakata commented May 2, 2024

Working on

-i "pandas.Series.dtype SA01" \ 
-i "pandas.Series.is_unique SA01" \
-i "pandas.Series.shape SA01" \

@shriyakalakata
Copy link
Contributor

shriyakalakata commented May 3, 2024

Working on

 -i "pandas.Series.is_monotonic_decreasing SA01" \ 
 -i "pandas.Series.is_monotonic_increasing SA01" \ 
 -i "pandas.Series.hasnans SA01" \ 

@tuhinsharma121
Copy link
Contributor

tuhinsharma121 commented May 11, 2024

working on

-i "pandas.Series.add PR07" \
        -i "pandas.Series.cat PR07" \

@tuhinsharma121
Copy link
Contributor

tuhinsharma121 commented May 13, 2024

working on

pandas.Series.case_when RT03
pandas.Series.str.translate RT03,SA01

@sam-baumann
Copy link
Contributor

sam-baumann commented May 15, 2024

Working on

pandas.Series.floordiv
pandas.Series.pow
pandas.Series.rmod
pandas.Series.rmod
pandas.Series.rtruediv
pandas.Series.sub

@03darius
Copy link

Working on

-i "pandas.Series.cat.add_categories PR01,PR02" \
-i "pandas.Series.cat.as_ordered PR01" \
-i "pandas.Series.cat.as_unordered PR01" \
-i "pandas.Series.cat.remove_categories PR01,PR02" \
-i "pandas.Series.cat.remove_unused_categories PR01" \
-i "pandas.Series.cat.rename_categories PR01,PR02" \
-i "pandas.Series.cat.reorder_categories PR01,PR02" \
-i "pandas.Series.cat.set_categories PR01,PR02" \

@tuhinsharma121
Copy link
Contributor

tuhinsharma121 commented May 18, 2024

working on

-i "pandas.Series.eq PR07,SA01" \ 
-i "pandas.Series.kurtosis RT03,SA01" \
-i "pandas.Series.kurt RT03,SA01" \

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs Needs Triage Issue that has not been reviewed by a pandas team member
Projects
None yet
Development

No branches or pull requests

5 participants