Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use with argparse #3

Open
janash opened this issue Apr 6, 2020 · 2 comments
Open

use with argparse #3

janash opened this issue Apr 6, 2020 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@janash
Copy link

janash commented Apr 6, 2020

Hello! This issue is concerning your comments on using argparse. I haven't run your code, but here are a few comments.

I see you using global debug a lot. This should be necessary. The variable debug as declared should already be "global" (ie readable inside of the functions). Since you are only reading it, I believe you do not need debug. This should be true when the file is run as a script whether or not it is in __main__. However, if the definition is in __main__ and you run with pytest (or run functions from an import), debug will not be defined, so tests will fail.

You might try putting everything with argparse in __main__ and adding a default argument (debug=False) to all of your function definitions which can be overridden with the value from argparse when the script is run. This way, debug is defined automatically to be False. When you use it, you will call it with the appropriate values (in __main__, this would be debug=args.debug).

@khoivan88
Copy link
Owner

Hi @janash ,

You are absolutely right on the global debug. I told myself to remove that and I keep forgetting.

I thought of your suggestion of adding default argument of debug=False but hesitated because it would required some more rewriting. However, it might be my last option for what I want the code to do.

Thank you very much for your time and comments!!

@khoivan88 khoivan88 self-assigned this Apr 6, 2020
@khoivan88 khoivan88 added the enhancement New feature or request label Apr 6, 2020
@khoivan88
Copy link
Owner

@janash : One note: I did want to have a global debug here in find_sds() function because I was hoping that maybe by passing a debug value of True into this function via argparse, I can modify the global debug.

find_sds() function basically calls download_sds() in the try clause, which in turn calls all others functions. However, after a bunch of testing, even though global debug == True at find_sds() (by passing argparse in '__main__' or by sys.argv), all other functions still read global debug as False. This is a weird behavior that I am not sure why but at this point, I assume it has something to do with the map and/or partial functions I used. Perhaps, map() and partial() create some closure and trap debug as False from the global variable.

def find_sds(cas_list: List[str], download_path: str = None, pool_size: int = 10) -> None:
"""Find safety data sheet (SDS) for list of CAS numbers
Parameters
----------
cas_list : List[str]
List of CAS numbers
download_path : str, optional
the path for downloaded file,
by default None. If so, SDS will be downloaded into folder 'SDS'
inside folder containing the python file
pool_size : int, optional
the number of multithread that are running simultaneously,
by default 10
Returns
-------
None:
Summary of result is print to screen
"""
global debug
# If the list of CAS is empty, exit the program
if not cas_list:
print('List of CAS numbers is empty!')
exit(0)
# # print out extra info in debug mode in case SDS is not found
# if len(sys.argv) == 2 and sys.argv[1] in ['--debug=True', '--debug=true', '--debug', '-d']:
# debug = True
# print('debug value: {}'.format(debug))
# Set download_path to 'SDS' folder inside the parent folder of python file
if not download_path:
download_path = Path(__file__).resolve().parent / 'SDS'
# Get the set of CAS for molecule missing sds:
to_be_downloaded = set(cas_list)
# Step 1: downloading sds file
# Check if download path directory exists. If not, create it
# https://stackoverflow.com/questions/12517451/automatically-creating-directories-with-file-output
# https://docs.python.org/3/library/os.html#os.makedirs
os.makedirs(download_path, exist_ok=True)
print('Downloading missing SDS files. Please wait!')
download_result = []
# # Using multithreading
try:
with Pool(pool_size) as p:
download_result = p.map(partial(
download_sds,
download_path=download_path),
to_be_downloaded)
except Exception as error:
# if debug:
traceback_str = ''.join(traceback.format_exception(etype=type(error), value=error, tb=error.__traceback__))
print(traceback_str)
# Step 2: print out summary
finally:
# Sometimes Pool worker return 'None', remove 'None' as the following
# print(download_result)
download_result = [x for x in download_result if x]
missing_sds = set()
updated_sds = set()
for cas_nr, sds_existed, sds_source in download_result:
if sds_existed:
updated_sds.add(cas_nr)
else:
missing_sds.add(cas_nr)
if missing_sds:
print('\nStill missing SDS:\n{}'.format(missing_sds))
print('\nSummary: ')
print('\t{} SDS files are missing.'.format(len(missing_sds)))
print('\t{} SDS files downloaded.'.format(len(updated_sds)))
# Advice user about turning on debug mode for more error printing
if not debug:
print('\n\n(Optional): you can turn on debug mode (more error printing during search) using the following command:')
print('python find_sds/find_sds.py --debug\n')

Anyway, all of this probably comes from my poor planning for this code. It has evolved over time with more functions added.

Thank you for your time again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants