Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parameters used in cell type annotation tutorial #174

Open
yueming-ding opened this issue Mar 28, 2024 · 0 comments
Open

parameters used in cell type annotation tutorial #174

yueming-ding opened this issue Mar 28, 2024 · 0 comments

Comments

@yueming-ding
Copy link

@subercui I have questions on your cell type tutorial https://github.com/bowang-lab/scGPT/blob/main/tutorials/Tutorial_Annotation.ipynb. You use "normalize_total=1e4" in the following preprocessor function:
preprocessor = Preprocessor(
use_key="X", # the key in adata.layers to use as raw data
filter_gene_by_counts=filter_gene_by_counts, # step 1
filter_cell_by_counts=False, # step 2
normalize_total=1e4, # 3. whether to normalize the raw data and to what sum
result_normed_key="X_normed", # the key in adata.layers to store the normalized data
log1p=data_is_raw, # 4. whether to log1p the normalized data
result_log1p_key="X_log1p",
subset_hvg=False, # 5. whether to subset the raw data to highly variable genes
hvg_flavor="seurat_v3" if data_is_raw else "cell_ranger",
binning=n_bins, # 6. whether to bin the raw data and to what number of bins
result_binned_key="X_binned", # the key in adata.layers to store the binned data
)

But I checked the data you used. Both datasets (c_data.h5ad and filtered_ms_adata.h5ad) are log1p data (not raw data) at adata.X. I think the correct parameter for normalize_total should be "False" (normalize_total=False), not normalize_total=1e4. Could you explain why you use "normalize_total=1e4" here if adata.X is log1p not raw data (basically now normalize log1p by total counts if normalize_total=1e4 is used)? We found that the X_binned values and the final stat metrics (accuracy, precision etc) were changed if "normalize_total=1e4" or "normalize_total=False" were used for these datasets .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant