Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PanicException: called Option::unwrap() on a None value #193

Open
nroak opened this issue Mar 8, 2023 · 6 comments
Open

PanicException: called Option::unwrap() on a None value #193

nroak opened this issue Mar 8, 2023 · 6 comments

Comments

@nroak
Copy link

nroak commented Mar 8, 2023

Setup

I am reporting a problem with GSEApy version, Python version, and operating
system as follows:

import sys; print(sys.version)
import platform; print(platform.python_implementation()); print(platform.platform())
import gseapy; print(gseapy.__version__)

3.10.9 (main, Jan 11 2023, 09:18:20) [Clang 14.0.6 ]
CPython
macOS-10.16-x86_64-i386-64bit
1.0.4

Expected behaviour

I'm running gsea step after converting mouse genes to human genes from my single cell data matrix.

gs_res = gp.gsea(data=bdata2.human, # or data='./P53_resampling_data.txt'
                gene_sets='h.all.v7.5.1.symbols.gmt', # or enrichr library names
                cls= "./gsea/"+cell_type+".cls", # cls=class_vector
                # set permutation_type to phenotype if samples >=15
                permutation_type='phenotype',
                permutation_num=1000, # reduce number to speed up test
                outdir=None,  # do not write output to disk
                method='signal_to_noise',
                threads=4, seed= 7)

Actual behaviour

2023-03-07 21:49:00,178 [WARNING] Dropping duplicated gene names, only keep the first values
thread '' panicked at 'called Option::unwrap() on a None value', src/utils.rs:67:33

PanicException Traceback (most recent call last)
Cell In[119], line 1
----> 1 gs_res = gp.gsea(data=bdata2.human, # or data='./P53_resampling_data.txt'
2 gene_sets='h.all.v7.5.1.symbols.gmt', # or enrichr library names
3 cls= "./gsea/"+cell_type+".cls", # cls=class_vector
4 # set permutation_type to phenotype if samples >=15
5 permutation_type='phenotype',
6 permutation_num=1000, # reduce number to speed up test
7 outdir=None, # do not write output to disk
8 method='signal_to_noise',
9 threads=4, seed= 7)

File ~/opt/anaconda3/envs/scanpy/lib/python3.10/site-packages/gseapy/init.py:150, in gsea(data, gene_sets, cls, outdir, min_size, max_size, permutation_num, weighted_score_type, permutation_type, method, ascending, threads, figsize, format, graph_num, no_plot, seed, verbose, *arg, **kwarg)
128 threads = kwarg["processes"]
130 gs = GSEA(
131 data,
132 gene_sets,
(...)
148 verbose,
149 )
--> 150 gs.run()
152 return gs

File ~/opt/anaconda3/envs/scanpy/lib/python3.10/site-packages/gseapy/gsea.py:305, in GSEA.run(self)
301 else: # phenotype permutation
302 group = list(
303 map(lambda x: True if x == self.pheno_pos else False, cls_vector)
304 )
--> 305 gsum = gsea_rs(
306 dat.index.values.tolist(),
307 dat.values.tolist(), # each row is gene values across samples
308 gmt,
309 group,
310 method,
311 self.weight,
312 self.min_size,
313 self.max_size,
314 self.permutation_num,
315 self._threads,
316 self.seed,
317 )
319 if self._outdir is not None:
320 self._logger.info(
321 "Start to generate GSEApy reports and figures............"
322 )

PanicException: called Option::unwrap() on a None value

@zqfang
Copy link
Owner

zqfang commented Mar 9, 2023

have you set the index of bdata2.human as gene_symbols?

I don't see any issue with my code here

image

@BelanaZ
Copy link

BelanaZ commented Jun 13, 2023

I experience the same kind of Error in a different case. This is the trace:


thread '<unnamed>' panicked at 'called `Option::unwrap()` on a `None` value', src\utils.rs:67:33
thread '<unnamed>' panicked at 'called `Option::unwrap()` on a `None` value', src\utils.rs:67:33
thread '<unnamed>' panicked at 'called `Option::unwrap()` on a `None` value', thread '<unnamed>thread 'src\utils.rs:67thread '' panicked at ':<unnamed>' panicked at 'called `Option::unwrap()` on a `None` value', src\utils.rs<unnamed>' panicked at ':called `Option::unwrap()` on a `None` value', 33
67:called `Option::unwrap()` on a `None` value', src\utils.rs33
src\utils.rs::6767:33
:33
thread '<unnamed>' panicked at 'called `Option::unwrap()` on a `None` value', src\utils.rs:67:33

gs.run()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <gseapy.gsea.GSEA object at 0x000002624D259790>

    def run(self):
        """GSEA main procedure"""
        m = self.method.lower()
        if m in ["signal_to_noise", "s2n"]:
            method = Metric.Signal2Noise
        elif m in ["s2n", "abs_signal_to_noise", "abs_s2n"]:
            method = Metric.AbsSignal2Noise
        elif m == "t_test":
            method = Metric.Ttest
        elif m == "ratio_of_classes":
            method = Metric.RatioOfClasses
        elif m == "diff_of_classes":
            method = Metric.DiffOfClasses
        elif m == "log2_ratio_of_classes":
            method = Metric.Log2RatioOfClasses
        else:
            raise Exception("Sorry, input method %s is not supported" % m)
    
        assert self.permutation_type in ["phenotype", "gene_set"]
        assert self.min_size <= self.max_size
    
        # Start Analysis
        self._logger.info("Parsing data files for GSEA.............................")
        # phenotype labels parsing
        cls_vector = self.load_classes()
        # select correct expression genes and values.
        dat, cls_dict = self.load_data(cls_vector)
        self.cls_dict = cls_dict
        # data frame must have length > 1
        assert len(dat) > 1
        # filtering out gene sets and build gene sets dictionary
        gmt = self.load_gmt(gene_list=dat.index.values, gmt=self.gene_sets)
        self.gmt = gmt
        self._logger.info(
            "%04d gene_sets used for further statistical testing....." % len(gmt)
        )
        self._logger.info("Start to run GSEA...Might take a while..................")
        # cpu numbers
        # compute ES, NES, pval, FDR, RES
        if self.permutation_type == "gene_set":
            # ranking metrics calculation.
            idx, dat2 = self.calculate_metric(
                df=dat,
                method=self.method,
                pos=self.pheno_pos,
                neg=self.pheno_neg,
                classes=cls_dict,
                ascending=self.ascending,
            )
            gsum = prerank_rs(
                dat2.index.values.tolist(),  # gene list
                dat2.squeeze().values.tolist(),  # ranking values
                gmt,  # must be a dict object
                self.weight,
                self.min_size,
                self.max_size,
                self.permutation_num,
                self._threads,
                self.seed,
            )
            ## need to update indices, prerank_rs only stores input's order
            # so compatible with code code below
            indices = gsum.indices
            indices[0] = idx
            gsum.indices = indices  # only accept [[]]
        else:  # phenotype permutation
            group = list(
                map(lambda x: True if x == self.pheno_pos else False, cls_vector)
            )
>           gsum = gsea_rs(
                dat.index.values.tolist(),
                dat.values.tolist(),  # each row is gene values across samples
                gmt,
                group,
                method,
                self.weight,
                self.min_size,
                self.max_size,
                self.permutation_num,
                self._threads,
                self.seed,
            )
E           pyo3_runtime.PanicException: called `Option::unwrap()` on a `None` value

I use the gsea method like this:

        gs_res = gp.gsea(
            data=df,
            gene_sets=gene_sets, #["KEGG_2016"]
            cls=cls,
            min_size=min_size, #15
            max_size=max_size, #500
            method=ranking_method, #"log2_ratio_of_classes"
            permutation_type=permutation_type, #"phenotype"
            permutation_num=number_of_permutations, #1000
            weighted_score_type=weighted_score, #1
            outdir=None,
            seed=seed, #123
        )

I think the problem is likely that my input dataframe looked like the one below. When preprocessing the data I normalised by z-score, so I had negative values. And if the class mean is negative taking the log2 for calculating the ranking by using log_2_ratio_of_classes won't work. This was obviously an error on my part but maybe this Error could be caught in some way, so it doesn't break the python code.

		Sample1  ...  SampleN
Gene symbol                        ...                       
ABCA8                   -0.288208  ...              -0.288208
ADCY9                   -0.288208  ...              -0.288208
ADH5                    -0.180052  ...              -0.190491
AGA                     -0.288208  ...              -0.288208
AGT                     -0.288208  ...              -0.258255
...                           ...  ...                    ...
UTP20                   -0.288208  ...              -0.288208
VGF                     -0.288208  ...              -0.288208
VIM                     -0.170777  ...              -0.221162
VPS26A                  -0.288208  ...              -0.288208
WNT7A                   -0.288208  ...              -0.288208

@gouinK
Copy link

gouinK commented Jul 11, 2023

I am getting the same error when using the prerank module with v1.0.4, but I just get a warning when using v0.10.7 and it produces results.

Here is a look at my input:
Screen Shot 2023-07-11 at 6 27 30 AM

Here is my invocation:

            import gseapy as gp

            pre_res = gp.prerank(rnk=rnk, 
                                 gene_sets='/Users/xxxxxxx/.cache/gseapy/enrichr.GO_Biological_Process_2018.gmt',
                                 processes=4,
                                 permutation_num=100,
                                 ascending=False,
                                 outdir='test/prerank_report_kegg', 
                                 format='png', 
                                 seed=6,
                                 min_size=0,
                                 max_size=500,
                                 verbose=True)

Here is the error message from v1.0.4:

2023-07-11 06:37:45,371 [WARNING] Duplicated values found in preranked stats: 0.77% of genes
The order of those genes will be arbitrary, which may produce unexpected results.
2023-07-11 06:37:45,378 [INFO] Parsing data files for GSEA.............................
2023-07-11 06:37:45,453 [INFO] 0000 gene_sets have been filtered out when max_size=500 and min_size=0
2023-07-11 06:37:45,453 [INFO] 5103 gene_sets used for further statistical testing.....
2023-07-11 06:37:45,454 [INFO] Start to run GSEA...Might take a while..................
thread '<unnamed>' panicked at 'called `Option::unwrap()` on a `None` value', src/stats.rs:302:55
---------------------------------------------------------------------------
PanicException                            Traceback (most recent call last)
Cell In[82], line 12
      8 genesets = ['[/Users/xxxxxxx/.cache/gseapy/enrichr.GO_Biological_Process_2018.gmt](https://file+.vscode-resource.vscode-cdn.net/Users/xxxxxxx/.cache/gseapy/enrichr.GO_Biological_Process_2018.gmt)']
     10 for g in genesets:
---> 12     pre_res = gp.prerank(rnk=rnk, 
     13                          gene_sets=g,
     14                          processes=4,
     15                          permutation_num=100,
     16                          ascending=False,
     17                          outdir='test/prerank_report_kegg', 
     18                          format='png', 
     19                          seed=6,
     20                          min_size=0,
     21                          max_size=500,
     22                          verbose=True)
     24     pre_res = pre_res.res2d
     25     pre_res = pre_res[(pre_res.pval<=0.05)]

File [/opt/homebrew/Caskroom/mambaforge/base/envs/scviEnv/lib/python3.9/site-packages/gseapy/__init__.py:358](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/mambaforge/base/envs/scviEnv/lib/python3.9/site-packages/gseapy/__init__.py:358), in prerank(rnk, gene_sets, outdir, pheno_pos, pheno_neg, min_size, max_size, permutation_num, weighted_score_type, ascending, threads, figsize, format, graph_num, no_plot, seed, verbose, *arg, **kwarg)
    338     threads = kwarg["processes"]
    339 pre = Prerank(
    340     rnk,
    341     gene_sets,
   (...)
...
    462     indices=gsum.indices if isinstance(dat2, pd.DataFrame) else None,
    463 )
    464 if self._outdir is not None:

PanicException: called `Option::unwrap()` on a `None` value

Here is the output from using v0.10.7:

2023-07-11 06:34:48,840 Parsing data files for GSEA.............................
2023-07-11 06:34:52,536 0000 gene_sets have been filtered out when max_size=500 and min_size=0
2023-07-11 06:34:52,537 5103 gene_sets used for further statistical testing.....
2023-07-11 06:34:52,537 Start to run GSEA...Might take a while..................
[/Users/xxxxxxx/miniconda3/envs/scanpyEnv/lib/python3.8/site-packages/gseapy/algorithm.py:71](https://file+.vscode-resource.vscode-cdn.net/Users/xxxxxxx/miniconda3/envs/scanpyEnv/lib/python3.8/site-packages/gseapy/algorithm.py:71): RuntimeWarning: divide by zero encountered in true_divide
  norm_tag =  1.0/sum_correl_tag
[/Users/xxxxxxx/miniconda3/envs/scanpyEnv/lib/python3.8/site-packages/gseapy/algorithm.py:74](https://file+.vscode-resource.vscode-cdn.net/Users/xxxxxxx/miniconda3/envs/scanpyEnv/lib/python3.8/site-packages/gseapy/algorithm.py:74): RuntimeWarning: invalid value encountered in multiply
  RES = np.cumsum(tag_indicator * correl_vector * norm_tag - no_tag_indicator * norm_no_tag, axis=axis)
2023-07-11 06:35:12,418 Start to generate gseapy reports, and produce figures...
2023-07-11 06:35:18,310 Congratulations. GSEApy runs successfully................

@zqfang
Copy link
Owner

zqfang commented Jul 13, 2023

@gouinK, you could share the ranking list with me if you don't mind?

@solo7773
Copy link

The same problem to me in PyCharm and Jupyter-lab. Don't know how to fix it. But run .py file in the terminal, not in interactive prompts, will solve it.

@maarten-devries
Copy link

This happened for me using gp.prerank when the sum of the rank vector is 0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants