You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for releasing this great resources. I noticed some discrepancies between the semicolon-separated lists in clinical_significance_ordered and submitters_ordered:
In [1]: df = pd.read_csv('clinvar_alleles_example_750_rows.single.b37.tsv', sep='\t')
In [2]: df.shape
Out[2]: (749, 39)
In [3]: for col in 'rcv scv clinical_significance_ordered submitters_ordered'.split():
...: df['len_' + col] = df[col].apply(lambda x: len(x.split(';')))
In [4]: diffs = df[df.len_clinical_significance_ordered != df.len_submitters_ordered].shape
In [5]: diffs.shape
Out[5]: (120, 43)
Ordered clinical significance doesn't seem to match the RCV or SCV lists either. Is this intended?
Thanks
The text was updated successfully, but these errors were encountered:
Hi,
Thanks for releasing this great resources. I noticed some discrepancies between the semicolon-separated lists in
clinical_significance_ordered
andsubmitters_ordered
:Ordered clinical significance doesn't seem to match the RCV or SCV lists either. Is this intended?
Thanks
The text was updated successfully, but these errors were encountered: