Skip to content

changebio/SEQLinkage

 
 

Repository files navigation

SEQLinkage

Collapsed Haplotype Pattern Method for Linkage Analysis of Next-Generation Sequencing Data

Features

  • It can do linkage analysis on single variants and CHP markers.
  • It can analyze families from different population.
  • It can handle large-scale whole-genome linkage analysis.

Pre-requisites

Make sure you install the pre-requisited before running seqlink:

#install cstatgen
conda install -c conda-forge xeus-cling
conda install -c anaconda swig 
conda install -c conda-forge gsl
pip install egglib
git clone https://github.com/statgenetics/cstatgen.git
cd cstatgen
python setup.py install

#install paramlink2
R
install.packages("paramlink2")

Install

pip install SEQLinkage

How to use

!seqlink --help
usage: seqlink [-h] [--single-marker] --fam FILE --vcf FILE [--anno FILE]
               [--pop FILE] [--included-vars FILE] [-b FILE] [-c P] [-o Name]
               [--build STRING] [--window INT] [--freq INFO]
               [--chrom-prefix STRING] [--run-linkage] [-K FLOAT]
               [--moi STRING] [-W FLOAT] [-M FLOAT] [--theta-max FLOAT]
               [--theta-inc FLOAT]

SEQLinkage V2, linkage analysis using sequence data

options:
  -h, --help            show this help message and exit

Collapsed haplotype pattern method arguments:
  --single-marker       Use single variant as the marker. Otherwise, use CHP
                        markers.
  --fam FILE            Input pedigree and phenotype information in FAM
                        format.
  --vcf FILE            Input VCF file, bgzipped.
  --anno FILE           Input annotation file from annovar.
  --pop FILE            Input two columns file, first column is family ID,
                        second column population information.
  --included-vars FILE  Variants to be included for linkage analysis, if None,
                        the analysis won't filter any variants. But you can
                        still set AF cutoff by -c argment.
  -b FILE, --blueprint FILE
                        Blueprint file that defines regional marker (format:
                        "chr startpos endpos name avg.distance male.distance
                        female.distance").
  -c P, --maf-cutoff P  MAF cutoff to define variants to be excluded from
                        analyses. this is useful, if you need to analyse
                        multiple population together.
  -o Name, --output Name
                        Output name prefix.
  --build STRING        Reference genome version for VCF file.
  --window INT          If no blueprint, seprate chromosome to pseudogenes
                        with 1000 (as default) variants.
  --freq INFO           Info field name for allele frequency in VCF file.
  --chrom-prefix STRING
                        Prefix to chromosome name in VCF file if applicable,
                        e.g. "chr".

LINKAGE options:
  --run-linkage         Perform Linkage analysis.
  -K FLOAT, --prevalence FLOAT
                        Disease prevalence. Default to 0.001.
  --moi STRING          Mode of inheritance, AD/AR: autosomal
                        dominant/recessive. Default to AD.
  -W FLOAT, --wt-pen FLOAT
                        Penetrance for wild type. Default to 0.01.
  -M FLOAT, --mut-pen FLOAT
                        Penetrance for mutation. Default to 0.9.
  --theta-max FLOAT     Theta upper bound. Default to 0.5.
  --theta-inc FLOAT     Theta increment. Default to 0.05.

Linkage analysis on specific regions

Normally, the regions are gene regions. you can also use self-defined regions, such as promoter regions, enhancer regions.

1.run seqlink on CHP marker

!seqlink --fam testdata/test_ped.fam --vcf testdata/test_snps.vcf.gz --anno testdata/test_chr1_anno.csv --pop testdata/test_fam_pop.txt --blueprint testdata/test_blueprint_ext.txt --included-vars testdata/test_chr1_included_vars.txt -o data/test_chp --run-linkage
�[1;40;32mMESSAGE: Binary trait detected in [/mnt/vast/hpc/csg/yin/Github/linkage/SEQpy3/nbs/testdata/test_ped.fam]�[0m
�[1;40;32mMESSAGE: Namespace(single_marker=False, tfam='/mnt/vast/hpc/csg/yin/Github/linkage/SEQpy3/nbs/testdata/test_ped.fam', vcf='/mnt/vast/hpc/csg/yin/Github/linkage/SEQpy3/nbs/testdata/test_snps.vcf.gz', anno='testdata/test_chr1_anno.csv', pop='testdata/test_fam_pop.txt', included_vars='testdata/test_chr1_included_vars.txt', blueprint='testdata/test_blueprint_ext.txt', maf_cutoff=None, output='data/test_chp', build='hg38', window=1000, freq='AF', chr_prefix=None, run_linkage=True, prevalence=0.001, inherit_mode='AD', wild_pen=0.01, muta_pen=0.9, theta_max=0.5, theta_inc=0.05)�[0m
�[1;40;32mMESSAGE: 18 samples found in FAM file but not in VCF file:�[0m

�[1;40;32mMESSAGE: 18 samples found in [/mnt/vast/hpc/csg/yin/Github/linkage/SEQpy3/nbs/testdata/test_snps.vcf.gz]�[0m
�[1;40;32mMESSAGE: Loading marker map from [testdata/test_blueprint_ext.txt] ...�[0m
�[1;40;32mMESSAGE: 6 families with a total of 18 samples will be scanned for 12 pre-defined units�[0m
SNVHap MIR6859-1@1,MIR6859-2@1,MIR6859-3@1,MIR6859-4@1
�[1;40;32mMESSAGE: write to pickle: data/test_chp/chr1result/chr1result0.pickle,Gene number:2,Time:5.62837730265326e-05�[0m
create data/test_chp/chr1result/chr1result0_AFcutoffNone_linkage.input
create data/test_chp/chr1result/chr1result0_AFcutoffNone_linkage.lods
0.21258915215730667
create data/test_chp/chr1result/chr1result0_AFcutoffNone_linkage.besthlod
�[1;40;32mMESSAGE: ============= Finish analysis ==============�[0m

2.run seqlink on variants

!seqlink --single-marker --fam testdata/test_ped.fam --vcf testdata/test_snps.vcf.gz --anno testdata/test_chr1_anno.csv --pop testdata/test_fam_pop.txt --blueprint testdata/test_blueprint_ext.txt -c 0.05 -o data/test_var --run-linkage
�[1;40;32mMESSAGE: Binary trait detected in [/mnt/vast/hpc/csg/yin/Github/linkage/SEQpy3/nbs/testdata/test_ped.fam]�[0m
�[1;40;32mMESSAGE: Namespace(single_marker=True, tfam='/mnt/vast/hpc/csg/yin/Github/linkage/SEQpy3/nbs/testdata/test_ped.fam', vcf='/mnt/vast/hpc/csg/yin/Github/linkage/SEQpy3/nbs/testdata/test_snps.vcf.gz', anno='testdata/test_chr1_anno.csv', pop='testdata/test_fam_pop.txt', included_vars=None, blueprint='testdata/test_blueprint_ext.txt', maf_cutoff=0.05, output='data/test_var', build='hg38', window=1000, freq='AF', chr_prefix=None, run_linkage=True, prevalence=0.001, inherit_mode='AD', wild_pen=0.01, muta_pen=0.9, theta_max=0.5, theta_inc=0.05)�[0m
�[1;40;32mMESSAGE: 18 samples found in FAM file but not in VCF file:�[0m

�[1;40;32mMESSAGE: 18 samples found in [/mnt/vast/hpc/csg/yin/Github/linkage/SEQpy3/nbs/testdata/test_snps.vcf.gz]�[0m
�[1;40;32mMESSAGE: Loading marker map from [testdata/test_blueprint_ext.txt] ...�[0m
�[1;40;32mMESSAGE: 6 families with a total of 18 samples will be scanned for 12 pre-defined units�[0m
�[1;40;32mMESSAGE: write to pickle: data/test_var/chr1result/chr1result0.pickle,Gene number:4,Time:4.1139241204493574e-05�[0m
create data/test_var/chr1result/chr1result0_AFcutoff0.05_linkage.input
create data/test_var/chr1result/chr1result0_AFcutoff0.05_linkage.lods
0.3724569082260132
create data/test_var/chr1result/chr1result0_AFcutoff0.05_linkage.besthlod
�[1;40;32mMESSAGE: ============= Finish analysis ==============�[0m

No annotation

If you don't have the annotation file. there is no need to add --pop. And --freq should be setted based on the INFO column in vcf file.

!seqlink --fam testdata/test_ped.fam --vcf testdata/test_snps.vcf.gz --freq='AF' --blueprint testdata/test_blueprint_ext.txt -c 0.05 -o data/test_chp_na --run-linkage
�[1;40;32mMESSAGE: Binary trait detected in [/mnt/vast/hpc/csg/yin/Github/linkage/SEQpy3/nbs/testdata/test_ped.fam]�[0m
�[1;40;32mMESSAGE: Namespace(single_marker=False, tfam='/mnt/vast/hpc/csg/yin/Github/linkage/SEQpy3/nbs/testdata/test_ped.fam', vcf='/mnt/vast/hpc/csg/yin/Github/linkage/SEQpy3/nbs/testdata/test_snps.vcf.gz', anno=None, pop=None, included_vars=None, blueprint='testdata/test_blueprint_ext.txt', maf_cutoff=0.05, output='data/test_chp_na', build='hg38', window=1000, freq='AF', chr_prefix=None, run_linkage=True, prevalence=0.001, inherit_mode='AD', wild_pen=0.01, muta_pen=0.9, theta_max=0.5, theta_inc=0.05)�[0m
�[1;40;32mMESSAGE: 18 samples found in FAM file but not in VCF file:�[0m

�[1;40;32mMESSAGE: 18 samples found in [/mnt/vast/hpc/csg/yin/Github/linkage/SEQpy3/nbs/testdata/test_snps.vcf.gz]�[0m
�[1;40;32mMESSAGE: Loading marker map from [testdata/test_blueprint_ext.txt] ...�[0m
�[1;40;32mMESSAGE: 6 families with a total of 18 samples will be scanned for 12 pre-defined units�[0m
SNVHap MIR6859-1@1,MIR6859-2@1,MIR6859-3@1,MIR6859-4@1
�[1;40;32mMESSAGE: write to pickle: data/test_chp_na/chrallresult/chrallresult0.pickle,Gene number:4,Time:9.55304606921143e-05�[0m
create data/test_chp_na/chrallresult/chrallresult0_AFcutoff0.05_linkage.input
create data/test_chp_na/chrallresult/chrallresult0_AFcutoff0.05_linkage.lods
0.3595982789993286
create data/test_chp_na/chrallresult/chrallresult0_AFcutoff0.05_linkage.besthlod
�[1;40;32mMESSAGE: ============= Finish analysis ==============�[0m

Whole-genome linkage analysis

if --blueprint is not provided, the genomic region will be seperated to pseudogenes with 1000 variants. you can change the variant number per pseudogene by --window.

!seqlink --single-marker --fam testdata/test_ped.fam --vcf testdata/test_snps.vcf.gz --anno testdata/test_chr1_anno.csv --pop testdata/test_fam_pop.txt -c 0.05 -o data/test_wg --run-linkage
�[1;40;32mMESSAGE: Binary trait detected in [/mnt/vast/hpc/csg/yin/Github/linkage/SEQpy3/nbs/testdata/test_ped.fam]�[0m
�[1;40;32mMESSAGE: Generate regions by annotation�[0m
�[1;40;32mMESSAGE: Namespace(single_marker=True, tfam='/mnt/vast/hpc/csg/yin/Github/linkage/SEQpy3/nbs/testdata/test_ped.fam', vcf='/mnt/vast/hpc/csg/yin/Github/linkage/SEQpy3/nbs/testdata/test_snps.vcf.gz', anno='testdata/test_chr1_anno.csv', pop='testdata/test_fam_pop.txt', included_vars=None, blueprint=None, maf_cutoff=0.05, output='data/test_wg', build='hg38', window=1000, freq='AF', chr_prefix=None, run_linkage=True, prevalence=0.001, inherit_mode='AD', wild_pen=0.01, muta_pen=0.9, theta_max=0.5, theta_inc=0.05)�[0m
�[1;40;32mMESSAGE: 18 samples found in FAM file but not in VCF file:�[0m

�[1;40;32mMESSAGE: 18 samples found in [/mnt/vast/hpc/csg/yin/Github/linkage/SEQpy3/nbs/testdata/test_snps.vcf.gz]�[0m
�[1;40;32mMESSAGE: separate chromosome to regions�[0m
�[1;40;32mMESSAGE: 6 families with a total of 18 samples will be scanned for 1 pre-defined units�[0m
�[1;40;32mMESSAGE: write to pickle: data/test_wg/chr1result/chr1result0.pickle,Gene number:1,Time:9.195781416363186e-05�[0m
create data/test_wg/chr1result/chr1result0_AFcutoff0.05_linkage.input
create data/test_wg/chr1result/chr1result0_AFcutoff0.05_linkage.lods
0.7846571207046509
create data/test_wg/chr1result/chr1result0_AFcutoff0.05_linkage.besthlod
�[1;40;32mMESSAGE: ============= Finish analysis ==============�[0m

Input format

  • --fam, Fam file (required, format: "fid iid fathid mothid sex trait[1 control, 2 case, -9 or 0 missing]")
%%writefile testdata/test_ped.fam
1033    1033_2  0       0       2       -9
1033    1033_1  0       0       1       -9
1033    1033_99 1033_1  1033_2  2       1
1033    1033_9  1033_1  1033_2  2       1
1033    1033_3  1033_1  1033_2  2       2
1036    1036_99 1036_1  1036_2  2       2
1036    1036_6  0       0       1       2
1036    1036_1  0       0       1       -9
1036    1036_3  1036_6  1036_99 2       1
1036    1036_4  1036_6  1036_99 2       1
1036    1036_2  0       0       2       -9
1036    1036_5  1036_6  1036_99 1       1
10J_103 10J_103_10      0       0       1       -9
10J_103 10J_103_4       0       0       1       -9
10J_103 10J_103_3       0       0       2       -9
10J_103 10J_103_2       10J_103_4       10J_103_3       2       2
10J_103 10J_103_1       10J_103_10      10J_103_3       1       2
10J_109 10J_109_2       10J_109_6       10J_109_5       1       2
10J_109 10J_109_3       10J_109_6       10J_109_5       1       2
10J_109 10J_109_4       10J_109_6       10J_109_5       1       2
10J_109 10J_109_6       0       0       1       -9
10J_109 10J_109_1       10J_109_6       10J_109_5       1       2
10J_109 10J_109_5       0       0       2       2
10J_109 10J_109_7       10J_109_6       10J_109_5       1       1
10J_112 10J_112_3       0       0       1       1
10J_112 10J_112_5       10J_112_3       10J_112_2       1       2
10J_112 10J_112_1       10J_112_3       10J_112_2       2       1
10J_112 10J_112_7       10J_112_3       10J_112_2       1       1
10J_112 10J_112_2       0       0       2       2
10J_119 10J_119_2       0       0       1       1
10J_119 10J_119_5       0       0       2       1
10J_119 10J_119_4       0       0       1       1
10J_119 10J_119_6       10J_119_4       10J_119_5       1       2
10J_119 10J_119_7       10J_119_4       10J_119_5       2       2
10J_119 10J_119_1       10J_119_4       10J_119_5       2       2
10J_119 10J_119_3       10J_119_2       10J_119_1       1       1
Overwriting ../testdata/test_ped.fam
  • --vcf, VCF file (required, vcf.gz with vcf.gz.tbi)
bgzip -c file.vcf > file.vcf.gz
tabix -p vcf file.vcf.gz
  • --anno, Annotation file from ANNOVAR, It must contains the allele frequency for population in the file of family population information. For example in here, The annotation file must have AF_amr, AF_afr, AF_nfe columns.
anno=pd.read_csv('testdata/test_chr1_anno.csv')
anno
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
Chr Start End Ref Alt Func.refGene Gene.refGene GeneDetail.refGene ExonicFunc.refGene AAChange.refGene ... CLNDISDB CLNREVSTAT CLNSIG DN ID Patient ID Phenotype Platform Study Pubmed ID Otherinfo1
0 1 10140 10147 ACCCTAAC A intergenic NONE;DDX11L1 dist=NONE;dist=1727 . . ... . . . . . . . . . chr1:10140:ACCCTAAC:A
1 1 10146 10147 AC A intergenic NONE;DDX11L1 dist=NONE;dist=1727 . . ... . . . . . . . . . chr1:10146:AC:A
2 1 10146 10148 ACC * . . . . . ... . . . . . . . . . chr1:10146:ACC:*
3 1 10150 10151 CT C intergenic NONE;DDX11L1 dist=NONE;dist=1723 . . ... . . . . . . . . . chr1:10150:CT:C
4 1 10172 10177 CCCTAA C intergenic NONE;DDX11L1 dist=NONE;dist=1697 . . ... . . . . . . . . . chr1:10172:CCCTAA:C
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
995 1 66479 66487 TATTTATAG * . . . . . ... . . . . . . . . . chr1:66479:TATTTATAG:*
996 1 66480 66481 AT A intergenic FAM138A;OR4F5 dist=30399;dist=2610 . . ... . . . . . . . . . chr1:66480:AT:A
997 1 66480 66483 ATTT * . . . . . ... . . . . . . . . . chr1:66480:ATTT:*
998 1 66481 66488 TTTATAGA T intergenic FAM138A;OR4F5 dist=30400;dist=2603 . . ... . . . . . . . . . chr1:66481:TTTATAGA:T
999 1 66481 66488 TTTATAGA * . . . . . ... . . . . . . . . . chr1:66481:TTTATAGA:*

1000 rows × 152 columns

anno.loc[:,['AF_amr', 'AF_afr', 'AF_nfe']]
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
AF_amr AF_afr AF_nfe
0 0.0006 0.0008 0.0007
1 0.6380 0.6300 0.6413
2 . . .
3 0.0357 0.0426 0.0370
4 0.0086 0.0097 0.0084
... ... ... ...
995 . . .
996 0.0039 0.0036 0.0076
997 . . .
998 0.0163 0.0410 0.0279
999 . . .

1000 rows × 3 columns

Or, create a self-defined annotation file like this:

	Chr	Start	AF_amr	AF	AF_nfe	AF_afr
chr1:10140:ACCCTAAC:A	1	10140	0.0006	0.0007	0.0007	0.0008
chr1:10146:AC:A	1	10146	0.638	0.6328	0.6413	0.63
chr1:10150:CT:C	1	10150	0.0357	0.0375	0.037	0.0426
chr1:10172:CCCTAA:C	1	10172	0.0086	0.0082	0.0084	0.0097
chr1:10178:CCTAA:C	1	10178	0.5	0.3333	0.2955	0.4375
chr1:10198:TAACCC:T	1	10198	0.0	0.0	0.0	0.0
chr1:10231:C:A	1	10231	0.2	0.0366	0.0	0.05
chr1:10236:AACCCT:A	1	10236	0.0	0.0	0.0	0.0
chr1:10247:TAAACCCTA:T	1	10247	0.2222	0.2089	0.1429	0.4211

The index must match with the ID in vcf file.

  • --pop, The file of family population information
%%writefile testdata/test_fam_pop.txt
1033 AF_amr
1036 AF_amr
10J_103 AF_afr
10J_109 AF_nfe
10J_112 AF_nfe
10J_119 AF_nfe
Writing ../testdata/test_fam_pop.txt
  • --included-vars, The file with one column of variants For example:
chr1:10140:ACCCTAAC:A
chr1:10172:CCCTAA:C
chr1:10198:TAACCC:T
chr1:10236:AACCCT:A
chr1:10261:T:TA
chr1:10262:AACCCT:A
  • --blueprint, The blueprint file that defines regional marker (format: "chr startpos endpos name avg.distance male.distance female.distance"). The first four columns are required.
%%writefile testdata/test_blueprint_ext.txt
1       11868   14362   LOC102725121@1  9.177127474362311e-07   1.1657192989882668e-06  6.814189157634088e-07
1       11873   14409   DDX11L1 9.195320788455595e-07   1.1680302941673515e-06  6.82769803434766e-07
1       14361   29370   WASH7P  1.5299877409602128e-06  1.94345806118021e-06    1.136044574393209e-06
1       17368   17436   MIR6859-1@1,MIR6859-2@1,MIR6859-3@1,MIR6859-4@1 1.217692507120495e-06   1.5467668502473368e-06  9.041595098829462e-07
1       30365   30503   MIR1302-10@1,MIR1302-11@1,MIR1302-2@1,MIR1302-9@1       2.1295973889038703e-06  2.705108741548526e-06   1.5812659765416382e-06
1       34610   36081   FAM138A@1,FAM138C@1,FAM138F@1   2.4732411024120156e-06  3.1416201771056266e-06  1.8364278747737466e-06
1       69090   70008   OR4F5   4.866641545668504e-06   6.181823219621424e-06   3.6135725636621673e-06
1       134772  140566  LOC729737       9.633289838108921e-06   1.2236630588823159e-05  7.152898262617822e-06
1       490755  495445  LOC100132062@1,LOC100132287@1   2.2828130832833112e-05  2.8997300893994373e-05  1.6950315013571593e-05
1       450739  451678  OR4F16@1,OR4F29@1,OR4F3@1       2.575942360468604e-05   3.2720758549649544e-05  1.912685483821856e-05
1       627379  629009  LOC101928626    3.943568768003252e-05   5.009295373297854e-05   2.9281737249900675e-05
1       632614  632685  MIR12136        3.974742311959244e-05   5.048893386847169e-05   2.9513206656665908e-05
Overwriting testdata/test_blueprint_ext.txt

Or

%%writefile testdata/test_blueprint.txt
1       11868   14362   LOC102725121@1
1       11873   14409   DDX11L1
1       14361   29370   WASH7P
1       17368   17436   MIR6859-1@1,MIR6859-2@1,MIR6859-3@1,MIR6859-4@1
1       30365   30503   MIR1302-10@1,MIR1302-11@1,MIR1302-2@1,MIR1302-9@1
1       34610   36081   FAM138A@1,FAM138C@1,FAM138F@1
1       69090   70008   OR4F5
1       134772  140566  LOC729737
1       490755  495445  LOC100132062@1,LOC100132287@1
1       450739  451678  OR4F16@1,OR4F29@1,OR4F3@1
1       627379  629009  LOC101928626
1       632614  632685  MIR12136
Overwriting testdata/test_blueprint.txt

Output format

- InfoFam: the number of families with the variant or the CHP marker.

LOD Score.

It is calculated from 0 to 0.5 with step 0.05 per family per gene. you can change them by --theta-inc and --theta-max.

- LOD0: the sum of LOD score at theta=0 among all families
- LODmax: the max of the sum of LOD score among all families between the range of thetas.

HLOD Score

- theta: the theta of best HLOD score.
- alpha: the alpha of best HLOD score.
- hlod: the max HLOD of these HLOD between the range of thetas.

The summary result of CHP markers

result=pd.read_csv('data/test_chp/chr1result_lod_summary.csv',index_col=0)
result
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
chrom start end name InfoFam LOD0 LODmax theta alpha hlod
MIR6859-1@1,MIR6859-2@1,MIR6859-3@1,MIR6859-4@1 1 17368 17436 MIR6859-1@1,MIR6859-2@1,MIR6859-3@1,MIR6859-4@1 3 -0.864448 0.000000 LOD0.5 0.0 0.000000
WASH7P 1 14361 29370 WASH7P 2 -0.507697 0.019594 LOD0.3 1.0 0.019594

The summary result of single variants

result=pd.read_csv('data/test_var/chr1result_lod_summary.csv',index_col=0)
result
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
chrom pos a0 a1 InfoFam LOD0 LODmax theta alpha hlod
chr1:13302:C:T chr1 13302 C T 1 -0.008503 0.000000 LOD0.5 0.000000 0.000000
chr1:13687:GCCAT:G chr1 13687 GCCAT G 1 0.024553 0.024553 LOD0.0 1.000000 0.024553
chr1:14464:A:T chr1 14464 A T 1 -0.113847 0.000000 LOD0.5 0.000000 0.000000
chr1:14470:G:A chr1 14470 G A 1 -0.122592 0.000000 LOD0.5 0.000000 0.000000
chr1:14773:C:T chr1 14773 C T 1 -0.007627 0.000000 LOD0.5 0.000000 0.000000
chr1:14843:G:A chr1 14843 G A 1 -0.280266 0.000000 LOD0.5 0.000000 0.000000
chr1:14933:G:A chr1 14933 G A 1 0.000000 0.000000 LOD0.5 0.000000 0.000000
chr1:16103:T:G chr1 16103 T G 4 -0.414168 0.079880 LOD0.0 0.376008 0.080043
chr1:17147:G:A chr1 17147 G A 1 -0.000219 0.000000 LOD0.5 0.000000 0.000000
chr1:17358:ACTT:A chr1 17358 ACTT A 1 0.000000 0.000000 LOD0.5 0.000000 0.000000
chr1:17379:G:A chr1 17379 G A 1 -0.741666 0.000000 LOD0.5 0.000000 0.000000
chr1:17406:C:T chr1 17406 C T 1 -0.122782 0.000000 LOD0.5 0.000000 0.000000
chr1:17407:G:A chr1 17407 G A 1 0.016021 0.016021 LOD0.0 1.000000 0.016021
chr1:17408:C:G chr1 17408 C G 1 0.356048 0.356048 LOD0.0 1.000000 0.356048
chr1:17519:G:T chr1 17519 G T 2 0.099799 0.112751 LOD0.0 0.728031 0.114024
chr1:17594:C:T chr1 17594 C T 2 0.100280 0.113080 LOD0.0 0.729654 0.114300
chr1:17614:G:A chr1 17614 G A 1 -0.120530 0.000000 LOD0.5 0.000000 0.000000
chr1:17716:G:A chr1 17716 G A 1 -0.122820 0.000000 LOD0.5 0.000000 0.000000
chr1:17722:A:G chr1 17722 A G 1 -0.122079 0.000000 LOD0.5 0.000000 0.000000
chr1:17767:G:A chr1 17767 G A 1 -0.122801 0.000000 LOD0.5 0.000000 0.000000
chr1:17928:T:A chr1 17928 T A 1 -0.000219 0.000000 LOD0.5 0.000000 0.000000
chr1:17929:C:A chr1 17929 C A 1 -0.000219 0.000000 LOD0.5 0.000000 0.000000
chr1:20184:A:G chr1 20184 A G 1 -0.000219 0.000000 LOD0.5 0.000000 0.000000
chr1:20231:T:G chr1 20231 T G 1 -0.741144 0.000000 LOD0.5 0.000000 0.000000
chr1:20235:G:A chr1 20235 G A 1 -0.118623 0.000000 LOD0.5 0.000000 0.000000
chr1:20443:G:A chr1 20443 G A 1 -0.280236 0.000000 LOD0.5 0.000000 0.000000
chr1:20485:CA:C chr1 20485 CA C 1 0.000000 0.000000 LOD0.5 0.000000 0.000000
chr1:20522:T:G chr1 20522 T G 1 0.000000 0.000000 LOD0.5 0.000000 0.000000
chr1:29368:G:A chr1 29368 G A 2 -0.280978 0.000000 LOD0.5 0.000000 0.000000

About

Collapsed Haplotype Pattern Method for Linkage Analysis of Next-Generation Sequencing Data

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Jupyter Notebook 62.5%
  • Python 36.8%
  • Other 0.7%