Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different junction and cdr3 reported for same annotation in different sections #84

Open
zacmon opened this issue May 18, 2020 · 2 comments

Comments

@zacmon
Copy link

zacmon commented May 18, 2020

Hi,
I have an annotation where some fields don't match and make sense. My annotation is for an IgG which is productive and has no indels. The 'cdr3_nt' and (trimmed) 'junc_nt' fields match. However, when I use the region lengths reported to get the cdr3 sequence, things do not match. Further, the junction_nt_breakdown field doesn't match the junc_nt field. Shouldn't these different fields match to their respective same fields?
Screen Shot 2020-05-18 at 4 13 38 PM

@zacmon zacmon closed this as completed May 25, 2020
@zacmon zacmon reopened this May 25, 2020
@zacmon
Copy link
Author

zacmon commented May 25, 2020

I'd like to bring a more concrete example up. My input sequence is
'NNNNNNNNNNNNNNNNNNNNNNNNGCTGCCTCTGGAGGGTCTTTCAGTGGCTTCTATTGGAGTTGGGGTCGAGTCACCATATCAGTGGACAGGTCCAAGAACCAGTTCTCACTGAGGCTGAGGTCTGTGACCGCCGCGGACACGGCTGTTTATTACTGTGCGAGAGGCAGAGGTCTCTATTATGAGAGTAGTGGTGGACTTTATTACATGGACGTCTGGGGCGAAGGGACCACGGTCACCGTCTCCTCAGCCTCCACCAAGGGCCCATCGGTCTTCCCCCTGGCACCCTCCTCCAAGAGCACCACTCTGGGCACAGCGGCCCTGGGCTGCCTGGTCAGGGACTACTTCCCCGAACCGGTGACG'
and the output for abstar for this sequence is

{'seq_id': 'M04617:111:000000000-CYR64:1:2118:20947:8496=81|CPRIMER=CHG-R|VPRIMER=IGHV4-F|DUPCOUNT=1|TIME=0|SEVERITY=Healthy|REPLICATE=0',
 'chain': 'heavy',
 'v_gene': {'full': 'IGHV4-34*02',
  'fam': 'IGHV4',
  'gene': 'IGHV4-34',
  'score': 271,
  'assigner_score': 141.062,
  'others': [{'full': 'IGHV4-34*01', 'assigner_score': 141.062},
   {'full': 'IGHV4-34*05', 'assigner_score': 137.892},
   {'full': 'IGHV4-34*04', 'assigner_score': 137.892},
   {'full': 'IGHV4-34*12', 'assigner_score': 137.892},
   {'full': 'IGHV4-4*02', 'assigner_score': 136.307}]},
 'd_gene': {'full': 'IGHD3-22*01',
  'fam': 'IGHD3',
  'gene': 'IGHD3-22',
  'score': 44,
  'assigner_score': 44,
  'others': [{'full': 'IGHD3-3*01', 'assigner_score': 31},
   {'full': 'IGHD2-15*01', 'assigner_score': 30},
   {'full': 'IGHD3-9*01', 'assigner_score': 30},
   {'full': 'IGHD3-22*01', 'assigner_score': 29},
   {'full': 'IGHD2-8*02', 'assigner_score': 27}]},
 'j_gene': {'full': 'IGHJ6*04',
  'gene': 'IGHJ6',
  'score': 124,
  'assigner_score': 66.5684,
  'others': [{'full': 'IGHJ6*01', 'score': 63.3985},
   {'full': 'IGHJ4*03', 'score': 52.3038},
   {'full': 'IGHJ4*01', 'score': 49.1338},
   {'full': 'IGHJ3*01', 'score': 49.1338},
   {'full': 'IGHJ4*02', 'score': 45.9639}]},
 'assigner_scores': {'v': 141.062, 'j': 66.5684, 'd': 44},
 'vdj_assigner': 'blastn',
 'isotype': 'IgG1',
 'isotype_score': 319,
 'isotype_alignment': {'query': 'CCTCCACCAAGGGCCCATCGGTCTTCCCCCTGGCACCCTCCTCCAAGAGCACCACTCTGGGCACAGCGGCCCTGGGCTGCCTGGTCAGGGACTACTTCCCCGAACCGGTGACG',
  'midline': '||||||||||||||||||||||||||||||||||||||||||||||||||||| ||  ||||||||||||||||||||||||||||| |||||||||||||||||||||||||',
  'isotype': 'CCTCCACCAAGGGCCCATCGGTCTTCCCCCTGGCACCCTCCTCCAAGAGCACCTCTGGGGGCACAGCGGCCCTGGGCTGCCTGGTCAAGGACTACTTCCCCGAACCGGTGACG'},
 'nt_identity': {'v': 93.13725490196079, 'j': 97.67441860465117},
 'aa_identity': {'v': 82.35294117647058, 'j': 92.3076923076923},
 'junc_len': 21,
 'cdr3_len': 19,
 'vdj_nt': 'GGGTCGAGTCACCATATCAGTGGACAGGTCCAAGAACCAGTTCTCACTGAGGCTGAGGTCTGTGACCGCCGCGGACACGGCTGTTTATTACTGTGCGAGAGGCAGAGGTCTCTATTATGAGAGTAGTGGTGGACTTTATTACATGGACGTCTGGGGCGAAGGGACCACGGTCACCGTCTCCTCAG',
 'gapped_vdj_nt': 'GGGTCGAGTCACCATATCAGTGGACAGGTCCAAGAACCAGTTCTCACTGAGGCTGAGGTCTGTGACCGCCGCGGACACGGCTGTTTATTACTGTGCGAGAGGCAGAGGTCTCTATTATGAGAGTAGTGGTGGACTTTATTACATGGACGTCTGGGGCGAAGGGACCACGGTCACCGTCTCCTCAG',
 'fr3_nt': 'GGGTCGAGTCACCATATCAGTGGACAGGTCCAAGAACCAGTTCTCACTGAGGCTGAGGTCTGTGACCGCCGCGGACACGGCTGTTTATTACTG',
 'cdr3_nt': 'GCGAGAGGCAGAGGTCTCTATTATGAGAGTAGTGGTGGACTTTATTACATGGACGTC',
 'fr4_nt': 'TGGGGCGAAGGGACCACGGTCACCGTCTCCTCA',
 'vdj_germ_nt': 'GAGTCGAGTCACCATATCAGTAGACACGTCCAAGAACCAGTTCTCCCTGAAGCTGAGCTCTGTGACCGCCGCGGACACGGCTGTGTATTACTGTGCGAGAGGCAGAGGTCTCTACTATGATAGTAGTGGTGGACTTTATTACATGGACGTCTGGGGCAAAGGGACCACGGTCACCGTCTCCTCAG',
 'gapped_vdj_germ_nt': 'GAGTCGAGTCACCATATCAGTAGACACGTCCAAGAACCAGTTCTCCCTGAAGCTGAGCTCTGTGACCGCCGCGGACACGGCTGTGTATTACTGTGCGAGAGGCAGAGGTCTCTACTATGATAGTAGTGGTGGACTTTATTACATGGACGTCTGGGGCAAAGGGACCACGGTCACCGTCTCCTCAG',
 'junc_nt': 'TGTGCGAGAGGCAGAGGTCTCTATTATGAGAGTAGTGGTGGACTTTATTACATGGACGTCTGG',
 'region_len_nt': {'fr1': 0,
  'cdr1': 0,
  'fr2': 0,
  'cdr2': 0,
  'fr3': 93,
  'cdr3': 57,
  'fr4': 33},
 'var_muts_nt': {'num': 7,
  'muts': [{'was': 'A',
    'is': 'G',
    'raw_position': 66,
    'position': 221,
    'codon': 74},
   {'was': 'A', 'is': 'G', 'raw_position': 86, 'position': 241, 'codon': 81},
   {'was': 'C', 'is': 'G', 'raw_position': 91, 'position': 246, 'codon': 82},
   {'was': 'C', 'is': 'A', 'raw_position': 110, 'position': 265, 'codon': 89},
   {'was': 'A', 'is': 'G', 'raw_position': 115, 'position': 270, 'codon': 90},
   {'was': 'C', 'is': 'G', 'raw_position': 122, 'position': 277, 'codon': 93},
   {'was': 'G',
    'is': 'T',
    'raw_position': 149,
    'position': 304,
    'codon': 102}]},
 'join_muts_nt': {'num': 1,
  'muts': [{'was': 'A',
    'is': 'G',
    'raw_position': 222,
    'position': 358,
    'codon': 120}]},
 'mut_count_nt': 8,
 'vdj_aa': 'GRVTISVDRSKNQFSLRLRSVTAADTAVYYCARGRGLYYESSGGLYYMDVWGEGTTVTVSS',
 'fr3_aa': 'GSSHHISGQVQEPVLTEAEVCDRRGHGCLLL',
 'cdr3_aa': 'ARGRGLYYESSGGLYYMDV',
 'fr4_aa': 'WGEGTTVTVSS',
 'vdj_germ_aa': 'SRVTISVDTSKNQFSLKLSSVTAADTAVYYCARGRGLYYDSSGGLYYMDVWGKGTTVTVSS',
 'junc_aa': 'CARGRGLYYESSGGLYYMDVW',
 'region_len_aa': {'fr1': 0,
  'cdr1': 0,
  'fr2': 0,
  'cdr2': 0,
  'fr3': 31,
  'cdr3': 19,
  'fr4': 11},
 'var_muts_aa': {'num': 6,
  'muts': [{'was': 'E',
    'is': 'G',
    'raw_position': None,
    'position': 74,
    'codon': 74},
   {'was': 'R', 'is': 'G', 'raw_position': None, 'position': 81, 'codon': 81},
   {'was': 'H', 'is': 'Q', 'raw_position': None, 'position': 82, 'codon': 82},
   {'was': 'P', 'is': 'T', 'raw_position': None, 'position': 89, 'codon': 89},
   {'was': 'L', 'is': 'V', 'raw_position': None, 'position': 93, 'codon': 93},
   {'was': 'V',
    'is': 'L',
    'raw_position': None,
    'position': 102,
    'codon': 102}]},
 'join_muts_aa': {'num': 1,
  'muts': [{'was': 'K',
    'is': 'E',
    'raw_position': None,
    'position': 120.0,
    'codon': 120.0}]},
 'region_muts_nt': {'fr1': {'num': 0, 'muts': []},
  'cdr1': {'num': 0, 'muts': []},
  'fr2': {'num': 0, 'muts': []},
  'cdr2': {'num': 0, 'muts': []},
  'fr3': {'num': 7,
   'muts': [{'was': 'A',
     'is': 'G',
     'raw_position': 66,
     'position': 221,
     'codon': 74},
    {'was': 'A', 'is': 'G', 'raw_position': 86, 'position': 241, 'codon': 81},
    {'was': 'C', 'is': 'G', 'raw_position': 91, 'position': 246, 'codon': 82},
    {'was': 'C', 'is': 'A', 'raw_position': 110, 'position': 265, 'codon': 89},
    {'was': 'A', 'is': 'G', 'raw_position': 115, 'position': 270, 'codon': 90},
    {'was': 'C', 'is': 'G', 'raw_position': 122, 'position': 277, 'codon': 93},
    {'was': 'G',
     'is': 'T',
     'raw_position': 149,
     'position': 304,
     'codon': 102}]},
  'fr4': {'num': 1,
   'muts': [{'was': 'A',
     'is': 'G',
     'raw_position': 222,
     'position': 358,
     'codon': 120}]}},
 'region_muts_aa': {'fr1': {'num': 0, 'muts': []},
  'cdr1': {'num': 0, 'muts': []},
  'fr2': {'num': 0, 'muts': []},
  'cdr2': {'num': 0, 'muts': []},
  'fr3': {'num': 6,
   'muts': [{'was': 'E',
     'is': 'G',
     'raw_position': None,
     'position': 74,
     'codon': 74},
    {'was': 'R', 'is': 'G', 'raw_position': None, 'position': 81, 'codon': 81},
    {'was': 'H', 'is': 'Q', 'raw_position': None, 'position': 82, 'codon': 82},
    {'was': 'P', 'is': 'T', 'raw_position': None, 'position': 89, 'codon': 89},
    {'was': 'L', 'is': 'V', 'raw_position': None, 'position': 93, 'codon': 93},
    {'was': 'V',
     'is': 'L',
     'raw_position': None,
     'position': 102,
     'codon': 102}]},
  'fr4': {'num': 1,
   'muts': [{'was': 'K',
     'is': 'E',
     'raw_position': None,
     'position': 120.0,
     'codon': 120.0}]}},
 'prod': 'yes',
 'junction_in_frame': 'yes',
 'raw_input': 'NNNNNNNNNNNNNNNNNNNNNNNNGCTGCCTCTGGAGGGTCTTTCAGTGGCTTCTATTGGAGTTGGGGTCGAGTCACCATATCAGTGGACAGGTCCAAGAACCAGTTCTCACTGAGGCTGAGGTCTGTGACCGCCGCGGACACGGCTGTTTATTACTGTGCGAGAGGCAGAGGTCTCTATTATGAGAGTAGTGGTGGACTTTATTACATGGACGTCTGGGGCGAAGGGACCACGGTCACCGTCTCCTCAGCCTCCACCAAGGGCCCATCGGTCTTCCCCCTGGCACCCTCCTCCAAGAGCACCACTCTGGGCACAGCGGCCCTGGGCTGCCTGGTCAGGGACTACTTCCCCGAACCGGTGACG',
 'oriented_input': 'NNNNNNNNNNNNNNNNNNNNNNNNGCTGCCTCTGGAGGGTCTTTCAGTGGCTTCTATTGGAGTTGGGGTCGAGTCACCATATCAGTGGACAGGTCCAAGAACCAGTTCTCACTGAGGCTGAGGTCTGTGACCGCCGCGGACACGGCTGTTTATTACTGTGCGAGAGGCAGAGGTCTCTATTATGAGAGTAGTGGTGGACTTTATTACATGGACGTCTGGGGCGAAGGGACCACGGTCACCGTCTCCTCAGCCTCCACCAAGGGCCCATCGGTCTTCCCCCTGGCACCCTCCTCCAAGAGCACCACTCTGGGCACAGCGGCCCTGGGCTGCCTGGTCAGGGACTACTTCCCCGAACCGGTGACG',
 'germ_alignments_nt': {'var': {'query': 'GGGTCGAGTCACCATATCAGTGGACAGGTCCAAGAACCAGTTCTCACTGAGGCTGAGGTCTGTGACCGCCGCGGACACGGCTGTTTATTACTGTGCGAGAGG',
   'germ': 'GAGTCGAGTCACCATATCAGTAGACACGTCCAAGAACCAGTTCTCCCTGAAGCTGAGCTCTGTGACCGCCGCGGACACGGCTGTGTATTACTGTGCGAGAGG',
   'midline': '| ||||||||||||||||||| |||| |||||||||||||||||| |||| |||||| |||||||||||||||||||||||||| |||||||||||||||||'},
  'join': {'query': 'ATGGACGTCTGGGGCGAAGGGACCACGGTCACCGTCTCCTCAG',
   'germ': 'ATGGACGTCTGGGGCAAAGGGACCACGGTCACCGTCTCCTCAG',
   'midline': '||||||||||||||| |||||||||||||||||||||||||||'},
  'div': {'query': 'TATTATGAGAGTAGTGGT',
   'germ': 'TACTATGATAGTAGTGGT',
   'midline': '|| ||||| |||||||||'}},
 'exo_trimming': {'var_3': 0, 'join_5': 20, 'div_5': 4, 'div_3': 9},
 'junc_nt_breakdown': {'v_nt': 'TGTGCGAGAGG',
  'n1_nt': 'CAGAGGTCTC',
  'd_nt': 'TATTATGAGAGTAGTGGT',
  'n2_nt': 'GGACTTTATTAC',
  'j_nt': 'TGTGCGAGAGG',
  'd_dist_from_cdr3_start': 18,
  'd_dist_from_cdr3_end': 21},
 'align_info': {'v_start': 191,
  'v_end': 292,
  'j_start': 20,
  'j_end': 62,
  'd_start': 4,
  'd_end': 21}}

The length of the vdj_nt is 185. But the sum of the region lengths is 183. Why are there two nucleotides missing in the region lenghts?

If I run

cdr3_start = sum([in_a['region_len_nt'][key]
                  for key in in_a['region_len_nt']
                  if key != 'cdr3' and key != 'fr4'])
cdr3_end = cdr3_start + in_a['region_len_nt']['cdr3']

and compare that output to

in_ann['cdr3_nt']

The output should be the same. For my sample of 94942 annotations, that is true for 90163 of them. Even for sequences with SHM indels, the cdr3 found by in_ann['cdr3_nt'] and that by region lengths are captured correctly. Why is there a discrepancy between the two? This is very concerning if I am to get the germline CDR3 and find a consensus progenitor for my clonotypes. This foreseeably presents difficulties when doing phylogenetic analysis of clonotypes, no?

Using

def abstar_cdr3(in_a):
    try:
        cdr3_start = in_a['vdj_nt'].index(in_a['cdr3_nt'])
        cdr3_end = cdr3_start + in_a['region_len_nt']['cdr3']
        return in_a['vdj_nt'][cdr3_start:cdr3_end]
    except:
        return None

gives better results. I get 94918 cdr3s correct out of 94942 annotations. Still, I don't understand why 24 fail. Of these 24, only 1 had shm indels. Of these 24, the cdr3 identified is in the raw sequence. 23 of the 24 are out of frame. One is not. Why is it not in the vdj_nt string? This makes no sense to me. Could you elaborate and clarify please?

Here is an example of one of the oof 24 that fail:

{'seq_id': 'M04617:111:000000000-CYR64:1:2111:14734:3825=81|CPRIMER=CHG-R|VPRIMER=IGHV4-F|DUPCOUNT=1|TIME=0|SEVERITY=Healthy|REPLICATE=0',
 'chain': 'heavy',
 'v_gene': {'full': 'IGHV4-39*07',
  'fam': 'IGHV4',
  'gene': 'IGHV4-39',
  'score': 591,
  'assigner_score': 302.728,
  'others': [{'full': 'IGHV4-39*01', 'assigner_score': 301.143},
   {'full': 'IGHV4-39*06', 'assigner_score': 299.558},
   {'full': 'IGHV4-39*02', 'assigner_score': 299.558},
   {'full': 'IGHV4-30-2*03', 'assigner_score': 280.538},
   {'full': 'IGHV4-61*05', 'assigner_score': 278.953}]},
 'd_gene': {'full': 'IGHD6-13*01',
  'fam': 'IGHD6',
  'gene': 'IGHD6-13',
  'score': 36,
  'assigner_score': 36,
  'others': [{'full': 'IGHD2-2*01', 'assigner_score': 33},
   {'full': 'IGHD2-2*03', 'assigner_score': 33},
   {'full': 'IGHD2-2*02', 'assigner_score': 33},
   {'full': 'IGHD2-2*01', 'assigner_score': 30},
   {'full': 'IGHD2-2*03', 'assigner_score': 30}]},
 'j_gene': {'full': 'IGHJ1*01',
  'gene': 'IGHJ1',
  'score': 52,
  'assigner_score': 22.1895,
  'others': [{'full': 'IGHJ2*01', 'score': 20.6045},
   {'full': 'IGHJ5*02', 'score': 19.0196},
   {'full': 'IGHJ4*02', 'score': 19.0196},
   {'full': 'IGKJ3*01', 'score': 17.4346},
   {'full': 'IGLJ7*01', 'score': 15.8496}]},
 'assigner_scores': {'v': 302.728, 'j': 22.1895, 'd': 36},
 'vdj_assigner': 'blastn',
 'isotype': 'IgG2',
 'isotype_score': 329,
 'isotype_alignment': {'query': 'CCTCCACCAAGGGCCCATCGGTCTTCCCCCTGGCACCCTGCTCCAGGAGCACCTCCGAGAGCACAGCGGCCCTGGGCTGCCTGGTCAAGGACTACTTCCCCGAACCGGTGACG',
  'midline': '|||||||||||||||||||||||||||||||||| |||||||||||||||||||||||||||||||| |||||||||||||||||||||||||||||||||||||||||||||',
  'isotype': 'CCTCCACCAAGGGCCCATCGGTCTTCCCCCTGGCGCCCTGCTCCAGGAGCACCTCCGAGAGCACAGCCGCCCTGGGCTGCCTGGTCAAGGACTACTTCCCCGAACCGGTGACG'},
 'nt_identity': {'v': 90.94827586206897, 'j': 61.224489795918366},
 'aa_identity': {'v': 84.41558441558442, 'j': 37.5},
 'junc_len': 49,
 'cdr3_len': 47,
 'vdj_nt': 'ACTGTCTCTGGTGCCTCCATCACCAGTAGTATTTACTACTGGGGCTGGATCCGCCAGTCCCCAGGGAAGGGCCTGGAGTGGATTGGGAGTATATATTATAGTGGGAACTCCTTCTACCAGCCGTCCCTCAAGAGTCGAATCACCATGGCCGTAGACACGTCCAAGAACCAGTTCTCCCTGAAACTCAGCTCTGTGACCGCCGCTGACACGGCCATCTATTACTGCGCGAGAGTCTTCAGCAGCTGGTATGTCGGATGGTTCACCCCCGGGTTCCAGTGAACGCTGGACACAGTGTGGTGAG',
 'gapped_vdj_nt': 'ACTGTCTCTGGTGCCTCCATCACCAGTAGTATTTACTACTGGGGCTGGATCCGCCAGTCCCCAGGGAAGGGCCTGGAGTGGATTGGGAGTATATATTATAGTGGGAACTCCTTCTACCAGCCGTCCCTCAAGAGTCGAATCACCATGGCCGTAGACACGTCCAAGAACCAGTTCTCCCTGAAACTCAGCTCTGTGACCGCCGCTGACACGGCCATCTATTACTGCGCGAGAGTCTTCAGCAGCTGGTATGTCGGATGGTTCACCCCCGGGTTCCAGTGAACGCTGGACACAGTGTGGTGAG',
 'fr1_nt': 'ACTGTCTCT',
 'cdr1_nt': 'GGTGCCTCCATCACCAGTAGTATTTACTAC',
 'fr2_nt': 'TGGGGCTGGATCCGCCAGTCCCCAGGGAAGGGCCTGGAGTGGATTGGGAGT',
 'cdr2_nt': 'ATATATTATAGTGGGAACTCC',
 'fr3_nt': 'TTCTACCAGCCGTCCCTCAAGAGTCGAATCACCATGGCCGTAGACACGTCCAAGAACCAGTTCTCCCTGAAACTCAGCTCTGTGACCGCCGCTGACACGGCCATCTATTACTGC',
 'cdr3_nt': 'GCGAGAGTCTTCAGCAGCTGGTATGTCGGATGGTTCACCCCCGGGTTCCAGTGAACGCTGGACACAGTGTGGTGAGCCTCCACCAAGGGCCCATCGGTCTTCCCCCTGGCACCCTGCTCCAGGAGCACCTCCGAGAGCACA',
 'fr4_nt': 'GGGTTCCAGTGAACGCTGGACACAGTGTGGTGA',
 'vdj_germ_nt': 'ACTGTCTCTGGTGGCTCCATCAGCAGTAGTAGTTACTACTGGGGCTGGATCCGCCAGCCCCCAGGGAAGGGGCTGGAGTGGATTGGGAGTATCTATTATAGTGGGAGCACCTACTACAACCCGTCCCTCAAGAGTCGAGTCACCATATCAGTAGACACGTCCAAGAACCAGTTCTCCCTGAAGCTGAGCTCTGTGACCGCCGCGGACACGGCCGTGTATTACTGTGCGAGAGTCTTCAGCAGCTGGTATGTCGAATACTTCCAGCACTGGGGCCAGGGCACCCTGGTCACCGTCTCCTCAG',
 'gapped_vdj_germ_nt': 'ACTGTCTCTGGTGGCTCCATCAGCAGTAGTAGTTACTACTGGGGCTGGATCCGCCAGCCCCCAGGGAAGGGGCTGGAGTGGATTGGGAGTATCTATTATAGTGGGAGCACCTACTACAACCCGTCCCTCAAGAGTCGAGTCACCATATCAGTAGACACGTCCAAGAACCAGTTCTCCCTGAAGCTGAGCTCTGTGACCGCCGCGGACACGGCCGTGTATTACTGTGCGAGAGTCTTCAGCAGCTGGTATGTCGAATACTTCCAGCACTGGGGCCAGGGCACCCTGGTCACCGTCTCCTCAG',
 'junc_nt': 'TGCGCGAGAGTCTTCAGCAGCTGGTATGTCGGATGGTTCACCCCCGGGTTCCAGTGAACGCTGGACACAGTGTGGTGAGCCTCCACCAAGGGCCCATCGGTCTTCCCCCTGGCACCCTGCTCCAGGAGCACCTCCGAGAGCACAGCG',
 'region_len_nt': {'fr1': 9,
  'cdr1': 30,
  'fr2': 51,
  'cdr2': 21,
  'fr3': 114,
  'cdr3': 141,
  'fr4': 33},
 'var_muts_nt': {'num': 21,
  'muts': [{'was': 'G',
    'is': 'C',
    'raw_position': 37,
    'position': 83,
    'codon': 28},
   {'was': 'G', 'is': 'C', 'raw_position': 46, 'position': 92, 'codon': 31},
   {'was': 'G', 'is': 'T', 'raw_position': 55, 'position': 107, 'codon': 36},
   {'was': 'C', 'is': 'T', 'raw_position': 81, 'position': 133, 'codon': 45},
   {'was': 'G', 'is': 'C', 'raw_position': 95, 'position': 147, 'codon': 49},
   {'was': 'C', 'is': 'A', 'raw_position': 116, 'position': 168, 'codon': 56},
   {'was': 'G', 'is': 'A', 'raw_position': 130, 'position': 191, 'codon': 64},
   {'was': 'A', 'is': 'T', 'raw_position': 132, 'position': 193, 'codon': 65},
   {'was': 'A', 'is': 'T', 'raw_position': 136, 'position': 197, 'codon': 66},
   {'was': 'A', 'is': 'C', 'raw_position': 141, 'position': 202, 'codon': 68},
   {'was': 'C', 'is': 'G', 'raw_position': 143, 'position': 204, 'codon': 68},
   {'was': 'G', 'is': 'A', 'raw_position': 162, 'position': 226, 'codon': 76},
   {'was': 'A', 'is': 'G', 'raw_position': 170, 'position': 234, 'codon': 78},
   {'was': 'T', 'is': 'G', 'raw_position': 171, 'position': 235, 'codon': 79},
   {'was': 'A', 'is': 'C', 'raw_position': 173, 'position': 237, 'codon': 79},
   {'was': 'G', 'is': 'A', 'raw_position': 206, 'position': 270, 'codon': 90},
   {'was': 'G', 'is': 'C', 'raw_position': 209, 'position': 273, 'codon': 91},
   {'was': 'G', 'is': 'T', 'raw_position': 227, 'position': 291, 'codon': 97},
   {'was': 'G', 'is': 'A', 'raw_position': 237, 'position': 301, 'codon': 101},
   {'was': 'G', 'is': 'C', 'raw_position': 239, 'position': 303, 'codon': 101},
   {'was': 'T',
    'is': 'C',
    'raw_position': 248,
    'position': 312,
    'codon': 104}]},
 'join_muts_nt': {'num': 19,
  'muts': [{'was': 'A',
    'is': 'G',
    'raw_position': 277,
    'position': 338,
    'codon': 113},
   {'was': 'A', 'is': 'G', 'raw_position': 280, 'position': 341, 'codon': 114},
   {'was': 'C', 'is': 'G', 'raw_position': 281, 'position': 342, 'codon': 114},
   {'was': 'C', 'is': 'A', 'raw_position': 285, 'position': 346, 'codon': 116},
   {'was': 'A', 'is': 'C', 'raw_position': 286, 'position': 347, 'codon': 116},
   {'was': 'G', 'is': 'C', 'raw_position': 287, 'position': 348, 'codon': 116},
   {'was': 'A', 'is': 'C', 'raw_position': 289, 'position': 350, 'codon': 117},
   {'was': 'T', 'is': 'G', 'raw_position': 291, 'position': 352, 'codon': 118},
   {'was': 'G', 'is': 'T', 'raw_position': 294, 'position': 355, 'codon': 119},
   {'was': 'G', 'is': 'T', 'raw_position': 295, 'position': 356, 'codon': 119},
   {'was': 'G', 'is': 'T', 'raw_position': 300, 'position': 361, 'codon': 121},
   {'was': 'C', 'is': 'A', 'raw_position': 302, 'position': 363, 'codon': 121},
   {'was': 'C', 'is': 'G', 'raw_position': 305, 'position': 366, 'codon': 122},
   {'was': 'T', 'is': 'A', 'raw_position': 310, 'position': 371, 'codon': 124},
   {'was': 'C', 'is': 'A', 'raw_position': 314, 'position': 375, 'codon': 125},
   {'was': 'C', 'is': 'G', 'raw_position': 317, 'position': 378, 'codon': 126},
   {'was': 'C', 'is': 'G', 'raw_position': 319, 'position': 380, 'codon': 127},
   {'was': 'C', 'is': 'G', 'raw_position': 320, 'position': 381, 'codon': 127},
   {'was': 'C',
    'is': 'G',
    'raw_position': 322,
    'position': 383,
    'codon': 128}]},
 'mut_count_nt': 40,
 'vdj_aa': 'TVSGASITSSIYYWGWIRQSPGKGLEWIGSIYYSGNSFYQPSLKSRITMAVDTSKNQFSLKLSSVTAADTAIYYCARVFSSWYVGWFTPGFQ*TLDTVW*',
 'fr1_aa': 'TVS',
 'cdr1_aa': 'GASITSSIYY',
 'fr2_aa': 'WGWIRQSPGKGLEWIGS',
 'cdr2_aa': 'IYYSGNS',
 'fr3_aa': 'FYQPSLKSRITMAVDTSKNQFSLKLSSVTAADTAIYYC',
 'cdr3_aa': 'ARVFSSWYVGWFTPGFQ*TLDTVW*ASTKGPSVFPLAPCSRSTSEST',
 'fr4_aa': 'GFQ*TLDTVW*',
 'vdj_germ_aa': 'TVSGGSISSSSYYWGWIRQPPGKGLEWIGSIYYSGSTYYNPSLKSRVTISVDTSKNQFSLKLSSVTAADTAVYYCARVFSSWYVEYFQHWGQGTLVTVSS',
 'junc_aa': 'CARVFSSWYVGWFTPGFQ*TLDTVW*ASTKGPSVFPLAPCSRSTSESTA',
 'region_len_aa': {'fr1': 3,
  'cdr1': 10,
  'fr2': 17,
  'cdr2': 7,
  'fr3': 38,
  'cdr3': 47,
  'fr4': 11},
 'var_muts_aa': {'num': 12,
  'muts': [{'was': 'G',
    'is': 'A',
    'raw_position': None,
    'position': 28,
    'codon': 28},
   {'was': 'S', 'is': 'T', 'raw_position': None, 'position': 31, 'codon': 31},
   {'was': 'S', 'is': 'I', 'raw_position': None, 'position': 36, 'codon': 36},
   {'was': 'P', 'is': 'S', 'raw_position': None, 'position': 45, 'codon': 45},
   {'was': 'S', 'is': 'N', 'raw_position': None, 'position': 64, 'codon': 64},
   {'was': 'T', 'is': 'S', 'raw_position': None, 'position': 65, 'codon': 65},
   {'was': 'Y', 'is': 'F', 'raw_position': None, 'position': 66, 'codon': 66},
   {'was': 'N', 'is': 'Q', 'raw_position': None, 'position': 68, 'codon': 68},
   {'was': 'V', 'is': 'I', 'raw_position': None, 'position': 76, 'codon': 76},
   {'was': 'I', 'is': 'M', 'raw_position': None, 'position': 78, 'codon': 78},
   {'was': 'S', 'is': 'A', 'raw_position': None, 'position': 79, 'codon': 79},
   {'was': 'V',
    'is': 'I',
    'raw_position': None,
    'position': 101,
    'codon': 101}]},
 'join_muts_aa': {'num': 10,
  'muts': [{'was': 'E',
    'is': 'G',
    'raw_position': None,
    'position': 113.0,
    'codon': 113.0},
   {'was': 'Y',
    'is': 'W',
    'raw_position': None,
    'position': 114.0,
    'codon': 114.0},
   {'was': 'Q',
    'is': 'T',
    'raw_position': None,
    'position': 116.0,
    'codon': 116.0},
   {'was': 'H',
    'is': 'P',
    'raw_position': None,
    'position': 117.0,
    'codon': 117.0},
   {'was': 'W',
    'is': 'G',
    'raw_position': None,
    'position': 118.0,
    'codon': 118.0},
   {'was': 'G',
    'is': 'F',
    'raw_position': None,
    'position': 119.0,
    'codon': 119.0},
   {'was': 'G',
    'is': '*',
    'raw_position': None,
    'position': 121.0,
    'codon': 121.0},
   {'was': 'V',
    'is': 'D',
    'raw_position': None,
    'position': 124.0,
    'codon': 124.0},
   {'was': 'S',
    'is': 'W',
    'raw_position': None,
    'position': 127.0,
    'codon': 127.0},
   {'was': 'S',
    'is': '*',
    'raw_position': None,
    'position': 128.0,
    'codon': 128.0}]},
 'region_muts_nt': {'fr1': {'num': 0, 'muts': []},
  'cdr1': {'num': 3,
   'muts': [{'was': 'G',
     'is': 'C',
     'raw_position': 37,
     'position': 83,
     'codon': 28},
    {'was': 'G', 'is': 'C', 'raw_position': 46, 'position': 92, 'codon': 31},
    {'was': 'G',
     'is': 'T',
     'raw_position': 55,
     'position': 107,
     'codon': 36}]},
  'fr2': {'num': 2,
   'muts': [{'was': 'C',
     'is': 'T',
     'raw_position': 81,
     'position': 133,
     'codon': 45},
    {'was': 'G',
     'is': 'C',
     'raw_position': 95,
     'position': 147,
     'codon': 49}]},
  'cdr2': {'num': 3,
   'muts': [{'was': 'C',
     'is': 'A',
     'raw_position': 116,
     'position': 168,
     'codon': 56},
    {'was': 'G', 'is': 'A', 'raw_position': 130, 'position': 191, 'codon': 64},
    {'was': 'A',
     'is': 'T',
     'raw_position': 132,
     'position': 193,
     'codon': 65}]},
  'fr3': {'num': 13,
   'muts': [{'was': 'A',
     'is': 'T',
     'raw_position': 136,
     'position': 197,
     'codon': 66},
    {'was': 'A', 'is': 'C', 'raw_position': 141, 'position': 202, 'codon': 68},
    {'was': 'C', 'is': 'G', 'raw_position': 143, 'position': 204, 'codon': 68},
    {'was': 'G', 'is': 'A', 'raw_position': 162, 'position': 226, 'codon': 76},
    {'was': 'A', 'is': 'G', 'raw_position': 170, 'position': 234, 'codon': 78},
    {'was': 'T', 'is': 'G', 'raw_position': 171, 'position': 235, 'codon': 79},
    {'was': 'A', 'is': 'C', 'raw_position': 173, 'position': 237, 'codon': 79},
    {'was': 'G', 'is': 'A', 'raw_position': 206, 'position': 270, 'codon': 90},
    {'was': 'G', 'is': 'C', 'raw_position': 209, 'position': 273, 'codon': 91},
    {'was': 'G', 'is': 'T', 'raw_position': 227, 'position': 291, 'codon': 97},
    {'was': 'G',
     'is': 'A',
     'raw_position': 237,
     'position': 301,
     'codon': 101},
    {'was': 'G',
     'is': 'C',
     'raw_position': 239,
     'position': 303,
     'codon': 101},
    {'was': 'T',
     'is': 'C',
     'raw_position': 248,
     'position': 312,
     'codon': 104}]},
  'fr4': {'num': 12,
   'muts': [{'was': 'T',
     'is': 'G',
     'raw_position': 291,
     'position': 352,
     'codon': 118},
    {'was': 'G',
     'is': 'T',
     'raw_position': 294,
     'position': 355,
     'codon': 119},
    {'was': 'G',
     'is': 'T',
     'raw_position': 295,
     'position': 356,
     'codon': 119},
    {'was': 'G',
     'is': 'T',
     'raw_position': 300,
     'position': 361,
     'codon': 121},
    {'was': 'C',
     'is': 'A',
     'raw_position': 302,
     'position': 363,
     'codon': 121},
    {'was': 'C',
     'is': 'G',
     'raw_position': 305,
     'position': 366,
     'codon': 122},
    {'was': 'T',
     'is': 'A',
     'raw_position': 310,
     'position': 371,
     'codon': 124},
    {'was': 'C',
     'is': 'A',
     'raw_position': 314,
     'position': 375,
     'codon': 125},
    {'was': 'C',
     'is': 'G',
     'raw_position': 317,
     'position': 378,
     'codon': 126},
    {'was': 'C',
     'is': 'G',
     'raw_position': 319,
     'position': 380,
     'codon': 127},
    {'was': 'C',
     'is': 'G',
     'raw_position': 320,
     'position': 381,
     'codon': 127},
    {'was': 'C',
     'is': 'G',
     'raw_position': 322,
     'position': 383,
     'codon': 128}]}},
 'region_muts_aa': {'fr1': {'num': 0, 'muts': []},
  'cdr1': {'num': 3,
   'muts': [{'was': 'G',
     'is': 'A',
     'raw_position': None,
     'position': 28,
     'codon': 28},
    {'was': 'S', 'is': 'T', 'raw_position': None, 'position': 31, 'codon': 31},
    {'was': 'S',
     'is': 'I',
     'raw_position': None,
     'position': 36,
     'codon': 36}]},
  'fr2': {'num': 1,
   'muts': [{'was': 'P',
     'is': 'S',
     'raw_position': None,
     'position': 45,
     'codon': 45}]},
  'cdr2': {'num': 2,
   'muts': [{'was': 'S',
     'is': 'N',
     'raw_position': None,
     'position': 64,
     'codon': 64},
    {'was': 'T',
     'is': 'S',
     'raw_position': None,
     'position': 65,
     'codon': 65}]},
  'fr3': {'num': 6,
   'muts': [{'was': 'Y',
     'is': 'F',
     'raw_position': None,
     'position': 66,
     'codon': 66},
    {'was': 'N', 'is': 'Q', 'raw_position': None, 'position': 68, 'codon': 68},
    {'was': 'V', 'is': 'I', 'raw_position': None, 'position': 76, 'codon': 76},
    {'was': 'I', 'is': 'M', 'raw_position': None, 'position': 78, 'codon': 78},
    {'was': 'S', 'is': 'A', 'raw_position': None, 'position': 79, 'codon': 79},
    {'was': 'V',
     'is': 'I',
     'raw_position': None,
     'position': 101,
     'codon': 101}]},
  'fr4': {'num': 6,
   'muts': [{'was': 'W',
     'is': 'G',
     'raw_position': None,
     'position': 118.0,
     'codon': 118.0},
    {'was': 'G',
     'is': 'F',
     'raw_position': None,
     'position': 119.0,
     'codon': 119.0},
    {'was': 'G',
     'is': '*',
     'raw_position': None,
     'position': 121.0,
     'codon': 121.0},
    {'was': 'V',
     'is': 'D',
     'raw_position': None,
     'position': 124.0,
     'codon': 124.0},
    {'was': 'S',
     'is': 'W',
     'raw_position': None,
     'position': 127.0,
     'codon': 127.0},
    {'was': 'S',
     'is': '*',
     'raw_position': None,
     'position': 128.0,
     'codon': 128.0}]}},
 'prod': 'no',
 'productivity_issues': 'Contains stop codon(s), Junction (CARVFSSWYVGWFTPGFQ*TLDTVW*ASTKGPSVFPLAPCSRSTSESTA) lacks conserved start and/or end residue',
 'junction_in_frame': 'yes',
 'raw_input': 'NNNNNNNNNNNNNNNNNNNNNNNNACTGTCTCTGGTGCCTCCATCACCAGTAGTATTTACTACTGGGGCTGGATCCGCCAGTCCCCAGGGAAGGGCCTGGAGTGGATTGGGAGTATATATTATAGTGGGAACTCCTTCTACCAGCCGTCCCTCAAGAGTCGAATCACCATGGCCGTAGACACGTCCAAGAACCAGTTCTCCCTGAAACTCAGCTCTGTGACCGCCGCTGACACGGCCATCTATTACTGCGCGAGAGTCTTCAGCAGCTGGTATGTCGGATGGTTCACCCCCGGGTTCCAGTGAACGCTGGACACAGTGTGGTGAGCCTCCACCAAGGGCCCATCGGTCTTCCCCCTGGCACCCTGCTCCAGGAGCACCTCCGAGAGCACAGCGGCCCTGGGCTGCCTGGTCAAGGACTACTTCCCCGAACCGGTGACG',
 'oriented_input': 'NNNNNNNNNNNNNNNNNNNNNNNNACTGTCTCTGGTGCCTCCATCACCAGTAGTATTTACTACTGGGGCTGGATCCGCCAGTCCCCAGGGAAGGGCCTGGAGTGGATTGGGAGTATATATTATAGTGGGAACTCCTTCTACCAGCCGTCCCTCAAGAGTCGAATCACCATGGCCGTAGACACGTCCAAGAACCAGTTCTCCCTGAAACTCAGCTCTGTGACCGCCGCTGACACGGCCATCTATTACTGCGCGAGAGTCTTCAGCAGCTGGTATGTCGGATGGTTCACCCCCGGGTTCCAGTGAACGCTGGACACAGTGTGGTGAGCCTCCACCAAGGGCCCATCGGTCTTCCCCCTGGCACCCTGCTCCAGGAGCACCTCCGAGAGCACAGCGGCCCTGGGCTGCCTGGTCAAGGACTACTTCCCCGAACCGGTGACG',
 'germ_alignments_nt': {'var': {'query': 'ACTGTCTCTGGTGCCTCCATCACCAGTAGTATTTACTACTGGGGCTGGATCCGCCAGTCCCCAGGGAAGGGCCTGGAGTGGATTGGGAGTATATATTATAGTGGGAACTCCTTCTACCAGCCGTCCCTCAAGAGTCGAATCACCATGGCCGTAGACACGTCCAAGAACCAGTTCTCCCTGAAACTCAGCTCTGTGACCGCCGCTGACACGGCCATCTATTACTGCGCGAGAG',
   'germ': 'ACTGTCTCTGGTGGCTCCATCAGCAGTAGTAGTTACTACTGGGGCTGGATCCGCCAGCCCCCAGGGAAGGGGCTGGAGTGGATTGGGAGTATCTATTATAGTGGGAGCACCTACTACAACCCGTCCCTCAAGAGTCGAGTCACCATATCAGTAGACACGTCCAAGAACCAGTTCTCCCTGAAGCTGAGCTCTGTGACCGCCGCGGACACGGCCGTGTATTACTGTGCGAGAG',
   'midline': '||||||||||||| |||||||| |||||||| ||||||||||||||||||||||||| ||||||||||||| |||||||||||||||||||| ||||||||||||| | ||| |||| | |||||||||||||||||| |||||||  | |||||||||||||||||||||||||||||||| || ||||||||||||||||| ||||||||| | |||||||| |||||||'},
  'join': {'query': 'GGATGGTTCACCCCCGGGTTCCAGTGAACGCTGGACACAGTGTGGTGAG',
   'germ': 'GAATACTTCCAGCACTGGGGCCAGGGCACCCTGGTCACCGTCTCCTCAG',
   'midline': '| ||  |||   | | ||  |||| | || |||| ||| || |  | ||'},
  'div': {'query': 'CAGCAGCTGGTA',
   'germ': 'CAGCAGCTGGTA',
   'midline': '||||||||||||'}},
 'exo_trimming': {'var_3': 1, 'join_5': 3, 'div_5': 8, 'div_3': 1},
 'junc_nt_breakdown': {'v_nt': 'TGCGCGAGAG',
  'n1_nt': 'TCTT',
  'd_nt': 'CAGCAGCTGGTA',
  'n2_nt': 'TGTC',
  'j_nt': 'TGCGCGAGAG',
  'd_dist_from_cdr3_start': 11,
  'd_dist_from_cdr3_end': 118},
 'align_info': {'v_start': 66,
  'v_end': 297,
  'j_start': 3,
  'j_end': 51,
  'd_start': 8,
  'd_end': 19}}

This is the in frame that fails:

{'seq_id': 'M04617:111:000000000-CYR64:1:2119:7294:8546=81|CPRIMER=CHG-R|VPRIMER=IGHV4-F|DUPCOUNT=1|TIME=0|SEVERITY=Healthy|REPLICATE=0',
 'chain': 'heavy',
 'v_gene': {'full': 'IGHV4-59*01',
  'fam': 'IGHV4',
  'gene': 'IGHV4-59',
  'score': 633,
  'assigner_score': 331.257,
  'others': [{'full': 'IGHV4-59*07', 'assigner_score': 329.672},
   {'full': 'IGHV4-59*02', 'assigner_score': 328.087},
   {'full': 'IGHV4-59*08', 'assigner_score': 323.332},
   {'full': 'IGHV4-4*08', 'assigner_score': 315.408},
   {'full': 'IGHV4-4*07', 'assigner_score': 309.068}]},
 'd_gene': {'full': 'IGHD3-10*02',
  'fam': 'IGHD3',
  'gene': 'IGHD3-10',
  'score': 14,
  'assigner_score': 23,
  'others': [{'full': 'IGHD6-6*01', 'assigner_score': 19},
   {'full': 'IGHD2-15*01', 'assigner_score': 18},
   {'full': 'IGHD1-26*01', 'assigner_score': 18},
   {'full': 'IGHD2-21*01', 'assigner_score': 17},
   {'full': 'IGHD6-13*01', 'assigner_score': 16}]},
 'j_gene': {'full': 'IGHJ1*01',
  'gene': 'IGHJ1',
  'score': 47,
  'assigner_score': 22.1895,
  'others': [{'full': 'IGHJ5*02', 'score': 19.0196},
   {'full': 'IGHJ4*02', 'score': 19.0196},
   {'full': 'IGHJ2*01', 'score': 19.0196},
   {'full': 'IGKJ3*01', 'score': 17.4346},
   {'full': 'IGHJ4*03', 'score': 15.8496}]},
 'assigner_scores': {'v': 331.257, 'j': 22.1895, 'd': 23},
 'vdj_assigner': 'blastn',
 'isotype': 'IgG1',
 'isotype_score': 54,
 'isotype_alignment': {'query': 'TTCCCCGAACCGGTGACG',
  'midline': '||||||||||||||||||',
  'isotype': 'TTCCCCGAACCGGTGACG'},
 'nt_identity': {'v': 96.01769911504425, 'j': 72.41379310344828},
 'aa_identity': {'v': 90.66666666666667, 'j': 33.33333333333333},
 'junc_len': 10,
 'cdr3_len': 8,
 'vdj_nt': 'ACTGTCTCTGGTGGCTCCATCAGTAGTTACTACTGGAGCTGGATCCGGCACCCCCCAGGGAAGGGACTGGAGTGGATTGGGTATATCTATTACAGTTGGAGCACCAACTACAACCCCGCCCTCATGAGTCGAGTCACCATCTCACTAGACACGTCCAAGAACCAGTTCTCCCTGACGCTGAACTCTGTAACCGCTGCGGACACGGCCGTGTATTACTGTGCGAGAGGAGCACCTCCGAGAGCACAGCGGCCCTGGGCTGCCTGGTCAAGGACTACT',
 'gapped_vdj_nt': 'ACTGTCTCTGGTGGCTCCATCAGTAGTTACTACTGGAGCTGGATCCGGCACCCCCCAGGGAAGGGACTGGAGTGGATTGGGTATATCTATTACAGTTGGAGCACCAACTACAACCCCGCCCTCATGAGTCGAGTCACCATCTCACTAGACACGTCCAAGAACCAGTTCTCCCTGACGCTGAACTCTGTAACCGCTGCGGACACGGCCGTGTATTACTGTGCGAGAGGAGCACCTCCGAGAGCACAGCGGCCCTGGGCTGCCTGGTCAAGGACTACT',
 'fr1_nt': 'ACTGTCTCT',
 'cdr1_nt': 'GGTGGCTCCATCAGTAGTTACTAC',
 'fr2_nt': 'TGGAGCTGGATCCGGCACCCCCCAGGGAAGGGACTGGAGTGGATTGGGTAT',
 'cdr2_nt': 'ATCTATTACAGTTGGAGCACC',
 'fr3_nt': 'AACTACAACCCCGCCCTCATGAGTCGAGTCACCATCTCACTAGACACGTCCAAGAACCAGTTCTCCCTGACGCTGAACTCTGTAACCGCTGCGGACACGGCCGTGTATTACTGT',
 'cdr3_nt': 'GCGAGAGGAGCACCTCCGAGAGCACA',
 'fr4_nt': 'GGCCCTGGGCTGCCTGGTCAAGGACTACT',
 'vdj_germ_nt': 'ACTGTCTCTGGTGGCTCCATCAGTAGTTACTACTGGAGCTGGATCCGGCAGCCCCCAGGGAAGGGACTGGAGTGGATTGGGTATATCTATTACAGTGGGAGCACCAACTACAACCCCTCCCTCAAGAGTCGAGTCACCATATCAGTAGACACGTCCAAGAACCAGTTCTCCCTGAAGCTGAGCTCTGTGACCGCTGCGGACACGGCCGTGTATTACTGTGCGAGAGGAGCACCTCGGGGAGCACAGCGGGCCAGGGCACCCTGGTCACCGTCTCCT',
 'gapped_vdj_germ_nt': 'ACTGTCTCTGGTGGCTCCATCAGTAGTTACTACTGGAGCTGGATCCGGCAGCCCCCAGGGAAGGGACTGGAGTGGATTGGGTATATCTATTACAGTGGGAGCACCAACTACAACCCCTCCCTCAAGAGTCGAGTCACCATATCAGTAGACACGTCCAAGAACCAGTTCTCCCTGAAGCTGAGCTCTGTGACCGCTGCGGACACGGCCGTGTATTACTGTGCGAGAGGAGCACCTCGGGGAGCACAGCGGGCCAGGGCACCCTGGTCACCGTCTCCT',
 'junc_nt': 'TGTGCGAGAGGAGCACCTCCGAGAGCACAGCG',
 'region_len_nt': {'fr1': 9,
  'cdr1': 24,
  'fr2': 51,
  'cdr2': 21,
  'fr3': 114,
  'cdr3': 26,
  'fr4': 29},
 'var_muts_nt': {'num': 9,
  'muts': [{'was': 'G',
    'is': 'C',
    'raw_position': 74,
    'position': 132,
    'codon': 44},
   {'was': 'G', 'is': 'T', 'raw_position': 120, 'position': 187, 'codon': 63},
   {'was': 'T', 'is': 'G', 'raw_position': 141, 'position': 208, 'codon': 70},
   {'was': 'A', 'is': 'T', 'raw_position': 148, 'position': 215, 'codon': 72},
   {'was': 'A', 'is': 'C', 'raw_position': 164, 'position': 234, 'codon': 78},
   {'was': 'G', 'is': 'C', 'raw_position': 168, 'position': 238, 'codon': 80},
   {'was': 'A', 'is': 'C', 'raw_position': 199, 'position': 269, 'codon': 90},
   {'was': 'G', 'is': 'A', 'raw_position': 205, 'position': 275, 'codon': 92},
   {'was': 'G',
    'is': 'A',
    'raw_position': 212,
    'position': 282,
    'codon': 94}]},
 'join_muts_nt': {'num': 8,
  'muts': [{'was': 'G',
    'is': 'C',
    'raw_position': 273,
    'position': 356,
    'codon': 119},
   {'was': 'A', 'is': 'T', 'raw_position': 276, 'position': 359, 'codon': 120},
   {'was': 'A', 'is': 'T', 'raw_position': 281, 'position': 364, 'codon': 122},
   {'was': 'C', 'is': 'G', 'raw_position': 282, 'position': 365, 'codon': 122},
   {'was': 'C', 'is': 'A', 'raw_position': 291, 'position': 374, 'codon': 125},
   {'was': 'C', 'is': 'G', 'raw_position': 292, 'position': 375, 'codon': 125},
   {'was': 'T', 'is': 'A', 'raw_position': 294, 'position': 377, 'codon': 126},
   {'was': 'C',
    'is': 'A',
    'raw_position': 297,
    'position': 380,
    'codon': 127}]},
 'mut_count_nt': 17,
 'vdj_aa': 'TVSGGSISSYYWSWIRHPPGKGLEWIGYIYYSWSTNYNPALMSRVTISLDTSKNQFSLTLNSVTAADTAVYYCARGAPPRAQRPWAAWSRT',
 'fr1_aa': 'TVS',
 'cdr1_aa': 'GGSISSYY',
 'fr2_aa': 'WSWIRHPPGKGLEWIGY',
 'cdr2_aa': 'IYYSWST',
 'fr3_aa': 'NYNPALMSRVTISLDTSKNQFSLTLNSVTAADTAVYYC',
 'cdr3_aa': 'ARGAPPRA',
 'fr4_aa': 'GPGLPGQGL',
 'vdj_germ_aa': 'TVSGGSISSYYWSWIRQPPGKGLEWIGYIYYSGSTNYNPSLKSRVTISVDTSKNQFSLKLSSVTAADTAVYYCARGAPRGAQRARAPWSPSP',
 'junc_aa': 'CARGAPPRAQ',
 'region_len_aa': {'fr1': 3,
  'cdr1': 8,
  'fr2': 17,
  'cdr2': 7,
  'fr3': 38,
  'cdr3': 8,
  'fr4': 9},
 'var_muts_aa': {'num': 7,
  'muts': [{'was': 'Q',
    'is': 'H',
    'raw_position': None,
    'position': 44,
    'codon': 44},
   {'was': 'G', 'is': 'W', 'raw_position': None, 'position': 63, 'codon': 63},
   {'was': 'S', 'is': 'A', 'raw_position': None, 'position': 70, 'codon': 70},
   {'was': 'K', 'is': 'M', 'raw_position': None, 'position': 72, 'codon': 72},
   {'was': 'V', 'is': 'L', 'raw_position': None, 'position': 80, 'codon': 80},
   {'was': 'K', 'is': 'T', 'raw_position': None, 'position': 90, 'codon': 90},
   {'was': 'S',
    'is': 'N',
    'raw_position': None,
    'position': 92,
    'codon': 92}]},
 'join_muts_aa': {'num': 6,
  'muts': [{'was': 'G',
    'is': 'A',
    'raw_position': None,
    'position': 119.0,
    'codon': 119.0},
   {'was': 'Q',
    'is': 'L',
    'raw_position': None,
    'position': 120.0,
    'codon': 120.0},
   {'was': 'T',
    'is': 'C',
    'raw_position': None,
    'position': 122.0,
    'codon': 122.0},
   {'was': 'T',
    'is': 'K',
    'raw_position': None,
    'position': 125.0,
    'codon': 125.0},
   {'was': 'V',
    'is': 'D',
    'raw_position': None,
    'position': 126.0,
    'codon': 126.0},
   {'was': 'S',
    'is': 'Y',
    'raw_position': None,
    'position': 127.0,
    'codon': 127.0}]},
 'region_muts_nt': {'fr1': {'num': 0, 'muts': []},
  'cdr1': {'num': 0, 'muts': []},
  'fr2': {'num': 1,
   'muts': [{'was': 'G',
     'is': 'C',
     'raw_position': 74,
     'position': 132,
     'codon': 44}]},
  'cdr2': {'num': 1,
   'muts': [{'was': 'G',
     'is': 'T',
     'raw_position': 120,
     'position': 187,
     'codon': 63}]},
  'fr3': {'num': 7,
   'muts': [{'was': 'T',
     'is': 'G',
     'raw_position': 141,
     'position': 208,
     'codon': 70},
    {'was': 'A', 'is': 'T', 'raw_position': 148, 'position': 215, 'codon': 72},
    {'was': 'A', 'is': 'C', 'raw_position': 164, 'position': 234, 'codon': 78},
    {'was': 'G', 'is': 'C', 'raw_position': 168, 'position': 238, 'codon': 80},
    {'was': 'A', 'is': 'C', 'raw_position': 199, 'position': 269, 'codon': 90},
    {'was': 'G', 'is': 'A', 'raw_position': 205, 'position': 275, 'codon': 92},
    {'was': 'G',
     'is': 'A',
     'raw_position': 212,
     'position': 282,
     'codon': 94}]},
  'fr4': {'num': 8,
   'muts': [{'was': 'G',
     'is': 'C',
     'raw_position': 273,
     'position': 356,
     'codon': 119},
    {'was': 'A',
     'is': 'T',
     'raw_position': 276,
     'position': 359,
     'codon': 120},
    {'was': 'A',
     'is': 'T',
     'raw_position': 281,
     'position': 364,
     'codon': 122},
    {'was': 'C',
     'is': 'G',
     'raw_position': 282,
     'position': 365,
     'codon': 122},
    {'was': 'C',
     'is': 'A',
     'raw_position': 291,
     'position': 374,
     'codon': 125},
    {'was': 'C',
     'is': 'G',
     'raw_position': 292,
     'position': 375,
     'codon': 125},
    {'was': 'T',
     'is': 'A',
     'raw_position': 294,
     'position': 377,
     'codon': 126},
    {'was': 'C',
     'is': 'A',
     'raw_position': 297,
     'position': 380,
     'codon': 127}]}},
 'region_muts_aa': {'fr1': {'num': 0, 'muts': []},
  'cdr1': {'num': 0, 'muts': []},
  'fr2': {'num': 1,
   'muts': [{'was': 'Q',
     'is': 'H',
     'raw_position': None,
     'position': 44,
     'codon': 44}]},
  'cdr2': {'num': 1,
   'muts': [{'was': 'G',
     'is': 'W',
     'raw_position': None,
     'position': 63,
     'codon': 63}]},
  'fr3': {'num': 5,
   'muts': [{'was': 'S',
     'is': 'A',
     'raw_position': None,
     'position': 70,
     'codon': 70},
    {'was': 'K', 'is': 'M', 'raw_position': None, 'position': 72, 'codon': 72},
    {'was': 'V', 'is': 'L', 'raw_position': None, 'position': 80, 'codon': 80},
    {'was': 'K', 'is': 'T', 'raw_position': None, 'position': 90, 'codon': 90},
    {'was': 'S',
     'is': 'N',
     'raw_position': None,
     'position': 92,
     'codon': 92}]},
  'fr4': {'num': 6,
   'muts': [{'was': 'G',
     'is': 'A',
     'raw_position': None,
     'position': 119.0,
     'codon': 119.0},
    {'was': 'Q',
     'is': 'L',
     'raw_position': None,
     'position': 120.0,
     'codon': 120.0},
    {'was': 'T',
     'is': 'C',
     'raw_position': None,
     'position': 122.0,
     'codon': 122.0},
    {'was': 'T',
     'is': 'K',
     'raw_position': None,
     'position': 125.0,
     'codon': 125.0},
    {'was': 'V',
     'is': 'D',
     'raw_position': None,
     'position': 126.0,
     'codon': 126.0},
    {'was': 'S',
     'is': 'Y',
     'raw_position': None,
     'position': 127.0,
     'codon': 127.0}]}},
 'prod': 'no',
 'productivity_issues': 'Junction (CARGAPPRAQ) lacks conserved start and/or end residue, Junction (TGTGCGAGAGGAGCACCTCCGAGAGCACAGCG) is out of frame',
 'junction_in_frame': 'no',
 'raw_input': 'NNNNNNNNNNNNNNNNNNNNNNNNACTGTCTCTGGTGGCTCCATCAGTAGTTACTACTGGAGCTGGATCCGGCACCCCCCAGGGAAGGGACTGGAGTGGATTGGGTATATCTATTACAGTTGGAGCACCAACTACAACCCCGCCCTCATGAGTCGAGTCACCATCTCACTAGACACGTCCAAGAACCAGTTCTCCCTGACGCTGAACTCTGTAACCGCTGCGGACACGGCCGTGTATTACTGTGCGAGAGGAGCACCTCCGAGAGCACAGCGGCCCTGGGCTGCCTGGTCAAGGACTACTTCCCCGAACCGGTGACG',
 'oriented_input': 'NNNNNNNNNNNNNNNNNNNNNNNNACTGTCTCTGGTGGCTCCATCAGTAGTTACTACTGGAGCTGGATCCGGCACCCCCCAGGGAAGGGACTGGAGTGGATTGGGTATATCTATTACAGTTGGAGCACCAACTACAACCCCGCCCTCATGAGTCGAGTCACCATCTCACTAGACACGTCCAAGAACCAGTTCTCCCTGACGCTGAACTCTGTAACCGCTGCGGACACGGCCGTGTATTACTGTGCGAGAGGAGCACCTCCGAGAGCACAGCGGCCCTGGGCTGCCTGGTCAAGGACTACTTCCCCGAACCGGTGACG',
 'germ_alignments_nt': {'var': {'query': 'ACTGTCTCTGGTGGCTCCATCAGTAGTTACTACTGGAGCTGGATCCGGCACCCCCCAGGGAAGGGACTGGAGTGGATTGGGTATATCTATTACAGTTGGAGCACCAACTACAACCCCGCCCTCATGAGTCGAGTCACCATCTCACTAGACACGTCCAAGAACCAGTTCTCCCTGACGCTGAACTCTGTAACCGCTGCGGACACGGCCGTGTATTACTGTGCGAGAG',
   'germ': 'ACTGTCTCTGGTGGCTCCATCAGTAGTTACTACTGGAGCTGGATCCGGCAGCCCCCAGGGAAGGGACTGGAGTGGATTGGGTATATCTATTACAGTGGGAGCACCAACTACAACCCCTCCCTCAAGAGTCGAGTCACCATATCAGTAGACACGTCCAAGAACCAGTTCTCCCTGAAGCTGAGCTCTGTGACCGCTGCGGACACGGCCGTGTATTACTGTGCGAGAG',
   'midline': '|||||||||||||||||||||||||||||||||||||||||||||||||| ||||||||||||||||||||||||||||||||||||||||||||| |||||||||||||||||||| |||||| ||||||||||||||| ||| |||||||||||||||||||||||||||||| ||||| |||||| |||||||||||||||||||||||||||||||||||||'},
  'join': {'query': 'GGCCCTGGGCTGCCTGGTCAAGGACTACT',
   'germ': 'GGGCCAGGGCACCCTGGTCACCGTCTCCT',
   'midline': '|| || ||||  ||||||||  | || ||'},
  'div': {'query': 'TCCGAGAG', 'germ': 'TCGGGGAG', 'midline': '|| | |||'}},
 'exo_trimming': {'var_3': 1, 'join_5': 20, 'div_5': 12, 'div_3': 10},
 'junc_nt_breakdown': {'v_nt': 'TGTGCGAGAG',
  'n1_nt': 'GAGCACC',
  'd_nt': 'TCCGAGAG',
  'n2_nt': 'CACAGC',
  'j_nt': 'TGTGCGAGAG',
  'd_dist_from_cdr3_start': 14,
  'd_dist_from_cdr3_end': 4},
 'align_info': {'v_start': 66,
  'v_end': 291,
  'j_start': 20,
  'j_end': 48,
  'd_start': 12,
  'd_end': 19}}

Thanks and all the best,
Zach

@zacmon
Copy link
Author

zacmon commented Jun 17, 2020

Another thing I noticed wrong in all these annotations and any of my annotations:

In the 'junc_nt_breakdown' field, v_nt = j_nt. This is clearly not right. In the code https://github.com/briney/abstar/blob/85438527024105461f8959c4324aa29e153a6120/abstar/utils/output.py#L166 there's a note to change from v to j, but this doesn't seem to be manifested in abstar 0.3.5?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant