Skip to content

Parser that reassembles sequence from NetSurfp csv output file.

License

Notifications You must be signed in to change notification settings

cbalbin-bio/netsurfp-parser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

This is a NetSurfP-2 parser. It parses the csv file that is generated at the end of the run, which contains the results for all the sequences if a multi-fasta file is provided to NetSurfP-2. NetSurfP provides the result per residue per each sequence in the output csv file, thus this parser reassembles the sequences and provides the results in a per sequences manner.

>>> from netsurfp import NETSURFPparser
>>>
>>> for result in NETSURFPparser("netsurfp_results.csv"):
...     print(result['id'][0])
...     print(result['rsa'])
...     break
...
ELMI002704|P13497|seqslice=66-171|elmslice=50-55
['0.8824259042739868', '0.7219752073287964', '0.7457220554351807', '0.6805842518806458', '0.5998258590698242', '0.5735031962394714', '0.6254182457923889', '0.5849741697311401', '0.6569890975952148', '0.6303611993789673', '0.6466432809829712', '0.6035148501396179', '0.6293766498565674', '0.6780522465705872', '0.6442797183990479', '0.6478316783905029', '0.6660124063491821', '0.6093245148658752', '0.6838065385818481', '0.6431317329406738', '0.6329817771911621', '0.6243690252304077', '0.6941267251968384', '0.7378093004226685', '0.6897199153900146', '0.7077065110206604', '0.7052865624427795', '0.6809216737747192', '0.6005478501319885', '0.624451756477356', '0.48799192905426025', '0.6386111974716187', '0.6114245057106018', '0.5538621544837952', '0.7809500098228455', '0.7154213190078735', '0.595727801322937', '0.5981651544570923', '0.5797094702720642', '0.6753348112106323', '0.565435528755188', '0.5729950070381165', '0.5051811933517456', '0.5441534519195557', '0.644923985004425', '0.4528184235095978', '0.692238986492157', '0.7202738523483276', '0.6652660369873047', '0.5935485363006592', '0.6235430240631104', '0.6077220439910889', '0.6550257205963135', '0.4952102601528168', '0.4107619524002075', '0.2789534628391266', '0.4713062644004822', '0.6082459688186646', '0.48249685764312744', '0.5480955243110657', '0.7335834503173828', '0.4044587016105652', '0.3395043909549713', '0.21154814958572388', '0.6250642538070679', '0.7737423777580261', '0.3212422728538513', '0.42769813537597656', '0.09081855416297913', '0.344725102186203', '0.11605754494667053', '0.3412134647369385', '0.14678719639778137', '0.26121950149536133', '0.6530312299728394', '0.8427003622055054', '0.21201205253601074', '0.5743263363838196', '0.7047423124313354', '0.7192980051040649', '0.3266800045967102', '0.4016511142253876', '0.4487822651863098', '0.3031940460205078', '0.15461444854736328', '0.48169514536857605', '0.524113655090332', '0.11701568961143494', '0.23772254586219788', '0.565127432346344', '0.4307790994644165', '0.20844152569770813', '0.48519811034202576', '0.6687926054000854', '0.5307827591896057', '0.2597864866256714', '0.34090763330459595', '0.21127831935882568', '0.5403017401695251', '0.456318736076355', '0.5217075943946838', '0.7526267766952515', '0.6122457385063171', '0.8588874340057373', '0.9573432803153992']

To view all possible keys:

>>> for result in NETSURFPparser("netsurfp_results.csv"):
...     print(result.keys())
...     break
...
dict_keys(['id', 'seq', 'n', 'rsa', 'asa', 'q3', 'p[q3_H]', 'p[q3_E]', 'p[q3_C]', 'q8', 'p[q8_G]', 'p[q8_H]', 'p[q8_I]', 'p[q8_B]', 'p[q8_E]', 'p[q8_S]', 'p[q8_T]', 'p[q8_C]', 'phi', 'psi', 'disorder'])