JBI-Special-Issue-BITS-2022

Prerequisites

To run the data processing procedure make sure you have installed in the MAIN folder the following:

R version ≥ 3.5.0. The R version used to write this data processing procedure is 4.1.1 (2021-08-10) -- "Kick Things" and it is available at https://cran.r-project.org/src/base/R-4/ for MacOs, https://cran.r-project.org/bin/windows/base/old/4.1.1/ for Windows and https://cran.r-project.org/doc/manuals/r-patched/R-admin.html for Unix.
PHP, available at https://www.php.net/manual/en/install.php

To visualize and interact with the data processing procedure, make sure you have installed Cytoscape version 3.9.1, available at https://cytoscape.org/download.html

Input preparation for the data processing procedure

Pick a Vitis vinifera or a Homo Sapiens gene and find its OneGenE expansion list.

Vitis vinifera

To check if your Vitis vinifera (Vv) gene has been already expanded, go to VvOneGenE and under gene name(s) type the Ordered Locus Name (VIT_XXsYYYYgZZZZZ) or gene name of your Vv gene. You can also type multiple names (space separated) and get multiple expansion lists as a result. For example:

Type VIT_04s0008g06000 in the gene name(s) box; this name corresponds to the transcription factor VvERF045.
You are then redirected to the output page where you can check the expansion list of VIT_04s0008g06000 and press the download button.
The expansion list will appear in your Donwnloadn folder as a zip compressed file, just extract it (54651_Vv-VIT_04s0008g06000.exp.csv) to use it. The expansion list is already annotated with additional information about the candidate genes that could be useful for a biologist.
Move the espansion list to the MAIN folder of this project to provide it as input to the data processing procedure.

Homo sapiens

To check if your human gene has been already expanded, go to HsOneGenE, choose Homo Sapiens (Hs) in the Organism box and leave Tile size and Iterations blank; the significance level alpha is set to 0.05 by default. Under LGN name, type the gene symbol of you Hs gene. For example:

Type MFSD2A in the LGN name box.
You are then redirected to the output page where the first result contains the expansion list of MFSD2A. Click on its pcim_id (193111) and the download will start automatically.
The expansion list will appear in your Donwnload folder as a zip compressed folder (193111_Hs.zip) . Unzip the folder to get access to its content: the .interactions file (193111_Hs.interactions) is the output of NES2RA, while the .expansion file (193111_Hs.expansion) is the actual expansion list, which is not in its working form, you have to annotate it first.
Move the .interactions file to the MAIN folder of this project, open a new terminal panel in this folder and type the following command:

% php anno-hsf5.php  file.interactions

You expansion list is now available in MAIN in csv format (193111_Hs_p1@MFSD2A.csv) and you can provide it as input to the data processing procedure.

Transcriptomic dataset preparation for the data processing procedure

Homo sapiens

Right now, the file fantom_mat.csv is a placeholder for the actual FANTOM-full transcriptomic dataset, a gene@home version of the FANTOM5 transcriptomic dataset. This file should be replaced and renamed accordingly in order for the data processing procedure to work. The FANTOM-full transcriptomic dataset can be downoloaded from: Human OneGenE download page. The file will need to be extracted, renamed (fantom_mat.csv) and placed in the MAIN folder.

Vitis vinifera

Right now, the file vespucci_mat.csv is a placeholder for the actual VESPUCCI transcriptomic dataset. This file should be replaced and renamed accordingly in order for the data processing procedure to work. The VESPUCCI transcriptomic dataset can be downoloaded from: Vitis OneGenE downolad page. The file will need to be extracted, renamed (vespucci_mat.csv) and placed in the MAIN folder.

Input submission to the data processing procedure

To run the data processing procedure, make sure you have a terminal panel open in the MAIN folder and type the following command:

% Rscript install_packages.R

In this way, all the necessary R pacakges will be installed, if not present.

Next, you can type the actual command that runs the data processing procedure:

% Rscript --vanilla data_processing_procedure.R explist.csv organism_type n

The arguments you can provide are:

explist.csv, which corresponds to the annotated expansion list of the gene under investigation. For Vv gene VIT_04s0008g06000 it is 54651_Vv-VIT_04s0008g06000.exp.csv and for Hs gene MFSD2A, it is 193111_Hs_p1@MFSD2A.csv (as explained in input-preparation);
organism_type, which corresponds to the organism to which the gene under investigation belongs;
n, which can be the first n genes you select from the expansion list (make sure that n is not greater than the expansion list length) or the relative frequency threshold according to which you can cut the expansion list, by selecting only the candidate genes with relative frequency >= n (0 <n <= 1).

Here are some examples:

Vitis vinifera

 % Rscript --vanilla data_processing_procedure.R 54651_Vv-VIT_04s0008g06000.exp.csv Vv 0.7

% Rscript --vanilla data_processing_procedure.R 54651_Vv-VIT_04s0008g06000.exp.csv Vv 150

Homo Sapiens

% Rscript --vanilla data_processing_procedure.R 193111_Hs_p1@MFSD2A.csv Hs 0.5

% Rscript --vanilla data_processing_procedure.R 193111_Hs_p1@MFSD2A.csv Hs 200

Note: The expansion lists of NFKB1 and TNF, used in the biological validation of the paper are made available for testing.

Output visualisation in Cytoscape

The data processing procedure has two output files:

a list of edges (gene_edges.csv), which represents the interactions retrieved by pc_parallel() between the surviving input gene nodes, divided into source and target, and the direction of their interaction, --- if undirected or --> if directed. Also the pearson correlation (cor) computed between the input genes, as the zero-order conditional independence test, is provided, along with its sign (cor_sign)

Homo sapiens example p1@MFSD2A_edges.csv

spurce	interaction	target	cor	cor_sign
T178190	---	T009518	0.455	+
T032201	-->	T054717	0.699	+

a list of nodes (gene_nodes.csv), which represents the input gene nodes that survived after pc_parallel() application and for which an interaction was found in the output graph. Additional information, extracted from human and grapevine annotation files, is added for the biological interpretation of the results.

These two files are contained in the Vv folder inside the MAIN folder, if a grapevine expansion list was chosen as input to the data processing procedure, or they are contained in the Hs folder inside the MAIN folder, if a human expansion list was chosen as input.

Homo sapiens example p1@MFSD2A_nodes.csv

ID	association_with_transcript	entrezgene_id	hgnc_id	uniprot_id	description	rank	Frel	type
T178190	p4@CDK6	CDK6	1777	Q00534	cyclin dependent kinase 6	14	0.996	gene with protein product

To visualize the pc_parallel() ouptut graph on Cytoscape, do the following steps:

Open Cytoscape and allow the app to accept incoming network connections;
Select the network icon from the main horizontal toolbar, which stands for Import Network from File System (or from File -> Import -> Network from file) and select the gene_edges.csv file from the Vv folder or Hs folder inside MAIN;
Click the OK button in the Import Network from Table panel and wait for the network to load;
Select the table icon from the main horizontal toolbar, which stands for Import Table from File (or from File -> Import -> Table from file) and select the gene_nodes.csv file from the Vv folder or Hs folder inside MAIN;
Click the OK button in the Import Columns from Table panel and wait for the table to load;
The Vv and Hs folders contain respectively a Vv_style.xml and a Hs_style.xml that can be uploaded in Cytoscape to customize the network appearance (From File -> Import -> Styles from file...). This feature is managed by the Style panel (under Network in the main vertical toolbar), from which you can selected the uploaded style and visualize the network in a more human-friendly and enriched way.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
MAIN		MAIN
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MAIN

MAIN

LICENSE

LICENSE

README.md

README.md

Repository files navigation

JBI-Special-Issue-BITS-2022

Prerequisites

Input preparation for the data processing procedure

Vitis vinifera

Homo sapiens

Transcriptomic dataset preparation for the data processing procedure

Homo sapiens

Vitis vinifera

Input submission to the data processing procedure

Vitis vinifera

Homo Sapiens

Output visualisation in Cytoscape

Homo sapiens example p1@MFSD2A_edges.csv

Homo sapiens example p1@MFSD2A_nodes.csv

About

Releases

Packages

Languages

License

Camilla9347/JBI-Special-Issue-BITS-2022

Folders and files

Latest commit

History

Repository files navigation

JBI-Special-Issue-BITS-2022

Prerequisites

Input preparation for the data processing procedure

Vitis vinifera

Homo sapiens

Transcriptomic dataset preparation for the data processing procedure

Homo sapiens

Vitis vinifera

Input submission to the data processing procedure

Vitis vinifera

Homo Sapiens

Output visualisation in Cytoscape

Homo sapiens example p1@MFSD2A_edges.csv

Homo sapiens example p1@MFSD2A_nodes.csv

About

Topics

Resources

License

Stars

Watchers

Forks

Languages