Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

View_corr_mat() - make new || leave old #149

Open
wingedRuslan opened this issue Aug 20, 2019 · 5 comments
Open

View_corr_mat() - make new || leave old #149

wingedRuslan opened this issue Aug 20, 2019 · 5 comments

Comments

@wingedRuslan
Copy link
Collaborator

Heya!

This issue is opened to decide what to do with the view_corr_mat and partially to address the issue #148.

view_corr_mat in

  • make_figures.py - load corr mat from a file, plot corr mat, save the output.

Basically, it is almost identical function to one in scripts/visualisation_commands.py. The one in visualisation_commands.py is better because PR #108 updated the view_corr_mat in visualisation_commands.py, now the function accepts corr_mat as a dataframe object, numpy_array, and as a path to file containing corr_mat.

  • scripts/visualisation_commands.py

What I suggested is to redesign view_corr_mat in order to use the seaborn.heatmap().
Why? - from my perspective, it is better to rely on the already created solutions and seaborn.heatmap has a wide variety of parameters so it provides great flexibility to adjust plot to your preferences.

seaborn docs

Here are the comparisons (saved viz outputs) between the existed function and the potentially new one:

  1. Existed function default call:
    KirstieCode

  2. Seaborn recreate the same plot:
    TheSame

Use new colormap - 'YlGnBu'

  1. Existed function:
    Attempt3_Kirstie
  2. Seaborn version:
    Attempt3

To sum up, with seaborn.heatmat we could have the same figures as with the existed corr_mat but as a bonus a lot more nice features (read - parameters to function).

@KirstieJane
Copy link
Member

Thanks @wingedRuslan - can you add an example of the additional functions that seaborn heatmap gives us? I personally prefer the current version because I like the black line around the outside of the plot, and I can't think of options that heatmap will give us that are useful in this case. Happy to be wrong though!

@wingedRuslan
Copy link
Collaborator Author

@KirstieJane, if the black line around the outside of the plot is the only reason, I could add this feature to the new function as well :)

you did this with the following line of code

    # Put a box around your data
    ax.add_patch(
     mpatches.Rectangle(
        (ax.get_xlim()[0], ax.get_ylim()[1]),
        ax.get_xlim()[1],
        ax.get_ylim()[0],
        fill=False,      # remove background
        color='k',
        linewidth=1) )

@wingedRuslan
Copy link
Collaborator Author

@KirstieJane, the only difference (that we care about) could be concerning the improved colorbar:

  • in the existed version, the colorbar range is hardcoded to be [-1,0,1]. That's why only the diverging colormap can be used to plot corr_mat; You can see the output above if we use, for instance, sequential colormap.
  • in the seaborn version, we can adjust colormap with the vmin, vmax parameters
    vmin, vmax - Values to anchor the colormap, otherwise they are inferred from the data and other keyword arguments.
    cbar - Whether to draw a colorbar (although, not a problem to add this param to the existed version)
    cbar_kws - Keyword arguments for fig.colorbar
    Adjust colorbar properties
    This param I used to set custom labels to the colorbar (-1,0,1) to create the same plot as the existed version.

@wingedRuslan
Copy link
Collaborator Author

Link to the existed function - view_corr_mat - here

@KirstieJane
Copy link
Member

KirstieJane commented Aug 21, 2019

Thanks for the link @wingedRuslan! I like that function actually 😄

This is the code I was thinking of that's at the top of the tutorials:

# Read in sample data from the NSPN WhitakerVertes PNAS 2016 paper.
df, names, covars, centroids = datasets.NSPN_WhitakerVertes_PNAS2016.import_data()

# calculate residuals of the matrix df for the columns of names
df_res = scn.create_residuals_df(df, names, covars)

# create a correlation matrix over the columns of df_res
M = scn.create_corrmat(df_res, method='pearson')

# Initialise a weighted graph G from the correlation matrix M
G = scn.BrainNetwork(network=M, parcellation=names, centroids=centroids)

# Threshold G at cost 10 to create a binary graph with 10% as many edges as the complete graph G.
G10 = G.threshold(10)

M is the correlation matrix.

The correct way to threshold the graph is using the G.threshold() command because it makes a minimum spanning tree to make sure that every node is connected for any threshold (cost). BUT for visualization purposes that isn't super important, so we can just threshold using the np.percentile function as in the current code.

To be clear - we do not need to pass a graph G to the function. We can stick with giving it M or a dataframe etc.

The bin option is so that you can visualize the thresholded network as a (usually) black and white correlation matrix. It just sets all the values below the threshold to 0 and all those above to 1. This is what we do when we calculate the global measures.

So the current code looks great. Lets add vmin and vmax in whatever way you think is best (seaborn or adding those options to the current code), and maybe add a symmetric option (or whatever you used for the other colormaps to force vmin to be -vmax or vmax to be -vmin depending on which gives the largest range 😄

Thanks for checking!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants