Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

check_zero() causes sankeyNetwork() and forceNetwork() to fail silently on specific inputs #281

Open
lewinfox opened this issue Apr 10, 2021 · 0 comments

Comments

@lewinfox
Copy link

Hi, thanks for an awesome package!

I've found an issue where assumptions made by check_zero() cause some graph plotting operations to fail silently if you have a 1-indexed rather than 0-indexed dataset for Links . I have a fix ready and will raise a PR for review in a sec.

To reproduce

It's a bit involved so bear with me. I initially found this when using sankeyNetwork() but it also affects forceNetwork().

First we pretend we have some 1-indexed link data:

URL <- paste0('https://cdn.rawgit.com/christophergandrud/networkD3/',
              'master/JSONdata/energy.json')

energy <- jsonlite::fromJSON(URL)

# Simulate 1-indexed data
bad_energy <- energy
bad_energy$links$source <- energy$links$source + 1
bad_energy$links$source <- energy$links$source + 1

Normally we get a warning for this

sankeyNetwork(
    Links = bad_energy$links, 
    Nodes = bad_energy$nodes,
    Source = 'source',
    Target = 'target',
    Value = 'value'
)
#> Warning message:
#> It looks like Source/Target is not zero-indexed. This is required in JavaScript and so your plot may not render. 

However, if we pass in a tbl_df we get no warning and no plot. We do see a message about data frame conversion but nothing about the 1-indexing error, and no plot is rendered.

sankeyNetwork(
    Links = tibble::as_tibble(bad_energy$links), 
    Nodes = bad_energy$nodes,
    Source = 'source',
    Target = 'target',
    Value = 'value'
)
#> Links is a tbl_df. Converting to a plain data frame.

This is caused by the way Links is subset in the call to check_zero():

# R/sankeyNetwork.R:82
check_zero(Links[, Source], Links[, Target])

The problem arises when Links is a tbl_df because [.tbl_df returns a tbl_df even if only one column is selected, unlike [.data.frame. This breaks check_zero() because it assumes its inputs are vectors.

We can get the same effect by omitting Source or Target - when these are missing the subsetting operation returns the entire data frame, which again breaks check_zero().

# Both of these calls produce no plot and no warning

sankeyNetwork(
    Links = bad_energy$links, 
    Nodes = bad_energy$nodes,
    # Source = 'source',
    Target = 'target',
    Value = 'value'
)

sankeyNetwork(
    Links = bad_energy$links, 
    Nodes = bad_energy$nodes,
    Source = 'source',
    # Target = 'target',
    Value = 'value'
)

Fix

I've created a PR to fix this (basically just move check_zero() after input checking and tbl_df_strip()) which I will link in a sec.

Hope this is helpful, thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant