Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

creating a matrix #15

Open
mictadlo opened this issue Nov 14, 2017 · 10 comments
Open

creating a matrix #15

mictadlo opened this issue Nov 14, 2017 · 10 comments

Comments

@mictadlo
Copy link

Unfortunately, I do not understand how could create a matrix for my data. There separated in 2 json files. The first one describe the circle labeling, color and size

[
  {
    "color": "#996600", 
    "id": "chr03", 
    "len": 35020413, 
    "label": "chr03"
  }, 
  {
    "color": "#666600", 
    "id": "tig00007144", 
    "len": 40868, 
    "label": "tig00007144"
  }, 
  {
    "color": "#666600", 
    "id": "tig00026480", 
    "len": 95961, 
    "label": "tig00026480"
  },
...
]

On the other hand, the second file describes the relationship between each label in the chord chart.

[
  {
    "source": {
      "start": 30824, 
      "end": 23113, 
      "id": "tig00007144"
    }, 
    "target": {
      "start": 33203431, 
      "end": 33211142, 
      "id": "chr03"
    }
  }, 
  {
    "source": {
      "start": 48387, 
      "end": 1, 
      "id": "tig00026480"
    }, 
    "target": {
      "start": 35010628, 
      "end": 34962190, 
      "id": "chr03"
    }
  }, 
...
]

How do I convert the above relationship file to a proper matrix?

Thank you in advance.

@mattflor
Copy link
Owner

Well, as this is an R wrapper around some javascript code you need to provide your data as an R matrix. Here's some code from the README that should give you an idea (think of 'have' as the source and of 'prefer' as the target):

m <- matrix(c(11975,  5871, 8916, 2868,
              1951, 10048, 2060, 6171,
              8010, 16145, 8090, 8045,
              1013,   990,  940, 6907),
            byrow = TRUE,
            nrow = 4, ncol = 4)
haircolors <- c("black", "blonde", "brown", "red")
dimnames(m) <- list(have = haircolors,
                    prefer = haircolors)
m
#>         prefer
#> have     black blonde brown  red
#>   black  11975   5871  8916 2868
#>   blonde  1951  10048  2060 6171
#>   brown   8010  16145  8090 8045
#>   red     1013    990   940 6907

@mictadlo
Copy link
Author

Hi, Thank you for your example but I still do not how to convert my data to the required matrix.

@mattflor
Copy link
Owner

Start by converting your json files to data frames, e.g using the jsonlite R package.

@mictadlo
Copy link
Author

Without any coding how would the matrix look like from the above 2 JSON files?

@mattflor
Copy link
Owner

You may want to look at the d3 chord layout: https://github.com/d3/d3-chord

[...] each matrix[i][j] represents the flow from the ith node in the network to the jth node. Each number matrix[i][j] must be nonnegative, though it can be zero if there is no flow from node i to node j.

I.e. the chord layout handles everything automatically: It calculates total node/group sizes and chord start and end positions, so you don't specify those explicitely...

@mictadlo
Copy link
Author

Thank you for your link but I still do not understand how the matrix should look like for my data.

@mattflor
Copy link
Owner

Sorry, show me some code of what you have tried so far, and some more explanation of what your data means. Your questions are too unspecific otherwise.

@mictadlo
Copy link
Author

mictadlo commented Nov 17, 2017

One of them is the reference file (chr03) which contains one long sequence (35020413 character long) and other file contained multiple sequence with different length and different ids (tig00007144, tig00026480,...). I used an alignment software called BLAST to align these two sequences. The original alignment results are stored in tab separated file:

tig00007144	chr03	23113	30824	33203431	33211142
tig00026480	chr03	1	48387	35010628	34962190
tig00003221	chr03	16916	29961	2127862	2140878
tig00010111	chr03	218	6989	23106738	23113500
tig00000318	chr03	1	18244	28621116	28639312
tig00009327	chr03	32147	40878	34160279	34151526
tig00025208	chr03	65878	79311	17006900	17020370
tig00019172	chr03	43720	50583	23113500	23106638
tig00004923	chr03	44154	50849	21159875	21153164

This is the column explanation:

  • query_name = column 0
  • subject_name = column 1
  • query_start = column 2
  • query_end = column 3
  • subject_start = column 4]
  • subject_end = column 5

I would like to see which parts (tig00007144, tig00026480, ...) mapped where to chr03.

Thank you in advance.

@mattflor
Copy link
Owner

I'm afraid the chorddiag package is not suited for your purpose. You may be able to misuse it to some (but certainly not satisfactory) degree. I would suggest you look for another tool.

@jelias1
Copy link

jelias1 commented Dec 20, 2017

Mictadlo, your data reminded me of an example in the RCircos package. Here is some documentation on that package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants