Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_table_columns does not get all columns #98

Open
kozo2 opened this issue Dec 13, 2022 · 9 comments
Open

get_table_columns does not get all columns #98

kozo2 opened this issue Dec 13, 2022 · 9 comments

Comments

@kozo2
Copy link
Member

kozo2 commented Dec 13, 2022

I tried p4c.get_table_columns(table='node', network=1162) for the second netowrk in this cys file.
https://www.dropbox.com/s/r9s09p4p54naw7z/test.cys?dl=0
But it didn't get the all columns
It got only the first 10 columns.
Please let me know if you have any ideas about this.
I'm using Python 3.10.8 + py4cytoscape 1.6.0 on Windows11.

@bdemchak
Copy link
Collaborator

Hi, Kozo ... I get 28 columns when I do this using Python 3.10.4 + py4cytoscape 1.6.0 + Windows 10. Also, I'm using Pandas version 1.4.2. All is fine on my system.

Would you mind trying a few things?

  1. Call p4c.get_table_column_types on this network/table ... good to see how many columns you get.

  2. In Swagger (via Help | Automation | CyREST API), try /v1/networks/{networkId}/tables/defaultnode/columns ... good to see what Cytoscape reports.

Has this failure been happening long? Do you know what changed between now and the last time this worked?

Thanks!

@kozo2
Copy link
Member Author

kozo2 commented Dec 15, 2022

Hi Barry,
I'm sorry.
The method I told you certainly got all the columns as you say.
It's because I partially told you how to reproduce the problem.
I will create a more detailed description to reproduce the problem.
Give me some time until then.

@bdemchak
Copy link
Collaborator

bdemchak commented Dec 15, 2022 via email

@kozo2
Copy link
Member Author

kozo2 commented Dec 15, 2022

Thanks Barry.
Please ignore about the cys file.
Below is the answer to your question number 1 when the p4c.get_table_columns doesn't work.

  1. Call p4c.get_table_column_types on this network/table ... good to see how many columns you get.

It returns the all 28 columns.
Below is the screenshot.

image
image

I tried the keys() of the p4c.get_table_column_types result for get_table columns(columns= argument.
But it did not get the 28 columns.

image

@kozo2
Copy link
Member Author

kozo2 commented Dec 15, 2022

And this is the answer for your question number 2.

In Swagger (via Help | Automation | CyREST API), try /v1/networks/{networkId}/tables/defaultnode/columns ... good to see what Cytoscape reports.

Curl

curl -X GET --header 'Accept: application/json' 'http://localhost:1234/v1/networks/1162/tables/defaultnode/columns'

Request URL

http://localhost:1234/v1/networks/1162/tables/defaultnode/columns

Response Body

[
  {
    "name": "SUID",
    "type": "Long",
    "immutable": true,
    "primaryKey": true
  },
  {
    "name": "shared name",
    "type": "String",
    "immutable": true,
    "primaryKey": false
  },
  {
    "name": "name",
    "type": "String",
    "immutable": true,
    "primaryKey": false
  },
  {
    "name": "selected",
    "type": "Boolean",
    "immutable": true,
    "primaryKey": false
  },
  {
    "name": "KEGG_NODE_X",
    "type": "String",
    "immutable": true,
    "primaryKey": false
  },
  {
    "name": "KEGG_NODE_Y",
    "type": "String",
    "immutable": true,
    "primaryKey": false
  },
  {
    "name": "KEGG_NODE_WIDTH",
    "type": "String",
    "immutable": true,
    "primaryKey": false
  },
  {
    "name": "KEGG_NODE_HEIGHT",
    "type": "String",
    "immutable": true,
    "primaryKey": false
  },
  {
    "name": "KEGG_NODE_LABEL",
    "type": "String",
    "immutable": true,
    "primaryKey": false
  },
  {
    "name": "KEGG_NODE_LABEL_LIST_FIRST",
    "type": "String",
    "immutable": true,
    "primaryKey": false
  },
  {
    "name": "KEGG_NODE_LABEL_LIST",
    "type": "List",
    "listType": "String",
    "immutable": true,
    "primaryKey": false
  },
  {
    "name": "KEGG_ID",
    "type": "List",
    "listType": "String",
    "immutable": true,
    "primaryKey": false
  },
  {
    "name": "KEGG_NODE_LABEL_COLOR",
    "type": "String",
    "immutable": true,
    "primaryKey": false
  },
  {
    "name": "KEGG_NODE_FILL_COLOR",
    "type": "String",
    "immutable": true,
    "primaryKey": false
  },
  {
    "name": "KEGG_NODE_REACTIONID",
    "type": "String",
    "immutable": true,
    "primaryKey": false
  },
  {
    "name": "KEGG_NODE_TYPE",
    "type": "String",
    "immutable": true,
    "primaryKey": false
  },
  {
    "name": "KEGG_NODE_SHAPE",
    "type": "String",
    "immutable": true,
    "primaryKey": false
  },
  {
    "name": "KEGG_LINK",
    "type": "String",
    "immutable": true,
    "primaryKey": false
  },
  {
    "name": "row.names",
    "type": "Integer",
    "immutable": false,
    "primaryKey": false
  },
  {
    "name": "x_location",
    "type": "String",
    "immutable": false,
    "primaryKey": false
  },
  {
    "name": "y_location",
    "type": "String",
    "immutable": false,
    "primaryKey": false
  },
  {
    "name": "source",
    "type": "Integer",
    "immutable": false,
    "primaryKey": false
  },
  {
    "name": "target",
    "type": "Integer",
    "immutable": false,
    "primaryKey": false
  },
  {
    "name": "orig_edge_SUID",
    "type": "String",
    "immutable": false,
    "primaryKey": false
  },
  {
    "name": "shared interaction",
    "type": "String",
    "immutable": false,
    "primaryKey": false
  },
  {
    "name": "interaction",
    "type": "String",
    "immutable": false,
    "primaryKey": false
  },
  {
    "name": "KEGG_REACTION_TYPE",
    "type": "String",
    "immutable": false,
    "primaryKey": false
  },
  {
    "name": "KEGG_REACTION_GENE",
    "type": "String",
    "immutable": false,
    "primaryKey": false
  }
]

@kozo2
Copy link
Member Author

kozo2 commented Dec 15, 2022

I think I find the key information to the problem. (See the attached image.)
It seems that if there are no elements in the column, it stops acquiring subsequent column information.

image

@kozo2
Copy link
Member Author

kozo2 commented Dec 15, 2022

When I skipped the 10th column, I got a strange result.
I could not get a data frame.
image

@bdemchak
Copy link
Collaborator

bdemchak commented Dec 15, 2022 via email

@bdemchak
Copy link
Collaborator

Hi, Kozo ...

Finally, I'm able to look at this closely. Sorry for the delay.

The problem I see is with the error message "Column "KEGG_NODE_LABEL_LIST" has only 3321 elements, but should have 4637"

For background, get_table_columns() fetches a list of SUIDs for the table, and it uses the length of the list to determine how many data elements each column should have.

Then, for each column, all column values are fetched. When CyREST returns the values, it returns them as a simple list, and doesn't tag the values with their SUIDs. So, if there are fewer values than SUIDs, there's no way to know which values go with which SUIDs. That's why get_table_columns() gives an error when it finds a column that doesn't have the same number of values as SUIDs.

From the screen shot, it looks like you're querying the "Metabolic pathways [rno01100]" table, which has 4637 nodes. It also looks like you're asking for the KEGG_NODE_LABEL_LIST column, whose datatype is List of Strings. Looking at the table with Cytoscape, I see that some values are empty ([]) and some are not (e.g., [C20408]).

Am I seeing this right?

When I try get_table_columns() with this data, I see what I expect ... that KEGG_NODE_LABEL_LIST column contains 4637 values, where each value is a Python list, and each list has a single element (e.g., [''] or ['C20408']). All good.

The error you're reporting indicates that only 3321 values were found. In my debugging, I see all 4637 values. So, I don't get the same result as you do.

I do notice that if I remove all of the empty values (i.e, ['']), there are 3321 values remaining, which matches what the error is reporting.

So, somehow, the empty values are being eliminated either in get_table_columns() or before.

The quickest test would be to use Swagger to fetch the values and see what CyREST is actually returning. Use the
GET /v1/networks/{networkId}/tables/defaultnode/columns/KEGG_NODE_LABEL_LIST function for this.

The CURL for this would be:

curl -X GET --header 'Accept: application/json' 'http://localhost:1234/v1/networks/124/tables/defaultnode/columns/KEGG_NODE_LABEL_LIST'

... where {networkId} or "124" is the SUID for the "Metabolic pathways [rno01100]' network. (You can use Cytoscape to find the network SUID by selecting the Network Table, then adding the "SUID" column to the visible column list, and then seeing the SUID value in the table.

I would want to know all of the values being returned, and am especially interested in knowing whether there are 3321 values or 4637 values. And I'm also very interested to know whether the empty values are actually [''] or something else.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants