Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new analysis for heliconia test case #16

Open
chodon opened this issue Sep 17, 2014 · 9 comments
Open

new analysis for heliconia test case #16

chodon opened this issue Sep 17, 2014 · 9 comments
Assignees

Comments

@chodon
Copy link
Collaborator

chodon commented Sep 17, 2014

after see result, search openTree for other zingiberales with brlengths, gather character data from morphobank, run on other taxa

@chodon chodon self-assigned this Sep 17, 2014
@curtislisle
Copy link
Collaborator

You might try the new Arbor module "Explore Study Trees (Return Tree info) from Taxon Name". I just added it to Arbor. It is an improvement over the previous "Explore Study Trees" which is still available. The table that comes back from this analysis has a column entitled "tree summary" that lists the tree name and, if it has branch lengths, what type of scaling is applied. I made an assumption about how to assume branch lengths existed (existence of a non-zero length 'to:branchLengthMode' attribute). This seems to turn up useful trees, though. Maybe it can help your search.

@curtislisle
Copy link
Collaborator

I am finally getting around to looking at the empty cell analysis problem now. I hope to push fixes to the aggregation steps soon.

@curtislisle
Copy link
Collaborator

Would it help if we made a version of the aggregation method that accepted a list of columns to ignore? I see the floating point averages of the ID column. I'm sure that is really helpful.... Is this a major thing or a minor annoyance to worry about later?

@chodon
Copy link
Collaborator Author

chodon commented Sep 18, 2014

No, we can already select columns. It is just that the column might be
populated but not by all individuals. It is a minor annoyance. Outside of
arbor, you can simply sort by column, remove all the nonpopulated cells. It
is annoying because you have to loop through and do this for each column
but no big deal for right now.

On Thu, Sep 18, 2014 at 7:05 AM, Curtis Lisle notifications@github.com
wrote:

Would it help if we made a version of the aggregation method that accepted
a list of columns to ignore? I see the floating point averages of the ID
column. I'm sure that is really helpful.... Is this a major thing or a
minor annoyance to worry about later?


Reply to this email directly or view it on GitHub
#16 (comment)
.

@curtislisle
Copy link
Collaborator

I have updated the Aggregate Table by Average. It seems to produce better output now on the dataset with missing entries. Try it out, please, when you have a second. Also, there is a basic scatterplot viewer available in Arbor. It is just a tease now, because it doesn't do enough, but you can run aggregation, then go to the visualization tab and select "scatterplot". Pick the aggregated output file and do some plotting (x=Long_x, y=Lat_y gives a map, minus the map). (x=Long_x, y=elevation gives elevation as function of longitude, etc.). This plot could be helpful if we joined more of your continuous traits with this matrix. You can at least include a picture of a plot in your standup talk, if you choose.

@curtislisle
Copy link
Collaborator

Aggregate Table by Max is also tolerant of missing data. However, I noticed the max value is currently output for cells when there is never any data. That seems wrong. What should we output if there is never any data. zero, "NA" or blank instead?

@chodon
Copy link
Collaborator Author

chodon commented Sep 18, 2014

Hooray! I could get Aggregate Table by Average and Max to work on
HeliconiaElev.csv. Thanks!

On Thu, Sep 18, 2014 at 7:35 AM, Curtis Lisle notifications@github.com
wrote:

I have updated the Aggregate Table by Average. It seems to produce better
output now on the dataset with missing entries. Try it out, please, when
you have a second. Also, there is a basic scatterplot viewer available in
Arbor. It is just a tease now, because it doesn't do enough, but you can
run aggregation, then go to the visualization tab and select "scatterplot".
Pick the aggregated output file and do some plotting (x=Long_x, y=Lat_y
gives a map, minus the map). (x=Long_x, y=elevation gives elevation as
function of longitude, etc.). This plot could be helpful if we joined more
of your continuous traits with this matrix. You can at least include a
picture of a plot in your standup talk, if you choose.


Reply to this email directly or view it on GitHub
#16 (comment)
.

@chodon
Copy link
Collaborator Author

chodon commented Sep 18, 2014

I think that NA for columns when there is never any data is the best

On Thu, Sep 18, 2014 at 7:46 AM, Curtis Lisle notifications@github.com
wrote:

Aggregate Table by Max is also tolerant of missing data. However, I
noticed the max value is currently output for cells when there is never any
data. That seems wrong. What should we output if there is never any data.
zero, "NA" or blank instead?


Reply to this email directly or view it on GitHub
#16 (comment)
.

@curtislisle
Copy link
Collaborator

I am so happy you were able to run aggregations on some of your data. We will be adding visual displays to Arbor to allow for interactive exploration of tables soon. I want to spend more time developing tools that will help you extract meaning & hopefully answer biology questions. Let’s keep working on this over the coming weeks…

On Sep 18, 2014, at 2:44 PM, chodon sass notifications@github.com wrote:

Hooray! I could get Aggregate Table by Average and Max to work on
HeliconiaElev.csv. Thanks!

On Thu, Sep 18, 2014 at 7:35 AM, Curtis Lisle notifications@github.com
wrote:

I have updated the Aggregate Table by Average. It seems to produce better
output now on the dataset with missing entries. Try it out, please, when
you have a second. Also, there is a basic scatterplot viewer available in
Arbor. It is just a tease now, because it doesn't do enough, but you can
run aggregation, then go to the visualization tab and select "scatterplot".
Pick the aggregated output file and do some plotting (x=Long_x, y=Lat_y
gives a map, minus the map). (x=Long_x, y=elevation gives elevation as
function of longitude, etc.). This plot could be helpful if we joined more
of your continuous traits with this matrix. You can at least include a
picture of a plot in your standup talk, if you choose.


Reply to this email directly or view it on GitHub
#16 (comment)
.


Reply to this email directly or view it on GitHub.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants