Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

duplicate 'row.names' are not allowed. when by_species = TRUE #21

Open
rossmounce opened this issue Feb 5, 2016 · 2 comments
Open

duplicate 'row.names' are not allowed. when by_species = TRUE #21

rossmounce opened this issue Feb 5, 2016 · 2 comments
Assignees

Comments

@rossmounce
Copy link

I have a large list of genera to lookup, and I (knowingly) have lots of duplicates. I don't want it all uniq'd down to unique genera only.

If the list starts with "Aa" first and "Aa" second, I want two separate lines output e.g.

 genus      family       order       group
1 Aa Orchidaceae Asparagales Angiosperms
2 Aa Orchidaceae Asparagales Angiosperms

But instead it just throws an error and doesn't output any table. I can only get an output table with by_species=FALSE and it only has ~12,000 in it (not what I want).

str(A)
 chr [1:337444] "Aa" "Aa" "Aaronsohnia" "Narthecium" "Abarema" ...

lookup_table(A,missing_action="NA",by_species=TRUE)
Error in `row.names<-.data.frame`(`*tmp*`, value = value) : 
  duplicate 'row.names' are not allowed
In addition: Warning message:
non-unique values when setting 'row.names': ‘Aa’, ‘Abarema’, ‘Abelia’, ‘Abeliophyllum’, ‘Abelmoschus’, ‘Abies’, ‘Abrodictyum’, ‘Abroma’, ‘Abronia’, ‘Abrophyllum’, ‘Abrotanella’, ‘Abrus’, ‘Abuta’, ‘Abutilon’, ‘Acacia’, ‘Acaciella’, ‘Acaena’, ‘Acalypha’, ‘Acampe’, ‘Acamptopappus’, ‘Acanthephippium’, ‘Acanthocalycium’, ‘Acanthocalyx’, ‘Acanthocarpus’, ‘Acanthocereus’, ‘Acantholimon’, ‘Acantholippia’, ‘Acantholobivia’, ‘Acanthomintha’, ‘Acanthopale’, ‘Acanthopanax’, ‘Acanthophoenix’, ‘Acanthophyllum’, ‘Acanthoprasium’, ‘Acanthopsis’, ‘Acanthorhipsalis’, ‘Acanthorrhiza’, ‘Acanthoscyphus’, ‘Acanthosicyos’, ‘Acanthospermum’, ‘Acanthostachys’, ‘Acanthosyris’, ‘Acanthus’, ‘Acaulimalva’, ‘Acca’, ‘Acer’, ‘Aceratium’, ‘Achatocarpus’, ‘Achetaria’, ‘Achillea’, ‘Achimenes’, ‘Achlys’, ‘Achnather [... truncated] 

Is it possible to coerce it to the style of output I desire, outputting 'dumbly' for each and every input name, even if there are duplicates?

@richfitz
Copy link
Member

richfitz commented Feb 5, 2016

Hi Ross. I appreciate the bug reports - but could you do us a favour and provide a minimally reproducible example for us? (i.e., 4-5 lines that we can run beginning to end to recreate your problem -- see here for more information).

I'm sure we can reverse engineer the issues you've found but it just makes it a lot easier, and therefore keeps it a little higher in the list of things to do.

@rossmounce
Copy link
Author

Sure.

Small example of the feature request is below:

plant_lookup_version_current()
[1] "1.1.1"
lookup_table(c("Aa","Aa","Aaronsohnia","Narthecium","Abarema","Abarema","Abarema"),missing_action="NA")
        genus        family        order       group
1          Aa   Orchidaceae  Asparagales Angiosperms
2 Aaronsohnia    Asteraceae    Asterales Angiosperms
3  Narthecium Nartheciaceae Dioscoreales Angiosperms
4     Abarema      Fabaceae      Fabales Angiosperms

I would expect/want this output instead:

        genus        family        order       group
1          Aa   Orchidaceae  Asparagales Angiosperms
2          Aa   Orchidaceae  Asparagales Angiosperms
3 Aaronsohnia    Asteraceae    Asterales Angiosperms
4  Narthecium Nartheciaceae Dioscoreales Angiosperms
5     Abarema      Fabaceae      Fabales Angiosperms
6     Abarema      Fabaceae      Fabales Angiosperms
7     Abarema      Fabaceae      Fabales Angiosperms

As for the bug, I've uploaded a smaller set of 1000 names to a github gist to enable reproduction:

testnames <- readLines("https://gist.githubusercontent.com/rossmounce/fcac3b61324f1dcf721e/raw/5665d9de08907f6dd54898f2fbcb52a62d0279d0/A%2520list%2520of%2520plant%2520genera", warn=FALSE)
zzz <- lookup_table(testnames,missing_action="NA",by_species=TRUE)
Error in `row.names<-.data.frame`(`*tmp*`, value = value) : 
  duplicate 'row.names' are not allowed
In addition: Warning message:
non-unique values when setting 'row.names': ‘Aa’, ‘Abarema’, ‘Abelia’, ‘Abeliophyllum’, ‘Abelmoschus’, ‘Abies’, ‘Abrodictyum’, ‘Abroma’, ‘Abronia’, ‘Abrophyllum’, ‘Abrotanella’, ‘Abrus’, ‘Abuta’, ‘Abutilon’, ‘Acacia’, ‘Corynabutilon’, ‘Diabelia’

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants