Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

validation error on save and error pushing a dataset from cli #1991

Open
greggles opened this issue Dec 8, 2021 · 3 comments
Open

validation error on save and error pushing a dataset from cli #1991

greggles opened this issue Dec 8, 2021 · 3 comments
Assignees
Projects

Comments

@greggles
Copy link

greggles commented Dec 8, 2021

Hi,

I'm trying to follow this tutorial to create a dataset in qri.io. To get started I figured I'd create one by following the tutorial.

Here's my terminal session.

  1. It would be helpful if the message "this dataset has 10 validation errors" suggested using qri validate to see the problems.
  2. I don't know where to go to debug the error about ref contained in log data does not match
greggles@gknaddison-mbp > brew install qri-io/qri/qri
...
🍺  /usr/local/Cellar/qri/0.10.0: 4 files, 56MB, built in 6 seconds
greggles@gknaddison-mbp [1]> qri setup
choose username (leave empty to generate a default name):greggles
set up qri repo at: /Users/greggles/.qri

greggles@gknaddison-mbp > qri list
you have no datasets
greggles@gknaddison-mbp > qri registry prove --email greg.knaddison@gmail.com --username greggles
password:
proved user greggles to registry, connected local key
greggles@gknaddison-mbp > vi population.csv
greggles@gknaddison-mbp > qri save --body population.csv me/population-data-set-name
saving done [==============================================================================]
dataset saved: greggles/population-data-set-name@/ipfs/QmVmaF36x3K7LiBwrPCvgbRxfrDH8ZNuwqNmdFQZvMZJ98
this dataset has 10 validation errors
greggles@gknaddison-mbp > qri list
1   greggles/population-data-set-name
    /ipfs/QmVmaF36x3K7LiBwrPCvgbRxfrDH8ZNuwqNmdFQZvMZJ98
    300 B, 10 entries, 10 errors

greggles@gknaddison-mbp [1]> qri validate --body population.csv me/population-data-set-name
#  ROW  COL  VALUE     ERROR
0  0    2     8622357  type should be integer, got string
1  1    2     4085014  type should be integer, got string
2  2    2     2670406  type should be integer, got string
3  3    2     2378146  type should be integer, got string
4  4    2     1743469  type should be integer, got string
5  5    2     1590402  type should be integer, got string
6  6    2     1579504  type should be integer, got string
7  7    2     1469490  type should be integer, got string
8  8    2     1400337  type should be integer, got string
9  9    2     1036242  type should be integer, got string
greggles@gknaddison-mbp > qri push greggles/population-data-set-name
ref contained in log data does not match
greggles@gknaddison-mbp [1]> qri push me/population-data-set-name
ref contained in log data does not match
greggles@gknaddison-mbp [1]> qri log me/population-data-set-name
1   Commit:  /ipfs/QmVmaF36x3K7LiBwrPCvgbRxfrDH8ZNuwqNmdFQZvMZJ98
    Date:    Wed Dec  8 08:55:08 MST 2021
    Storage: local
    Size:    300 B

    created dataset from population.csv

greggles@gknaddison-mbp >
@Arqu
Copy link
Contributor

Arqu commented Dec 8, 2021

Appreciate the thorough feedback here. It also highlights our need to revamp/update a lot of our error handling to make it more user friendly. The qri validate suggestion is a great note and we should do that right away (I'll file a separate issue).
For the ref contained in log data does not match error, that's definitely a bit more obscure and harder to debug unless you look at the qri code itself (you should not have to do that).

In any case, the error itself comes from one of the following:

  • the username provided (greggles or whatever me gets resolved to) doesn't match what the internal log store expects (and that's the final authority on who owns what and how)
  • profile IDs don't match in a similar fashion as above; They are the internal representation and what the username resolves to when shuffling data around basically using a stable ID instead of the username for the user representation

Those can come from a few different places (and probably some I'm not listing/aware of):

  • when you initialized your CLI instance you gave it another name and are now using greggles - low likelyhood as me/ also resolves badly
  • you fumbled with your config - also don't think that's the case here, you would probably know if you did
  • you did the setup process multiple times and that made things weird? - not sure about this one
  • you signed up on cloud, you created your profile on CLI independently and later did a qri registry prove - might have messed up your internal profileID; there's a long technical answer to this, but in short we're transitioning some things here and we have some funkiness between CLI and Cloud for fresh user accounts in some edge cases - this one seems pretty likely
  • or maybe similar to above, but just the plain CLI signup flow (qri registry signup) for cloud results in a similar issue

If any of the above ring a bell, let me know and we can work from there.
Also we should pretty soon have the ability to manually create datasets directly on cloud which should help you skip a bunch of these steps and just upload and edit your dataset directly from the web interface.

@greggles
Copy link
Author

greggles commented Dec 9, 2021

I don't think any of those scenarios regarding profile ID occurred to me. When I registered on the website I did have a somewhat weird experience.

  1. I filled in my info and everything was fine
  2. I clicked submit and the browser seemed to show page loading but then nothing happened
  3. I did not get an error message or welcome email at that time
  4. I waited a bit and clicked the button to submit again and got an error about the username being already taken
  5. I changed the username and submitted and got an error that the email was already taken
  6. I tried logging in and found that the initial values I created worked

So maybe my profile on the server side is not complete?

@Arqu Arqu added this to Backlog in Sprints via automation Dec 9, 2021
@Arqu Arqu moved this from Backlog to In progress in Sprints Dec 9, 2021
@ramfox
Copy link
Member

ramfox commented Dec 9, 2021

Pulling in convo from discord:

Arqu — Today at 3:18 PM
Could you try a qri registry prove to get your public key to sync up to cloud

greggles — Today at 3:41 PM
I'm getting an error - cannot prove with a non-empty repository

This error comes from:

qri/lib/registry.go

Lines 85 to 100 in e499860

// ProveProfileKey sends proof to the registry that this user has control of a
// specified private key, and modifies the user's config in order to reconcile
// it with any already existing identity the registry knows about
func (registryImpl) ProveProfileKey(scope scope, p *RegistryProfileParams) error {
// Check if the repository has any saved datasets. If so, calling prove is
// not allowed, because doing so would essentially throw away the old profile,
// making those references unreachable. In the future, this can be changed
// such that the old identity is given a different username, and is merged
// into the client's collection.
numRefs, err := scope.Repo().RefCount()
if err != nil {
return err
}
if numRefs > 0 {
return fmt.Errorf("cannot prove with a non-empty repository")
}

Plus it looks like there may be some confusion about which system (local or remote) gets precedence over the keys.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Sprints
In progress
Development

No branches or pull requests

3 participants