Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type error upon loading - index.json getting clobbered #103

Open
marcboivin opened this issue Jan 7, 2022 · 8 comments
Open

Type error upon loading - index.json getting clobbered #103

marcboivin opened this issue Jan 7, 2022 · 8 comments

Comments

@marcboivin
Copy link

Great idea, I was dumping as much stuff as I could into it because it solves a real problem I have.

Thing is, I realized my index didn't have everything in it. Looked at the archive folder all the pages were there.

Tried to open a new session started re-indexing content. Still same issue, not everything was showing in the index.

Started a 3rd time, got :

TypeError: Cannot destructure property 'id' of 'Mo.Index.get(...)' as it is undefined.
    at /Users/mboivin/.nvm/versions/node/v16.13.1/lib/node_modules/diskernet/build/22120.js:3:8330661
    at Array.map (<anonymous>)
    at n.flex (/Users/mboivin/.nvm/versions/node/v16.13.1/lib/node_modules/diskernet/build/22120.js:3:8330639)
    at Object.search (/Users/mboivin/.nvm/versions/node/v16.13.1/lib/node_modules/diskernet/build/22120.js:3:8330865)
    at async /Users/mboivin/.nvm/versions/node/v16.13.1/lib/node_modules/diskernet/build/22120.js:3:8357152

Now I can't use the tool and my indexed content is unusable.

Attached is the MASSIVE error log I got from trying to restart diskernet.

Anyway to solve this?
out.log

Thanks

@o0101
Copy link
Collaborator

o0101 commented Jan 7, 2022

Thank you! I'm really sorry about this issue. I've seen it as well.

I still have not isolated the cause.

Basically what has happened is the index.json file has been clobbered.

So all your cached resources are still there, and I believe the cache.json file should still be OK.

This is a really terrible thing to happen to your index, I'm sorry!

I don't have a solution right now but I believe it may be possible to rebuild the index.json file and recover it.

A patch I'm intending to release will keep a backup index.json and recover it if it gets clobbered. As well as adding a check before any writes that we are not overwriting an existing one, and to, in any case, save out the existing one, to the backup before writing.

I still can't isolate where Index.json is overwritten with an empty copy, as there are only a couple of places this occurs.

@o0101
Copy link
Collaborator

o0101 commented Jan 7, 2022

Thanks again for the report @marcboivin ! I really appreciate it and I'm very sorry for you that this happened 😢

@o0101 o0101 changed the title Type error upon loading Type error upon loading - index.json getting clobbered Jan 7, 2022
@o0101
Copy link
Collaborator

o0101 commented Jan 7, 2022

I just checked out the out.log -- that is an impressively long error isn't it 😂 😆

It's basically just dumped the entirety of the bundled JavaScript for the entire project out of the executable. I'm still not sure why that happens on crash -- it used to happen with nexe and still occurs with pkg.

I think it is happening because it's trying to output the line where the error occurred. but of course the built JS is all one single "line" (8 Mb long...)

Anyway, this is not the cause of the crash / corruption

@marcboivin
Copy link
Author

marcboivin commented Jan 7, 2022

(Edit: corrected typo)

I can confirm the cache looks intact.

I could rebuild the index. Don't mind trying at least.

Pretty sure it's looking for an array indice that doesn't exist because my index.json is nothing like I would expect it to be


[
  [
    "http://www.lockwiki.com/index.php/Main_Page",
    {
      "date": 1641520105000,
      "id": 4,
      "ndx_id": 1000016,
      "title": "Lockwiki"
    }
  ],
  [
    4,
    "http://www.lockwiki.com/index.php/Main_Page"
  ],
  [
    "http://bjoernkarmann.dk/project_alias",
    {
      "date": 1641520105765,
      "id": 6,
      "ndx_id": 1000017,
      "title": "Bjørn Karmann › project_alias"
    }
  ],
  [
    "https://playbook.cio.gov/?utm_content=buffere045d&utm_medium=social&utm_source=linkedin.com&utm_campaign=buffer",
    {
      "date": 1641520104911,
      "id": 5,
      "ndx_id": 1000015,
      "title": "The Digital Services Playbook — from the U.S. Digital Service"
    }
  ],
  [
    "ndx1000003",
    "http://www.lockwiki.com/index.php/Main_Page"
  ],
  [
    5,
    "https://playbook.cio.gov/?utm_content=buffere045d&utm_medium=social&utm_source=linkedin.com&utm_campaign=buffer"
  ],
  [
    "ndx1000004",
    "https://playbook.cio.gov/?utm_content=buffere045d&utm_medium=social&utm_source=linkedin.com&utm_campaign=buffer"
  ],
  [
    6,
    "http://bjoernkarmann.dk/project_alias"
  ],
  [
    "ndx1000005",
    "http://bjoernkarmann.dk/project_alias"
  ],
  [
    "ndx1000006",
    "https://playbook.cio.gov/?utm_content=buffere045d&utm_medium=social&utm_source=linkedin.com&utm_campaign=buffer"
  ],
  [
    "ndx1000007",
    "http://www.lockwiki.com/index.php/Main_Page"
  ],
  [
    "ndx1000008",
    "https://playbook.cio.gov/?utm_content=buffere045d&utm_medium=social&utm_source=linkedin.com&utm_campaign=buffer"
  ],
  [
    "ndx1000009",
    "http://bjoernkarmann.dk/project_alias"
  ],
  [
    "ndx1000010",
    "http://bjoernkarmann.dk/project_alias"
  ],
  [
    "ndx1000011",
    "http://www.lockwiki.com/index.php/Main_Page"
  ],
  [
    "ndx1000012",
    "https://playbook.cio.gov/?utm_content=buffere045d&utm_medium=social&utm_source=linkedin.com&utm_campaign=buffer"
  ],
  [
    "ndx1000013",
    "http://www.lockwiki.com/index.php/Main_Page"
  ],
  [
    "ndx1000014",
    "http://bjoernkarmann.dk/project_alias"
  ],
  [
    "ndx1000015",
    "https://playbook.cio.gov/?utm_content=buffere045d&utm_medium=social&utm_source=linkedin.com&utm_campaign=buffer"
  ],
  [
    "ndx1000016",
    "http://www.lockwiki.com/index.php/Main_Page"
  ],
  [
    "ndx1000017",
    "http://bjoernkarmann.dk/project_alias"
  ]
]

@o0101
Copy link
Collaborator

o0101 commented Jan 8, 2022

That's awesome, how did you rebuild the index??

@marcboivin
Copy link
Author

By hand my good sir.

Backed up the public folder and with a bit of ls and grep magic figured out all the URLs I wanted to index and revisited them ;)

Doing that I found out some of the archive was still not processed and diskermet asked if I wanted to recover. I did and got some back.

So something tells me the process fails at some point but we have no way of knowing when.

Also I noted that one folder was corrupted and Finder (I'm on a Mac) would open the folder.

@marcboivin
Copy link
Author

If you're curious, I used this as a starting point

grep -r GEThttp ./ | cut -d ':' -f 4 | cut -d '?' -f 1

@o0101
Copy link
Collaborator

o0101 commented Jan 10, 2022

By hand my good sir.

Backed up the public folder and with a bit of ls and grep magic figured out all the URLs I wanted to index and revisited them ;)

Doing that I found out some of the archive was still not processed and diskermet asked if I wanted to recover. I did and got some back.

So something tells me the process fails at some point but we have no way of knowing when.

Also I noted that one folder was corrupted and Finder (I'm on a Mac) would open the folder.

You're awesome! That's so good! 😆 😂 ✊🏻 !!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants