Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicated document links in navigation and collection history #5948

Closed
almereyda opened this issue Oct 5, 2023 · 19 comments
Closed

Duplicated document links in navigation and collection history #5948

almereyda opened this issue Oct 5, 2023 · 19 comments
Assignees
Labels
bug self-hosted Issues related to self-hosting the code

Comments

@almereyda
Copy link

almereyda commented Oct 5, 2023

There are a few documents in our instance, which appear multiple times in listings, such as the navigation sidebar, and in the document list of a collection.

To Reproduce
Steps to reproduce the behavior:

  1. Create a document
  2. Change its title
  3. Choose to publish it
  4. See it multiple times in the sidebar and in the list

Expected behavior
Any document is only listed once.

Screenshots

The documents called Untitled, Mondeggi and Digitales Verrotten appear multiple times here and there. Note the browser status bar indicating the same link destination when hovering the individual instances.

Screenshots

grafik
grafik
Bildschirmfoto vom 2023-10-06 00-24-52
Bildschirmfoto vom 2023-10-06 00-21-44
Bildschirmfoto vom 2023-10-06 00-21-41
Bildschirmfoto vom 2023-10-06 00-21-38
Bildschirmfoto vom 2023-10-06 00-21-36
Bildschirmfoto vom 2023-10-06 00-21-32
Bildschirmfoto vom 2023-10-06 00-21-29
Bildschirmfoto vom 2023-10-06 00-21-25
grafik

Here we also see

in effect.

This happened a while after collaborative editing was enabled, and was maybe caused by flaky connections. Yet since this is reproducible across browsers and platforms, it seems there may be some database corruption, which would need to be repaired. Or another regression some place else.

Outline:

  • Self-hosted version: 0.72.0

Desktop:

  • OS: Ubuntu 23.10, Fedora 38
  • Browser: Firefox, Chromium
  • Version: a lot

Mobile:

  • Device: Samsung Galaxy Tab S6, CAT S62 Pro
  • OS: Android
  • Browser: Firefox, Chrome

This has been nagging me for a long time, and effectively stopped me from rolling out the application broadly. This might have to do with collaborative editing, but then I don't know how real-time versions are materialised to the database or how snapshots are being kept.

@tommoor
Copy link
Member

tommoor commented Oct 12, 2023

@almereyda I'm not sure what's going on with your instance to be honest, we've not had any other reports of this and it's never happened in the cloud-hosted instance to my knowledge

@ovizii
Copy link

ovizii commented Oct 13, 2023

@almereyda happy to hear I am not the only one with the same problem :-)

I have seen the exact same behaviour except this instance is only used by me for note-taking so no collaboration happening.
I am on version 0.72.2
Accessing outline behind traefik as reverse proxy.

Outline has been working perfectly fine for a couple of months for me, I had this problem once before, I think it was a few weeks ago, and it disappeared after a docker compose down followed by docker-compose up -d.
Unfortunately this time it does not help.

OS: win 11
Browser latest Chrome

I clicked the "+" sign next to the collection named docker, edited the new page then published it. Didn't notice an issue. Later on, I wondered why I had a second draft and noticed the draft had the same name as the published page. If I publish the draft, it creates a second page with the same name. If I delete it, the published page also disappears.

Initial situation:
image

if I delete the draft, the draft stays and the published page disappears:
image

Once I click the draft it also appears in the sidebar menu:
image

As soon as I hit publish:
image

This is very confusing, and the logs do not show any errors. Happy to share, just not sure if there's anything confidential in them, let me know if you like to see them.

The logs show quite a lot of these entries:

outline | {"event":{"actorId":"36490d7c-e628-42ea-bc05-16d0a5c9dc67","collectionId":"f025880e-6f1f-4f1b-a209-407244a0c2ec","createdAt":"2023-10-13T12:22:36.413Z","data":null,"documentId":"b58421b1-b1df-47a4-bc80-a605b95bc035","id":"235c7ab4-4491-4802-b8af-b552fe370942","ip":"82.76.153.242","modelId":"b5584af9-0063-4640-aaa6-897a202ce2ce","name":"revisions.create","teamId":"39849d48-6d55-41a5-8af7-c93d913e0039","userId":null},"label":"worker","level":"info","message":"WebhookProcessor running revisions.create"}

@almereyda
Copy link
Author

This data is returned from a call to collections.documents, so we could trace back the code path it took from being entered in this seemingly fragile state.

image

Any hints hat how this may be debugged are highly appreciated. If we have some kind of (frontend) profiling mode, I'm also happy to enable this.

Thank you for the excellent reproducer. I was able to witness failing WebSocket connections and failing succeeding API requests:

image

It appears the WebSockets connection is flakey, and some expected objects do not exist, yet. Is there some kind of hand over of state from the WebSockets connection to the HTTP API service?

In my reenactment, I've found all of the above, summarising in other words:

  • A published document shows in the sidebar. When it is clicked, also its Draft sibling appears, and the document is opened.
  • Publishing it again will create a second (published) draft in the collection,
  • Multiple publishes to a collaction will all show in the sidebar, but being shown as drafts in a collection's overview.
  • The single draft always remains in the global drafts section, and in the collection's sidebar listing, and it does not have the three-dot menu, and cannot be deleted.
Screenshots

image

image

image

image

Bildschirmfoto vom 2023-10-14 21-54-11

Bildschirmfoto vom 2023-10-14 21-54-08

It is also Traefik that we use, so I suspect there's some issue with the WebSockets connection, and that's about it.

Outline could fail a little more gracefully, though.

@ovizii
Copy link

ovizii commented Oct 14, 2023

I doubt its traefik, I have a few more apps using websockets behind it. Happy to provide logs or help debugging.

@almereyda
Copy link
Author

You don't have WebSocket errors and subsequently failing POST requests in your browser console?

I will follow down along that path for a moment, and report back here.

Actually I'm already glad to know, that we're not alone with this edge case over here.

@tommoor
Copy link
Member

tommoor commented Oct 14, 2023

As it seems to be reproducible, the best thing to do is take a screen recording with the inspector open – to show all errors, network requests etc.

@ovizii
Copy link

ovizii commented Oct 24, 2023

I'm not quite sure how to help, here are a few screenshots. If you would like more details please ask I am not sure what / how to provide you with.

This happens when reloading my outline instance and simply accessing it. There is a forever pending WS connection.
image

When entering "edit" mode on the problematic draft:
image

When triggering the bug by editing and publishing the problematic draft which results in multiple copies I see these WS entries:
image

image

image

@tommoor
Copy link
Member

tommoor commented Oct 25, 2023

FWIW these screenshots all seem like expected behavior, which would mostly rule out websocket issues

@ovizii
Copy link

ovizii commented Oct 30, 2023

Any other tips what to do to sort this out?
I have two drafts I cannot delete as they keep reappearing, and I can't publish them either without creating duplicates.

Willing to show all my configs and debug given more instructions.

edited to add this info:

I am willing to start from scratch, given I can easily export my kb.
When starting with Outline I went with a self-hosted minio instance but since this is for home usage I'd like to move to local storage.
I am using outline as my own personal kb, nothing has yet been shared with anyone, I would just like to make sure I am able to easily export/import and keep my document structure.

Any advice?

@tommoor
Copy link
Member

tommoor commented Oct 30, 2023

Probably best not to wipe the install.

  • Are you still able to reproduce the issue reliably?
  • Are you using the standard Postgres database in Docker?

If the bad data persisted after reloading the app then it's definitely not a websockets issue FWIW, we can count that out. I've traced through the code again and it continues to confound – the entire publish operation takes place in a transaction, it should be impossible for the data to be added to the sidebar and not be published.

@ovizii
Copy link

ovizii commented Oct 30, 2023

First of all, sorry to take over this thread, happy to open a new one if we notice it is unrelated to the original poster.

I can still reproduce it.
After this exporting issue #6094 is sorted I'd make a complete export and after that I am happy to try anything you like with this instance.

For now, here is my docker-compose
version: "2.4"

services:

  outline-redis:
    image: redis:7-alpine
    container_name: outline-redis
    hostname: outline-redis
    restart: "no"
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 10s
      timeout: 30s
      retries: 3
    cpus: 1
    mem_limit: 512M
    networks:
      - outline

  outline-postgres:
    image: postgres:13-alpine
    hostname: outline-postgres
    container_name: outline-postgres
    restart: "no"
    environment:
      - POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
      - POSTGRES_USER=outline
      - POSTGRES_DB=outline
    volumes:
      - ./postgres:/var/lib/postgresql/data:Z
#    healthcheck:
#      test: ["CMD", "pg_isready"]
#      interval: 30s
#      timeout: 20s
#      retries: 3
    cpus: 1
    mem_limit: 1G
    networks:
      - outline


  outline:
    image: outlinewiki/outline:latest
    container_name: outline
    hostname: outline
    user: root
    restart: "no"
    command: sh -c "yarn db:migrate --env=production-ssl-disabled && yarn start --env=production-ssl-disabled"
    volumes:
      - ./outline:/var/lib/outline/data
    environment:
      - PGSSLMODE=disable
      - NODE_ENV=production
      - SECRET_KEY=${SECRET_KEY}
      - UTILS_SECRET=${UTILS_SECRET}
      - DATABASE_URL=postgres://outline:${POSTGRES_PASSWORD}@outline-postgres:5432/outline
      - REDIS_URL=redis://outline-redis:6379
      - URL=https://wiki.domain.tld
      - PORT=3000
      - FILE_STORAGE_UPLOAD_MAX_SIZE=26214400
      - AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}
      - AWS_REGION=eu-de-mil
      - AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}
      - AWS_S3_UPLOAD_BUCKET_URL=https://s3.domain.tld
      - AWS_S3_UPLOAD_BUCKET_NAME=outline
      - AWS_S3_FORCE_PATH_STYLE=true
      - AWS_S3_ACL=private
      - AWS_S3_ACCELERATE_URL=
      - FORCE_HTTPS=false
      - SMTP_HOST=10.10.10.10
      - SMTP_PORT=25
      - SMTP_FROM_EMAIL=Outline <ovizii+outline@domain.tld>
      - SMTP_REPLY_EMAIL=Outline <ovizii+outline@domain.tld>
      - SMTP_SECURE=false
      - SMTP_USERNAME=
      - SMTP_PASSWORD
      - ALLOWED_DOMAINS=mydomain.tld
      - WEB_CONCURRENCY=2
      - RATE_LIMITER_ENABLED=true
      - RATE_LIMITER_DURATION_WINDOW=60
      - RATE_LIMITER_REQUESTS=1000
    depends_on:
      - outline-postgres
      - outline-redis
    cpus: 1
    mem_limit: 1G
    networks:
      - outline
      - traefik
    labels:
      - "traefik.enable=true"
      - "traefik.docker.network=traefik"
      - "traefik.http.routers.outline.tls=true"
      - "traefik.http.routers.outline.entrypoints=websecure"
      - "traefik.http.routers.outline.middlewares=secHeaders@file"
      - "traefik.http.routers.outline.rule=Host(`wiki.domain.tld`)"
      - "traefik.http.routers.outline.service=outline"
      - "traefik.http.services.outline.loadbalancer.server.port=3000"

networks:
    traefik:
        external: true
        name: traefik
    outline:
        external: true
        name: outline


@tommoor
Copy link
Member

tommoor commented Oct 30, 2023

It's possible the exporting issue is related as I can't reproduce that bug in a development or the cloud production environment either 🤷

When you publish a document and the bug reproduces, in the API response is the publishedAt property set to the current time or is it null?

@almereyda
Copy link
Author

@ovizii No offense taken. Thanks for moving forward in chasing this Heisenbug.

I'm inclined to record some reproducers, yet struggle to find the time just yet.

And in a way I would even feel inclined to run a ZFS snapshot from this instance's containers against the commits from #5553. Maybe with it all the above just disappears?

@tommoor
Copy link
Member

tommoor commented Jan 19, 2024

closed in 2505fea

@tommoor tommoor closed this as completed Jan 19, 2024
@tommoor tommoor self-assigned this Jan 19, 2024
@ovizii
Copy link

ovizii commented Jan 19, 2024

closed in 2505fea

Sorry for the beginner question but will this fix end up in an official build and retrospectively fix the issue? I don't know enough about coding or git to figure this out on my own.

@tommoor
Copy link
Member

tommoor commented Jan 19, 2024

The fix will be in the next release, if you have existing broken data structure then that will need fixing manually – the best way to do so would be by unpublishing and re-publishing the document

@almereyda
Copy link
Author

Glad you found it Tom.

I will be able to test this with the next release, esp. with regards to fixing existing broken data structures. The previous tests left us with a large amount of candidates for the remediation strategy unpublish-republish.

Are there dailies/nightlies of the main branch that one could test?

Else I could also try from a source build, if feedback is needed earlier.

@almereyda
Copy link
Author

almereyda commented Jan 26, 2024

I have now tested this with a fresh build from main, but am unfortunately not able to confirm that 2505fea resolves the issue entirely with existing documents. New documents seem to be repaired now.

Note about running a source build with Docker Compose

Assuming a fresh clone with clean working area at the desired commit exists in a subdirectory called outline/, the following Compose configuration will yield a running intermediary build.

Converting an existing Docker Compose deployment from an image: deployment to a source build: requires adding a separate container for the base image, which will not be started, but tagged as if it were upstream.


services:

  outline-base:
    image: outlinewiki/outline-base
    build:
      context: ./outline/
      dockerfile: Dockerfile.base
      args:
        pull: 1
    profiles:
      - donotstart

  outline:
#    image: outlinewiki/outline:$OUTLINE_IMAGE_TAG
    build:
      context: ./outline/
      dockerfile: Dockerfile
      args:
        pull: 1
    command: sh -c "yarn db:migrate --env production-ssl-disabled && yarn start"
    
docker compose build outline-base
docker compose build outline

(I think I have read once that db:migrate is not needed anymore, and taken care of during yarn start. I reactivated it out of nostalgy.)

Running a source build is also harder with the separation of the two Dockerfiles. Maybe it could be more convenient for long-term maintenance to have both stages in the same file.

It seems this was partially resolved: Some of the published documents switched to only having one copy of their title in the sidebar or collection overview. Out of nine drafts, I was able to delete and publish almost half of them. Toggling deletion or publication status succeeded more often in the collection overview, than when a document was opened.

Some documents could be deleted when they were unpublished. All documents that were completely published could be deleted. Some of the broken documents end up in a superposition that displays as being unpublished, and that offers the actions in both directions.

image

A new error appears when trying to unpublish a document that was seen in the superposition, but is now published:

image

{"label":"http","level":"info","message":"  <-- POST /api/documents.unpublish"}
{"label":"http","level":"info","message":"  xxx POST /api/documents.unpublish 500 32ms -"}

The 500 error does not reveal further details when ENVIRONMENT is set to development.

The same appears for documents in the superposition that were published.

image

The superposition comes into place after views.create when a document is opened, and just in the moment when the first documents.update is run.

Calls to documents.delete yield a 200, but the broken drafts will remain present.

Unpublished documents can also not be hit directly with their URL some times. One must go through the collection overview or the drafts list to access them.


When successfully publishing a document, the same event is recognised by Processing documents.publish, BacklinksProcessor, NotificationsProcessor, RevisionsProcessor, DocumentPublishedNotificationsTask, SlackProcessor and WebhookProcessor, with modelId and userId always being null. These events are doubled by a second run in the log, when publishing succeeds.

outline_1       | {"attempt":0,"event":{"actorId":"8553e03c-9135-4ea6-978a-ea80a1128db5","collectionId":"08de809f-4357-4204-9322-24103e3b8c13","createdAt":"2024-01-26T23:15:12.271Z","data":{"title":"Doppelte Titel titeln dem doppelten"},"documentId":"9b13eaad-c3c1-49b8-bf25-56f455d901e9","id":"0d3af33a-1204-452a-91f2-3718a0dc2c99","ip":"88.134.24.179","modelId":null,"name":"documents.publish","teamId":"7f96ac4b-2242-4023-8826-1d424c8b5e5e","userId":null},"la
bel":"worker","level":"info","message":"Processing documents.publish"}
outline_1       | {"event":{"actorId":"8553e03c-9135-4ea6-978a-ea80a1128db5","collectionId":"08de809f-4357-4204-9322-24103e3b8c13","createdAt":"2024-01-26T23:15:12.271Z","data":{"title":"Doppelte Titel titeln dem doppelten"},"documentId":"9b13eaad-c3c1-49b8-bf25-56f455d901e9","id":"0d3af33a-1204-452a-91f2-3718a0dc2c99","ip":"88.134.24.179","modelId":null,"name":"documents.publish","teamId":"7f96ac4b-2242-4023-8826-1d424c8b5e5e","userId":null},"label":"worker
","level":"info","message":"BacklinksProcessor running documents.publish"}
outline_1       | {"event":{"actorId":"8553e03c-9135-4ea6-978a-ea80a1128db5","collectionId":"08de809f-4357-4204-9322-24103e3b8c13","createdAt":"2024-01-26T23:15:12.271Z","data":{"title":"Doppelte Titel titeln dem doppelten"},"documentId":"9b13eaad-c3c1-49b8-bf25-56f455d901e9","id":"0d3af33a-1204-452a-91f2-3718a0dc2c99","ip":"88.134.24.179","modelId":null,"name":"documents.publish","teamId":"7f96ac4b-2242-4023-8826-1d424c8b5e5e","userId":null},"label":"worker
","level":"info","message":"NotificationsProcessor running documents.publish"}
outline_1       | {"event":{"actorId":"8553e03c-9135-4ea6-978a-ea80a1128db5","collectionId":"08de809f-4357-4204-9322-24103e3b8c13","createdAt":"2024-01-26T23:15:12.271Z","data":{"title":"Doppelte Titel titeln dem doppelten"},"documentId":"9b13eaad-c3c1-49b8-bf25-56f455d901e9","id":"0d3af33a-1204-452a-91f2-3718a0dc2c99","ip":"88.134.24.179","modelId":null,"name":"documents.publish","teamId":"7f96ac4b-2242-4023-8826-1d424c8b5e5e","userId":null},"label":"worker
","level":"info","message":"RevisionsProcessor running documents.publish"}
outline_1       | {"actorId":"8553e03c-9135-4ea6-978a-ea80a1128db5","collectionId":"08de809f-4357-4204-9322-24103e3b8c13","createdAt":"2024-01-26T23:15:12.271Z","data":{"title":"Doppelte Titel titeln dem doppelten"},"documentId":"9b13eaad-c3c1-49b8-bf25-56f455d901e9","id":"0d3af33a-1204-452a-91f2-3718a0dc2c99","ip":"88.134.24.179","label":"worker","level":"info","message":"DocumentPublishedNotificationsTask running","modelId":null,"name":"documents.publish",
"teamId":"7f96ac4b-2242-4023-8826-1d424c8b5e5e","userId":null}
outline_1       | {"event":{"actorId":"8553e03c-9135-4ea6-978a-ea80a1128db5","collectionId":"08de809f-4357-4204-9322-24103e3b8c13","createdAt":"2024-01-26T23:15:12.271Z","data":{"title":"Doppelte Titel titeln dem doppelten"},"documentId":"9b13eaad-c3c1-49b8-bf25-56f455d901e9","id":"0d3af33a-1204-452a-91f2-3718a0dc2c99","ip":"88.134.24.179","modelId":null,"name":"documents.publish","teamId":"7f96ac4b-2242-4023-8826-1d424c8b5e5e","userId":null},"label":"worker
","level":"info","message":"SlackProcessor running documents.publish"}
outline_1       | {"event":{"actorId":"8553e03c-9135-4ea6-978a-ea80a1128db5","collectionId":"08de809f-4357-4204-9322-24103e3b8c13","createdAt":"2024-01-26T23:15:12.271Z","data":{"title":"Doppelte Titel titeln dem doppelten"},"documentId":"9b13eaad-c3c1-49b8-bf25-56f455d901e9","id":"0d3af33a-1204-452a-91f2-3718a0dc2c99","ip":"88.134.24.179","modelId":null,"name":"documents.publish","teamId":"7f96ac4b-2242-4023-8826-1d424c8b5e5e","userId":null},"label":"worker
","level":"info","message":"WebhookProcessor running documents.publish"

Also it seems the translation for two modals is not complete, or the wrong key is being used. In both cases it just says "document" instead of the title:

Bildschirmfoto vom 2024-01-27 00-10-07

Bildschirmfoto vom 2024-01-27 00-26-28

5 out of 9 broken drafts could be deleted with repeatingly toggling publication states and trying to delete, all from the sidebar, the collection overview or in the top-right of a document view.

The last mystery pose documents which show as published in the main page, but cannot be deleted nor unpublished. When they get unpublished, they show up as a draft, but the context menu will be empty. Reloading the page shows them as complete documents again. POSTs to documents.info will yield an occassional 404.

Interestingly the translation string for document is correct in another collection:

image

One major caveat is here, that some calls to documents.delete return a 200, despite they didn't finish successfully, why reloading the page will show the documents again.

The second more confusing thing are the documents that are published, but only show up in the main overview, or in the recently changed tab of the collection overview.


I'm now more confident to set up new instances, but will be happy about suggestions how to repair or delete the stuck documents, either being stuck in published state, or being undeletable in any publication state.

@almereyda
Copy link
Author

When trying to open one of the garbled documents with accessing its direct URL, the POST call to document.info will yield a 404. The same document will also through superposition states, or can be unpublished twice, without the UI responding to the intent at first try. Also an old friend returned:

grafik

Other documents created after 2505fea was included can be accessed directly.

Judging that the database corruption will not occur anymore with the newer frontend logic, it appears the safest way to return this (testing-)instance to a consistent state will be a full export-import cycle with resetting the database state.

Fortunately #6438 pretty much blocked onboarding users until now, why the amount of undesired side-effects is luckily minimal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug self-hosted Issues related to self-hosting the code
Projects
None yet
Development

No branches or pull requests

3 participants