Skip to content

Latest commit

 

History

History
87 lines (69 loc) · 3.55 KB

search-index-docs.md

File metadata and controls

87 lines (69 loc) · 3.55 KB

Creating the search index

Algolia runs a Docsearch free program, where they manage and host and update the search index for your documentation for you, for free.

That program is by application only and takes at least two weeks. In the mean time, they have a (legacy) option to run your own scraper, as documented here.

In either case, Algolia hosts the index on their servers. The (frontend-only) docs site can then fetch search data from Algolia's servers and return relevant search results to the user. The difference is that if accepted into the programme then Algolia also takes care of regularly scraping the site (once a week) and updating the index. Without acceptance, this can be done manually by using their open source (no longer maintained) Docker container as described below.

Pre-requisites

  • Docker

Creating the search index

  • Sign up for the Aloglia free plan
  • Create a new index and make a note of the index name

image

    • Visit API Keys under settings:

image

    • Make note of the application id and the two api keys (one to update the index, one as a read-only token that will be included in the frontend to access the index and perform searches).

image

  • Create a .env file with the application ID and the Admin API key, as described in the docs.
  • Create a config.json file with the contents copied from the Indexer configuration below.
  • If necessary, change the start_urls and sitemap_urls to the correct domain (dev, prod, etc)
  • Run the Docker command

This should update the index in your aloglia account, which you can see in the portal.

If necessary, update the src/pages/index.js file to reflect the index name, appId, and search key (not the admin key) that you got from Algolia.

            <DocSearch
              indexName="device42"
              appId="SCH7N4RLU6"
              apiKey="acebf9e8f4b83b8c1e7270713d7f70b8"
            />

Indexer configuration

{
    "index_name": "device42",
    "start_urls": ["https://dev.docs.device42.com"],
    "sitemap_urls": ["https://dev.docs.device42.com/sitemap.xml"],
    "selectors": {
        "lvl0": {
            "selector": "(//ul[contains(@class,'menu__list')]//a[contains(@class, 'menu__link menu__link--sublist menu__link--active')]/text() | //nav[contains(@class, 'navbar')]//a[contains(@class, 'navbar__link--active')]/text())[last()]",
            "type": "xpath",
            "global": true,
            "default_value": "Documentation"
          },
        "lvl1": "header h1",
        "lvl2": "article h2",
        "lvl3": "article h3",
        "lvl4": "article h4",
        "lvl5": "article h5, article td:first-child",
        "lvl6": "article h6",
        "text": "article p, article li, article td:last-child"
      },
      "strip_chars": " .,;:#",
      "custom_settings": {
        "separatorsToIndex": "_",
        "attributesForFaceting": [
          "language",
          "version",
          "type",
          "docusaurus_tag"
        ],
        "attributesToRetrieve": [
          "hierarchy",
          "content",
          "anchor",
          "url",
          "url_without_anchor",
          "type"
        ]
      }
  }