Skip to content
This repository has been archived by the owner on Apr 2, 2024. It is now read-only.

hugoalh-studio/string-dissect-js

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

String Dissect (JavaScript)

βš–οΈ MIT

πŸ—‚οΈ GitHub: hugoalh-studio/string-dissect-js NPM: @hugoalh/string-dissect

πŸ†™ Latest Release Version (Latest Release Date)

A JavaScript module to dissect the string; Safe with the emojis, URLs, and words.

🎯 Target

  • Bun ^ v1.0.0
  • Cloudflare Workers
  • Deno >= v1.34.0

    πŸ›‘οΈ Require Permission

    N/A

  • NodeJS >= v20.9.0

πŸ”— Other Edition

πŸ”° Usage

Via Installation

🎯 Supported Target

  • Cloudflare Workers
  • NodeJS
  1. Install via console/shell/terminal:
    • Via NPM
      npm install @hugoalh/string-dissect[@<Tag>]
    • Via PNPM
      pnpm add @hugoalh/string-dissect[@<Tag>]
    • Via Yarn
      yarn add @hugoalh/string-dissect[@<Tag>]
  2. Import at the script (<ScriptName>.js):
    import ... from "@hugoalh/string-dissect";

    ℹ️ Note

    Although it is recommended to import the entire module, it is also able to import part of the module with sub path if available, please visit file package.json property exports for available sub paths.

Via NPM Specifier

🎯 Supported Target

  • Bun
  • Deno
  1. Import at the script (<ScriptName>.js):
    import ... from "npm:@hugoalh/string-dissect[@<Tag>]";

    ℹ️ Note

    Although it is recommended to import the entire module, it is also able to import part of the module with sub path if available, please visit file package.json property exports for available sub paths.

🧩 API

  • class StringDissector {
      constructor(options: StringDissectorOptions = {}): StringDissector;
      dissect(item: string, optionsOverride: StringDissectorOptions = {}): Generator<StringSegmentDescriptor>;
      dissectExtend(item: string, optionsOverride: StringDissectorOptions = {}): Generator<StringSegmentDescriptorExtend>;
      static dissect(item: string, options: StringDissectorOptions = {}): Generator<StringSegmentDescriptor>;
      static dissectExtend(item: string, options: StringDissectorOptions = {}): Generator<StringSegmentDescriptorExtend>;
    }
  • function dissectString(item: string, options: StringDissectorOptions = {}): Generator<StringSegmentDescriptor>;
  • function dissectStringExtend(item: string, options: StringDissectorOptions = {}): Generator<StringSegmentDescriptorExtend>;
  • enum StringSegmentType {
      ansi = "ansi",
      ANSI = "ansi",
      character = "character",
      Character = "character",
      emoji = "emoji",
      Emoji = "emoji",
      url = "url",
      Url = "url",
      URL = "url",
      word = "word",
      Word = "word"
    }
  • interface StringDissectorOptions {
      /**
       * The locale(s) to use in the operation; The JavaScript implementation examines locales, and then computes a locale it understands that comes closest to satisfying the expressed preference. By default, the implementation's default locale will be used. For more information, please visit https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Intl#locales_argument.
       * @default undefined
       */
      locales?: StringDissectorLocales;
      /**
       * Whether to remove ANSI escape codes.
       * @default false
       */
      removeANSI?: boolean;
      /**
       * Whether to prevent URLs get splitted.
       * @default true
       */
      safeURLs?: boolean;
      /**
       * Whether to prevent words get splitted.
       * @default true
       */
      safeWords?: boolean;
    }
  • interface StringSegmentDescriptor {
      type: StringSegmentType;
      value: string;
    }
  • interface StringSegmentDescriptorExtend extends StringSegmentDescriptor {
      indexEnd: number;
      indexStart: number;
    }
  • type StringDissectorLocales = ConstructorParameters<typeof Intl.Segmenter>[0];

✍️ Example

  • const sample1 = "Vel ex sit est sit est tempor enim et voluptua consetetur gubergren gubergren ut.";
    
    /* Either */
    Array.from(new StringDissector().dissect(sample1));
    Array.from(dissectString(sample1));
    /*=>
    [
      { value: "Vel", type: "word" },
      { value: " ", type: "character" },
      { value: "ex", type: "word" },
      { value: " ", type: "character" },
      { value: "sit", type: "word" },
      { value: " ", type: "character" },
      { value: "est", type: "word" },
      { value: " ", type: "character" },
      ... +20
    ]
    */
    
    /* Either */
    Array.from(new StringDissector({ safeWords: false }).dissect(sample1));
    Array.from(dissectString(sample1, { safeWords: false }));
    /*=>
    [
      { value: "V", type: "character" },
      { value: "e", type: "character" },
      { value: "l", type: "character" },
      { value: " ", type: "character" },
      { value: "e", type: "character" },
      { value: "x", type: "character" },
      { value: " ", type: "character" },
      { value: "s", type: "character" },
      ... +73
    ]
    */
  • /* Either */
    Array.from(new StringDissector().dissect("GitHub homepage is https://github.com."));
    Array.from(dissectString("GitHub homepage is https://github.com."));
    /*=>
    [
      { value: "GitHub", type: "word" },
      { value: " ", type: "character" },
      { value: "homepage", type: "word" },
      { value: " ", type: "character" },
      { value: "is", type: "word" },
      { value: " ", type: "character" },
      { value: "https://github.com", type: "url" },
      { value: ".", type: "character" }
    ]
    */
  • /* Either */
    Array.from(new StringDissector().dissect("πŸ€πŸ’‘πŸ’πŸ‘ͺπŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦πŸ‘©β€πŸ‘¦πŸ‘©β€πŸ‘§β€πŸ‘¦πŸ§‘β€πŸ€β€πŸ§‘")).map((element) => { return element.value; });
    Array.from(dissectString("πŸ€πŸ’‘πŸ’πŸ‘ͺπŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦πŸ‘©β€πŸ‘¦πŸ‘©β€πŸ‘§β€πŸ‘¦πŸ§‘β€πŸ€β€πŸ§‘")).map((element) => { return element.value; });
    //=> [ "🀝", "πŸ’‘", "πŸ’", "πŸ‘ͺ", "πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦", "πŸ‘©β€πŸ‘¦", "πŸ‘©β€πŸ‘§β€πŸ‘¦", "πŸ§‘β€πŸ€β€πŸ§‘" ]