Skip to content

This .NET 6.0 library uses AngleSharp to parse an HTML string into a DOM. It is written such that it can be easily used in PowerShell Core.

Notifications You must be signed in to change notification settings

nstevens1040/AngleSharpParser

Repository files navigation

Build status

AngleSharpParser

PowerShell Core Quick Start

Load the DLL into PowerShell Core

try { Set-ExecutionPolicy Bypass -Scope Process -Force } catch {}
[System.Net.ServicePointManager]::SecurityProtocol = [System.Net.ServicePointManager]::SecurityProtocol -bor 3072
[System.Net.WebClient]::New().DownloadFile(
    "https://github.com/nstevens1040/AngleSharpParser/releases/latest/download/AngleSharpParser-latest.nupkg",
    "$($ENV:USERPROFILE)\Desktop\AngleSharpParser-latest.nupkg"
)
$null = mkdir "$($ENV:USERPROFILE)\Desktop\AngleSharpParser-latest"
Expand-Archive -Path "$($ENV:USERPROFILE)\Desktop\AngleSharpParser-latest.nupkg" -DestinationPath "$($ENV:USERPROFILE)\Desktop\AngleSharpParser-latest"
Add-Type -Path "$($ENV:USERPROFILE)\Desktop\AngleSharpParser-latest\lib\net6.0\AngleSharpParser.dll"

Do a simple test

$html_string = @"
<!DOCTYPE html>
<html>
    <head>
        <meta charset="utf-8"/>
        <meta name="viewport" content="width=device-width,initial-scale=1"/>
        <title>Testing HTML</title>
    </head>
    <body>
        <h1>Heading</h1>
        <article>
            <section>
                <h2>subtitle</h2>
                <p>paragraph</p>
                <span id="test">Test succeeded!</span>
            </section>
        </article>
    </body>
</html>
"@
$parser = [Angle.Sharp]::New()
$document = $parser.GetDomDocument($html_string)
$document.GetElementById("test").TextContent

The output should read

Test succeeded!