Skip to content

Commit

Permalink
Fix parsing of HTML returned from raw API (#144)
Browse files Browse the repository at this point in the history
* Fix parsing of HTML returned from raw API

Github must have changed something on their end, the HTML seems to have changed a little.

* Update gh-md-toc

Co-authored-by: Vladyslav Diachenko <82767850+vlad-diachenko@users.noreply.github.com>

---------

Co-authored-by: Eugene Kalinin <ekalinin@users.noreply.github.com>
Co-authored-by: Vladyslav Diachenko <82767850+vlad-diachenko@users.noreply.github.com>
  • Loading branch information
3 people committed Sep 25, 2023
1 parent 661b5c5 commit c836e5e
Showing 1 changed file with 6 additions and 6 deletions.
12 changes: 6 additions & 6 deletions gh-md-toc
Expand Up @@ -241,17 +241,17 @@ gh_toc_grab() {
grepcmd="pcregrep -o"
echoargs=""
awkscript='{
level = substr($0, length($0), 1)
text = substr($0, match($0, /a>.*<\/h/)+2, RLENGTH-5)
level = substr($0, 3, 1)
text = substr($0, match($0, />[^<]*<span aria-hidden/)+1, RLENGTH-18)
href = substr($0, match($0, "href=\"([^\"]+)?\"")+6, RLENGTH-7)
'"$common_awk_script"'
}'
else
grepcmd="grep -Eo"
echoargs="-e"
awkscript='{
level = substr($0, length($0), 1)
text = substr($0, match($0, /a>.*<\/h/)+2, RLENGTH-5)
level = substr($0, 3, 1)
text = substr($0, match($0, />[^<]*<span aria-hidden/)+1, RLENGTH-18)
href = substr($0, match($0, "href=\"[^\"]+?\"")+6, RLENGTH-7)
'"$common_awk_script"'
}'
Expand All @@ -266,7 +266,7 @@ gh_toc_grab() {
sed -e ':a' -e 'N' -e '$!ba' -e 's/\n<\/h/<\/h/g' |

# find strings that corresponds to template
$grepcmd '<a.*id="user-content-[^"]*".*</h[1-6]' |
$grepcmd '<h.*id="user-content-[^"]*".*</h[1-6]' |

# remove code tags
sed 's/<code>//g' | sed 's/<\/code>//g' |
Expand All @@ -275,7 +275,7 @@ gh_toc_grab() {
sed 's/<g-emoji[^>]*[^<]*<\/g-emoji> //g' |

# now all rows are like:
# <a id="user-content-..." href="..."><span ...></span></a> ... </h1
# <h1 id="user-content-..."><a href="..."> ... <span ...></span></a></h1
# format result line
# * $0 - whole string
# * last element of each row: "</hN" where N in (1,2,3,...)
Expand Down

0 comments on commit c836e5e

Please sign in to comment.