Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix two list item related bugs #514

Merged
merged 4 commits into from Jun 22, 2023

Conversation

Crozzers
Copy link
Contributor

@Crozzers Crozzers commented Jun 4, 2023

Bug 1: List items being incorrectly nested if preceded by \n\n

- Item 1
  ABCDEF

- Item 2
  - Item 3
    - Item 4

While processing Item 2, Item 4 would erroneously be flattened to the same level as Item 3.
In _list_item_sub if the last processed list item (in this case Item 1) ended with \n\n, the current list item would be passed to self._outdent, flattening the child list items.

This bug was fixed by instead passing the current list item to self._uniform_outdent and outdenting it by 1-4 spaces.

Bug 2: HTML blocks with markdown="1" not being processed correctly when inside list items

The _list_item_sub function assumes that most list items, if they don't contain \n\n, should be run through the span gamut.
This assumption causes the following markdown to be incorrectly converted:

- Item 1
  <div markdown="1">
  ###### This line would not be converted
  Some text
  </div>

List items are allowed to contain div blocks, which can contain other block elements such as headers. However, since this example does not contain \n\n, it would be run through the span gamut and the header would not be processed.

The fix for this was to add a dedicated _do_markdown_in_html function that piggy-backs off of _strict_tag_block_sub.
It will find and hash indented HTML blocks, resulting in the following markdown:

- Item 1

  md5-somehash

  ###### This line would not be converted
  Some text

  md5-somehash

Due to the presence of \n\n, the _list_item_sub will now process these correctly using the block gamut.

Alternatives considered:

  • Running all list items through the block gamut
    • Results in <li><p>some text</p></li>
  • Tweaking _strict_tag_block_sub to always hash indented HTML blocks
    • Could not get this working. Caused 40+ test failures

Other notable changes

  • Refactored _uniform_indent to give better control on what happens to whitespace-only lines
  • Made _uniform_(in|out)dent staticmethods and added docstrings

@nicholasserra
Copy link
Collaborator

Nice thank you!

@nicholasserra nicholasserra merged commit 9f6b529 into trentm:master Jun 22, 2023
18 checks passed
Crozzers added a commit to Crozzers/python-markdown2 that referenced this pull request Jul 2, 2023
Merge master to pull in trentm#514 for conversion to new Extra format
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants