Releases · Filimoa/open-parse

Update pymupdf.md by @ada-lovecraft in #20
update the cookbooks link by @brianjking in #24
fix: Fix sequence item 2: expected str instance, NoneType found exception when table output is set to markdown. by @ic-xu in #27

New Contributors

@ada-lovecraft made their first contribution in #20
@brianjking made their first contribution in #24
@ic-xu made their first contribution in #27

Full Changelog: v0.5.2...v0.5.3

Contributors

ada-lovecraft, brianjking, and ic-xu

Assets 2

11 Apr 14:23

Filimoa

v0.5.2

987b43a

v0.5.2 (2024-04-11)

Features

Better version display
Fixed pytorch device bug. Thanks @jinmang2
Add global config to set pytorch device

Contributors

jinmang2

Assets 2

08 Apr 20:58

Filimoa

v0.5.1

06509d1

v0.5.1 (2024-04-05)

Bug Fixes

Fixed type hinting bug for python < 3.10

Full Changelog: v0.5.0...v0.5.1

Assets 2

08 Apr 04:20

Filimoa

v0.5.0

d4dbaff

v0.5.0 (2024-04-07)

0.5.0 (2024-04-01)

What's Changed

SemanticProcessing! This is the recommended processing pipeline.
Add optional annotations to the pdf draw functions
Fixed reading order bug

Breaking Changes

Renaming

Node.aggregate_position renamed to Node.reading_order.
RemoveStubs to RemoveNodesBelowNTokens

Refactored processing pipelines to use a class to promote ease of reuse

Previously

from openparse import ProcessingStep, default_pipeline, Node
from typing import List


class CustomCombineTables(ProcessingStep):
    def process(self, nodes: List[Node]) -> List[Node]:
        return nodes


# copy the default pipeline (or create a new one)
custom_pipeline = default_pipeline.copy()
custom_pipeline.append(CustomCombineTables())

parser = openparse.DocumentParser(
    table_args={"parsing_algorithm": "pymupdf"}, processing_pipeline=custom_pipeline
)
custom_10k = parser.parse(meta10k_path)

Now becomes

from openparse import processing, Node
from typing import List


class CustomCombineTables(processing.ProcessingStep):
    def process(self, nodes: List[Node]) -> List[Node]:
        return nodes


# copy the default pipeline (or create a new one)
custom_pipeline = processing.BasicIngestionPipeline()
custom_pipeline.append_transform(CustomCombineTables())

parser = openparse.DocumentParser(
    table_args={"parsing_algorithm": "pymupdf"}, processing_pipeline=custom_pipeline
)
custom_10k = parser.parse(meta10k_path)

openai and numpy as now required dependencies, will likely split this out in the future.

Full Changelog: v0.4.1...v0.5.0

Assets 2

05 Apr 19:33

Filimoa

v0.4.1

dd33fb0

v0.4.1 (2024-04-05)

What's Changed

Better error messages for missing weights
Type hinting bug with python 3.8 fixed

Full Changelog: v0.4.0...v0.4.1

Assets 2

05 Apr 04:54

Filimoa

v0.4.0

80e2df9

0.4.0 (2024-04-04)

What's Changed

✨ Unitable support for table content extraction!
🐛 Fixed bug with table transformers failing on multiple pages.
✨ Improved table docs

What's Changed

Unitable by @Filimoa in #6

New Contributors

@Filimoa made their first contribution in #6

Full Changelog: v0.3.1...v0.4.0

Contributors

Filimoa

Assets 2

01 Apr 21:36

Filimoa

v0.3.1

252390f

0.3.1 (2024-04-01)

What's Changed

Fixed #4

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

Contributors

What's Changed

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

Features

Contributors

Bug Fixes

What's Changed

Breaking Changes

What's Changed

What's Changed

What's Changed

New Contributors

Contributors

What's Changed

Releases: Filimoa/open-parse

v0.5.6 (2024-05-01)

What's Changed

Contributors

v0.5.5

What's Changed

Contributors

v0.5.4

What's Changed

New Contributors

Contributors

v0.5.3 (2024-04-21)

What's Changed

New Contributors

Contributors

v0.5.2 (2024-04-11)

Features

Contributors

v0.5.1 (2024-04-05)

Bug Fixes

v0.5.0 (2024-04-07)

What's Changed

Breaking Changes

v0.4.1 (2024-04-05)

What's Changed

0.4.0 (2024-04-04)

What's Changed

What's Changed

New Contributors

Contributors

0.3.1 (2024-04-01)

What's Changed