Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any plans to multitarget pdflexer? #88

Closed
podprad opened this issue Sep 7, 2023 · 5 comments
Closed

Any plans to multitarget pdflexer? #88

podprad opened this issue Sep 7, 2023 · 5 comments

Comments

@podprad
Copy link

podprad commented Sep 7, 2023

First I would like to say you did awesome job with that lib.
I would like to use it to extract and modify some information in PDF files, where it cannot be done by regular high-level PDF libraries.

Do you have any plans for multitargeting it or moving to .NET Standard 2.0?
I would like to use it both in .net48 and .net6.

@plaisted
Copy link
Contributor

plaisted commented Sep 7, 2023

hey @podprad, this is something I struggled deciding on during development of the library. I initially targeted .net 5+ & .net standard 2.1+ and was hoping to keep it compatible with those (there were a few things that made .net standard 2.0 not possible from the start although I forget what exactly). As time went on the code base became more and more complicated to keep backwards compatibility and when I added support for generic math I decided to only support .net 7+ going forward. My reasoning was basically:

  • There are several options out there for pdf libraries that support older .net versions (PdfSharp, PdfPig, itext / itext MPL 4.2). Having PdfLexer focus on a clean start with new .net versions / features gives it a niche in the .net ecosystem.
  • As stated above the code base was getting messy with multiple implementations and a lot of conditional compilation flags (eg. included zlib/gz support in .net 6+, missing Span<T> / Sequence<T> extensions, etc)
  • Some of the things that turned out to be core PdfLexer features needed .net 7 (generic math for parsing, memory mapped files for efficient data access in particular)

If here is an easy way to support the older .net version that I'm un-aware of I'd be open to it but as it is I do not have plans to try adding support back in. I understand that a lot of existing software is difficult to upgrade to newer .net versions so hate excluding those environments but ultimately think it's best for the library. Going forward I plan to support at least all in support versions after .net 7 (see .net support lifecycle).

@plaisted plaisted pinned this issue Sep 7, 2023
@podprad
Copy link
Author

podprad commented Sep 10, 2023

Thank you for answer.

I tried to downgrade PdfLexer to .net6.0 and I managed to do that by removing generics for floating points and hardcoding double type, however it is destructive and some of thr tests are still failing. There are more todos, because I will also need to dowmgrade lang level to 7.3 in order to support .net48.

I saw your code and it looks you know you are doing to make performant software by using latest .net7 features.

In my case I can't just use iText, because it's licensing is unacceptable for the project that I'm working in. Other free libs do not allow such a low access to PDF internal structures. That's why PdfLexer is an unique gem for me.

What I need is to extract some unusual data from PDF objects and maybe modify it. I will try to implement custom parsing and just write changes by incremental updates. Seems a lot of work, but not a rocket science at all :)

Alternatively I will wrap PdfLexer as self-contained executable.

@plaisted
Copy link
Contributor

It looks like https://github.com/pdflexer/pdflexer/tree/dfe2182ecb9eec8ed0c4601a5d7e3ee0432c88b0 is the last version with .net 6 support and isn't missing too much work so would probably be easier to start from there if you needed .net 6 support. The core library is in a pretty good shape (and was at that point as well) but I've been and still am adding extra features and improving the API / dev experience so if you went that route you'd be "left behind" so to speak. If all you care about is the lower level parsing then that may be a decent option since that area of the library is pretty set.

Alternatively, if you go with a self-contained executable I'd be interested to know a little more about your use case and would be open to tweaking the library some to make it easier.

@podprad
Copy link
Author

podprad commented Sep 10, 2023

Thanks for hints.

I just need to read and modify additional tags per-page level. It's really exotic case ;)

Now I also see that I just somehow messed it, but PdfSharp also provides low level API. So maybe I will go with PdfSharp as the cheapest option for .NET Framework 4.8 and replace it with PdfLexer after upgrading my project to .NET 8.

Anyway, thank you for help and answers. I wish you a good lock with this library.

Btw. some of the unit tests (@ master branch) are failing on my machine, because I'm using the locale settings with "," character as decimal separator. Anyway, not a big deal.

Best regards :)

@plaisted
Copy link
Contributor

Thanks, I'll look into the tests / locale settings! Closing the issue, feel free to re-open if you change your mind.

@plaisted plaisted closed this as not planned Won't fix, can't repro, duplicate, stale Sep 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants