Skip to content

Issues: adbar/trafilatura

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Label
Filter by label
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Milestones
Filter by milestone
Assignee
Filter by who’s assigned
Sort

Issues list

Extract text from buttons for semantic elements question Further information is requested
#573 opened Apr 23, 2024 by zirkelc
Question: check if page is readable? question Further information is requested
#572 opened Apr 23, 2024 by zirkelc
No timeout for some URLs when using focused_crawler enhancement New feature or request
#566 opened Apr 19, 2024 by JER-CE
Why lzma for data compression? question Further information is requested
#559 opened Apr 15, 2024 by Yomguithereal
Scraping websites which are protected by WAF question Further information is requested
#558 opened Apr 15, 2024 by thebigbone
Preserve horizontal space in code blocks enhancement New feature or request
#553 opened Apr 9, 2024 by mittsommer
Downloads: Add ZStandard as optional Accept-Encoding header enhancement New feature or request
#537 opened Apr 3, 2024 by adbar
List element inside a table is lost bug Something isn't working
#531 opened Mar 29, 2024 by mikhainin
Link proportion heuristic fails for link paragraph bug Something isn't working
#529 opened Mar 27, 2024 by adbar
Doesn't extract links in table bug Something isn't working
#523 opened Mar 18, 2024 by obeone
PDF as output format? feedback Feedback from users requested
#519 opened Mar 4, 2024 by adbar
Link section missed at bottom of page bug Something isn't working
#518 opened Mar 1, 2024 by adbar
OVERALL_DISCARD_XPATH not discarding in some cases question Further information is requested
#510 opened Feb 19, 2024 by felipehertzer
Extract more text bug Something isn't working
#488 opened Jan 26, 2024 by vulinh48936
Deprecate functions and arguments documentation Docs in need of update or extension
#480 opened Jan 18, 2024 by adbar v2.0
save cookies on redirect enhancement New feature or request
#478 opened Jan 16, 2024 by zeliboba7
include_links option mixes texts and links bug Something isn't working
#476 opened Jan 12, 2024 by hugoobauer
Add support for Netscape cookies file format enhancement New feature or request
#473 opened Jan 11, 2024 by adbar
Add HTML output option enhancement New feature or request
#472 opened Jan 9, 2024 by adbar
Missing Yoast FAQ block headers bug Something isn't working
#471 opened Jan 8, 2024 by adbar
Configure pre-commit for this repository and update documentation documentation Docs in need of update or extension up for grabs Good for (first) contributors
#466 opened Jan 2, 2024 by adbar
Here is an interesting example... any tips? question Further information is requested
#459 opened Dec 19, 2023 by krstp
ProTip! Mix and match filters to narrow down what you’re looking for.