software + digital preservation // senior applications and tools engineer @webrecorder
- Montreal, QC, Canada
- https://bitarchivist.net
Block or Report
Block or report tw4l
Report abuse
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abusePinned
-
brunnhilde
brunnhilde PublicSiegfried-based characterization tool for directories and disk images
-
bulk-reviewer/bulk-reviewer
bulk-reviewer/bulk-reviewer PublicIdentify, review, and remove sensitive files
-
-
CCA-Public/diskimageprocessor
CCA-Public/diskimageprocessor PublicTool for automated processing of disk images in BitCurator
811 contributions in the last year
Day of Week | April Apr | May May | June Jun | July Jul | August Aug | September Sep | October Oct | November Nov | December Dec | January Jan | February Feb | March Mar | |||||||||||||||||||||||||||||||||||||||||
Sunday Sun | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Monday Mon | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Tuesday Tue | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Wednesday Wed | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Thursday Thu | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Friday Fri | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Saturday Sat |
Less
No contributions.
Low contributions.
Medium-low contributions.
Medium-high contributions.
High contributions.
More
Activity overview
Contributed to
webrecorder/browsertrix-cloud,
webrecorder/browsertrix-crawler,
CCA-Public/diskimageprocessor
and 25 other
repositories
Contribution activity
March 2024
Created 21 commits in 3 repositories
Created 1 repository
-
tw4l/js-wacz
JavaScript
This contribution was made on Mar 7
Created a pull request in harvard-lil/js-wacz that received 12 comments
Modify --pages option to copy pages files directly into WACZ
Fixes #91 Adds tests as well. Happy to make any changes you see fit. Thanks for the review!
+149
−27
lines changed
•
12
comments
Opened 9 other pull requests in 3 repositories
webrecorder/browsertrix-cloud
6
merged
-
Add single page QA GET endpoint
This contribution was made on Mar 27
-
Give test_crawl_timeout 10 mins to finish
This contribution was made on Mar 26
-
Add additional filters to page list endpoints
This contribution was made on Mar 21
-
Fix intermittent crawl timeout test failure
This contribution was made on Mar 21
-
Add updatable QA reviewStatus field to crawls
This contribution was made on Mar 5
-
Temporarily remove pages migration
This contribution was made on Mar 4
webrecorder/browsertrix-crawler
2
merged
-
Temporarily disable tmp-cdx creation
This contribution was made on Mar 18
-
Add MKDocs documentation site for Browsertrix Crawler 1.0.0
This contribution was made on Mar 13
harvard-lil/js-wacz
1
merged
-
Add option to use existing CDXJ rather than indexing from WARCs
This contribution was made on Mar 7
Reviewed 27 pull requests in 3 repositories
webrecorder/browsertrix-crawler
17 pull requests
-
upgrade puppeteer-core to 22.6.1
This contribution was made on Mar 27
-
Unify WARC writing + CDXJ indexing into single class
This contribution was made on Mar 26
-
sitemap improvements: gz support + application/xml + extraHops fix
This contribution was made on Mar 26
-
Use RFC2606 invalid domain names
This contribution was made on Mar 26
-
fixes redirected seed (from #475) being counted againt page limit:
This contribution was made on Mar 26
-
service worker capture fix: disable by default for now
This contribution was made on Mar 22
-
Switch to using JS WACZ
This contribution was made on Mar 22
-
improvements to 'non-graceful' interrupt to ensure WARCs are still closed gracefully
This contribution was made on Mar 21
-
Improved support for running as non-root
This contribution was made on Mar 21
-
Docs: Minor fixes to edit link & clarifications
This contribution was made on Mar 20
-
profiles: handle terminate signals directly
This contribution was made on Mar 18
-
SAX-based sitemap parser
This contribution was made on Mar 18
-
Fix Save/Load State
This contribution was made on Mar 15
-
Add MKDocs documentation site for Browsertrix Crawler 1.0.0
This contribution was made on Mar 14
-
Better tracking of failed requests + logging context exclude
This contribution was made on Mar 7
-
Fail on status code option + requeue fix
This contribution was made on Mar 4
-
warc: add Network.resourceType (https://chromedevtools.github.io/devt…
This contribution was made on Mar 4
webrecorder/browsertrix-cloud
7 pull requests
-
Crawler pod memory padding + auto scaling
This contribution was made on Mar 27
-
MetaController update
This contribution was made on Mar 27
-
QA Runs Initial Backend Implementation
This contribution was made on Mar 20
-
kubernetes api: avoid overriding content-type header in kubernetes-asyncio, pass in via arg instead (main)
This contribution was made on Mar 18
-
profile browser fixes: better resource usage + load retry
This contribution was made on Mar 15
-
Docs: Update docs theme
This contribution was made on Mar 14
-
Fix execution time checking by keeping lastUpdatedTime in db
This contribution was made on Mar 4
harvard-lil/js-wacz
3 pull requests
-
0.1.0 RC
This contribution was made on Mar 22
-
Modify --pages option to copy pages files directly into WACZ
This contribution was made on Mar 22
-
Add option to use existing CDXJ rather than indexing from WARCs
This contribution was made on Mar 7
Created an issue in webrecorder/browsertrix-crawler that received 1 comment
Use js-wacz to create WACZ files
Improvements for 1.0.0 branch of crawler:
Switch from using py-wacz to js-wacz for WACZ generation
Pass in indexes from /tmp-cdx
rather than rein…
1
comment
Opened 8 other issues in 3 repositories
webrecorder/browsertrix-cloud
3
closed
1
open
-
QA: Add single page QA GET API endpoint
This contribution was made on Mar 27
-
Nightly crawl timeout test fails intermittently
This contribution was made on Mar 21
-
Add method of populating pages for older crawls
This contribution was made on Mar 14
-
Publish API docs separately
This contribution was made on Mar 6
harvard-lil/js-wacz
2
closed
-
Add option to copy existing pages.jsonl/extraPages.jsonl files directly in WACZ
This contribution was made on Mar 20
-
Add option to use existing CDXJ indices rather than indexing from WARCs
This contribution was made on Mar 7
webrecorder/browsertrix-crawler
2
closed
-
Temporarily disable temp-cdx generation
This contribution was made on Mar 18
-
Update documentation for 1.0.0
This contribution was made on Mar 13