How to: news metadata -> full text -> pipeline #165

kmcdono2 · 2022-10-04T13:40:15Z

Ticket to strategize getting a sample of full text articles into the toponym resolution pipeline.

kallewesterling · 2022-10-04T13:44:46Z

Just to start a thread here for myself: This pull request is where @thobson88 has created a workflow for creating fulltext for a given set of items, based on locally downloaded zip files. (Not the best for the pipeline outlined here).

Meanwhile, I'm working on ingesting the fulltext into a separate table. (And then @thobson88's method will have to be used inside a VM for BNA access somehow!)

kmcdono2 · 2022-10-05T17:14:06Z

@kallewesterling let us know how we can help test this method, when it's useful!

kallewesterling · 2022-10-06T08:40:22Z

Thanks, definitely @kmcdono2! We're almost there with the fulltext table as well, which will be available outside of the BNA!

kmcdono2 · 2022-10-12T08:20:56Z

@kallewesterling @lukehare can we add some information here about how this process will work with the new API for toponym resolution?

kallewesterling · 2022-10-12T08:31:08Z

Yeah that's a great point. We need to keep this in mind. I assume we could just tag it on outside the db code base (separation of concern, makes the most sense to me) but can also think of ways to build it into the db code (which would be easier for the end user)... food for thought. Let's chat more!

ruthahnert · 2022-10-20T14:17:14Z

Would love to chat more about framing narratives

kmcdono2 · 2022-10-20T14:20:14Z

That would be great @ruthahnert! will follow up on slack, but you can see here and in other tickets how we are trying to prep for understanding this as part of working with newspaper full text content

kmcdono2 created this issue from a note in Applications (To do) Oct 4, 2022

kallewesterling self-assigned this Oct 4, 2022

dcsw2 moved this from To do to To review in Applications Feb 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to: news metadata -> full text -> pipeline #165

How to: news metadata -> full text -> pipeline #165

kmcdono2 commented Oct 4, 2022 •

edited

kallewesterling commented Oct 4, 2022 •

edited

kmcdono2 commented Oct 5, 2022

kallewesterling commented Oct 6, 2022

kmcdono2 commented Oct 12, 2022

kallewesterling commented Oct 12, 2022

ruthahnert commented Oct 20, 2022

kmcdono2 commented Oct 20, 2022

How to: news metadata -> full text -> pipeline #165

How to: news metadata -> full text -> pipeline #165

Comments

kmcdono2 commented Oct 4, 2022 • edited

kallewesterling commented Oct 4, 2022 • edited

kmcdono2 commented Oct 5, 2022

kallewesterling commented Oct 6, 2022

kmcdono2 commented Oct 12, 2022

kallewesterling commented Oct 12, 2022

ruthahnert commented Oct 20, 2022

kmcdono2 commented Oct 20, 2022

kmcdono2 commented Oct 4, 2022 •

edited

kallewesterling commented Oct 4, 2022 •

edited