Sequence labeling for documents #372

mouthgalya · 2019-09-05T02:26:37Z

Hi
We would want to import the whole document and file content and assign labels to the document content. Currently doccano automatically parses the input and separates out the individual sentences. Can we instead do sequence labeling at the document level??
MG

icoxfog417 · 2019-09-05T02:40:01Z

You can do sequence labeling at the document level. But I recommend separating the document to each sentence to make model training easily.

mouthgalya · 2019-09-05T14:17:24Z

But we are noticing that the document is by default split into lines(because of line feed characters) when we import the data. Is there any way to override this default behavior so that we can see the entire document content in the screen

icoxfog417 · 2019-09-06T03:43:55Z

doccano separates each data by line break now. But you can use the hack that replace line break to \n. Please refer #330.

icoxfog417 added the question Further information is requested label Sep 5, 2019

icoxfog417 closed this as completed Sep 5, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sequence labeling for documents #372

Sequence labeling for documents #372

mouthgalya commented Sep 5, 2019

icoxfog417 commented Sep 5, 2019

mouthgalya commented Sep 5, 2019

icoxfog417 commented Sep 6, 2019 •

edited

Sequence labeling for documents #372

Sequence labeling for documents #372

Comments

mouthgalya commented Sep 5, 2019

icoxfog417 commented Sep 5, 2019

mouthgalya commented Sep 5, 2019

icoxfog417 commented Sep 6, 2019 • edited

icoxfog417 commented Sep 6, 2019 •

edited