data state restoration #64

TimotheeMickus · 2024-03-26T07:14:22Z

closes #63 .

same idea as v2, didn't bother porting from it.

does not embark bucket states, although this could maybe be done by picking the line indices from all examples in the reservoir. Would come at a (potentially high) communication overhead cost.

TimotheeMickus · 2024-05-21T07:48:23Z

also embarks a minor debug on update_vocab being only partially removed (following #61)

jrvc · 2024-05-22T13:26:14Z

mammoth/inputters/dataset.py

@@ -162,13 +184,13 @@ def _cast(example_dict):
                if self.transforms is not None else lambda x: x
            ),
            stride=self.stride,
-            offset=self.offset,
+            offset=offset,
        )
        examples = map(_cast, examples)
        yield from examples

    # FIXME: some RNN archs require sorting src's by length


Out of this PR concenrs... do we still support RNNs? if not, we can just delete this

no, we no longer support RNNs. the move is to externalize transformer variantsvia lucidrains (see #56). Will remove the comment

jrvc · 2024-05-22T13:32:08Z

mammoth/models/model_saver.py

@@ -95,12 +96,14 @@ def save(self, step, moving_average=None):
                self._rm_checkpoint(todel)
            self.checkpoint_queue.append(chkpt_names)

-    def _save(self, step):
+    def _save(self, step, save_model, data_state, device_context):
        """Save a resumable checkpoint.

        Args:
            step (int): step number
            model (nn.Module): torch model to save


missing: save_model (type): description

yeah, i think it was renamed from model to save_model? not sure why. will fix that.

jrvc

lgtm :)

TimotheeMickus requested a review from jrvc March 26, 2024 07:14

Mickus Timothee added 2 commits May 20, 2024 18:41

data state restoration, first pass

441f13c

fix test expectations (add line index support)

ec347bc

TimotheeMickus force-pushed the feats/data-restoration branch from edaa78b to ec347bc Compare May 20, 2024 15:44

Mickus Timothee added 5 commits May 20, 2024 18:50

making tests valid

1497952

fix after merge

279876e

runable

cc807c7

unplug model vocab (remnants of opts cleaning?)

9308e7d

smoketesting ok?

b659522

TimotheeMickus marked this pull request as ready for review May 21, 2024 07:46

jrvc reviewed May 22, 2024

View reviewed changes

jrvc approved these changes May 22, 2024

View reviewed changes

Mickus Timothee added 2 commits May 22, 2024 17:35

minor fixes

af4b212

oop,s translator was broken

ba3abb5

TimotheeMickus merged commit dc01039 into main May 22, 2024
2 checks passed

TimotheeMickus deleted the feats/data-restoration branch May 22, 2024 14:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data state restoration #64

data state restoration #64

TimotheeMickus commented Mar 26, 2024 •

edited

TimotheeMickus commented May 21, 2024

jrvc May 22, 2024

TimotheeMickus May 22, 2024

jrvc May 22, 2024

TimotheeMickus May 22, 2024

jrvc left a comment

data state restoration #64

data state restoration #64

Conversation

TimotheeMickus commented Mar 26, 2024 • edited

TimotheeMickus commented May 21, 2024

jrvc May 22, 2024

Choose a reason for hiding this comment

TimotheeMickus May 22, 2024

Choose a reason for hiding this comment

jrvc May 22, 2024

Choose a reason for hiding this comment

TimotheeMickus May 22, 2024

Choose a reason for hiding this comment

jrvc left a comment

Choose a reason for hiding this comment

TimotheeMickus commented Mar 26, 2024 •

edited