Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the AMR graph constrction and RGCN in example #578

Open
wants to merge 25 commits into
base: develop
Choose a base branch
from

Conversation

cminusQAQ
Copy link

Description

Checklist

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [$CATEGORY] (such as [Doc], [Feature]])
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage
  • Code is well-documented
  • To the my best knowledge, examples are either not affected by this change,
    or have been fixed to be compatible with this change
  • Related issue is referred in this PR
  • If the PR is for a new model/paper, I've updated the example index here.

Changes

@cminusQAQ cminusQAQ force-pushed the dev_amr_rgnn_demo branch 4 times, most recently from 52d81ee to dee086d Compare September 9, 2022 09:05
Copy link
Contributor

@AlanSwift AlanSwift left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check the comments.

@@ -195,6 +218,18 @@ def node_features(self) -> NodeFeatView:

return self.nodes[:].features

@property
def ntypes(self) -> List[str]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the graph is homogeneous, it should be an exception.

@@ -157,6 +168,12 @@ def add_nodes(self, node_num: int):
node_num > 0
), "The number of nodes to be added should be greater than 0. (Got {})".format(node_num)

if not self.is_hetero:
assert ntypes is None, "The graph is homogeneous, ntypes should be None."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please raise exceptions

@@ -326,6 +361,18 @@ def edges(self) -> EdgeView:
"""
return EdgeView(self)

@property
def etypes(self) -> List[Tuple[str, str, str]]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When it is homogeneous, self._etypes is not assigned.

@@ -641,6 +713,41 @@ def edge_attributes(self) -> List[Dict[str, torch.Tensor]]:
return self._edge_attributes

# Conversion utility functions
def make_data_dict(self) -> Dict[Tuple[str, str, str], Tuple[torch.Tensor, torch.Tensor]]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could be a utility function.

edge_token = graph.edge_attributes[edge_idx]["token"]
s.add(edge_token)
except Exception as e:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please handle the catched exception

@@ -705,6 +709,41 @@ def _process(self):
self.test = self.build_topology(self.test)
if "val" in self.__dict__:
self.val = self.build_topology(self.val)
# build_edge_vocab and save
if self.init_edge_vocab:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be a new function similar to "build_vocab()". E.g., build_edge_vocab()

@@ -311,6 +311,7 @@ def __init__(
for_inference=False,
reused_vocab_model=None,
nlp_processor_args=None,
init_edge_vocab=False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

init_edge_vocab is not clear. It covers 1.build edge vocab, 2. an indicator to use heterogeneous graph.
TODO: discuss

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. build_edge_vocab: bool
  2. is_hetero: bool

)
data_item_collect.append(data_item)

data_item_collect = self.dataset.build_topology(data_items=data_item_collect)
for i in range(len(data_item_collect)):
data_item_collect[i] = self.dataset._vectorize_one_dataitem(
data_item_collect[i], self.vocab_model, use_ie=use_ie
data_item_collect[i], self.vocab_model, use_ie=use_ie, edge_vocab=self.edge_vocab
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the api in dataset._vectorize_one_dataitem doesn't have the parameter: "edge_vocab", please check it.
_vectorize_one_dataitem

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. move the edge vocab to the library code
  2. unify edge_vocab with vocab_model

@@ -140,6 +140,9 @@ def __init__(
"w2v_bert",
"w2v_bert_bilstm",
"w2v_bert_bigru",
"w2v_amr",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why amr shoud be a general embedding strategy?
I don't think it should be regarded as a new general embedding strategy.

@@ -211,6 +222,17 @@ def __init__(
else:
rnn_input_size = word_emb_size

if "pos" in word_emb_type:
self.word_emb_layers["pos"] = WordEmbedding(37, 50)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please don't use magic numbers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants