Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make SubGraph more pythonic #266

Open
1 task done
Mec-iS opened this issue Sep 5, 2022 · 5 comments
Open
1 task done

Make SubGraph more pythonic #266

Mec-iS opened this issue Sep 5, 2022 · 5 comments
Assignees
Labels
good first issue Good for newcomers
Projects

Comments

@Mec-iS
Copy link
Contributor

Mec-iS commented Sep 5, 2022

I'm submitting a

  • feature request.

Current Behaviour:

SubGraph uses transform and inverse_transform to provide Node-to-index and index-to-Node. This is quite non-standard as "trasformations" are usually functions applied to data structures, plus Python has its fluent interface to define this kind of behaviour

Expected Behaviour:

The __getitem__/__setitem__ protocol should be implemented for SubGraph so that will be possible to do:

# currently done by `inverse_transform`
subgraph = kglab.SubgraphMatrix(kg, sparql)
node = subgraph[i]

to return a node at i position. And:

# currently done by `transform`
subgraph = kglab.SubgraphMatrix(kg, sparql)
node_index = subgraph.index(node)

to return the index of a given node.

This will allow to use Subgraph as an iterator with minor additions.

This requires:

  • to port inverse_transform to __getitem__
  • to port transform to __setitem__
  • and implement Subgraph.index()

For example:

class test:
    some_list = []
    def __getitem__(self, key):
        return self.some_list[key]
    def __delitem__(self, key):
       del self.some_list[key]
    def index(self, node):
        return self.some_list.index(node)

This implies modifying all the existing examples and tests.

@Mec-iS Mec-iS added the good first issue Good for newcomers label Sep 5, 2022
@Mec-iS
Copy link
Contributor Author

Mec-iS commented Sep 5, 2022

cc: @ceteri @tomaarsen

@tomaarsen
Copy link
Collaborator

Are you aiming to remove transform and inverse_transform in favour of the Pythonic alternatives, or merely also implement the Pythonic alternatives? Obviously the former is a breaking change, while the latter is not.
I'm all for this change, by the way.

@Mec-iS
Copy link
Contributor Author

Mec-iS commented Sep 5, 2022

yes I would like to remove the current interface. If there are breaking changes, maybe the time for applying them is now, before it stabilises. Anyway we could have both for some time if needed.

@Ankush-Chander
Copy link
Collaborator

Hi @Mec-iS ,
Change sounds great. Will make usage more intuitive.
I intend to take this up.

@ceteri
Copy link
Collaborator

ceteri commented Sep 5, 2022

Good points, all around!

It's fine to have a breaking change, especially now as we're going through major refactoring and will need to bump to a major release update afterwards.

I really like the more pythonic approach!

In terms of breaking changes, we have use cases for this in the tutorial notebooks, which might be reused for unit tests?

Longer-term perspectives (beyond this refactoring):

  • My one caveat is that for graphs at scale, this indexing can become a critical performance bottleneck.
  • For example, our colleagues at the manufacturing customer have had to use keyvi simply to handle the scale of indexing, and that requires the keys (i.e., the node symbols) to be known in advance so that an FST can be built.
  • Also, we foresee some changes later to the definition of a subgraph such as having these boundaries based on ML products (e.g., "motifs" or various kinds of embeddings).

My main points about these long-term perspectives is to plan on having the indexing process become "pluggable", possibly with some outside framework to handle transform/inverse-transform lookups, and also the construction of some variants of the Subgraph class may be empirically driven.

@ceteri ceteri added this to the Refactor KnowledgeGraph class milestone Sep 5, 2022
@ceteri ceteri added this to In progress in kglab Sep 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
No open projects
kglab
  
In progress
Development

No branches or pull requests

4 participants