Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DON'T MERGE: Example matching decision trees for sample patterns #1482

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

Upabjojr
Copy link
Contributor

This example shows how two matching patterns can be represented by a decision tree in SymEngine.

The patterns are x**w and y, where w is a wildcard. This example explicitly avoids the commutative property.

The SymPy-MatchPy script to create this is:

matcher = ManyToOneMatcher()
var("x y z")
w = WC("w")
matcher.add(Pattern(x**w))
matcher.add(Pattern(y))

Files:

  • example.cpp contains the decision tree in the form of a function, that is all yield's have been replaced by return's. This has the disadvantage of returning only the first match.
  • example_yield.cpp contains the decision tree in a class. The generator function has been broken into many functions contained in the match_root class. The match_root class contains a pointer to a function to execute. After the pointer is called, the function will change to pointer to point to the next function to execute. Local variables of the generator are stored as class variables.

Generators in C++

Alternatives for generators in C++ are:

  • Threads: they are expansive to initialize, discarded.
  • boost::coroutines seems interesting, but using low level code to hack the stack trace is concern for potential issues. Furthermore, better keep the code free of boost library dependencies.
  • The trick of breaking up the generator function in blocks and using a pointer to a function.

@Upabjojr
Copy link
Contributor Author

For completeness, the Python code used to create those files is:

from collections import deque

def match_root(subject):
        subjects = deque([subject]) if subject is not None else deque()
        subst0 = Substitution()
        # State 2200
        if len(subjects) >= 1 and isinstance(subjects[0], Pow):
                tmp1 = subjects.popleft()
                subjects2 = deque(op_iter(tmp1))
                # State 2201
                if len(subjects2) >= 1 and subjects2[0] == x:
                        tmp3 = subjects2.popleft()
                        # State 2202
                        if len(subjects2) >= 1:
                                tmp4 = subjects2.popleft()
                                subst1 = Substitution(subst0)
                                try:
                                        subst1.try_add_variable('i2', tmp4)
                                except ValueError:
                                        pass
                                else:
                                        # State 2203
                                        if len(subjects2) == 0:
                                                # State 2204
                                                if len(subjects) == 0:
                                                        tmp_subst = Substitution()
                                                        tmp_subst['w'] = subst1['i2']
                                                        # 0: x**w
                                                        yield 0, tmp_subst
                                subjects2.appendleft(tmp4)
                        subjects2.appendleft(tmp3)
                subjects.appendleft(tmp1)
        if len(subjects) >= 1 and subjects[0] == y:
                tmp6 = subjects.popleft()
                # State 2205
                if len(subjects) == 0:
                        # 1: y
                        yield 1, subst0
                subjects.appendleft(tmp6)
        return
        yield

@Upabjojr
Copy link
Contributor Author

@henrik227 @wheerd I am trying to find a way to translate parts of MatchPy into C++ to allow the generated decision tree to be compiled with SymEngine. The generators (functions with yield) are particularly complicated to translate into C++.

I have proposed to use a class with a function pointer, which will point to blocks of code of the generator. An example is contained in the file example_yield.cpp.

Do you have by chance a better suggestion for this?

{
public:
typedef tuple<int, Substitution2> YieldType;
void (match_root::*current)();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should use the std::function for this to avoid handling raw pointers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you're right. I'll give it a try later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, using std::function would introduce a double overhead. You need to assign a method like:

current = std::bind(&match_root::next_function, this);

or

current = [=](){ this->next_function(); };

I would stick with the function pointers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I didn't realize you need to store the this argument also. C++ is usually good at inlining things, so the current = [=](){ this->next_function(); }; approach might not introduce an overhead, but it has to be benchmarked. So for now use the raw function pointers, and we can figure out later how to wrap it in a safer way without introducing any overhead.

@hbarthels
Copy link

Do you have by chance a better suggestion for this?

Unfortunately, I don't. I'm really not an expert in C++.

However, one of the reasons why we use yield is flexibility. You can stop matching as soon as the first match was found, but you can also get all matches. If you know that you only need one of those two cases, there might be no need to implement yield at all.

@Upabjojr
Copy link
Contributor Author

If you know that you only need one of those two cases, there might be no need to implement yield at all.

The commutative matcher in MatchPy makes heavy usage of generators. Especially your implementation of the Uno, Fukuda, Matsui algorithm. That is needed here as well, so we have to find a way to port generators into C++.

It takes less time to just translate your code than to rewrite everything in a new structure, so I'll try to keep the generator structure.

@hbarthels
Copy link

If you know that you only need one of those two cases, there might be no need to implement yield at all.

The commutative matcher in MatchPy makes heavy usage of generators. Especially your implementation of the Uno, Fukuda, Matsui algorithm. That is needed here as well, so we have to find a way to port generators into C++.

It takes less time to just translate your code than to rewrite everything in a new structure, so I'll try to keep the generator structure.

Yes, I guess that makes sense. I was only thinking about the generation of matches, not the internals.

return d;
}

tuple<int, Substitution2> match_root(RCP<const Basic> subject)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's ok to use std::variant here, if that's what is intended.

For older compilers, we can provide a compatible implementation, say using Boost variant, i.e., by having SymEngine::variant, that will be equal to std::variant for compilers that support it, and boost::variant for older compilers. For now just use std::variant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, std::variant was meant to be used in the other PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type is a tuple, the integer stands for the rule number (e.g. RUBI rules are numbered), while Substitution2 is a map of wildcards to expressions to be substituted. Substitution2 is then used in a replacement pattern to return the result of the integration.

@isuruf
Copy link
Member

isuruf commented Jan 7, 2019

https://github.com/tonbit/coroutine is a small header only library implementing coroutines supporting linux, osx and windows

@Upabjojr
Copy link
Contributor Author

Upabjojr commented Jan 7, 2019

I suggest to create a module with the currently working stuff, which means everything except for the commutative support. The commutative matcher relies a lot on coroutines and it's pretty hard to port to C++.

You have a Teuchos submodule. The code could be appended as a side module in that location.

@symengine symengine deleted a comment from isuruf-bot Jan 11, 2019
@symengine symengine deleted a comment from isuruf-bot Jan 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants