Improve unhelpful logging of rule engine #4317
Comments
@ThomDietrich yes we should work on this, maybe we can start by collecting some example which lead to those errors and work from there on improving the error handling. If needed we can even pair up around the smarthome day 👍 |
I absolutely agree, it would help a lot if these error messages would be more helpful. However, while this sounds like a trivial task on the first glance, it probably will turn out to be pretty nasty in the end: these error messages are not created by us, but by the xtext parser, lexer and the scoping. Therefore I second @martinvw, we need to collect them (ideally a sample rule + error message) and then have a deeper look individually on what can be done in order to support xtext in that situation to generate better error messages. Sometimes there might have been taken a short-cut so that the scoping is clueless what to expect and we can change it to the better. Sometimes we might be able to add our custom validations with also our own error messages. In any case it's gonna be some effort. |
One common misleading error that comes to mind is, for example, when I pass something to a Number Item that cannot be parsed into a Number I end up getting a NullPointerException (NPE) rather than a ParseException (or whatever would be appropriate). In fact, NPE seems to be the default exception thrown for pretty much all type and parsing problems. One thing I have noticed is when I wrap my rules in a try/catch myself, I will actually get the ParseException so somewhere along the way that exception is being lost and replaced with a NPE. I agree with @SJKA though, I suspect the root of the problem is the underlying Rules engine, and not that the developers have just ignored this issue. One thing that would go a really long way towards more helpful errors would be the line in the Rule that generated the error. I suspect that once the Rule gets parsed that information gets lost, but if not just knowing specifically what line generated the error can be very helpful. I am not completely up on what is going on with the Experimental Rules Engine but I hope that this sort of thing is being taken into account. |
...so that sendCommand or postUpdate don't fail with a weird NPE anymore if e.g. a non-parsable string is attempted to be sent to a number item. See eclipse-archived#4317 (comment) Signed-off-by: Simon Kaufmann <simon.kfm@googlemail.com>
That's indeed a good example for something which we have in our hands and can easily fix: In |
Great, that we can already make steps! |
...so that sendCommand or postUpdate don't fail with a weird NPE anymore if e.g. a non-parsable string is attempted to be sent to a number item. See #4317 (comment) Signed-off-by: Simon Kaufmann <simon.kfm@googlemail.com>
@ThomDietrich do you have the rules belonging to these exception then I can take a quick look as well :-) |
Sure thing. One thing I need to add because I ran into it while testing: There seems to be a huge difference between the execution of Item-triggered rules and "System started" rules. The below examples will show that. This needs to be addressed as well or we will fight a battle on multiple fronts forever. 1. "Rule 'myRuleTest': java.lang.Number"Short: Type casting of NULL Example: rule "Test"
when
System started
then
logInfo("Test", "Item: " + Test_Item_1.name)
logInfo("Test", "State: " + Test_Item_1.state)
logInfo("Test", "Number: " + (Test_Item_1.state as Number))
logInfo("Test", "Item: " + Test_Item_2.name)
logInfo("Test", "State: " + Test_Item_2.state)
logInfo("Test", "Number: " + (Test_Item_2.state as Number))
end Results in:
Pay attention to the exact error.
2. "Rule 'myRuleTest': The name '.blaaa' cannot be resolved to an item or type."Short: non-existent method Example: rule "Test"
when
System started
or Item Test_Item_1 changed
then
logInfo("Test", "Item: " + Test_Item_1.name)
logInfo("Test", "Number: " + (Test_Item_1.state as Number).blaaa)
end
3. Non-existent ItemWhile we are at it, let's add yet another. rule "Test"
when
System started
or Item Test_Item_1 changed
then
logInfo("Test", "Item: " + Non_Existent_Item.name)
end
|
@SJKA as promissed 😉 |
Thanks 😄 Your first point about the two different places where the script interpreter is used obviously is very valid. This looks suspicious! I will check if I can figure out if there is/was a good reason for this. Otherwise this should be unified, of course. The other two examples look perfectly fine to me when run at run-time. For the |
Oh, I just found out some additional details regarding points 2.) and 3.). Apparently there is this "special handling" for startup rules here: It basically intends to detect if a rule failed because of a missing item and then will re-execute it later, once a new item was added. It will just log a This obviously is a good idea if the item really exists and will be added later to the system. However, as you used a wrong item in 2.) which will never show up the rule will never get executed in a way that it fails "loudly". Which is a little sad if you didn't do this on purpose but accidentally have a typo in your item names (but remember, this only happens at startup). The 3.)rd case is a little sad because the error message apparently is the same (the interpreter cannot know if it is a item or not which it is looking for). So it gets also "tricked" into the same repeat-once-an-item-gets-added-behavior like above. To be honest, I'm not really sure what to do about it nevertheless. I still think solving #1896 would be the best choice because then we can completely eliminate this "special" behavior and get the same error messages in both cases. |
👍 |
Yes sounds like it. I also agree that "...cannot be resolved to an item or type." is an acceptable result. I was wondering if they can be further distinguished. ".blaaa" is clearly neither an item nor a type, correct? What's your opinion about the first case ("java.lang.Number")? |
I did some interpreter archeology and I think I found a place to hook into (#4462) for those cases. Not exactly perfect, but good enough I think. How about
and
? |
Oh, wait, hang on... I just figured out how to find out where it happened. So we could have something like this:
|
Sounds great 👌 |
I have to second the 👌
This will make a lot of people very happy!! Thanks Simon! Minor details: Where does the semicolon come from? |
Good question. The ScriptError.java adds it by "itself". I'm not 100% sure though in which contexts it is used and how it is intended to look like where (no unit tests unfortunately). So changing this just needs a little more archeology to be done. |
I see. Well it's not that important. From the looks of it we'd be fine if you removed the periods at the end of your error strings. |
You're right, that made it look weird |
@SJKA Thanks for your changes! I had the chance to see those additions in action yesterday and it's a huge improvement in the overall workflow. I'll close this issue. |
https://github.com/eclipse/smarthome/blob/a4dea98a1c91a36d4dbe6d5d616879988f176459/bundles/model/org.eclipse.smarthome.model.rule.runtime/src/org/eclipse/smarthome/model/rule/runtime/internal/engine/RuleEngineImpl.java#L341-L345
Hello, the above logging lines are often encountered while DSL rule development. Their minimalistic form can sometimes be confusing or non-helpful. One interesting example is:
Also quite common is:
I believe this could easily be improved for the sake of user experience. One simple approach would be to distinguish commonly known error reasons and add constructive feedback to the logging message.
@SJKA I believe you are the right person to trigger in this matter? What do you think?
@martinvw I think we discussed something similar a while back.
The text was updated successfully, but these errors were encountered: