Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

temporary QueryWorkflowAnalysis code #684

Open
wants to merge 14 commits into
base: rku/aql-machine
Choose a base branch
from

Conversation

Kelerchian
Copy link
Contributor

No description provided.

}
}

pub fn check(&'a self, query: &'a Query) -> Vec<QueryWorkflowAnalysisError> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The query argument should be the same as self.query, right? In that case it should be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes.

.workflows
.iter()
.for_each(|(_ident, workflow)| errors.extend(self.check_workflow(workflow)));
errors
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’d use self.query.workflows.values().flat_map(|wf| self.check_workflow(wf)).collect()

}
Call {
workflow: workflow_ident,
cases: _,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cases need to be checked as well

..
} => {
if let Some(workflow) = self.query.workflows.get(&*workflow_ident) {
self.check_workflow(workflow)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn’t do this here, since we have a reliable top-level enumeration of all workflows. This means the tracker isn’t needed.

}

#[derive(Default)]
pub(crate) struct WorkflowTracker<'a>(pub(crate) RefCell<BTreeSet<&'a Workflow<'a>>>);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably no longer needed, but still: why pub(crate)? Such internal things should always stay as local as possible, especially with a RefCell inside.

Comment on lines 736 to 737
pub(crate) query: &'a Query<'a>,
pub(crate) workflow_tracker: WorkflowTracker<'a>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think neither of these fields are needed: the (recursive) analysis functions can just get the outer Query and the syntax element in question as arguments. When analysing the flow inside a workflow we’ll probably need to have some more context, though, so it might be appropriate to keep this struct with the query field in any case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also imagine the type checker information to be stored in this struct too because it will be used inside while checking binders

Copy link
Contributor

@jmg-duarte jmg-duarte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was faced with similar vec vs iter decisions while writing the code to dive down the workflow getting the steps. Curious to learn your take on that.

}

fn check_step(&'a self, step: &'a WorkflowStep<'a>) -> Vec<QueryWorkflowAnalysisError> {
use super::workflow::WorkflowStep::*;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: Not a fan of using all the enum members. In this case is not particularly problematic but take a look into this https://youtu.be/8j_FbjiowvE?si=DOchI2_0vieFkEjg&t=103 (not the full video)

Copy link
Contributor Author

@Kelerchian Kelerchian Mar 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wat?? TIL thanks for the reference

rust/actyx/ax-aql/src/language/parser.rs Outdated Show resolved Hide resolved
rust/actyx/ax-aql/src/language/parser.rs Outdated Show resolved Hide resolved
});
}

errors.extend(cases.iter().flat_map(|(_, steps)| self.check_steps(steps)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this always runs, you can do:

let mut errors = cases.iter().flat_map(|(_, steps)| self.check_steps(steps)).collect::<Vec<_>>();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason, I feel that workflow_identity error should go first before the cases error. Not that I am sure why this ordering matter at all

rust/actyx/ax-aql/src/language/parser.rs Outdated Show resolved Hide resolved
rust/actyx/ax-aql/src/language/parser.rs Outdated Show resolved Hide resolved
@Kelerchian
Copy link
Contributor Author

I was faced with similar vec vs iter decisions while writing the code to dive down the workflow getting the steps. Curious to learn your take on that.

My first thought was: if these methods can return an iterator rather than vec, then at the end a chain and collect is executed, then it might result in a better code. But I do not know how to do that nor I can find any references on the internet on how to do that

}
_ => Err(format!("{:?} cannot be accessed by {:?}", x, first)),
Type::Atom(type_atom) => match type_atom {
TypeAtom::Universal => return Ok(Type::Atom(TypeAtom::Universal)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be

return Ok(*cur_type) // or cur_type.clone() if Type doesn't implement Copy

Or even just:

return Err(anyhow!("UNIVERSAL type is not indexable"))

rust/actyx/ax-aql/src/language/type_check.rs Outdated Show resolved Hide resolved
rust/actyx/ax-aql/src/language/type_check.rs Outdated Show resolved Hide resolved
rust/actyx/ax-aql/src/language/type_check.rs Outdated Show resolved Hide resolved
rust/actyx/ax-aql/src/language/type_check.rs Outdated Show resolved Hide resolved
rust/actyx/ax-aql/src/language/type_check.rs Outdated Show resolved Hide resolved
pub(crate) error_type: QueryTypeCheckErrorType,
}

type ExtraMessage = String;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is usually not done in Rust: type aliases are used to write down complex types with less verbosity, otherwise newtypes like struct ExtraMessage(String) are used to create a separate type with different semantics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With struct how can I make it so that ExtraMessage can be operated as if it is a string?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you want to achieve? In which way shall the new type be different from String?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to state that this is a string, but also an "ExtraMessage" for InvalidIndex error.
The reason is, DrillDownResult cannot be Result<Type, QueryTypeCheckErrorType::InvalidIndex> and Result<Type, String> isn't clear enough on what the string are

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case I’d just use Result<Type, QueryTypeCheckErrorType> as the return type.

Copy link
Member

@rkuhn rkuhn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general the control flow structure will need to be geared towards traversing the statements of a workflow in the same order as during evaluation, including a context that is enriched along the way. This enables checking that eventLabel @ whatever refers to a valid whatever — which may either be a workflow argument or a binding.

}
}

pub(crate) struct QueryTypeCheck<'a> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, type checking is a function, not an object, so I’d start from functions, not methods. If the argument lists become too long and repetitive, then some arguments can be passed within a single struct.

tail: vec![Index::String(String::from("some_string"))].try_into().unwrap(),
},
)
.is_err());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add the second parameter to assert! with a short explanation of what failed

}

#[test]
fn tuple() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I'd separate all the asserts in this test into multiple tests. AFAIK you also get to take more advantage of parallelism that way

rust/actyx/ax-aql/src/language/type_check/mod.rs Outdated Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this file is lacking some docs, they should help you pick the file back up in the future too as well as any potential contributors.

rust/actyx/ax-aql/src/language/types.rs Outdated Show resolved Hide resolved
rust/actyx/ax-aql/src/language/types.rs Outdated Show resolved Hide resolved
rust/actyx/ax-aql/src/language/types.rs Outdated Show resolved Hide resolved
rust/actyx/ax-aql/src/language/types.rs Outdated Show resolved Hide resolved
rust/actyx/ax-aql/src/language/types.rs Outdated Show resolved Hide resolved
(Type::Record(a), Type::Record(b)) => {
let a = a.iter().collect::<BTreeSet<_>>();
let b = b.iter().collect::<BTreeSet<_>>();
let intersection = BTreeSet::intersection(&a, &b).cloned().collect::<BTreeSet<_>>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider the following records:

type A = { x: number, y: string }
type B = { y: string }

As far as your code goes, this makes it so that A supertypeof B which I agree with, however, what if we have the following:

type A = { x: number, y: { y_x: string, y_z: string } }
type B = { x: number, y: { y_x: string } }

Logic follows that A supertypeof B is still true, however I don't think this piece of code handles this case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven’t yet read the code, but @jmg-duarte your comment above is incorrect: a subtype can be used in place of its supertype (Liskov substitution principle), which means that it must have at least the properties of the supertype. A is a subtype of B, in both cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

really nice catch. thanks for this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven’t yet read the code, but @jmg-duarte your comment above is incorrect: a subtype can be used in place of its supertype (Liskov substitution principle), which means that it must have at least the properties of the supertype. A is a subtype of B, in both cases.

True, I applied the logic for refinements here where NUMBER is supertype of 10.

But I think the nesting issue still applies?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nesting issue still applies I think.

One(T),
}

fn spread<T>(vec: Vec<T>, spreader: impl Fn(&T) -> Spread<T>) -> Vec<T> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the point of this function, could you add docs or explain?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do. and it will be moved down because it doesn't make sense being between struct Type and impl Type

Comment on lines +322 to +347
/// Spread union tree into a set of type references without collapsing
fn spread_union((a, b): &(Type, Type)) -> BTreeSet<&Type> {
let spread = spread(vec![a, b], |x| match x {
Type::Union(union) => {
let (a, b) = union.as_ref();
Spread::Many(vec![a, b])
}
x => Spread::One(x),
});

spread.into_iter().collect()
}

/// Spread union tree into a set of types with collapsing
fn spread_union_collapsing((a, b): &(Type, Type)) -> BTreeSet<Type> {
let spread = spread(vec![a.clone(), b.clone()], |x| match x.clone().collapse() {
Type::Union(union) => {
let (a, b) = union.as_ref();
Spread::Many(vec![a.clone(), b.clone()])
}
x => Spread::One(x),
});

spread.into_iter().collect()
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel there must be a better way of writing this API. The double clone that ends up happening in line 337 seems wrong.

Maybe you could use a Collapse trait and implement it for the various BTreeSet, delaying the cloning until its really necessary for example. Furthermore, as we discussed, collapse probably doesn't need to consume the type, you're just checking a bunch of things and returning a new type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while collapse doesn't consume, it still produces a new type.
this in turn forces this function to return BTreeSet and not BTreeSet<&Type>
because we cannot return references to objects created within the function (produced by collapse)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that makes sense but my point was also about how (if possible) to make a slightly better API by just having a single function instead of these two.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants