Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seek by id runtime optimization #643

Merged
merged 15 commits into from
Apr 19, 2024
Merged

Conversation

AviAvni
Copy link
Contributor

@AviAvni AviAvni commented Apr 16, 2024

fix #632

Summary by CodeRabbit

  • New Features
    • Enhanced node retrieval operations to allow filtering by IDs, improving precision and efficiency.
    • Introduced dynamic handling and manipulation of node filters in label scans, enabling more refined search capabilities.
    • Added functionality for resolving filter expressions on graph data structures using roaring bitmaps.
    • Implemented memory alignment function for efficient memory allocation in roaring bitmaps initialization.

Copy link

coderabbitai bot commented Apr 16, 2024

Walkthrough

The updates involve enhancing node retrieval mechanisms in RedisGraph, focusing on efficient and precise data access by implementing ID filters, dynamic bitmap operations, and improved node seeking methods.

Changes

Files Change Summary
.../op_node_by_id_seek.c Significant modifications including ID filtering, dynamic bitmap manipulation, and improved node seeking mechanisms.
.../op_node_by_label_scan.c Added functions and fields for ID-based filtering, updated functions for efficiency in handling filters.
.../shared/filter_functions.c Introduces functionality for resolving filter expressions using roaring bitmaps.
module.c Added header inclusion for "util/roaring.h" and introduced rm_aligned_malloc function for memory alignment.

Possibly related issues

  • allow optimization where apply op is present #139: The changes in .../op_node_by_label_scan.c could optimize filter operations, potentially addressing the optimization issues discussed.
  • Index is not utilized #232: The enhancements in node seeking and filtering might resolve the index utilization problems by ensuring efficient query execution, aligning with the issue's objectives.

🐇💻🌟
To the nodes we seek, both near and far,
With filters fine as a gossamer star.
Bitmaps shift and IDs align,
In RedisGraph, efficiency we find.
Hop, hop, hooray, for code that shines! 🌟💾
🐇


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Comment on lines 28 to 42
OpBase *NewNodeByIdSeekOp
(
const ExecutionPlan *plan,
const char *alias,
FilterID *filters
) {

NodeByIdSeek *op = rm_malloc(sizeof(NodeByIdSeek));
op->g = QueryCtx_GetGraph();
op->child_record = NULL;
op->alias = alias;

op->minId = id_range->include_min ? id_range->min : id_range->min + 1;
/* The largest possible entity ID is the same as Graph_RequiredMatrixDim.
* This value will be set on Init, to allow operation clone be independent
* on the current graph size.*/
op->maxId = id_range->include_max ? id_range->max : id_range->max - 1;

op->currentId = op->minId;
op->filters = filters;
op->ids = roaring64_bitmap_create();
op->it = NULL;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider checking the result of rm_malloc for null before using the op pointer.

+	if (!op) return NULL;

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
OpBase *NewNodeByIdSeekOp
(
const ExecutionPlan *plan,
const char *alias,
FilterID *filters
) {
NodeByIdSeek *op = rm_malloc(sizeof(NodeByIdSeek));
op->g = QueryCtx_GetGraph();
op->child_record = NULL;
op->alias = alias;
op->minId = id_range->include_min ? id_range->min : id_range->min + 1;
/* The largest possible entity ID is the same as Graph_RequiredMatrixDim.
* This value will be set on Init, to allow operation clone be independent
* on the current graph size.*/
op->maxId = id_range->include_max ? id_range->max : id_range->max - 1;
op->currentId = op->minId;
op->filters = filters;
op->ids = roaring64_bitmap_create();
op->it = NULL;
NodeByIdSeek *op = rm_malloc(sizeof(NodeByIdSeek));
if (!op) return NULL;
op->g = QueryCtx_GetGraph();
op->child_record = NULL;
op->alias = alias;
op->filters = filters;
op->ids = roaring64_bitmap_create();
op->it = NULL;

Comment on lines 211 to 232
static void NodeByIdSeekFree
(
OpBase *opBase
) {
NodeByIdSeek *op = (NodeByIdSeek *)opBase;
if(op->child_record) {
OpBase_DeleteRecord(op->child_record);
op->child_record = NULL;
}

if(op->filters) {
for(int i = 0; i < array_len(op->filters); i++) {
AR_EXP_Free(op->filters[i].id_exp);
}
array_free(op->filters);
op->filters = NULL;
}

if(op->it) {
roaring64_iterator_free(op->it);
op->it = NULL;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reorder the resource freeing in NodeByIdSeekFree to avoid potential use-after-free errors. Free the iterator before freeing the filters and child records.

+	if(op->it) {
+		roaring64_iterator_free(op->it);
+		op->it = NULL;
+	}
+	if(op->filters) {
+		for(int i = 0; i < array_len(op->filters); i++) {
+			AR_EXP_Free(op->filters[i].id_exp);
+		}
+		array_free(op->filters);
+		op->filters = NULL;
+	}
+	if(op->child_record) {
+		OpBase_DeleteRecord(op->child_record);
+		op->child_record = NULL;
+	}

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
static void NodeByIdSeekFree
(
OpBase *opBase
) {
NodeByIdSeek *op = (NodeByIdSeek *)opBase;
if(op->child_record) {
OpBase_DeleteRecord(op->child_record);
op->child_record = NULL;
}
if(op->filters) {
for(int i = 0; i < array_len(op->filters); i++) {
AR_EXP_Free(op->filters[i].id_exp);
}
array_free(op->filters);
op->filters = NULL;
}
if(op->it) {
roaring64_iterator_free(op->it);
op->it = NULL;
}
static void NodeByIdSeekFree
(
OpBase *opBase
) {
NodeByIdSeek *op = (NodeByIdSeek *)opBase;
if(op->it) {
roaring64_iterator_free(op->it);
op->it = NULL;
}
if(op->filters) {
for(int i = 0; i < array_len(op->filters); i++) {
AR_EXP_Free(op->filters[i].id_exp);
}
array_free(op->filters);
op->filters = NULL;
}
if(op->child_record) {
OpBase_DeleteRecord(op->child_record);
op->child_record = NULL;
}

@FalkorDB FalkorDB deleted a comment from coderabbitai bot Apr 17, 2024
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

roaring64_bitmap_add_range_closed(op->ids, 0, node_count);
for(int i = 0; i < count; i++) {
SIValue v = AR_EXP_Evaluate(op->filters[i].id_exp, op->child_record);
if(SI_TYPE(v) != T_INT64) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handle the case where SIValue type is not T_INT64 to prevent incorrect behavior.

+	else return false;

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
if(SI_TYPE(v) != T_INT64) {
if(SI_TYPE(v) != T_INT64) {
else return false;

static Record NodeByIdSeekConsumeFromChild
(
OpBase *opBase
) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refine the logic to ensure that the iterator is reset only when a new record is successfully consumed from the child.

+	if(op->child_record == NULL) {
+		op->child_record = OpBase_Consume(op->op.children[0]);
+		if(op->child_record == NULL) return NULL; // Child depleted.
+		NodeByIdSeekReset(opBase);
+	}

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
) {
if(op->child_record == NULL) {
op->child_record = OpBase_Consume(op->op.children[0]);
if(op->child_record == NULL) return NULL; // Child depleted.
NodeByIdSeekReset(opBase);
}

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Comment on lines 9 to 60
bool FilterExpression_Resolve
(
Graph *g,
FilterExpression *filters,
roaring64_bitmap_t *ids,
Record r
) {
size_t node_count = Graph_UncompactedNodeCount(g);
int count = array_len(filters);
roaring64_bitmap_add_range_closed(ids, 0, node_count);
for(int i = 0; i < count; i++) {
SIValue v = AR_EXP_Evaluate(filters[i].id_exp, r);
if(SI_TYPE(v) != T_INT64) {
return false;
}
switch(filters[i].op) {
case OP_LT: // <
if(roaring64_bitmap_maximum(ids) >= v.longval) {
roaring64_bitmap_remove_range_closed(ids, v.longval, roaring64_bitmap_maximum(ids));
}
break;
case OP_LE: // <=
if(roaring64_bitmap_maximum(ids) > v.longval) {
roaring64_bitmap_remove_range_closed(ids, v.longval + 1, roaring64_bitmap_maximum(ids));
}
break;
case OP_GT: // >
if(roaring64_bitmap_minimum(ids) <= v.longval) {
roaring64_bitmap_remove_range_closed(ids, roaring64_bitmap_minimum(ids), v.longval);
}
break;
case OP_GE: // >=
if(roaring64_bitmap_minimum(ids) < v.longval) {
roaring64_bitmap_remove_range(ids, roaring64_bitmap_minimum(ids), v.longval);
}
break;
case OP_EQUAL: // =
if(!roaring64_bitmap_contains(ids, v.longval)) {
return false;
}

roaring64_bitmap_remove_range_closed(ids, v.longval + 1, roaring64_bitmap_maximum(ids));
roaring64_bitmap_remove_range(ids, roaring64_bitmap_minimum(ids), v.longval);
break;
default:
ASSERT(false && "operation not supported");
break;
}
}

return true;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation of FilterExpression_Resolve handles various filter operations on IDs using a bitmap. Ensure that all edge cases, such as unsupported operations and non-integer values, are handled correctly to prevent runtime errors.

+    if (SI_TYPE(v) != T_INT64) {
+        // Log error or handle the case where the value type is not an integer
+        return false;
+    }

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
bool FilterExpression_Resolve
(
Graph *g,
FilterExpression *filters,
roaring64_bitmap_t *ids,
Record r
) {
size_t node_count = Graph_UncompactedNodeCount(g);
int count = array_len(filters);
roaring64_bitmap_add_range_closed(ids, 0, node_count);
for(int i = 0; i < count; i++) {
SIValue v = AR_EXP_Evaluate(filters[i].id_exp, r);
if(SI_TYPE(v) != T_INT64) {
return false;
}
switch(filters[i].op) {
case OP_LT: // <
if(roaring64_bitmap_maximum(ids) >= v.longval) {
roaring64_bitmap_remove_range_closed(ids, v.longval, roaring64_bitmap_maximum(ids));
}
break;
case OP_LE: // <=
if(roaring64_bitmap_maximum(ids) > v.longval) {
roaring64_bitmap_remove_range_closed(ids, v.longval + 1, roaring64_bitmap_maximum(ids));
}
break;
case OP_GT: // >
if(roaring64_bitmap_minimum(ids) <= v.longval) {
roaring64_bitmap_remove_range_closed(ids, roaring64_bitmap_minimum(ids), v.longval);
}
break;
case OP_GE: // >=
if(roaring64_bitmap_minimum(ids) < v.longval) {
roaring64_bitmap_remove_range(ids, roaring64_bitmap_minimum(ids), v.longval);
}
break;
case OP_EQUAL: // =
if(!roaring64_bitmap_contains(ids, v.longval)) {
return false;
}
roaring64_bitmap_remove_range_closed(ids, v.longval + 1, roaring64_bitmap_maximum(ids));
roaring64_bitmap_remove_range(ids, roaring64_bitmap_minimum(ids), v.longval);
break;
default:
ASSERT(false && "operation not supported");
break;
}
}
return true;
}
bool FilterExpression_Resolve
(
Graph *g,
FilterExpression *filters,
roaring64_bitmap_t *ids,
Record r
) {
size_t node_count = Graph_UncompactedNodeCount(g);
int count = array_len(filters);
roaring64_bitmap_add_range_closed(ids, 0, node_count);
for(int i = 0; i < count; i++) {
SIValue v = AR_EXP_Evaluate(filters[i].id_exp, r);
if (SI_TYPE(v) != T_INT64) {
// Log error or handle the case where the value type is not an integer
return false;
}
switch(filters[i].op) {
case OP_LT: // <
if(roaring64_bitmap_maximum(ids) >= v.longval) {
roaring64_bitmap_remove_range_closed(ids, v.longval, roaring64_bitmap_maximum(ids));
}
break;
case OP_LE: // <=
if(roaring64_bitmap_maximum(ids) > v.longval) {
roaring64_bitmap_remove_range_closed(ids, v.longval + 1, roaring64_bitmap_maximum(ids));
}
break;
case OP_GT: // >
if(roaring64_bitmap_minimum(ids) <= v.longval) {
roaring64_bitmap_remove_range_closed(ids, roaring64_bitmap_minimum(ids), v.longval);
}
break;
case OP_GE: // >=
if(roaring64_bitmap_minimum(ids) < v.longval) {
roaring64_bitmap_remove_range(ids, roaring64_bitmap_minimum(ids), v.longval);
}
break;
case OP_EQUAL: // =
if(!roaring64_bitmap_contains(ids, v.longval)) {
return false;
}
roaring64_bitmap_remove_range_closed(ids, v.longval + 1, roaring64_bitmap_maximum(ids));
roaring64_bitmap_remove_range(ids, roaring64_bitmap_minimum(ids), v.longval);
break;
default:
ASSERT(false && "operation not supported");
break;
}
}
return true;
}

Comment on lines 28 to 42
OpBase *NewNodeByIdSeekOp
(
const ExecutionPlan *plan,
const char *alias,
FilterExpression *filters
) {

NodeByIdSeek *op = rm_malloc(sizeof(NodeByIdSeek));
op->g = QueryCtx_GetGraph();
op->child_record = NULL;
op->alias = alias;

op->minId = id_range->include_min ? id_range->min : id_range->min + 1;
/* The largest possible entity ID is the same as Graph_RequiredMatrixDim.
* This value will be set on Init, to allow operation clone be independent
* on the current graph size.*/
op->maxId = id_range->include_max ? id_range->max : id_range->max - 1;

op->currentId = op->minId;
op->filters = filters;
op->ids = NULL;
op->it = NULL;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation of NewNodeByIdSeekOp correctly initializes the NodeByIdSeek structure. Ensure that memory allocation failures are handled to prevent potential null pointer dereferences.

+    if (!op) return NULL;

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
OpBase *NewNodeByIdSeekOp
(
const ExecutionPlan *plan,
const char *alias,
FilterExpression *filters
) {
NodeByIdSeek *op = rm_malloc(sizeof(NodeByIdSeek));
op->g = QueryCtx_GetGraph();
op->child_record = NULL;
op->alias = alias;
op->minId = id_range->include_min ? id_range->min : id_range->min + 1;
/* The largest possible entity ID is the same as Graph_RequiredMatrixDim.
* This value will be set on Init, to allow operation clone be independent
* on the current graph size.*/
op->maxId = id_range->include_max ? id_range->max : id_range->max - 1;
op->currentId = op->minId;
op->filters = filters;
op->ids = NULL;
op->it = NULL;
OpBase *NewNodeByIdSeekOp
(
const ExecutionPlan *plan,
const char *alias,
FilterExpression *filters
) {
NodeByIdSeek *op = rm_malloc(sizeof(NodeByIdSeek));
if (!op) return NULL;
op->g = QueryCtx_GetGraph();
op->child_record = NULL;
op->alias = alias;
op->filters = filters;
op->ids = NULL;
op->it = NULL;

Comment on lines 53 to 62
static OpResult NodeByIdSeekInit
(
OpBase *opBase
) {
ASSERT(opBase->type == OPType_NODE_BY_ID_SEEK);
NodeByIdSeek *op = (NodeByIdSeek *)opBase;
// The largest possible entity ID is the number of nodes - deleted and real - in the DataBlock.
size_t node_count = Graph_UncompactedNodeCount(op->g);
op->maxId = MIN(node_count - 1, op->maxId);
if(opBase->childCount > 0) OpBase_UpdateConsume(opBase, NodeByIdSeekConsumeFromChild);
op->ids = roaring64_bitmap_create();
if(opBase->childCount > 0) {
OpBase_UpdateConsume(opBase, NodeByIdSeekConsumeFromChild);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The NodeByIdSeekInit function initializes the bitmap for ID storage and sets the appropriate consume function based on the presence of a child. Verify that the bitmap creation is checked for null to handle memory allocation failures.

+    if (!op->ids) return OP_ERR;

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
static OpResult NodeByIdSeekInit
(
OpBase *opBase
) {
ASSERT(opBase->type == OPType_NODE_BY_ID_SEEK);
NodeByIdSeek *op = (NodeByIdSeek *)opBase;
// The largest possible entity ID is the number of nodes - deleted and real - in the DataBlock.
size_t node_count = Graph_UncompactedNodeCount(op->g);
op->maxId = MIN(node_count - 1, op->maxId);
if(opBase->childCount > 0) OpBase_UpdateConsume(opBase, NodeByIdSeekConsumeFromChild);
op->ids = roaring64_bitmap_create();
if(opBase->childCount > 0) {
OpBase_UpdateConsume(opBase, NodeByIdSeekConsumeFromChild);
}
static OpResult NodeByIdSeekInit
(
OpBase *opBase
) {
ASSERT(opBase->type == OPType_NODE_BY_ID_SEEK);
NodeByIdSeek *op = (NodeByIdSeek *)opBase;
op->ids = roaring64_bitmap_create();
if (!op->ids) return OP_ERR;
if(opBase->childCount > 0) {
OpBase_UpdateConsume(opBase, NodeByIdSeekConsumeFromChild);
}

Comment on lines 172 to 198
static void NodeByIdSeekFree
(
OpBase *opBase
) {
NodeByIdSeek *op = (NodeByIdSeek *)opBase;
if(op->child_record) {
OpBase_DeleteRecord(op->child_record);
op->child_record = NULL;
}

if(op->filters) {
for(int i = 0; i < array_len(op->filters); i++) {
AR_EXP_Free(op->filters[i].id_exp);
}
array_free(op->filters);
op->filters = NULL;
}

if(op->it) {
roaring64_iterator_free(op->it);
op->it = NULL;
}

if(op->ids) {
roaring64_bitmap_free(op->ids);
op->ids = NULL;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reorder the resource freeing in NodeByIdSeekFree to avoid potential use-after-free errors. Free the iterator before freeing the filters and child records.

+    if(op->it) {
+        roaring64_iterator_free(op->it);
+        op->it = NULL;
+    }
+    if(op->filters) {
+        for(int i = 0; i < array_len(op->filters); i++) {
+            AR_EXP_Free(op->filters[i].id_exp);
+        }
+        array_free(op->filters);
+        op->filters = NULL;
+    }
+    if(op->child_record) {
+        OpBase_DeleteRecord(op->child_record);
+        op->child_record = NULL;
+    }

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
static void NodeByIdSeekFree
(
OpBase *opBase
) {
NodeByIdSeek *op = (NodeByIdSeek *)opBase;
if(op->child_record) {
OpBase_DeleteRecord(op->child_record);
op->child_record = NULL;
}
if(op->filters) {
for(int i = 0; i < array_len(op->filters); i++) {
AR_EXP_Free(op->filters[i].id_exp);
}
array_free(op->filters);
op->filters = NULL;
}
if(op->it) {
roaring64_iterator_free(op->it);
op->it = NULL;
}
if(op->ids) {
roaring64_bitmap_free(op->ids);
op->ids = NULL;
}
static void NodeByIdSeekFree
(
OpBase *opBase
) {
NodeByIdSeek *op = (NodeByIdSeek *)opBase;
if(op->it) {
roaring64_iterator_free(op->it);
op->it = NULL;
}
if(op->filters) {
for(int i = 0; i < array_len(op->filters); i++) {
AR_EXP_Free(op->filters[i].id_exp);
}
array_free(op->filters);
op->filters = NULL;
}
if(op->child_record) {
OpBase_DeleteRecord(op->child_record);
op->child_record = NULL;
}
if(op->ids) {
roaring64_bitmap_free(op->ids);
op->ids = NULL;
}

Comment on lines 88 to 105
op->L = Graph_GetLabelMatrix(op->g, op->n->label_id);

// make sure range is within matrix bounds
UnsignedRange_TightenRange(op->id_range, OP_GE, 0);
UnsignedRange_TightenRange(op->id_range, OP_LT, nrows);
if(array_len(op->filters) > 0) {
if(!FilterExpression_Resolve(op->g, op->filters, op->ids, op->child_record)) {
return GrB_INVALID_INDEX;
}

if(!UnsignedRange_IsValid(op->id_range)) return GrB_DIMENSION_MISMATCH;
if(roaring64_bitmap_get_cardinality(op->ids) == 0) {
return GrB_DIMENSION_MISMATCH;
}

if(op->id_range->include_min) minId = op->id_range->min;
else minId = op->id_range->min + 1;
if(op->it) {
roaring64_iterator_free(op->it);
}
op->it = roaring64_iterator_create(op->ids);

if(op->id_range->include_max) maxId = op->id_range->max;
else maxId = op->id_range->max - 1;
return GrB_SUCCESS;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _ConstructIterator function handles the construction of an iterator based on the label matrix and filters. Ensure that all edge cases, such as empty bitmaps and invalid indexes, are handled to prevent runtime errors.

+    if (roaring64_bitmap_get_cardinality(op->ids) == 0) {
+        // Handle the case where no IDs are found
+        return GrB_DIMENSION_MISMATCH;
+    }

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
op->L = Graph_GetLabelMatrix(op->g, op->n->label_id);
// make sure range is within matrix bounds
UnsignedRange_TightenRange(op->id_range, OP_GE, 0);
UnsignedRange_TightenRange(op->id_range, OP_LT, nrows);
if(array_len(op->filters) > 0) {
if(!FilterExpression_Resolve(op->g, op->filters, op->ids, op->child_record)) {
return GrB_INVALID_INDEX;
}
if(!UnsignedRange_IsValid(op->id_range)) return GrB_DIMENSION_MISMATCH;
if(roaring64_bitmap_get_cardinality(op->ids) == 0) {
return GrB_DIMENSION_MISMATCH;
}
if(op->id_range->include_min) minId = op->id_range->min;
else minId = op->id_range->min + 1;
if(op->it) {
roaring64_iterator_free(op->it);
}
op->it = roaring64_iterator_create(op->ids);
if(op->id_range->include_max) maxId = op->id_range->max;
else maxId = op->id_range->max - 1;
return GrB_SUCCESS;
}
op->L = Graph_GetLabelMatrix(op->g, op->n->label_id);
if(array_len(op->filters) > 0) {
if(!FilterExpression_Resolve(op->g, op->filters, op->ids, op->child_record)) {
return GrB_INVALID_INDEX;
}
if(roaring64_bitmap_get_cardinality(op->ids) == 0) {
// Handle the case where no IDs are found
return GrB_DIMENSION_MISMATCH;
}
if(op->it) {
roaring64_iterator_free(op->it);
}
op->it = roaring64_iterator_create(op->ids);
return GrB_SUCCESS;
}

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

const ExecutionPlan *plan,
const char *alias,
FilterExpression *filters
) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check for NULL after memory allocation to prevent potential null pointer dereference.

+	if (!op) return NULL;

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
) {
) {
if (!op) return NULL;

size_t node_count = Graph_UncompactedNodeCount(op->g);
op->maxId = MIN(node_count - 1, op->maxId);
if(opBase->childCount > 0) OpBase_UpdateConsume(opBase, NodeByIdSeekConsumeFromChild);
op->ids = roaring64_bitmap_create();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure that roaring64_bitmap_create does not return NULL to avoid potential issues.

+	if (!op->ids) return OP_ERR;

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
op->ids = roaring64_bitmap_create();
op->ids = roaring64_bitmap_create();
if (!op->ids) return OP_ERR;

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Comment on lines 55 to 68
OpBase *opBase
) {
ASSERT(opBase->type == OPType_NODE_BY_ID_SEEK);
NodeByIdSeek *op = (NodeByIdSeek *)opBase;
// The largest possible entity ID is the number of nodes - deleted and real - in the DataBlock.
size_t node_count = Graph_UncompactedNodeCount(op->g);
op->maxId = MIN(node_count - 1, op->maxId);
if(opBase->childCount > 0) OpBase_UpdateConsume(opBase, NodeByIdSeekConsumeFromChild);
op->ids = roaring64_bitmap_create();
if(opBase->childCount > 0) {
OpBase_UpdateConsume(opBase, NodeByIdSeekConsumeFromChild);
} else {
if(!FilterExpression_Resolve(op->g, op->filters, op->ids, op->child_record)) {
OpBase_UpdateConsume(opBase, NodeByIdSeekNoOpConsume);
return OP_OK;
}
op->it = roaring64_iterator_create(op->ids);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure proper error handling when roaring64_bitmap_create returns NULL, as this would lead to a failure in initializing the bitmap for IDs.

+	if (!op->ids) return OP_ERR;

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
OpBase *opBase
) {
ASSERT(opBase->type == OPType_NODE_BY_ID_SEEK);
NodeByIdSeek *op = (NodeByIdSeek *)opBase;
// The largest possible entity ID is the number of nodes - deleted and real - in the DataBlock.
size_t node_count = Graph_UncompactedNodeCount(op->g);
op->maxId = MIN(node_count - 1, op->maxId);
if(opBase->childCount > 0) OpBase_UpdateConsume(opBase, NodeByIdSeekConsumeFromChild);
op->ids = roaring64_bitmap_create();
if(opBase->childCount > 0) {
OpBase_UpdateConsume(opBase, NodeByIdSeekConsumeFromChild);
} else {
if(!FilterExpression_Resolve(op->g, op->filters, op->ids, op->child_record)) {
OpBase_UpdateConsume(opBase, NodeByIdSeekNoOpConsume);
return OP_OK;
}
op->it = roaring64_iterator_create(op->ids);
}
OpBase *opBase
) {
ASSERT(opBase->type == OPType_NODE_BY_ID_SEEK);
NodeByIdSeek *op = (NodeByIdSeek *)opBase;
op->ids = roaring64_bitmap_create();
if (!op->ids) return OP_ERR;
if(opBase->childCount > 0) {
OpBase_UpdateConsume(opBase, NodeByIdSeekConsumeFromChild);
} else {
if(!FilterExpression_Resolve(op->g, op->filters, op->ids, op->child_record)) {
OpBase_UpdateConsume(opBase, NodeByIdSeekNoOpConsume);
return OP_OK;
}
op->it = roaring64_iterator_create(op->ids);
}

Comment on lines 162 to 164
if(op->it && op->child_record == 0) {
roaring64_iterator_reinit(op->ids, op->it);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verify that the condition op->child_record == 0 is intended to check for a NULL pointer rather than comparing to zero, as this seems to be a potential logical error.

-	if(op->it && op->child_record == 0) {
+	if(op->it && op->child_record == NULL) {

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
if(op->it && op->child_record == 0) {
roaring64_iterator_reinit(op->ids, op->it);
}
if(op->it && op->child_record == NULL) {
roaring64_iterator_reinit(op->ids, op->it);
}

Comment on lines 120 to 128
bool has_filters = array_len(op->filters) > 0;

OpBase_UpdateConsume(opBase, has_filters
? NodeByLabelAndIDScanConsume
: NodeByLabelScanConsume); // default consume function

if(has_filters) {
op->ids = roaring64_bitmap_create();
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure that the bitmap for IDs is created successfully and handle potential memory allocation failures to prevent runtime errors.

+	if (!op->ids) return OP_ERR;

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
bool has_filters = array_len(op->filters) > 0;
OpBase_UpdateConsume(opBase, has_filters
? NodeByLabelAndIDScanConsume
: NodeByLabelScanConsume); // default consume function
if(has_filters) {
op->ids = roaring64_bitmap_create();
}
bool has_filters = array_len(op->filters) > 0;
OpBase_UpdateConsume(opBase, has_filters
? NodeByLabelAndIDScanConsume
: NodeByLabelScanConsume); // default consume function
if(has_filters) {
op->ids = roaring64_bitmap_create();
if (!op->ids) return OP_ERR;
}

Comment on lines 342 to 346
if(op->child_record > 0) {
OpBase_DeleteRecord(op->child_record); // Free old record.
op->child_record = NULL;
} else {
_ConstructIterator(op);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verify that the condition op->child_record > 0 is intended to check for a non-NULL pointer rather than comparing to zero, as this seems to be a potential logical error.

-	if(op->child_record > 0) {
+	if(op->child_record != NULL) {

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
if(op->child_record > 0) {
OpBase_DeleteRecord(op->child_record); // Free old record.
op->child_record = NULL;
} else {
_ConstructIterator(op);
if(op->child_record != NULL) {
OpBase_DeleteRecord(op->child_record); // Free old record.
op->child_record = NULL;
} else {
_ConstructIterator(op);

Comment on lines 362 to 396
static void NodeByLabelScanFree
(
OpBase *opBase
) {
NodeByLabelScan *op = (NodeByLabelScan *)opBase;

GrB_Info info = RG_MatrixTupleIter_detach(&(nodeByLabelScan->iter));
GrB_Info info = RG_MatrixTupleIter_detach(&(op->iter));
ASSERT(info == GrB_SUCCESS);

if(nodeByLabelScan->child_record) {
OpBase_DeleteRecord(nodeByLabelScan->child_record);
nodeByLabelScan->child_record = NULL;
if(op->child_record) {
OpBase_DeleteRecord(op->child_record);
op->child_record = NULL;
}

if(op->n != NULL) {
NodeScanCtx_Free(op->n);
op->n = NULL;
}

if(op->filters) {
for(int i = 0; i < array_len(op->filters); i++) {
AR_EXP_Free(op->filters[i].id_exp);
}
array_free(op->filters);
op->filters = NULL;
}

if(nodeByLabelScan->id_range) {
UnsignedRange_Free(nodeByLabelScan->id_range);
nodeByLabelScan->id_range = NULL;
if(op->it) {
roaring64_iterator_free(op->it);
op->it = NULL;
}

if(nodeByLabelScan->n != NULL) {
NodeScanCtx_Free(nodeByLabelScan->n);
nodeByLabelScan->n = NULL;
if(op->ids) {
roaring64_bitmap_free(op->ids);
op->ids = NULL;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure that all resources are freed in the correct order to avoid potential use-after-free errors. Verify that the iterator and filters are freed before the node scan context to maintain proper resource management.

+    if(op->it) {
+        roaring64_iterator_free(op->it);
+        op->it = NULL;
+    }
+    if(op->filters) {
+        for(int i = 0; i < array_len(op->filters); i++) {
+            AR_EXP_Free(op->filters[i].id_exp);
+        }
+        array_free(op->filters);
+        op->filters = NULL;
+    }
+    if(op->n != NULL) {
+        NodeScanCtx_Free(op->n);
+        op->n = NULL;
+    }

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
static void NodeByLabelScanFree
(
OpBase *opBase
) {
NodeByLabelScan *op = (NodeByLabelScan *)opBase;
GrB_Info info = RG_MatrixTupleIter_detach(&(nodeByLabelScan->iter));
GrB_Info info = RG_MatrixTupleIter_detach(&(op->iter));
ASSERT(info == GrB_SUCCESS);
if(nodeByLabelScan->child_record) {
OpBase_DeleteRecord(nodeByLabelScan->child_record);
nodeByLabelScan->child_record = NULL;
if(op->child_record) {
OpBase_DeleteRecord(op->child_record);
op->child_record = NULL;
}
if(op->n != NULL) {
NodeScanCtx_Free(op->n);
op->n = NULL;
}
if(op->filters) {
for(int i = 0; i < array_len(op->filters); i++) {
AR_EXP_Free(op->filters[i].id_exp);
}
array_free(op->filters);
op->filters = NULL;
}
if(nodeByLabelScan->id_range) {
UnsignedRange_Free(nodeByLabelScan->id_range);
nodeByLabelScan->id_range = NULL;
if(op->it) {
roaring64_iterator_free(op->it);
op->it = NULL;
}
if(nodeByLabelScan->n != NULL) {
NodeScanCtx_Free(nodeByLabelScan->n);
nodeByLabelScan->n = NULL;
if(op->ids) {
roaring64_bitmap_free(op->ids);
op->ids = NULL;
static void NodeByLabelScanFree
(
OpBase *opBase
) {
NodeByLabelScan *op = (NodeByLabelScan *)opBase;
GrB_Info info = RG_MatrixTupleIter_detach(&(op->iter));
ASSERT(info == GrB_SUCCESS);
if(op->child_record) {
OpBase_DeleteRecord(op->child_record);
op->child_record = NULL;
}
if(op->it) {
roaring64_iterator_free(op->it);
op->it = NULL;
}
if(op->filters) {
for(int i = 0; i < array_len(op->filters); i++) {
AR_EXP_Free(op->filters[i].id_exp);
}
array_free(op->filters);
op->filters = NULL;
}
if(op->n != NULL) {
NodeScanCtx_Free(op->n);
op->n = NULL;
}
if(op->ids) {
roaring64_bitmap_free(op->ids);
op->ids = NULL;

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

src/module.c Outdated
Comment on lines 98 to 104
static void *rm_aligned_malloc
(
size_t alignment,
size_t size
) {
return rm_malloc(size);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation of rm_aligned_malloc is correct and aligns with the memory management practices of the module. Consider adding a comment explaining the purpose of this function, especially its role in supporting roaring bitmaps.

+ // rm_aligned_malloc: Allocates memory with specified alignment, used for roaring bitmaps
  static void *rm_aligned_malloc
  (
  	size_t alignment,
  	size_t size
  ) {
  	return rm_malloc(size);
  }

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
static void *rm_aligned_malloc
(
size_t alignment,
size_t size
) {
return rm_malloc(size);
}
// rm_aligned_malloc: Allocates memory with specified alignment, used for roaring bitmaps
static void *rm_aligned_malloc
(
size_t alignment,
size_t size
) {
return rm_malloc(size);
}

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Actionable comments outside the diff hunks (2)
src/module.c (2)

Line range hint 41-41: Use formatted strings to prevent potential information leaks when logging errors.

- printf("Failed to register GraphContext type\n");
+ printf("%s", "Failed to register GraphContext type\n");

Line range hint 46-46: Use formatted strings to prevent potential information leaks when logging errors.

- printf("Failed to register GraphMeta type\n");
+ printf("%s", "Failed to register GraphMeta type\n");

Comment on lines +113 to +120
roaring_init_memory_hook((roaring_memory_t) {
.free = rm_free,
.malloc = rm_malloc,
.calloc = rm_calloc,
.realloc = rm_realloc,
.aligned_free = rm_free,
.aligned_malloc = rm_aligned_malloc
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation of roaring_init_memory_hook correctly sets up memory hooks for roaring bitmaps. Consider adding a comment explaining the purpose of each hook, especially for clarity and maintainability.

+ // Set up memory hooks for roaring bitmaps
  roaring_init_memory_hook((roaring_memory_t) {

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
roaring_init_memory_hook((roaring_memory_t) {
.free = rm_free,
.malloc = rm_malloc,
.calloc = rm_calloc,
.realloc = rm_realloc,
.aligned_free = rm_free,
.aligned_malloc = rm_aligned_malloc
});
// Set up memory hooks for roaring bitmaps
roaring_init_memory_hook((roaring_memory_t) {
.free = rm_free,
.malloc = rm_malloc,
.calloc = rm_calloc,
.realloc = rm_realloc,
.aligned_free = rm_free,
.aligned_malloc = rm_aligned_malloc
});

Comment on lines +76 to +82
RangeExpression *ranges // ID range expressions
) {
UnsignedRange_Free(op->id_range);
op->id_range = UnsignedRange_Clone(id_range);
ASSERT(op != NULL);
ASSERT(ranges != NULL);
ASSERT(op->ranges == NULL);

op->ranges = ranges;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tip

Codebase Verification

Several files in the codebase still use the old function signature NodeByLabelScanOp_SetIDRange. This contradicts the review comment suggesting a rename to NodeByLabelScanOp_SetFilterID. The locations needing attention are:

  • src/execution_plan/ops/op_node_by_label_scan.h
  • src/execution_plan/ops/op_node_by_label_scan.c
  • src/execution_plan/optimizations/seek_by_id.c
Analysis chain

Renaming NodeByLabelScanOp_SetIDRange to NodeByLabelScanOp_SetFilterID and updating the parameter type enhances the clarity and functionality of ID filtering. Ensure that all references to the old function name are updated across the project.

Scripts Executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Search for old function name to ensure it's updated everywhere.
rg --type c "NodeByLabelScanOp_SetIDRange"

Length of output: 303

Comment on lines +30 to +32
static inline void *rm_aligned_malloc(size_t alignment, size_t size) {
return rm_malloc(size);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function rm_aligned_malloc does not handle the alignment parameter, which could lead to unexpected behavior if alignment is critical for the caller. Consider implementing memory alignment handling within this function.

-	return rm_malloc(size);
+	return aligned_alloc(alignment, size);

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
static inline void *rm_aligned_malloc(size_t alignment, size_t size) {
return rm_malloc(size);
}
static inline void *rm_aligned_malloc(size_t alignment, size_t size) {
return aligned_alloc(alignment, size);
}

Comment on lines +30 to +48
OpBase *NewNodeByIdSeekOp
(
const ExecutionPlan *plan, // execution plan
const char *alias, // node alias
RangeExpression *ranges // ID range expressions
) {
NodeByIdSeek *op = rm_malloc(sizeof(NodeByIdSeek));

op->g = QueryCtx_GetGraph();
op->it = NULL;
op->ids = NULL;
op->alias = alias;
op->ranges = ranges;
op->child_record = NULL;
op->alias = alias;

op->minId = id_range->include_min ? id_range->min : id_range->min + 1;
/* The largest possible entity ID is the same as Graph_RequiredMatrixDim.
* This value will be set on Init, to allow operation clone be independent
* on the current graph size.*/
op->maxId = id_range->include_max ? id_range->max : id_range->max - 1;

op->currentId = op->minId;

OpBase_Init((OpBase *)op, OPType_NODE_BY_ID_SEEK, "NodeByIdSeek", NodeByIdSeekInit,
NodeByIdSeekConsume, NodeByIdSeekReset, NodeByIdSeekToString, NodeByIdSeekClone, NodeByIdSeekFree,
false, plan);
OpBase_Init((OpBase *)op, OPType_NODE_BY_ID_SEEK, "NodeByIdSeek",
NodeByIdSeekInit, NodeByIdSeekConsume, NodeByIdSeekReset,
NodeByIdSeekToString, NodeByIdSeekClone, NodeByIdSeekFree, false,
plan);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure to check for NULL after calling rm_malloc to prevent potential null pointer dereference.

+	if (!op) return NULL;

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
OpBase *NewNodeByIdSeekOp
(
const ExecutionPlan *plan, // execution plan
const char *alias, // node alias
RangeExpression *ranges // ID range expressions
) {
NodeByIdSeek *op = rm_malloc(sizeof(NodeByIdSeek));
op->g = QueryCtx_GetGraph();
op->it = NULL;
op->ids = NULL;
op->alias = alias;
op->ranges = ranges;
op->child_record = NULL;
op->alias = alias;
op->minId = id_range->include_min ? id_range->min : id_range->min + 1;
/* The largest possible entity ID is the same as Graph_RequiredMatrixDim.
* This value will be set on Init, to allow operation clone be independent
* on the current graph size.*/
op->maxId = id_range->include_max ? id_range->max : id_range->max - 1;
op->currentId = op->minId;
OpBase_Init((OpBase *)op, OPType_NODE_BY_ID_SEEK, "NodeByIdSeek", NodeByIdSeekInit,
NodeByIdSeekConsume, NodeByIdSeekReset, NodeByIdSeekToString, NodeByIdSeekClone, NodeByIdSeekFree,
false, plan);
OpBase_Init((OpBase *)op, OPType_NODE_BY_ID_SEEK, "NodeByIdSeek",
NodeByIdSeekInit, NodeByIdSeekConsume, NodeByIdSeekReset,
NodeByIdSeekToString, NodeByIdSeekClone, NodeByIdSeekFree, false,
plan);
NodeByIdSeek *op = rm_malloc(sizeof(NodeByIdSeek));
if (!op) return NULL;
op->g = QueryCtx_GetGraph();
op->it = NULL;
op->ids = NULL;
op->alias = alias;
op->ranges = ranges;
op->child_record = NULL;
OpBase_Init((OpBase *)op, OPType_NODE_BY_ID_SEEK, "NodeByIdSeek",
NodeByIdSeekInit, NodeByIdSeekConsume, NodeByIdSeekReset,
NodeByIdSeekToString, NodeByIdSeekClone, NodeByIdSeekFree, false,
plan);

Comment on lines +63 to +85
// create empty ID range
op->ids = roaring64_bitmap_create();
op->it = roaring64_iterator_create(op->ids);

// operation is not a tap
// update consume function
if(opBase->childCount > 0) {
OpBase_UpdateConsume(opBase, NodeByIdSeekConsumeFromChild);
return OP_OK;
}

/* As long as we're within range bounds
* and we've yet to get a node. */
while(!_outOfBounds(op)) {
if(Graph_GetNode(op->g, op->currentId, &n)) break;
op->currentId++;
// operation is a tap
// evaluate ID ranges
if(!BitmapRange_FromRanges(op->ranges, op->ids, op->child_record, 0,
Graph_UncompactedNodeCount(op->g))) {
// failed to tighten range, update consume function to return NULL
OpBase_UpdateConsume(opBase, NodeByIdSeekDepleted);
return OP_OK;
}

// Advance id for next consume call.
op->currentId++;
// ID range set, reattach iterator
roaring64_iterator_reinit(op->ids, op->it);

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure that roaring64_bitmap_create and roaring64_iterator_create do not return NULL to avoid potential issues.

+	if (!op->ids || !op->it) return OP_ERR;

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
// create empty ID range
op->ids = roaring64_bitmap_create();
op->it = roaring64_iterator_create(op->ids);
// operation is not a tap
// update consume function
if(opBase->childCount > 0) {
OpBase_UpdateConsume(opBase, NodeByIdSeekConsumeFromChild);
return OP_OK;
}
/* As long as we're within range bounds
* and we've yet to get a node. */
while(!_outOfBounds(op)) {
if(Graph_GetNode(op->g, op->currentId, &n)) break;
op->currentId++;
// operation is a tap
// evaluate ID ranges
if(!BitmapRange_FromRanges(op->ranges, op->ids, op->child_record, 0,
Graph_UncompactedNodeCount(op->g))) {
// failed to tighten range, update consume function to return NULL
OpBase_UpdateConsume(opBase, NodeByIdSeekDepleted);
return OP_OK;
}
// Advance id for next consume call.
op->currentId++;
// ID range set, reattach iterator
roaring64_iterator_reinit(op->ids, op->it);
// create empty ID range
op->ids = roaring64_bitmap_create();
op->it = roaring64_iterator_create(op->ids);
if (!op->ids || !op->it) return OP_ERR;
// operation is not a tap
// update consume function
if(opBase->childCount > 0) {
OpBase_UpdateConsume(opBase, NodeByIdSeekConsumeFromChild);
return OP_OK;
}
// operation is a tap
// evaluate ID ranges
if(!BitmapRange_FromRanges(op->ranges, op->ids, op->child_record, 0,
Graph_UncompactedNodeCount(op->g))) {
// failed to tighten range, update consume function to return NULL
OpBase_UpdateConsume(opBase, NodeByIdSeekDepleted);
return OP_OK;
}
// ID range set, reattach iterator
roaring64_iterator_reinit(op->ids, op->it);

Comment on lines +117 to +137
static Record NodeByIdSeekConsumeFromChild
(
OpBase *opBase
) {
ASSERT(opBase != NULL);

NodeByIdSeek *op = (NodeByIdSeek *)opBase;

// try to get a new record from child op
pull:
if(op->child_record == NULL) {
op->child_record = OpBase_Consume(op->op.children[0]);
if(op->child_record == NULL) return NULL;
else NodeByIdSeekReset(opBase);
op->child_record = OpBase_Consume(OpBase_GetChild(opBase, 0));
if(op->child_record == NULL) return NULL; // child depleted

// re-evealuate ID ranges
if(!BitmapRange_FromRanges(op->ranges, op->ids, op->child_record, 0,
Graph_UncompactedNodeCount(op->g))) {
return NULL;
}

roaring64_iterator_reinit(op->ids, op->it);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refine the logic to ensure that the iterator is reset only when a new record is successfully consumed from the child.

+	if(op->child_record == NULL) {
+		op->child_record = OpBase_Consume(op->op.children[0]);
+		if(op->child_record == NULL) return NULL; // Child depleted.
+		NodeByIdSeekReset(opBase);
+	}

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
static Record NodeByIdSeekConsumeFromChild
(
OpBase *opBase
) {
ASSERT(opBase != NULL);
NodeByIdSeek *op = (NodeByIdSeek *)opBase;
// try to get a new record from child op
pull:
if(op->child_record == NULL) {
op->child_record = OpBase_Consume(op->op.children[0]);
if(op->child_record == NULL) return NULL;
else NodeByIdSeekReset(opBase);
op->child_record = OpBase_Consume(OpBase_GetChild(opBase, 0));
if(op->child_record == NULL) return NULL; // child depleted
// re-evealuate ID ranges
if(!BitmapRange_FromRanges(op->ranges, op->ids, op->child_record, 0,
Graph_UncompactedNodeCount(op->g))) {
return NULL;
}
roaring64_iterator_reinit(op->ids, op->it);
static Record NodeByIdSeekConsumeFromChild
(
OpBase *opBase
) {
ASSERT(opBase != NULL);
NodeByIdSeek *op = (NodeByIdSeek *)opBase;
// try to get a new record from child op
pull:
if(op->child_record == NULL) {
op->child_record = OpBase_Consume(OpBase_GetChild(opBase, 0));
if(op->child_record == NULL) return NULL; // child depleted
// re-evealuate ID ranges
if(!BitmapRange_FromRanges(op->ranges, op->ids, op->child_record, 0,
Graph_UncompactedNodeCount(op->g))) {
return NULL;
}
roaring64_iterator_reinit(op->ids, op->it);
}

Comment on lines +225 to +252
static void NodeByIdSeekFree
(
OpBase *opBase
) {
NodeByIdSeek *op = (NodeByIdSeek *)opBase;

if(op->child_record) {
OpBase_DeleteRecord(op->child_record);
op->child_record = NULL;
}

if(op->ranges) {
for(int i = 0; i < array_len(op->ranges); i++) {
RangeExpression_Free(op->ranges + i);
}
array_free(op->ranges);
op->ranges = NULL;
}

if(op->it) {
roaring64_iterator_free(op->it);
op->it = NULL;
}

if(op->ids) {
roaring64_bitmap_free(op->ids);
op->ids = NULL;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reorder the resource freeing in NodeByIdSeekFree to avoid potential use-after-free errors. Free the iterator before freeing the filters and child records.

+	if(op->it) {
+		roaring64_iterator_free(op->it);
+		op->it = NULL;
+	}
+	if(op->filters) {
+		for(int i = 0; i < array_len(op->filters); i++) {
+			AR_EXP_Free(op->filters[i].id_exp);
+		}
+		array_free(op->filters);
+		op->filters = NULL;
+	}
+	if(op->child_record) {
+		OpBase_DeleteRecord(op->child_record);
+		op->child_record = NULL;
+	}

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
static void NodeByIdSeekFree
(
OpBase *opBase
) {
NodeByIdSeek *op = (NodeByIdSeek *)opBase;
if(op->child_record) {
OpBase_DeleteRecord(op->child_record);
op->child_record = NULL;
}
if(op->ranges) {
for(int i = 0; i < array_len(op->ranges); i++) {
RangeExpression_Free(op->ranges + i);
}
array_free(op->ranges);
op->ranges = NULL;
}
if(op->it) {
roaring64_iterator_free(op->it);
op->it = NULL;
}
if(op->ids) {
roaring64_bitmap_free(op->ids);
op->ids = NULL;
}
static void NodeByIdSeekFree
(
OpBase *opBase
) {
NodeByIdSeek *op = (NodeByIdSeek *)opBase;
if(op->it) {
roaring64_iterator_free(op->it);
op->it = NULL;
}
if(op->filters) {
for(int i = 0; i < array_len(op->filters); i++) {
AR_EXP_Free(op->filters[i].id_exp);
}
array_free(op->filters);
op->filters = NULL;
}
if(op->child_record) {
OpBase_DeleteRecord(op->child_record);
op->child_record = NULL;
}
if(op->ranges) {
for(int i = 0; i < array_len(op->ranges); i++) {
RangeExpression_Free(op->ranges + i);
}
array_free(op->ranges);
op->ranges = NULL;
}
if(op->ids) {
roaring64_bitmap_free(op->ids);
op->ids = NULL;
}

@swilly22 swilly22 merged commit bb04c83 into master Apr 19, 2024
14 checks passed
@swilly22 swilly22 deleted the runtime-seek-by-id-optimization branch April 19, 2024 09:14
swilly22 added a commit that referenced this pull request Apr 19, 2024
* 600 add browser usage for the docker image (#601)

* add instructions to run browser UI

* set hostname to 0.0.0.0

* fix crash in rewrite call subquery (#604)

* fix crash in rewrite call subquery

* simplified test query

---------

Co-authored-by: Roi Lipman <swilly22@users.noreply.github.com>

* use debug assert in rust (#606)

* fix merge init (#609)

* fix merge init

* remove unused function, simplified branch assignment logic

* clean

---------

Co-authored-by: Roi Lipman <roilipman@gmail.com>

* Unwind persist (#613)

* persist unwind record

* deep clone base record

* Disable jit (#612)

* persist unwind record

* disable GraphBLAS JIT

* Update module api (#617)

* update RediSearch submodule

* bump RediSearch version

* updated redis module api header file

* make sure there's a record to emit (#623)

* update RediSearch (#640)

* Fix #634 Add Cloud link to README.md (#635)

* fix #645 Add objective to README (#646)

* Add objective to README

* Update wordlist.txt

---------

Co-authored-by: Roi Lipman <swilly22@users.noreply.github.com>

* fix union validation with call subquery (#648)

* Seek by id runtime optimization (#643)

* wip runtime optimization

* fix leak

* fix node by label scan

* fix leak

* fix leak

* address review

* add tests

* address review

* review

* review

* review

* add memory hook to roaring bitmap

* introduce bitmap range

---------

Co-authored-by: Roi Lipman <roilipman@gmail.com>
Co-authored-by: Roi Lipman <swilly22@users.noreply.github.com>

* bump version

---------

Co-authored-by: Dudi <16744955+dudizimber@users.noreply.github.com>
Co-authored-by: Avi Avni <avi.avni@gmail.com>
Co-authored-by: Guy Korland <gkorland@gmail.com>
@gkorland gkorland linked an issue Apr 23, 2024 that may be closed by this pull request
swilly22 added a commit that referenced this pull request May 13, 2024
* 600 add browser usage for the docker image (#601)

* add instructions to run browser UI

* set hostname to 0.0.0.0

* fix crash in rewrite call subquery (#604)

* fix crash in rewrite call subquery

* simplified test query

---------

Co-authored-by: Roi Lipman <swilly22@users.noreply.github.com>

* use debug assert in rust (#606)

* fix merge init (#609)

* fix merge init

* remove unused function, simplified branch assignment logic

* clean

---------

Co-authored-by: Roi Lipman <roilipman@gmail.com>

* Unwind persist (#613)

* persist unwind record

* deep clone base record

* Disable jit (#612)

* persist unwind record

* disable GraphBLAS JIT

* Update module api (#617)

* update RediSearch submodule

* bump RediSearch version

* updated redis module api header file

* make sure there's a record to emit (#623)

* update RediSearch (#640)

* Fix #634 Add Cloud link to README.md (#635)

* fix #645 Add objective to README (#646)

* Add objective to README

* Update wordlist.txt

---------

Co-authored-by: Roi Lipman <swilly22@users.noreply.github.com>

* fix union validation with call subquery (#648)

* Seek by id runtime optimization (#643)

* wip runtime optimization

* fix leak

* fix node by label scan

* fix leak

* fix leak

* address review

* add tests

* address review

* review

* review

* review

* add memory hook to roaring bitmap

* introduce bitmap range

---------

Co-authored-by: Roi Lipman <roilipman@gmail.com>
Co-authored-by: Roi Lipman <swilly22@users.noreply.github.com>

* zero dim vector save/load (#654)

* update redisearch, do not escape TAG field (#667)

* Code coverage (#674)

* add coverage build

* rename

* fix

* fix cache

* install redis in coverage

* ignore some path in code coverage (#678)

* Ref count record (#663)

* ref count record

* nullify record on deletion

* handle record ref count within execution plan return record

* address PR comments

* remove head & tail calls to persist

* trying to simplify and make value passing safer and simpler

* free aggregated group once yield

* removed Record_PersistScalars

* removed override flag

* remove extra persists

* group no longer holds a keys array

* only persists keys which are not graph entities

* allow for record cloning from different execution plans

---------

Co-authored-by: Dudi <16744955+dudizimber@users.noreply.github.com>
Co-authored-by: Avi Avni <avi.avni@gmail.com>
Co-authored-by: Guy Korland <gkorland@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Extend support of the "Node By ID Scan" optimization
2 participants