Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract nodes from json based on user input preserveing a portion of the higher level object as well #247

Open
vineetsingh065 opened this issue Nov 18, 2022 · 1 comment

Comments

@vineetsingh065
Copy link

vineetsingh065 commented Nov 18, 2022

I need to extract object from the given json based on the node chain passed by user and neglect those which are not in
user input, then create a new json object

my master json is :

{
    	"menustructure": 
    	[
    			{
    			 "node":"Admin",
    			 "path":"admin",
    				"child":[
    						{
    						    "node": "Admin.resouce1",
    							"path":"resouce1",
    							"rank":1
    						 },
    					
    						{"node":"Admin.resouce2",
    							"path": "oath",
    							"rank":2
    						}
    					   ]
    			},
    			{
    				"node":"Workspace",
    				"path": "wsp",
    				"child":[{
    						"node": "Workspace.system1",
    						"path":"sys1"
    					
    					},
    					{
    						"node": "Workspace.system2",
    						"path":"sys2"
    					}
    				]
    			}
    
    		
    	]
    }

for example if user pass ['Admin.resource1', 'Workspace'] so expeceted ouput json will be
Note '.' in element of user inputted list means that node have child nodes and new json will be having all those child node details including parent node details.

{
 	"menustructure": 
 	[
 			{
 			 "node":"Admin",
 			 "path":"admin",
 				"child":[
 						{
 						    "node": "Admin.resouce1",
 							"path":"resouce1",
 							"rank":1
 						 }
 					   ]
 			},
 			{
 				"node":"Workspace",
 				"path": "wsp",
 				"child":[{
 						"node": "Workspace.system1",
 						"path":"sys1"
 					
 					},
 					{
 						"node": "Workspace.system2",
 						"path":"sys2"
 					}
 				]
 			}
 
 		
 	]
 }

or another example is : ['Admin.resouce2', 'workspace.system1'] then expected json will be:

  {
   	"menustructure": 
   	[
   			{
   			 "node":"Admin",
   			 "path":"admin",
   				"child":[
   						
   						{"node":"Admin.resouce2",
   							"path": "oath",
   							"rank":2
   						}
   					   ]
   			},
   			{
   				"node":"Workspace",
   				"path": "wsp",
   				"child":[{
   						"node": "Workspace.system1",
   						"path":"sys1"
   					
   					}
   				]
   			}
   	]
   }

or if only single node passed ['Admin'] then output json will be:

{
    	"menustructure": 
    	[
    			{
    			 "node":"Admin",
    			 "path":"admin",
    				"child":[
    						{
    						    "node": "Admin.resouce1",
    							"path":"resouce1",
    							"rank":1
    						 },
    					
    						{"node":"Admin.resouce2",
    							"path": "oath",
    							"rank":2
    						}
    					   ]
    			}	
    	]
    }

or another example is : ['Admin.resouce1', 'Admin.resouce2'] then expected json will be:

{
 	"menustructure": 
 	[
 			{
 			 "node":"Admin",
 			 "path":"admin",
 				"child":[
 						
 						{"node":"Admin.resouce1",
 							"path": "oath",
 							"rank":1
 						},
                                             {"node":"Admin.resouce2",
 							"path": "oath",
 							"rank":2
 						}
 					   ]
 			},
 			{
 				"node":"Workspace",
 				"path": "wsp",
 				"child":[{
 						"node": "Workspace.system1",
 						"path":"sys1"
 					
 					}
 				]
 			}
 	]
 }

How would I achieve that using Glom?

@kurtbrose
Copy link
Collaborator

kurtbrose commented Nov 28, 2022

First off, it's a recursive problem so we'd need to use Ref in order to get that problem.

Then, it's a filtration problem, so we will need to use the SKIP marker object to drop things which don't match.

Additionally, there are two inputs here: one is the nested nodes, the other is the list of attribute nodes.

This might be better solved with a direct recursion or the remap() recursion helper from boltons. I don't want to over-promise that glom is the right solution. But, I'll take a shot at it.

Reformulating the problem:

  • we want to recurse on "child" attribute of dict
  • for every dict, if the "node" attribute is in the input set, completely hold onto it and all child nodes
  • on the way back out, we want to drop any nodes that weren't included in the result set, or whose children weren't included in the result set
node_spec = Ref("node-spec", 
   Or(
      # case 1: this node is in the input; return it and all children
      And(lambda t: t["node"] in t[S]["input-nodes"], T),
      # case 2: one of the children is in the input
      And((
         A.node,
         "child",
         [Ref("node-spec")],
         Merge(S.node, 

Oof, even as I'm writing this I can tell it's a bad fit for glom; there are multiple inputs and a lot of internal state. This is really just a recursion problem. I'm not going to bother trying to finish it, it would be really tortured.

def filter_node(cur, to_include):
   if cur["node"] in to_include:
      return cur
   children = []
   for child in cur["child"]:
      filtered = filter_node(child)
      if filtered:
         children.append(filtered)
    if children:
      return {**cur, "child": children}
   return None

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants