Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

p5.XML.listChildren() inserts #text unexpectedly #7047

Open
1 of 17 tasks
nickmcintyre opened this issue May 16, 2024 · 6 comments
Open
1 of 17 tasks

p5.XML.listChildren() inserts #text unexpectedly #7047

nickmcintyre opened this issue May 16, 2024 · 6 comments

Comments

@nickmcintyre
Copy link
Member

nickmcintyre commented May 16, 2024

Most appropriate sub-area of p5.js?

  • Accessibility
  • Color
  • Core/Environment/Rendering
  • Data
  • DOM
  • Events
  • Image
  • IO
  • Math
  • Typography
  • Utilities
  • WebGL
  • Build process
  • Unit testing
  • Internationalization
  • Friendly errors
  • Other (specify if possible)

p5.js version

1.9.3

Web browser and version

Chrome 124.0.6367.208

Operating system

macOS 14.4.1

Steps to reproduce this

Steps:

  1. Call myXML.listChildren().
  2. View the returned array.

Snippet:

/*
<?xml version="1.0"?>
<animals>
  <mammal id="0" species="Capra hircus">Goat</mammal>
  <mammal id="1" species="Panthera pardus">Leopard</mammal>
  <mammal id="2" species="Equus zebra">Zebra</mammal>
  <reptile id="3" species="Caretta caretta">Turtle</reptile>
</animals>
*/

let myXML;

// Load the XML and create a p5.XML object.
function preload() {
  myXML = loadXML('assets/animals.xml');
}

function setup() {
  noCanvas();

  // Get the names of the element's children as an array.
  let children = myXML.listChildren();

  // Print the array to the console.
  // Desired: ["mammal", "mammal", "mammal", "reptile"]
  // Actual: ["#text", "mammal", "#text", "mammal", "#text", "mammal", "#text", "reptile", "#text"]
  print(children);
}

Here's the sketch for reference.

@kris08052000
Copy link

Hi,

I found the bug,

If you run this correct code in here it works, I just did changes without setting up the code in my machine.

  1. Go here add this place for testing https://editor.p5js.org/mcintyre/sketches/jnRZA8KHw
  2. Add this code

let myXML;

// Load the XML and create a p5.XML object.
function preload() {
myXML = loadXML('assets/animals.xml');
}

function listChildren(xml) {
const arr = [];
for (let i = 0; i < xml.DOM.childNodes.length; i++) {
// Only include element nodes, filter out text nodes
if (xml.DOM.childNodes[i].nodeType === 1) { // 1 is the nodeType for element nodes
arr.push(xml.DOM.childNodes[i].nodeName);
}
}
return arr;
}

function setup() {
noCanvas();

// Get the names of the element's children as an array.
let children = listChildren(myXML);

// Print the array to the console.
// Desired: ["mammal", "mammal", "mammal", "reptile"]
// Actual: ["#text", "mammal", "#text", "mammal", "#text", "mammal", "#text", "reptile", "#text"]
print(children);
}

  1. Run it, it works.

@kris08052000
Copy link

kris08052000 commented May 17, 2024

If you want to do full code changes in repository you need to make changes in these places.

  1.    [p5.js]/src/io/p5.XML.js  :   Here Inside listChildren() function,  add the  if statement to // Only include element nodes, filter out text nodes.
    

correct code :

listChildren(xml) {
const arr = [];
for (let i = 0; i < xml.DOM.childNodes.length; i++) {
// Only include element nodes, filter out text nodes
if (xml.DOM.childNodes[i].nodeType === 1) { // 1 is the nodeType for element nodes
arr.push(xml.DOM.childNodes[i].nodeName);
}
}
return arr;
}

  1. Inside the Setup() function : Give the "myxml" as arguments of the listChildren() function.

correct code :

function setup() {
noCanvas();

// Get the names of the element's children as an array.
let children = listChildren(myXML);

// Print the array to the console.
// Desired: ["mammal", "mammal", "mammal", "reptile"]
print(children);
}

@raquelcps
Copy link

👋
I don't see that this bug has been assigned to anyone or that a PR is up so I'd like to volunteer...unless @kris08052000 you were planning to do it?

I suspect the suggested solution of checking for nodeType will work so I would follow through with that idea and also add a test to the existing suite.

I did read through the contributor docs, but I am new to this codebase and contributing community so apologies if I'm not following protocol in some way.
✌️

@limzykenneth
Copy link
Member

Simply filtering out #text nodes will not work as #text nodes are legitimate child nodes. For example

<?xml version="1.0"?>
<animals>
  There are 4 animals here.
  <mammal id="0" species="Capra hircus">Goat</mammal>
  <mammal id="1" species="Panthera pardus">Leopard</mammal>
  <mammal id="2" species="Equus zebra">Zebra</mammal>
  <reptile id="3" species="Caretta caretta">Turtle</reptile>
</animals>

One would expect the #text node containing the string "There are 4 animals here." to be one of the child nodes of <animals>. The "extra" #text nodes that shows up in this case is because of how whitespace is treated in XML (and HTML as well), in that any whitespace either space characters or new line characters still count as text in the tags, more here.

A possible way is to only ignore #text nodes with only whitespace content but this implementation should first evaluate possible performance impact.

@raquelcps
Copy link

@limzykenneth thanks for the link and explanation about the string being considered a child.

So with listChildren(), my question is, what is the intended behavior as to what counts as a child - is it all nodes (which include elements, strings and comments) or strictly elements?

Based on the language used in the Reference docs, it seems like only elements should be included as children.
Here'e the Reference description for listChildren() method:

Returns an array with the names of the element's child elements as Strings.

How I understand the description, I do not expect that calling the method on this

<?xml version="1.0"?>
<animals>
  There are 4 animals here.
  <mammal id="0" species="Capra hircus">Goat</mammal>
  <mammal id="1" species="Panthera pardus">Leopard</mammal>
  <mammal id="2" species="Equus zebra">Zebra</mammal>
  <reptile id="3" species="Caretta caretta">Turtle</reptile>
</animals>

will include "There are 4 animals here" in the returned array.

What's your take on the description and the intended behavior of the method?

@limzykenneth
Copy link
Member

limzykenneth commented May 31, 2024

@raquelcps For me, elements is basically everything. What is being referred to as elements here is more usually referred to as tags as in HTML/XML tags, although the definition can drift.

Regardless, in terms of utility, I do think having #text included is usually more useful than not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants