Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Axis scale includes null value counts #222

Open
willeppy opened this issue Nov 29, 2023 · 5 comments
Open

Axis scale includes null value counts #222

willeppy opened this issue Nov 29, 2023 · 5 comments

Comments

@willeppy
Copy link

willeppy commented Nov 29, 2023

When plotting a binned bar chart, I noticed that the y axis scale is affected by the presence of nulls. Is this expected?

For example, the following chart does not include null values and the y axis has a max of 1:

await vg.coordinator().exec(vg.loadObjects("testData", [{ colA: 1 }, { colA: 2 }]));

vg.plot(
  vg.rectY(vg.from("testData"), {
    x: vg.bin("colA"),
    y: vg.count(),
    inset: 0.5,
  }),
  vg.height(200)
);
Screenshot 2023-11-29 at 6 19 58 PM

Whereas this data has null values and the y axis max is 6

 await vg.coordinator()
    .exec(
      vg.loadObjects("testData", [
        { colA: 1 },
        { colA: 2 },
        { colA: null },
        { colA: null },
        { colA: null },
        { colA: null },
        { colA: null },
        { colA: null },
        { colA: null },
      ])
    );

 
vg.plot(
  vg.rectY(vg.from("testData"), {
    x: vg.bin("colA"),
    y: vg.count(),
    inset: 0.5,
  }),
  vg.height(200)
);
Screenshot 2023-11-29 at 6 19 29 PM

I would expect these two plots to look the same, however it seems the presence of nulls is affecting the axis scale. One workaround I've found is to first create a view that filters out nulls before plotting but curious if this should not be the default behavior?

@willeppy
Copy link
Author

One follow up comment to this is that I quite like that the default behavior for bars is to actually plot the nulls:

await vg
    .coordinator()
    .exec(
      vg.loadObjects("testData", [
        { colA: 1 },
        { colA: 2 },
        { colA: null },
        { colA: null },
        { colA: null },
        { colA: null },
        { colA: null },
        { colA: null },
      ])
    );

  return vg.vconcat(
    vg.plot(
      vg.barX(vg.from("testData"), {
        x: vg.count(),
        y: "colA",
        order: "colA",
      }),
      vg.height(200)
    )
  );
Screenshot 2023-11-29 at 6 39 33 PM

This makes more sense to me than the example above since the nulls are actually plotted. Is it reasonable to expect nulls to be plotted for bars but filtered out when they are not actually included in the plot?

@jheer
Copy link
Member

jheer commented Nov 29, 2023

The aggregate (bin/count) query must be returning an entry corresponding to the null values. I'm guessing the resulting x1 and x2 values that map to the x-axis are null. As a result, Observable Plot does not draw a corresponding bar but does include the count in the axis scale determination.

The question is where we might want null filtering/suppression to kick in. Should the underlying query suppress nulls? In general I don't think so, as your barX example suggests. But we might try to do this as part of the semantics of the bin transform? Or via an explicit push-down null filter option? Or something else?

@domoritz
Copy link
Member

I think it would be nice to have a separate bar for nulls or an indicator how many nulls there are nulls in a histogram similar to tableau. Then the right things would probably be to remove the nulls before passing to plot but still query for them.

@jheer
Copy link
Member

jheer commented Nov 30, 2023

@domoritz we can already do this with an ordinal domain, but with rectX/Y here we have a solely continuous domain. But the bin transform should already work with barX/Y for an ordinal scale including nulls.

@domoritz
Copy link
Member

I'm thinking of something like https://vega.github.io/vega/examples/histogram-null-values/.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants