Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect measurement of area between intersecting 2D curves #20

Open
rafalszulejko opened this issue May 19, 2022 · 11 comments
Open

Incorrect measurement of area between intersecting 2D curves #20

rafalszulejko opened this issue May 19, 2022 · 11 comments

Comments

@rafalszulejko
Copy link

It appears that either my understanding of area between the curves or its calculation in the library is incorrect (the referenced paper is paywalled). In the following example, I have two plots, where grey line is original data, and there are two different blue splines. Visually, you can clearly see that the area between two lines on the left plot is several times larger than the area on the right plot, but the calculation with similaritymeasures.area_between_two_curves shows only a 2x difference.
image
image

Here is the GitHub gist, where I present the calculation (the GDrive zip with airfoils is public, so the whole thing can be ran in Colab or elsewhere if you modify the path in the 2nd cell):
https://gist.github.com/rafalszulejko/2c9ff645b448d60d857975a8f7965045#file-wing-optimization2-ipynb

@cjekel
Copy link
Owner

cjekel commented May 19, 2022

The algorithm is not behind a paywall, and here is a copy of the paper if you want to read it.

You are not using enough discrete data points to effectively calculate the area. The method makes no assumptions about the curvature between discrete data points, other than data points can be connected with a line. This is important, because straight lines allows us to use quadrilaterals.

Let me run that script and plot the quadrilaterals so you can get a sense of what is happening. Give me an hour or so.

@rafalszulejko
Copy link
Author

Thank you for the quick reply. To clarify - these big blue points are just nodes, through which the splines are interpolated. Then the spline is evaluated to produce discrete data points.
The actual blue line (out in the notebook) has 101 data points, and the grey one (af_pts) has 97 - is that still not enough?

I really appreciate you looking into the script, thank you!

@cjekel
Copy link
Owner

cjekel commented May 19, 2022

When I ran your code, I noticed there were only 15 data points... Are you indeed using 100? Let me plot the quadrilaterals still to illustrate what's happening.

Edit: Ah I see now you did have more than 15 points in the comparison... It might be a sampling rate issue.

@rafalszulejko
Copy link
Author

rafalszulejko commented May 19, 2022

I believe you are still referring to XX and YY variables, which in the second plot indeed have the length of 15. As implicitly mentioned in my previous comment, while important to my project, this variable is unrelated to area calculation. The data points are stored in af_pts and out arrays. I am sorry, I should have stated that clearly in the first place, and the plot should better reflect the problem and not something I wanted to illustrate in my project.

I removed that information from the plot. Now blue points refer to the evaluated points from the spline (the data is unchanged). I updated the gist: https://gist.github.com/rafalszulejko/2c9ff645b448d60d857975a8f7965045#file-wing-optimization2-ipynb

image

@cjekel
Copy link
Owner

cjekel commented May 19, 2022

The quadrilaterals actually look good... seems like you are actually getting a good area value...
image

@cjekel
Copy link
Owner

cjekel commented May 19, 2022

So here is a convergent study by increasing the data points. You can see the case on the right sees a more drastic change than the airfoil on the left.

I would consider using at least 200 data points.

Instead of doing a area comparison directly, you might want to compare the area difference relative to the true cross sectional area of the airfoil. This should help normalize the area comparisons.

My bisection algorithm to add more data points is very slow. If you an make both curves have the exact same number of data points, then the calculations should be quicker. ( at least I think so from memory )

Again the quadrilaterals all look good.

202 data

  unew = np.arange(0, 1.01, 0.005) #dlaczego 1.01 jest lepsze niż u[-1]???

image

1010 data

  unew = np.arange(0, 1.01, 0.001) #dlaczego 1.01 jest lepsze niż u[-1]???

image

2022 data

  unew = np.arange(0, 1.01, 0.0005) #dlaczego 1.01 jest lepsze niż u[-1]???

image

xxxx data (still running)...

  unew = np.arange(0, 1.01, 0.0001) #dlaczego 1.01 jest lepsze niż u[-1]???

@cjekel
Copy link
Owner

cjekel commented May 19, 2022

And another tip.

def interpolateAirfoilWithNodes(af_ptsx, af_ptsy, nodes):
  LE_index = np.argmin(af_ptsx)
  xx, yy = getXYnodes(af_ptsx, af_ptsy, nodes);
  tck, u = interpolate.splprep([xx, yy], s=0)
  unew = np.arange(0, 1.01, 0.01) #dlaczego 1.01 jest lepsze niż u[-1]???
  out = interpolate.splev(unew, tck)
  return out, u

Instead of picking unew evenly spaced, you may want to sample unew exactly at af_ptsx to ensure the exact same sampling rate with respect to the x axis.

@rafalszulejko
Copy link
Author

rafalszulejko commented May 19, 2022

Interesting - these new results more closely resemble what I have just obtained with MATLAB polyshapes. For original numbers of data points (97 and 101), I obtained 0.006484 for the left plot and 0.001548 for the right. I think I disabled every automatic correction, but I am not 100% sure this should be the 'ground truth'.

@rafalszulejko
Copy link
Author

rafalszulejko commented May 21, 2022

Okay, here is the gist with calculation of area in MATLAB, I just exported the data points from the notebook and put into the script.
https://gist.github.com/rafalszulejko/6c961edb0cf4f1b1fc544678f90ca65d

Are you able to run it? And even if we can take its value as the ground truth, what then? I am not very familiar with how things are done in open source community. I am also not sure if I could fix this myself (and post a PR).

@cjekel
Copy link
Owner

cjekel commented Jun 6, 2022

I'm sorry for the delayed response. I've been completely swamped, preparing , and I'm currently at a conference. This is airfoil work is something I am interested in.

I have not had a chance to run the matlab routine.

I will just say, my method is always going to be an approximate method. One thing you can do to improve the approximations, would be to try tweaking the number of data points as I suggested. Either with more data points, or by tweaking such that the sampling rate is better.

That said I do have some new ideas on how to get a better, more 'exact' area.

Where are you on this? I expect I'll have more free time towards the end of the month.

@rafalszulejko
Copy link
Author

No worries, the project I'm working on is at a stage when I can comfortably work on a different part of it. The workaround helps a lot (and I really appreciate the time you put into that problem already) for a quick estimation, though for a step equal to 0.0005 we still have 5.7% error for the plot on the right assuming MATLAB's result is correct. Also I am not sure if and how the shape of the calculated area influences the error.

If you'd like to and have time to continue investigation of this, I'll be happy to provide more "problematic" data along with results obtained with matlab.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants