Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use fixed point math for Offer resource calculations #161

Open
erikdw opened this issue Jul 21, 2016 · 0 comments
Open

use fixed point math for Offer resource calculations #161

erikdw opened this issue Jul 21, 2016 · 0 comments

Comments

@erikdw
Copy link
Collaborator

erikdw commented Jul 21, 2016

Problem Description

During testing of PR #154 we noticed that the floating point math done in the AggregatedOffers logic can lead to incorrectly accounting for the resources used. e.g., with CPUs available being 4.1 and then having a worker need 1.1 CPUs, the subtraction lead to the CPUs being 2.9999999999999996 instead of 3.0 as expected. [1]

Effect?

The effect of this will be a failure to squeeze every bit of resources out of a set of Offers.

Implementation Options

Switch from Double/double to one of these fixed-point implementations:

  1. BigDecimal
  2. decimal4j (supposedly more performant than BigDecimal, although I'm not sure that's a concern for us)

Appendix

[1] Specifically, this value is the "bad" one: CreateMesosWorkerSlot()'s aggregatedOffers.availableResources[0].value.totalAvailableResource.

erikdw added a commit to erikdw/storm-mesos that referenced this issue Jul 22, 2016
…e being

1. Our floating point tabulations in AggregatedOffers can lead to confusing results
where a 1.0 becomes 0.999999999996 for example.  We add a fudge factor by adding 0.01
more CPU resources in one test to work around this until the following issue is fixed:
  * mesos#161
2. Since we are disabling the ability of allSlotsAvailableForScheduling to account
for supervisors already existing when calculating resource needs, we disable the test
that is validating that behavior.  That will be reenabled once we fix this issue:
* mesos#160
erikdw added a commit to erikdw/storm-mesos that referenced this issue Jul 22, 2016
…e being

1. Our floating point tabulations in AggregatedOffers can lead to confusing results
where a 1.0 becomes 0.999999999996 for example.  We add a fudge factor by adding 0.01
more CPU resources in one test to work around this until the following issue is fixed:
  * mesos#161
2. Since we are disabling the ability of allSlotsAvailableForScheduling to account
for supervisors already existing when calculating resource needs, we disable the test
that is validating that behavior.  That will be reenabled once we fix this issue:
* mesos#160

Also a few cosmetic-ish changes:
1. Use more standard hostnames with a domain of "example.org", a standard domain
for documentation.  See http://example.org
2. Rearrange/renumber some of the offers to prevent confusion.
erikdw added a commit to erikdw/storm-mesos that referenced this issue Jul 23, 2016
…e being

1. Our floating point tabulations in AggregatedOffers can lead to confusing results
where a 1.0 becomes 0.999999999996 for example.  We add a fudge factor by adding 0.01
more CPU resources in one test to work around this until the following issue is fixed:
  * mesos#161
2. Since we are disabling the ability of allSlotsAvailableForScheduling to account
for supervisors already existing when calculating resource needs, we disable the test
that is validating that behavior.  That will be reenabled once we fix this issue:
* mesos#160

Also a few cosmetic-ish changes:
1. Use more standard hostnames with a domain of "example.org", a standard domain
for documentation.  See http://example.org
2. Rearrange/renumber some of the offers to prevent confusion.
JessicaLHartog pushed a commit to JessicaLHartog/mesos-storm that referenced this issue Jul 23, 2016
…e being

1. Our floating point tabulations in AggregatedOffers can lead to confusing results
where a 1.0 becomes 0.999999999996 for example.  We add a fudge factor by adding 0.01
more CPU resources in one test to work around this until the following issue is fixed:
  * mesos#161
2. Since we are disabling the ability of allSlotsAvailableForScheduling to account
for supervisors already existing when calculating resource needs, we disable the test
that is validating that behavior.  That will be reenabled once we fix this issue:
* mesos#160

Also a few cosmetic-ish changes:
1. Use more standard hostnames with a domain of "example.org", a standard domain
for documentation.  See http://example.org
2. Rearrange/renumber some of the offers to prevent confusion.
@erikdw erikdw changed the title use BigDecimal or fixed point math for Offer resource calculations use fixed point math for Offer resource calculations Jul 27, 2016
JessicaLHartog pushed a commit to JessicaLHartog/mesos-storm that referenced this issue Jul 28, 2016
…e being

1. Our floating point tabulations in AggregatedOffers can lead to confusing results
where a 1.0 becomes 0.999999999996 for example.  We add a fudge factor by adding 0.01
more CPU resources in one test to work around this until the following issue is fixed:
  * mesos#161
2. Since we are disabling the ability of allSlotsAvailableForScheduling to account
for supervisors already existing when calculating resource needs, we disable the test
that is validating that behavior.  That will be reenabled once we fix this issue:
* mesos#160

Also a few cosmetic-ish changes:
1. Use more standard hostnames with a domain of "example.org", a standard domain
for documentation.  See http://example.org
2. Rearrange/renumber some of the offers to prevent confusion.
JessicaLHartog pushed a commit to JessicaLHartog/mesos-storm that referenced this issue Jul 29, 2016
…e being

1. Our floating point tabulations in AggregatedOffers can lead to confusing results
where a 1.0 becomes 0.999999999996 for example.  We add a fudge factor by adding 0.01
more CPU resources in one test to work around this until the following issue is fixed:
  * mesos#161
2. Since we are disabling the ability of allSlotsAvailableForScheduling to account
for supervisors already existing when calculating resource needs, we disable the test
that is validating that behavior.  That will be reenabled once we fix this issue:
* mesos#160

Also a few cosmetic-ish changes:
1. Use more standard hostnames with a domain of "example.org", a standard domain
for documentation.  See http://example.org
2. Rearrange/renumber some of the offers to prevent confusion.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant