Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wrong data quality values when the same flow occurs multiple times #120

Open
msrocka opened this issue Oct 15, 2020 · 0 comments
Open

wrong data quality values when the same flow occurs multiple times #120

msrocka opened this issue Oct 15, 2020 · 0 comments
Labels

Comments

@msrocka
Copy link
Member

msrocka commented Oct 15, 2020

When the same flow occurs multiple times in a process but with different data quality values:

image

only one data quality value is selected for the flow of the process in the result calculation:

image

this is an issue in openLCA 1.10.x and 2.0.x. In openLCA 2.0.x however, it works when the flow has different locations in the process and a regionalized calculation is performed:

image

image

but not with a normal calculation.

In openLCA 2.0 we collect the DQ values in a single table scan and organize them in flow * process matrices for the different DQ indicators (just allocating single bytes for DQI scores). This structure is fast for calculating aggregated DQ results based on inventory and impact results. But in order to fix this bug we need to calculate process based aggregations during the table scan. Depending on the aggregation function this has to include the respective amount of the flow which again may depends on formulas, unit conversions etc... thus, the best would be to collect the DQ information directly in the inventory builder where the formula interpreter, conversion table, allocation factors etc. are present. And then we could attach the DQ matrices to the output matrices of the inventory builder (as we do for the uncertainty parameters)...

To calculate such process based aggregations in the table scan, we need to hold the total amounts etc., e.g:

exchanges = [
  (1, 0.5, [1, 2, 3, 4, 5]),
  (2, -1.0, [5, 5, 5, 5, 5]),
  (1, 2.0, [1, 1, 1, 1, 1]),
  (2, 1.0, [1, 1, 1, 1, 1]),
  (2, -1.0, [1, 1, 1, 1, 1]),
]

total_amounts = Dict{Int, Float64}()
total_dqs = Dict{Int, Array{Float64}}()

for (flowID, amount, dq) in exchanges
  total_amount = get(total_amounts, flowID, 0.0)
  if total_amount == 0
    total_amounts[flowID] = abs(amount)
    total_dqs[flowID] = copy(dq)
    continue
  end

  total_dq = total_dqs[flowID]
  abs_amount = abs(amount)
  next_total = total_amount + abs_amount
  for i = 1 : length(total_dq)
    total_dq[i] = (
      total_dq[i] * total_amount + dq[i] * abs_amount
      ) / next_total
  end
  total_amounts[flowID] = next_total
end

Another problem is that our BMatrix structures where we store the DQ values may have to grow during the table scan as we build the flow index during that scan. Also there is currently no way to calculate result based DQ values for intermediate product and waste flows (e.g. based on their total requirements). When we fix this bug, we may also should think about this.

@msrocka msrocka added the bug label Oct 15, 2020
@msrocka msrocka added this to the v2.0 milestone Jul 29, 2022
@msrocka msrocka removed this from the v2.0 milestone May 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant