C/S update error #188

hdoupe · 2021-11-08T15:41:03Z

I'm getting this error when trying to update Tax-Brain for Compute Studio:

ValueError: Must have equal len keys and value when setting with an iterable

Full stack trace

_______________________________________________________________________________ TestFunctions1.test_run_model ________________________________________________________________________________

self = <cs_config.tests.test_functions.TestFunctions1 object at 0x7fefc584fb50>

    def test_run_model(self):
        self.test_all_data_specified()
        inputs = self.get_inputs({})
        check_get_inputs(inputs)
    
        class MetaParams(Parameters):
            defaults = inputs["meta_parameters"]
    
        mp_spec = MetaParams().specification(serializable=True)
    
>       result = self.run_model(mp_spec, self.ok_adjustment)

../miniconda3/envs/taxbrain-dev/lib/python3.9/site-packages/cs_kit/validate.py:221: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
cs-config/cs_config/functions.py:150: in run_model
    results = compute(*delayed_list)
../miniconda3/envs/taxbrain-dev/lib/python3.9/site-packages/dask/base.py:568: in compute
    results = schedule(dsk, keys, **kwargs)
../miniconda3/envs/taxbrain-dev/lib/python3.9/site-packages/dask/threaded.py:79: in get
    results = get_async(
../miniconda3/envs/taxbrain-dev/lib/python3.9/site-packages/dask/local.py:517: in get_async
    raise_exception(exc, tb)
../miniconda3/envs/taxbrain-dev/lib/python3.9/site-packages/dask/local.py:325: in reraise
    raise exc
../miniconda3/envs/taxbrain-dev/lib/python3.9/site-packages/dask/local.py:223: in execute_task
    result = _execute_task(task, data)
../miniconda3/envs/taxbrain-dev/lib/python3.9/site-packages/dask/core.py:121: in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
cs-config/cs_config/helpers.py:187: in nth_year_results
    agg1, agg2 = fuzzed(dv1, dv2, reform_affected, 'aggr')
cs-config/cs_config/helpers.py:158: in fuzzed
    group2.iloc[idx] = group1.iloc[idx]
../miniconda3/envs/taxbrain-dev/lib/python3.9/site-packages/pandas/core/indexing.py:723: in __setitem__
    iloc._setitem_with_indexer(indexer, value, self.name)
../miniconda3/envs/taxbrain-dev/lib/python3.9/site-packages/pandas/core/indexing.py:1730: in _setitem_with_indexer
    self._setitem_with_indexer_split_path(indexer, value, name)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <pandas.core.indexing._iLocIndexer object at 0x7fef9101f7c0>, indexer = (223671, slice(None, None, None))
value = c62100                  9009.470531
aftertax_income        10535.143557
payrolltax              1378.448991
benefit_va...          1.000000
benefit_cost_total         0.000000
s006                     231.800000
Name: 223671, dtype: float64
name = 'iloc'

    def _setitem_with_indexer_split_path(self, indexer, value, name: str):
        """
        Setitem column-wise.
        """
        # Above we only set take_split_path to True for 2D cases
        assert self.ndim == 2
    
        if not isinstance(indexer, tuple):
            indexer = _tuplify(self.ndim, indexer)
        if len(indexer) > self.ndim:
            raise IndexError("too many indices for array")
        if isinstance(indexer[0], np.ndarray) and indexer[0].ndim > 2:
            raise ValueError(r"Cannot set values with ndim > 2")
    
        if (isinstance(value, ABCSeries) and name != "iloc") or isinstance(value, dict):
            from pandas import Series
    
            value = self._align_series(indexer, Series(value))
    
        # Ensure we have something we can iterate over
        info_axis = indexer[1]
        ilocs = self._ensure_iterable_column_indexer(info_axis)
    
        pi = indexer[0]
        lplane_indexer = length_of_indexer(pi, self.obj.index)
        # lplane_indexer gives the expected length of obj[indexer[0]]
    
        # we need an iterable, with a ndim of at least 1
        # eg. don't pass through np.array(0)
        if is_list_like_indexer(value) and getattr(value, "ndim", 1) > 0:
    
            if isinstance(value, ABCDataFrame):
                self._setitem_with_indexer_frame_value(indexer, value, name)
    
            elif np.ndim(value) == 2:
                self._setitem_with_indexer_2d_value(indexer, value)
    
            elif len(ilocs) == 1 and lplane_indexer == len(value) and not is_scalar(pi):
                # We are setting multiple rows in a single column.
                self._setitem_single_column(ilocs[0], value, pi)
    
            elif len(ilocs) == 1 and 0 != lplane_indexer != len(value):
                # We are trying to set N values into M entries of a single
                #  column, which is invalid for N != M
                # Exclude zero-len for e.g. boolean masking that is all-false
    
                if len(value) == 1 and not is_integer(info_axis):
                    # This is a case like df.iloc[:3, [1]] = [0]
                    #  where we treat as df.iloc[:3, 1] = 0
                    return self._setitem_with_indexer((pi, info_axis[0]), value[0])
    
                raise ValueError(
                    "Must have equal len keys and value "
                    "when setting with an iterable"
                )
    
            elif lplane_indexer == 0 and len(value) == len(self.obj.index):
                # We get here in one case via .loc with a all-False mask
                pass
    
            elif len(ilocs) == len(value):
                # We are setting multiple columns in a single row.
                for loc, v in zip(ilocs, value):
                    self._setitem_single_column(loc, v, pi)
    
            elif len(ilocs) == 1 and com.is_null_slice(pi) and len(self.obj) == 0:
                # This is a setitem-with-expansion, see
                #  test_loc_setitem_empty_append_expands_rows_mixed_dtype
                # e.g. df = DataFrame(columns=["x", "y"])
                #  df["x"] = df["x"].astype(np.int64)
                #  df.loc[:, "x"] = [1, 2, 3]
                self._setitem_single_column(ilocs[0], value, pi)
    
            else:
>               raise ValueError(
                    "Must have equal len keys and value "
                    "when setting with an iterable"
                )
E               ValueError: Must have equal len keys and value when setting with an iterable

../miniconda3/envs/taxbrain-dev/lib/python3.9/site-packages/pandas/core/indexing.py:1808: ValueError

It looks like the error is being thrown when assigning the values of one dataframe to another one (line 158):

Tax-Brain/cs-config/cs_config/helpers.py

Lines 150 to 159 in 2c63597

    
           for name, group2 in gdf2: 
        
               group2 = copy.deepcopy(group2) 
        
               indices = np.where(group2['reform_affected']) 
        
               num = min(len(indices[0]), NUM_TO_FUZZ) 
        
               if num > 0: 
        
                   choices = np.random.choice(indices[0], size=num, replace=False) 
        
                   group1 = gdf1.get_group(name) 
        
                   for idx in choices: 
        
                       group2.iloc[idx] = group1.iloc[idx] 
        
               group_list.append(group2)

I verified that all other tests are passing locally, too.

I verified that the correct PUF file is being downloaded by downloading a new copy of the PUF and then testing it against the copy in the S3 bucket:

The text was updated successfully, but these errors were encountered:

hdoupe · 2021-11-08T16:26:48Z

It looks like the group2 dataframe has an extra column that's not in group1: reform_affected. If we drop this, I think we should be good to go. @andersonfrailey thoughts?

andersonfrailey · 2021-11-09T13:09:14Z

@hdoupe

If we drop this, I think we should be good to go

I agree. I went back a bit and it looks like that column was added three years ago. Unclear why it's just throwing an error now, but dropping it should fix the problem.

hdoupe · 2021-11-09T14:55:19Z

Ok, cool. Thanks for taking a look. I'll open a PR

jdebacker mentioned this issue Mar 27, 2024

Assign reform_affected column to baseline df to support assignment #189

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

C/S update error #188

C/S update error #188

hdoupe commented Nov 8, 2021

hdoupe commented Nov 8, 2021 •

edited

andersonfrailey commented Nov 9, 2021

hdoupe commented Nov 9, 2021

C/S update error #188

C/S update error #188

Comments

hdoupe commented Nov 8, 2021

hdoupe commented Nov 8, 2021 • edited

andersonfrailey commented Nov 9, 2021

hdoupe commented Nov 9, 2021

hdoupe commented Nov 8, 2021 •

edited