Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GroupBy with usage tech.tablesaw.api.Table#stream #1220

Open
pavel-hp opened this issue Jun 17, 2023 · 0 comments
Open

GroupBy with usage tech.tablesaw.api.Table#stream #1220

pavel-hp opened this issue Jun 17, 2023 · 0 comments

Comments

@pavel-hp
Copy link

pavel-hp commented Jun 17, 2023

I got issue with applying groupBy operation on specific column with using
tech.tablesaw.api.Table#stream API
https://javadoc.io/doc/tech.tablesaw/tablesaw-core/latest/tech/tablesaw/api/Table.html#stream--
(Returns the rows in table as a Stream)

Basically group by doesn't work when I use stream from Table method.
(I know that I can do "groupByColumn" differently but this is just an example for demo bug related to Tablesaw Stream API)

Tablesaw version: 0.43.1

Requires JDK 11

Here is Test for demo this issue:

import lombok.AllArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.junit.jupiter.api.Assertions;
import org.junit.jupiter.api.Test;
import tech.tablesaw.api.IntColumn;
import tech.tablesaw.api.Row;
import tech.tablesaw.api.StringColumn;
import tech.tablesaw.api.Table;

import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;

@Slf4j
class TableSawGroupTest {
    static List<Holder> testData = List.of(
            new Holder("a", 1),
            new Holder("b", 2),
            new Holder("c", 3),
            new Holder("a", -1)
    );

    @Test
    void shouldGroupBy() {
        Map<String, Integer> tableSawRes = tableSawVersion();
        Map<String, Integer> javaRes = javaStreamVersion();
        Assertions.assertEquals(javaRes, tableSawRes);
    }

    Map<String, Integer> tableSawVersion() {
        StringColumn strColumn = StringColumn.create("A-str", testData.stream().map(p -> p.strValue).collect(Collectors.toList()));
        IntColumn intColumn = IntColumn.create("B-int", testData.stream().map(p -> p.intValue).toArray(Integer[]::new));
        Table table = Table.create(strColumn, intColumn);
        log.info("Table: {}", table.printAll());
        return table.stream()
            .collect(Collectors.groupingBy(p -> p.getString("A-str"), LinkedHashMap::new,
                Collectors.collectingAndThen(Collectors.toList(), rows -> {
                    int sum = 0;
                    for (Row row : rows) {
                        int bValue = row.getInt("B-int");
                        sum = sum + bValue;
                    }
                    return sum;
                })));
    }

    Map<String, Integer> javaStreamVersion() {
        return testData.stream()
            .collect(Collectors.groupingBy(p -> p.strValue, LinkedHashMap::new,
                Collectors.collectingAndThen(Collectors.toList(), rows -> {
                    int sum = 0;
                    for (Holder row : rows) {
                        int bValue = row.intValue;
                        sum = sum + bValue;
                    }
                    return sum;
                })));
    }

    @AllArgsConstructor
    private static class Holder {
        String strValue;
        int intValue;
    }
}

test failed, here is output:

2023-06-17 08:50:09 INFO  TableSawGroupTest:35 - Table:  A-str  |  B-int  |
-------------------
     a  |      1  |
     b  |      2  |
     c  |      3  |
     a  |     -1  |

org.opentest4j.AssertionFailedError: 
Expected :{a=0, b=2, c=3}
Actual   :{a=-2, b=-1, c=-1}

See tablesaw-test.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant