New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for dictionary encoded INT96 timestamp in parquet files #4680
base: main
Are you sure you want to change the base?
Conversation
✅ Deploy Preview for meta-velox canceled.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rui-mo the implementation looks good to me. Left a couple of comments. Can you add a Parquet File with timestamp type to velox/dwio/parquet/tests/examples
and add a test?
namespace facebook::velox::parquet { | ||
|
||
class TimestampColumnReader : public IntegerColumnReader { | ||
public: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we create a struct int96 {int val[3];};
in this class and use that instead of allocation int128_t
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just char[12]
(or a local using int96_t = char[12]
) would be good I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Yuhta @majetideepak, I tried above two ways, but I have to say a lot of compilation errors were encountered.
For char[12]
way, below error was found during compilation.
../.././velox/dwio/common/SelectiveColumnReaderInternal.h:55:39: required from ‘void facebook::velox::dwio::common::SelectiveColumnReader::ensureValuesCapacity(facebook::velox::vector_size_t) [with T = char [12]; facebook::velox::vector_size_t = int]’
../.././velox/dwio/common/SelectiveColumnReaderInternal.h:100:3: required from ‘void facebook::velox::dwio::common::SelectiveColumnReader::prepareRead(facebook::velox::vector_size_t, facebook::velox::dwio::common::RowSet, const uint64_t*) [with T = char [12]; facebook::velox::vector_size_t = int; facebook::velox::dwio::common::RowSet = folly::Range<const int*>; uint64_t = long unsigned int]’
../.././velox/dwio/parquet/reader/TimestampColumnReader.h:41:47: required from here
/usr/include/c++/9/optional:952:2: error: function returning an array
952 | value_or(_Up&& __u) const&
| ^~~~~~~~
/usr/include/c++/9/optional:963:2: error: function returning an array
963 | value_or(_Up&& __u) &&
| ^~~~~~~~
In file included from ../../velox/dwio/parquet/reader/ParquetColumnReader.cpp:22:
../.././velox/dwio/common/SelectiveColumnReaderInternal.h: In instantiation of ‘void facebook::velox::dwio::common::SelectiveColumnReader::ensureValuesCapacity(facebook::velox::vector_size_t) [with T = char [12]; facebook::velox::vector_size_t = int]’:
../.././velox/dwio/common/SelectiveColumnReaderInternal.h:100:3: required from ‘void facebook::velox::dwio::common::SelectiveColumnReader::prepareRead(facebook::velox::vector_size_t, facebook::velox::dwio::common::RowSet, const uint64_t*) [with T = char [12]; facebook::velox::vector_size_t = int; facebook::velox::dwio::common::RowSet = folly::Range<const int*>; uint64_t = long unsigned int]’
../.././velox/dwio/parquet/reader/TimestampColumnReader.h:41:47: required from here
../.././velox/dwio/common/SelectiveColumnReaderInternal.h:55:39: note: when instantiating default argument for call to ‘static facebook::velox::BufferPtr facebook::velox::AlignedBuffer::allocate(size_t, facebook::velox::memory::MemoryPool*, const std::optional<_Tp>&) [with T = char [12]; facebook::velox::BufferPtr = boost::intrusive_ptr<facebook::velox::Buffer>; size_t = long unsigned int]’
55 | values_ = AlignedBuffer::allocate<T>(
| ~~~~~~~~~~~~~~~~~~~~~~~~~~^
56 | numRows + (simd::kPadding / sizeof(T)), &memoryPool_);
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
For struct int96 {int val[3];};
way, I solved some compilation errors and ended up with an error of missing conversion from int64_t
to int96_t
at RLEv1.h. Because I create a columnar visitor in SelectiveIntegerColumnReader.h with type int96_t (the self-defined struct), while value
is of int64_t type.
case 12:
reinterpret_cast<Reader*>(this)->Reader::readWithVisitor(
rows,
ColumnVisitor<int96_t, TFilter, ExtractValues, isDense>(
*reinterpret_cast<TFilter*>(filter), this, rows, extractValues));
break;
Looks more work is needed here to support a self-defined int96 type, and some template functions for integer need to be changed.
Do you have any suggestion here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we consider using int128_t
as an alternative first? Because it helps simply the implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think using int128_t
will work here, the valueSize_
is different and you will end up reading different part of data and even read out of bound.
Maybe check #2822 as a reference for what need to be done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Yuhta Thank you. I see. The main problem here is __int128
is a built-in type in GCC while int96
is not there, so to use int96_t
requires more work. For example, the implicit conversions between int64_t
and int96_t
, and some operators like *
and +=
etc. are lacking. Therefore, a lot of compiling errors occur in the code path of IntegerColumnReader
.
Do we intend to implement a robust int96
type, or try other solution if there is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The actual underlying structure is not really a generic int96_t
but this:
struct __attribute__((__packed__)) Int96Timestamp {
int32_t days;
uint64_t nanos;
};
I would recommend you to make this type to work with IntegerColumnReader
, that would be the easiest way I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will continue to work on that. Convert this PR as draft for now, and will ping you for review when it's ready. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add some tests in https://github.com/facebookincubator/velox/blob/main/velox/dwio/parquet/tests/reader/E2EFilterTest.cpp
You probably need to make the writer to generate INT96 in writeToMemory
namespace facebook::velox::parquet { | ||
|
||
class TimestampColumnReader : public IntegerColumnReader { | ||
public: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just char[12]
(or a local using int96_t = char[12]
) would be good I think
@Yuhta I doubt if the Arrow Bridge supports int96 type. But worth checking. The alternative is to check in a file. |
@majetideepak Using a fixed file gives less coverage, but if the writer is not working then we have to do it this way for now. Either way we should make sure the result is correct with or without filters. |
@majetideepak @Yuhta Thanks for your review! Your comments are well received, and I'm working on them. |
f5c944e
to
a8dee34
Compare
@Yuhta I also tried that. Use |
velox/dwio/parquet/writer/Writer.cpp
Outdated
@@ -32,7 +32,9 @@ void Writer::write(const RowVectorPtr& data) { | |||
recordBatch->schema(), recordBatch->columns(), data->size()); | |||
if (!arrowWriter_) { | |||
stream_ = std::make_shared<DataBufferSink>(pool_); | |||
auto arrowProperties = ::parquet::ArrowWriterProperties::Builder().build(); | |||
auto arrowProperties = ::parquet::ArrowWriterProperties::Builder() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you pass in the ArrowWriterProperties
in the constructor and enable int96 in the corresponding unit tests only?
Timestamp(1197417600, 16076000000000)}); | ||
|
||
loadData( | ||
getExampleFilePath("timestamp-int96.parquet"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just use the writer to write the file in memory since you find a way to do it; also add some tests in https://github.com/facebookincubator/velox/blob/main/velox/dwio/parquet/tests/reader/E2EFilterTest.cpp for timestamp type
namespace facebook::velox::parquet { | ||
|
||
class TimestampColumnReader : public IntegerColumnReader { | ||
public: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think using int128_t
will work here, the valueSize_
is different and you will end up reading different part of data and even read out of bound.
Maybe check #2822 as a reference for what need to be done.
53d9408
to
1fe1694
Compare
hi @Yuhta, I spent more time on Below is the stack of current timestamp scan. facebook::velox::parquet::PageReader::prepareDictionary(facebook::velox::parquet::thrift::PageHeader const&) in ./velox_dwio_parquet_table_scan_test
1# facebook::velox::parquet::PageReader::seekToPage(long) in ./velox_dwio_parquet_table_scan_test
2# facebook::velox::parquet::PageReader::rowsForPage(facebook::velox::dwio::common::SelectiveColumnReader&, bool, bool, folly::Range<int const*>&, unsigned long const*&) in ./velox_dwio_parquet_table_scan_test
3# void facebook::velox::parquet::PageReader::readWithVisitor<facebook::velox::dwio::common::ColumnVisitor<__int128, facebook::velox::common::AlwaysTrue, facebook::velox::dwio::common::ExtractToReader<facebook::velox::dwio::common::SelectiveIntegerColumnReader>, true> >(facebook::velox::dwio::common::ColumnVisitor<__int128, facebook::velox::common::AlwaysTrue, facebook::velox::dwio::common::ExtractToReader<facebook::velox::dwio::common::SelectiveIntegerColumnReader>, true>&) in ./velox_dwio_parquet_table_scan_test Could you explain more about the possible risks? Thank you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think using int128_t will work here, the valueSize_ is different and you will end up reading different part of data and even read out of bound.
hi @Yuhta, I spent more time on
Int96Timestamp
type support but it is not easy to make it work through. We also made more tests on theint128_t
workaround, and found it could work for a pure scan. As posted in this PR, int96 in Parquet is converted to Velox Timestamp type (which is of 16-byte length) in PageReader (see link), and onlynumValues * sizeof(Int96Timestamp)
bytes of data was read in PageReader.Below is the stack of current timestamp scan.
facebook::velox::parquet::PageReader::prepareDictionary(facebook::velox::parquet::thrift::PageHeader const&) in ./velox_dwio_parquet_table_scan_test 1# facebook::velox::parquet::PageReader::seekToPage(long) in ./velox_dwio_parquet_table_scan_test 2# facebook::velox::parquet::PageReader::rowsForPage(facebook::velox::dwio::common::SelectiveColumnReader&, bool, bool, folly::Range<int const*>&, unsigned long const*&) in ./velox_dwio_parquet_table_scan_test 3# void facebook::velox::parquet::PageReader::readWithVisitor<facebook::velox::dwio::common::ColumnVisitor<__int128, facebook::velox::common::AlwaysTrue, facebook::velox::dwio::common::ExtractToReader<facebook::velox::dwio::common::SelectiveIntegerColumnReader>, true> >(facebook::velox::dwio::common::ColumnVisitor<__int128, facebook::velox::common::AlwaysTrue, facebook::velox::dwio::common::ExtractToReader<facebook::velox::dwio::common::SelectiveIntegerColumnReader>, true>&) in ./velox_dwio_parquet_table_scan_testCould you explain more about the possible risks? Thank you.
So the assumption here is it is always dictionary-encoded? If this assumption holds all the time, we can probably go this way. It's only a problem when we want to apply a filter on the column of flat values.
Make sure to beef up your E2E filter tests with filters on some primary keys (int64 is fine), and also put timestamp in complex types (array, map, struct) in addition to top-level column.
@Yuhta Thanks for your reply.
Understood the gap here. I guess RLEV1 and Plain encoding are also possible because the column encoding can be set during Parquet write, but we only tested the Parquet generated with default configs.
Got it, will do. |
828d05d
to
3990c66
Compare
107c963
to
15d35cc
Compare
Summary: Arrow bridge supports different timestamp units, including second, milli, micro and nano. This PR adds `TimestampUnit` in `ArrowOptions` to support these units in the process of exportToArrow. For importFromArrow, the unit extracted from arrow schema is followed. By default, the conversion unit is nano, and in Gluten, micro is configured to align with Spark. #4680 (comment) Arrow Reference: https://github.com/apache/arrow/blob/main/cpp/src/arrow/c/bridge.cc#L402-L421. Pull Request resolved: #7625 Reviewed By: mbasmanova Differential Revision: D54852534 Pulled By: Yuhta fbshipit-source-id: 0494102fedc73f7068424bc09e972e7deb297a6e
dde7df4
to
1c78db0
Compare
Related Presto issue prestodb/presto#22605 |
false, | ||
filters, | ||
sampleRate); | ||
auto remainingFilter = hiveTableHandle_->remainingFilter(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious why do we need this? isFilterPushdownEnabled
should be ignored in Prestissimo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is just a workaround to get test work. Without this change, filter pushdown happens and below exception is observed.
C++ exception with description "Exception: VeloxUserError
Error Source: USER
Error Code: UNSUPPORTED
Reason: TimestampRange: [-292275055-05-16T16:47:04.000000000, 2000-09-12T22:36:28.999999999] no nulls: testInt128() is not supported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This flag will be removed in the future. Will it work if we just implement TimestampRange::testInt128
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I will take a look. Thanks for the suggestion.
auto& data = formatData_->as<ParquetData>(); | ||
// Use int128_t as a workaroud. Timestamp in Velox is of 16-byte length. | ||
prepareRead<int128_t>(offset, rows, nullptr); | ||
readCommon<IntegerColumnReader>(rows); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rui-mo, I note an extra template arg kEncodingHasNulls is added by 0c4dad1.
I just use true for it in a downstream branch.
oap-project@a4039c2#diff-ae87451c1577f3b47d2863187de8bf30c7351484d39537419016487cc7b2f71cR44
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated, thanks.
cef49e2
to
8a3df69
Compare
Support timestamp reader for Parquet file format to read from INT96 timestamps.