Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for dictionary encoded INT96 timestamp in parquet files #4680

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

rui-mo
Copy link
Contributor

@rui-mo rui-mo commented Apr 20, 2023

Support timestamp reader for Parquet file format to read from INT96 timestamps.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 20, 2023
@netlify
Copy link

netlify bot commented Apr 20, 2023

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 8a3df69
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/6646c024b95934000801803a

Copy link
Collaborator

@majetideepak majetideepak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rui-mo the implementation looks good to me. Left a couple of comments. Can you add a Parquet File with timestamp type to velox/dwio/parquet/tests/examples and add a test?

velox/dwio/parquet/reader/TimestampColumnReader.h Outdated Show resolved Hide resolved
namespace facebook::velox::parquet {

class TimestampColumnReader : public IntegerColumnReader {
public:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we create a struct int96 {int val[3];}; in this class and use that instead of allocation int128_t?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just char[12] (or a local using int96_t = char[12]) would be good I think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Yuhta @majetideepak, I tried above two ways, but I have to say a lot of compilation errors were encountered.

For char[12] way, below error was found during compilation.

../.././velox/dwio/common/SelectiveColumnReaderInternal.h:55:39:   required from ‘void facebook::velox::dwio::common::SelectiveColumnReader::ensureValuesCapacity(facebook::velox::vector_size_t) [with T = char [12]; facebook::velox::vector_size_t = int]’
../.././velox/dwio/common/SelectiveColumnReaderInternal.h:100:3:   required from ‘void facebook::velox::dwio::common::SelectiveColumnReader::prepareRead(facebook::velox::vector_size_t, facebook::velox::dwio::common::RowSet, const uint64_t*) [with T = char [12]; facebook::velox::vector_size_t = int; facebook::velox::dwio::common::RowSet = folly::Range<const int*>; uint64_t = long unsigned int]’
../.././velox/dwio/parquet/reader/TimestampColumnReader.h:41:47:   required from here
/usr/include/c++/9/optional:952:2: error: function returning an array
  952 |  value_or(_Up&& __u) const&
      |  ^~~~~~~~
/usr/include/c++/9/optional:963:2: error: function returning an array
  963 |  value_or(_Up&& __u) &&
      |  ^~~~~~~~
In file included from ../../velox/dwio/parquet/reader/ParquetColumnReader.cpp:22:
../.././velox/dwio/common/SelectiveColumnReaderInternal.h: In instantiation of ‘void facebook::velox::dwio::common::SelectiveColumnReader::ensureValuesCapacity(facebook::velox::vector_size_t) [with T = char [12]; facebook::velox::vector_size_t = int]’:
../.././velox/dwio/common/SelectiveColumnReaderInternal.h:100:3:   required from ‘void facebook::velox::dwio::common::SelectiveColumnReader::prepareRead(facebook::velox::vector_size_t, facebook::velox::dwio::common::RowSet, const uint64_t*) [with T = char [12]; facebook::velox::vector_size_t = int; facebook::velox::dwio::common::RowSet = folly::Range<const int*>; uint64_t = long unsigned int]’
../.././velox/dwio/parquet/reader/TimestampColumnReader.h:41:47:   required from here
../.././velox/dwio/common/SelectiveColumnReaderInternal.h:55:39: note:   when instantiating default argument for call to ‘static facebook::velox::BufferPtr facebook::velox::AlignedBuffer::allocate(size_t, facebook::velox::memory::MemoryPool*, const std::optional<_Tp>&) [with T = char [12]; facebook::velox::BufferPtr = boost::intrusive_ptr<facebook::velox::Buffer>; size_t = long unsigned int]’
   55 |   values_ = AlignedBuffer::allocate<T>(
      |             ~~~~~~~~~~~~~~~~~~~~~~~~~~^
   56 |       numRows + (simd::kPadding / sizeof(T)), &memoryPool_);
      |       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

For struct int96 {int val[3];}; way, I solved some compilation errors and ended up with an error of missing conversion from int64_t to int96_t at RLEv1.h. Because I create a columnar visitor in SelectiveIntegerColumnReader.h with type int96_t (the self-defined struct), while value is of int64_t type.

    case 12:
      reinterpret_cast<Reader*>(this)->Reader::readWithVisitor(
          rows,
          ColumnVisitor<int96_t, TFilter, ExtractValues, isDense>(
              *reinterpret_cast<TFilter*>(filter), this, rows, extractValues));
      break;

Looks more work is needed here to support a self-defined int96 type, and some template functions for integer need to be changed.
Do you have any suggestion here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we consider using int128_t as an alternative first? Because it helps simply the implementation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think using int128_t will work here, the valueSize_ is different and you will end up reading different part of data and even read out of bound.

Maybe check #2822 as a reference for what need to be done.

Copy link
Contributor Author

@rui-mo rui-mo May 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Yuhta Thank you. I see. The main problem here is __int128 is a built-in type in GCC while int96 is not there, so to use int96_t requires more work. For example, the implicit conversions between int64_t and int96_t, and some operators like * and += etc. are lacking. Therefore, a lot of compiling errors occur in the code path of IntegerColumnReader.
Do we intend to implement a robust int96 type, or try other solution if there is?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The actual underlying structure is not really a generic int96_t but this:

struct __attribute__((__packed__)) Int96Timestamp {
  int32_t days;
  uint64_t nanos;
};

I would recommend you to make this type to work with IntegerColumnReader, that would be the easiest way I guess.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will continue to work on that. Convert this PR as draft for now, and will ping you for review when it's ready. Thanks.

Copy link
Contributor

@Yuhta Yuhta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some tests in https://github.com/facebookincubator/velox/blob/main/velox/dwio/parquet/tests/reader/E2EFilterTest.cpp

You probably need to make the writer to generate INT96 in writeToMemory

namespace facebook::velox::parquet {

class TimestampColumnReader : public IntegerColumnReader {
public:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just char[12] (or a local using int96_t = char[12]) would be good I think

velox/dwio/parquet/reader/TimestampColumnReader.h Outdated Show resolved Hide resolved
@majetideepak
Copy link
Collaborator

majetideepak commented Apr 21, 2023

You probably need to make the writer to generate INT96 in writeToMemory

@Yuhta I doubt if the Arrow Bridge supports int96 type. But worth checking. The alternative is to check in a file.
Arrow Bridge has a similar issue with Parquet Decimal types backed by int64.

@Yuhta
Copy link
Contributor

Yuhta commented Apr 21, 2023

@majetideepak Using a fixed file gives less coverage, but if the writer is not working then we have to do it this way for now. Either way we should make sure the result is correct with or without filters.

@rui-mo
Copy link
Contributor Author

rui-mo commented Apr 23, 2023

@majetideepak @Yuhta Thanks for your review! Your comments are well received, and I'm working on them.

@rui-mo rui-mo force-pushed the wip_ts_reader branch 3 times, most recently from f5c944e to a8dee34 Compare April 28, 2023 06:15
@rui-mo
Copy link
Contributor Author

rui-mo commented Apr 28, 2023

Please add some tests in https://github.com/facebookincubator/velox/blob/main/velox/dwio/parquet/tests/reader/E2EFilterTest.cpp

You probably need to make the writer to generate INT96 in writeToMemory

@Yuhta I also tried that. Use enable_deprecated_int96_timestamps can make the arrow writer generate INT96. Since int128_t is used in Timestamp reader for now, the decoder calls readInt128() but little endian is not support currently (see IntDecoder.h). I will continue to work on this after the type issue is decided.

velox/type/Timestamp.h Outdated Show resolved Hide resolved
@@ -32,7 +32,9 @@ void Writer::write(const RowVectorPtr& data) {
recordBatch->schema(), recordBatch->columns(), data->size());
if (!arrowWriter_) {
stream_ = std::make_shared<DataBufferSink>(pool_);
auto arrowProperties = ::parquet::ArrowWriterProperties::Builder().build();
auto arrowProperties = ::parquet::ArrowWriterProperties::Builder()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you pass in the ArrowWriterProperties in the constructor and enable int96 in the corresponding unit tests only?

Timestamp(1197417600, 16076000000000)});

loadData(
getExampleFilePath("timestamp-int96.parquet"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just use the writer to write the file in memory since you find a way to do it; also add some tests in https://github.com/facebookincubator/velox/blob/main/velox/dwio/parquet/tests/reader/E2EFilterTest.cpp for timestamp type

velox/dwio/parquet/reader/TimestampColumnReader.h Outdated Show resolved Hide resolved
namespace facebook::velox::parquet {

class TimestampColumnReader : public IntegerColumnReader {
public:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think using int128_t will work here, the valueSize_ is different and you will end up reading different part of data and even read out of bound.

Maybe check #2822 as a reference for what need to be done.

@rui-mo rui-mo marked this pull request as draft May 8, 2023 05:35
@rui-mo rui-mo force-pushed the wip_ts_reader branch 6 times, most recently from 53d9408 to 1fe1694 Compare May 18, 2023 07:03
@rui-mo
Copy link
Contributor Author

rui-mo commented May 18, 2023

I don't think using int128_t will work here, the valueSize_ is different and you will end up reading different part of data and even read out of bound.

hi @Yuhta, I spent more time on Int96Timestamp type support but it is not easy to make it work through.
We also made more tests on the int128_t workaround, and found it could work for a pure scan. As posted in this PR, int96 in Parquet is converted to Velox Timestamp type (which is of 16-byte length) in PageReader (see link), and only numValues * sizeof(Int96Timestamp) bytes of data was read in PageReader.

Below is the stack of current timestamp scan.

facebook::velox::parquet::PageReader::prepareDictionary(facebook::velox::parquet::thrift::PageHeader const&) in ./velox_dwio_parquet_table_scan_test
 1# facebook::velox::parquet::PageReader::seekToPage(long) in ./velox_dwio_parquet_table_scan_test
 2# facebook::velox::parquet::PageReader::rowsForPage(facebook::velox::dwio::common::SelectiveColumnReader&, bool, bool, folly::Range<int const*>&, unsigned long const*&) in ./velox_dwio_parquet_table_scan_test
 3# void facebook::velox::parquet::PageReader::readWithVisitor<facebook::velox::dwio::common::ColumnVisitor<__int128, facebook::velox::common::AlwaysTrue, facebook::velox::dwio::common::ExtractToReader<facebook::velox::dwio::common::SelectiveIntegerColumnReader>, true> >(facebook::velox::dwio::common::ColumnVisitor<__int128, facebook::velox::common::AlwaysTrue, facebook::velox::dwio::common::ExtractToReader<facebook::velox::dwio::common::SelectiveIntegerColumnReader>, true>&) in ./velox_dwio_parquet_table_scan_test

Could you explain more about the possible risks? Thank you.

@Yuhta Yuhta self-requested a review May 18, 2023 15:25
Copy link
Contributor

@Yuhta Yuhta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think using int128_t will work here, the valueSize_ is different and you will end up reading different part of data and even read out of bound.

hi @Yuhta, I spent more time on Int96Timestamp type support but it is not easy to make it work through. We also made more tests on the int128_t workaround, and found it could work for a pure scan. As posted in this PR, int96 in Parquet is converted to Velox Timestamp type (which is of 16-byte length) in PageReader (see link), and only numValues * sizeof(Int96Timestamp) bytes of data was read in PageReader.

Below is the stack of current timestamp scan.

facebook::velox::parquet::PageReader::prepareDictionary(facebook::velox::parquet::thrift::PageHeader const&) in ./velox_dwio_parquet_table_scan_test
 1# facebook::velox::parquet::PageReader::seekToPage(long) in ./velox_dwio_parquet_table_scan_test
 2# facebook::velox::parquet::PageReader::rowsForPage(facebook::velox::dwio::common::SelectiveColumnReader&, bool, bool, folly::Range<int const*>&, unsigned long const*&) in ./velox_dwio_parquet_table_scan_test
 3# void facebook::velox::parquet::PageReader::readWithVisitor<facebook::velox::dwio::common::ColumnVisitor<__int128, facebook::velox::common::AlwaysTrue, facebook::velox::dwio::common::ExtractToReader<facebook::velox::dwio::common::SelectiveIntegerColumnReader>, true> >(facebook::velox::dwio::common::ColumnVisitor<__int128, facebook::velox::common::AlwaysTrue, facebook::velox::dwio::common::ExtractToReader<facebook::velox::dwio::common::SelectiveIntegerColumnReader>, true>&) in ./velox_dwio_parquet_table_scan_test

Could you explain more about the possible risks? Thank you.

So the assumption here is it is always dictionary-encoded? If this assumption holds all the time, we can probably go this way. It's only a problem when we want to apply a filter on the column of flat values.

Make sure to beef up your E2E filter tests with filters on some primary keys (int64 is fine), and also put timestamp in complex types (array, map, struct) in addition to top-level column.

@rui-mo
Copy link
Contributor Author

rui-mo commented May 23, 2023

@Yuhta Thanks for your reply.

So the assumption here is it is always dictionary-encoded? If this assumption holds all the time, we can probably go this way. It's only a problem when we want to apply a filter on the column of flat values.

Understood the gap here. I guess RLEV1 and Plain encoding are also possible because the column encoding can be set during Parquet write, but we only tested the Parquet generated with default configs.

Make sure to beef up your E2E filter tests with filters on some primary keys (int64 is fine), and also put timestamp in complex types (array, map, struct) in addition to top-level column.

Got it, will do.

velox/vector/arrow/Bridge.cpp Outdated Show resolved Hide resolved
velox/dwio/parquet/reader/PageReader.cpp Outdated Show resolved Hide resolved
facebook-github-bot pushed a commit that referenced this pull request Mar 19, 2024
Summary:
Arrow bridge supports different timestamp units, including second, milli, micro
and nano. This PR adds `TimestampUnit` in  `ArrowOptions` to support these
units in the process of exportToArrow. For importFromArrow, the unit extracted
from arrow schema is followed. By default, the conversion unit is nano, and in
Gluten, micro is configured to align with Spark.
#4680 (comment)
Arrow Reference: https://github.com/apache/arrow/blob/main/cpp/src/arrow/c/bridge.cc#L402-L421.

Pull Request resolved: #7625

Reviewed By: mbasmanova

Differential Revision: D54852534

Pulled By: Yuhta

fbshipit-source-id: 0494102fedc73f7068424bc09e972e7deb297a6e
@rui-mo rui-mo force-pushed the wip_ts_reader branch 3 times, most recently from dde7df4 to 1c78db0 Compare March 26, 2024 12:09
@yingsu00
Copy link
Collaborator

Related Presto issue prestodb/presto#22605

velox/type/Type.h Outdated Show resolved Hide resolved
false,
filters,
sampleRate);
auto remainingFilter = hiveTableHandle_->remainingFilter();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious why do we need this? isFilterPushdownEnabled should be ignored in Prestissimo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is just a workaround to get test work. Without this change, filter pushdown happens and below exception is observed.

C++ exception with description "Exception: VeloxUserError
Error Source: USER
Error Code: UNSUPPORTED
Reason: TimestampRange: [-292275055-05-16T16:47:04.000000000, 2000-09-12T22:36:28.999999999] no nulls: testInt128() is not supported.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This flag will be removed in the future. Will it work if we just implement TimestampRange::testInt128?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I will take a look. Thanks for the suggestion.

@rui-mo rui-mo changed the title Support timestamp reader for Parquet file format Support INT96 timestamp read for dictionary-encoded Parquet Apr 26, 2024
@rui-mo rui-mo changed the title Support INT96 timestamp read for dictionary-encoded Parquet Support for dictionary encoded INT96 timestamp in parquet files Apr 26, 2024
auto& data = formatData_->as<ParquetData>();
// Use int128_t as a workaroud. Timestamp in Velox is of 16-byte length.
prepareRead<int128_t>(offset, rows, nullptr);
readCommon<IntegerColumnReader>(rows);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rui-mo, I note an extra template arg kEncodingHasNulls is added by 0c4dad1.

I just use true for it in a downstream branch.
oap-project@a4039c2#diff-ae87451c1577f3b47d2863187de8bf30c7351484d39537419016487cc7b2f71cR44

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, thanks.

@rui-mo rui-mo force-pushed the wip_ts_reader branch 2 times, most recently from cef49e2 to 8a3df69 Compare May 17, 2024 02:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants