Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pdq] Constant initialize DCT buffer #1518

Merged
merged 2 commits into from Jan 22, 2024
Merged

Conversation

ianwal
Copy link
Contributor

@ianwal ianwal commented Jan 21, 2024

Summary

Changed DCT buffer to be constant initialized.

The function which initializes and returns the DCT static buffer was updated to be reentrant in #1338 in pdqhashing.cpp. The change was to protect filling the static buffer from multiple threads by using std::call_once. However, this is unnecessary because it can just be constant initialized because the buffer is never modified after initialization.

In this PR, the buffer is changed to a std::array, so it can be returned from the function. The convenience pointers to its offsets are changed to references in fillFloatLumaFromRGB.

All of the changes are supported by C++11, which is the default for pdq.

Other misc changes:

  • Updated some variable names in the function for readability.

  • Added venv/ to .gitignore for dev convenience

References:

cppreference.com initialization article (I don't own the standard):

All non-local variables with static storage duration are initialized as part of program startup, before the execution of the main function begins (unless deferred, see below). All non-local variables with thread-local storage duration are initialized as part of thread launch, sequenced-before the execution of the thread function begins. For both of these classes of variables, initialization occurs in two distinct stages:

Anecdote: If compiled with C++20 (ex. https://godbolt.org/z/jejqn6M4Y), this function actually turns into a lookup table. I thought that was interesting.

Test Plan

Tested vpdq and pdq. All tests pass. No concurrency issues when ran with sanitizers (vpdq CI does this).

Copy link
Contributor

@Dcallies Dcallies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not surprised it turns into a lookup table - I had something very similar to this in a previous project that the just made into an array filled with constant numbers.

namespace {

// ----------------------------------------------------------------
// Christoph Zauner 'Implementation and Benchmarking of Perceptual
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For fun, I found this paper and tried to figure out what this was referring to, and I can't find anything on the page that seems to match.

@@ -2,8 +2,6 @@
// Copyright (c) Meta Platforms, Inc. and affiliates.
// ================================================================

#include <mutex>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any ideas for in the long term how we can ensure pdq and vpdq say in sync? Should we write a test that checks that file contents of the vpdq directory match the cpp one?

@Dcallies Dcallies merged commit 649c1f3 into facebook:main Jan 22, 2024
11 checks passed
@ianwal ianwal deleted the pdq-improve-array branch January 28, 2024 05:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants