Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added more FAQs #1567

Open
wants to merge 19 commits into
base: master
Choose a base branch
from
Open

added more FAQs #1567

wants to merge 19 commits into from

Conversation

ms041223
Copy link
Collaborator

@ms041223 ms041223 commented May 10, 2024

I've added a few new points to the FAQs. Please find the link below; I would appreciate any feedback.

https://1a1ca458.arcticdb-docs.pages.dev/dev/faq/

vasil-pashov and others added 3 commits May 10, 2024 22:53
#### Reference Issues/PRs
<!--Example: Fixes #1234. See also #3456.-->
Closes: #1226
#### What does this implement or fix?
* Remove the environment variable which was used to enable the
consolidation phase
* Make our custom block manager to cast blocks for empty columns to
float64. To be removed after empty types become enabled.
#### Any other comments?

#### Checklist

<details>
  <summary>
   Checklist for code changes...
  </summary>
 
- [ ] Have you updated the relevant docstrings, documentation and
copyright notice?
- [ ] Is this contribution tested against [all ArcticDB's
features](../docs/mkdocs/docs/technical/contributing.md)?
- [ ] Do all exceptions introduced raise appropriate [error
messages](https://docs.arcticdb.io/error_messages/)?
 - [ ] Are API changes highlighted in the PR description?
- [ ] Is the PR labelled as enhancement or bug so it appears in
autogenerated release notes?
</details>

<!--
Thanks for contributing a Pull Request to ArcticDB! Please ensure you
have taken a look at:
- ArcticDB's Code of Conduct:
https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
- ArcticDB's Contribution Licensing:
https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
-->

---------

Co-authored-by: Vasil Pashov <vasil.pashov@man.com>
#### Reference Issues/PRs

Fixes the 10th item of the folly replacement plan, issue #1412.

#### What does this implement or fix?

This removes the single use of `folly/ThreadCachedInt`. It is replaced
by a partial vendoring of the `folly` code plus use of
`boost::thread_specific_ptr`.

`ThreadCachedInt` is used to count the number of freed memory blocks. It
is (presumably) not just implemented as an atomic integer count as
thread locking would be too slow, so instead each thread has its own
count and when a single thread's count exceeds some threshold it is
added to the overall count. The original `folly` implementation has two
ways of reading the count which are slow (fully accurate) and fast (not
fully accurate). ArcticDB only uses the fast option, so the
implementation is much simpler than `folly`'s, requiring fewer
`atomic`s.

New class `ThreadCachedInt` in `thread_cached_int.hpp` is derived from
https://github.com/facebook/folly/blob/main/folly/ThreadCachedInt.h but
containing only the subset of functionality required. Instead of using
`folly'`s own `ThreadLocalPtr` this uses `boost::thread_specific_ptr`.
The `folly` source code claims that their implementation is 4x faster
than the `boost` one:


https://github.com/facebook/folly/blob/dbc9e565f54eabb40ad6600656ad9dea919f51c0/folly/ThreadLocal.h#L18-L20

but that claim dates from 12 years ago and this solution is simpler than
theirs. This does need to be benchmarked against `master` to confirm
that it is not measurably slower.

#### Any other comments?

The only place this is ultimately used is to control when `malloc_trim`
is called here


https://github.com/man-group/ArcticDB/blob/23b3b943b7c4a10889a563a063b2a616fe00d9fa/cpp/arcticdb/util/allocator.cpp#L286-L288

to release memory back to the system. This only occurs on Linux. Other
OSes could have all of this code removed but this would be a bigger
change with many `#ifdef` blocks, etc.

---------

Signed-off-by: Ian Thomas <ianthomas23@gmail.com>
The SlabAlloc tests were super slow on the build servers.
Investigation showed this was due to poor performance of slab allocator with many threads. More details:
https://manwiki.maninvestments.com/display/AlphaTech/Slab+Allocator+poor+multi-threaded+performance+with+aggressive+allocations

This commit does the following:
 - Place an upper limit on num_threads of 8 (which speeds up drastically the tests on the build servers)
 - Improve multi-threaded performance by ~20% by removing unneded atomic::load() calls (because atomic::compare_exchange_strong already does this for us)
 - Adds compiled away logging for log contention. Can be enabled with #define LOG_SLAB_ALLOC_INTERNALS.
docs/mkdocs/docs/faq.md Outdated Show resolved Hide resolved
docs/mkdocs/docs/faq.md Outdated Show resolved Hide resolved
docs/mkdocs/docs/faq.md Outdated Show resolved Hide resolved
poodlewars and others added 13 commits May 15, 2024 13:47
)

#### What does this implement or fix?

We should only write the version ref key once when we write with
`prune_previous_versions=True`. Currently we are writing it twice - once
after we write the tombstone all and once when we write the new version.
This means that there is a period of time where the symbol is
unreadable.

This was fixed a while ago with PR #1104 but regressed with PR #1355.
#### What does this implement or fix?

`SymbolDescription.last_update_time` is a
`pandas._libs.tslibs.timestamps.Timestamp` which is a
`datetime.datetime` subtype, not a `numpy.datetime64`.

#### Any other comments?

Should this be `Union[datetime.datetime, numpy.datetime64]`?

#### Checklist

<details>
  <summary>
   Checklist for code changes...
  </summary>
 
- [ ] Have you updated the relevant docstrings, documentation and
copyright notice?
- [ ] Is this contribution tested against [all ArcticDB's
features](../docs/mkdocs/docs/technical/contributing.md)?
- [ ] Do all exceptions introduced raise appropriate [error
messages](https://docs.arcticdb.io/error_messages/)?
 - [ ] Are API changes highlighted in the PR description?
- [ ] Is the PR labelled as enhancement or bug so it appears in
autogenerated release notes?
</details>

<!--
Thanks for contributing a Pull Request to ArcticDB! Please ensure you
have taken a look at:
- ArcticDB's Code of Conduct:
https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
- ArcticDB's Contribution Licensing:
https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
-->
…ilder (#1557)

#### Reference Issues/PRs
Fixes #1148.
previously if a type was mismatched in the QueryBuilder as follows
```python
df1 = pd.DataFrame({"col1": [1, 2, 3], "col2": [2, 3, 4], "col3": [4, 5, 6], "col_str": ["1", "2", "3"]})
sym = "symbol"
lib.write(sym, df1)
q = QueryBuilder()
q = q[q["col1" + 1]  == "3"]
lib.read(sym, query_builder=q)
```
it would show an unclear message with `UTF_DYNAMIC64` type shown for
strings
```
arcticdb_ext.exceptions.InternalException: E_ASSERTION_FAILURE Cannot compare TD<type=UTF_DYNAMIC64, dim=0> to TD<type=UINT32, dim=0> (possible categorical?)
```
Now the error is much clearer where column names are generated according
to the query and STRING type is shown for strings
```
arcticdb_ext.exceptions.UserInputException: E_INVALID_USER_ARGUMENT Invalid comparison (col1 + 1) (TD<type=INT64, dim=0>) == "3" (TD<type=STRING, dim=0>)
```
For a more complex query like `q = q[1 + q["col1"] * q["col2"] -
q["col3"] == q["col_str"]]` it will show column name `((1 + (col1 *
col2)) - col3)` in the error message which allows the user to better
understand the error.

#### What does this implement or fix?

#### Any other comments?

#### Checklist

<details>
  <summary>
   Checklist for code changes...
  </summary>
 
- [x] Have you updated the relevant docstrings, documentation and
copyright notice?
- [ ] Is this contribution tested against [all ArcticDB's
features](../docs/mkdocs/docs/technical/contributing.md)?
- [ ] Do all exceptions introduced raise appropriate [error
messages](https://docs.arcticdb.io/error_messages/)?
 - [ ] Are API changes highlighted in the PR description?
- [x] Is the PR labelled as enhancement or bug so it appears in
autogenerated release notes?
</details>

<!--
Thanks for contributing a Pull Request to ArcticDB! Please ensure you
have taken a look at:
- ArcticDB's Code of Conduct:
https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
- ArcticDB's Contribution Licensing:
https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
-->
#### Reference Issues/PRs
<!--Example: Fixes #1234. See also #3456.-->

#### What does this implement or fix?
This PR is aiming to provide a way to get the name/identifier of a given
store.
[This is what the
identifier](https://github.com/man-group/ArcticDB/blob/f208433e44d17f03011a96db64ef25dba3fccaa4/cpp/arcticdb/storage/s3/s3_storage.cpp#L57)
looks like now -
s3_storage-us-east-1/test_bucket_1/local/lib-kOGPh-3-True-target-1
- this applies only to s3 really -
{storage_type}-{region}/{bucket}/{perfix or lib_name}
- for the other stores - {storage_type}-{perfix or lib_name}
Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>
Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>
@ms041223
Copy link
Collaborator Author

new FAQs with feedback updates

https://1a1ca458.arcticdb-docs.pages.dev/dev/faq/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet