Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Documentation] How to actually use BitField / BitArray in schema definitions? #30

Open
vinerz opened this issue Jul 7, 2022 · 5 comments

Comments

@vinerz
Copy link

vinerz commented Jul 7, 2022

Hello! First of all, thank you for the marvelous (and quite unique) way of handling binary data in JavaScript! This library is paving the way for an intra-worker communication I've been developing. Copying/Serializing objects have been an absurd overhead due to the application communication throughput so I started looking into SharedArrayBuffers to transfer data directly.

I already got the gist of BitArrays by themselves, but I'm really struggling to understand the view.create<T> expectations when the schema uses some kind of these special data handlers.

I wrote a basic pseudo-code to explain what I am trying to achieve.
The ??? comments are for sections I am specifically clueless.

interface User {
  username: string
  attributes: number[] // ???
}

const UserView = view.create<User>({
  $id: 'User',
  type: 'object',
  
  properties: {
    username: {
      type: 'string',
      maxLength: 10,
    },

    attributes: {
      type: 'array',
      btype: 'uint32', // ???
      maxLength: 4,
    },
  },
})

const user = UserView.from({
   username: 'vinerz',
   attributes: [0, 1, 0, 1], // ???
 })

// work with the user data
user.getView('attributes').setBit(2, 1) 

// transfer control to worker
worker.postMessage({ user: user.buffer }, [user.buffer])

Could you please shed some light here?
Is it possible or am I thinking in a wrong way?

@vinerz
Copy link
Author

vinerz commented Jul 7, 2022

I thought of doing something like this:

// ...

const UserView = view.create<User>({
  // ...

  properties: {
    // ...

    attributes: {
      type: 'array',
      btype: 'uint32', // ???
      maxLength: BitArray.getLength(4),
    },
  },
})

const attributes = new BitArray(4)
attributes.setBit(1, 1)
attributes.setBit(3, 1)

const user = UserView.from({
   username: 'vinerz',
   attributes: [...attributes.values()],
 })

const { buffer, byteOffset, byteLength } = user.getView('attributes')
const attrManager = new BitArray(buffer, byteOffset, byteLength)
attrManager.setBit(2, 1)

But seems quite verbose and with some undesirable copying / deconstruction

@zandaqo
Copy link
Owner

zandaqo commented Jul 7, 2022

Hi! That's an interesting use case, and one well-suited for view structures I believe.

I already got the gist of BitArrays by themselves, but I'm really struggling to understand the view.create expectations when the schema uses some kind of these special data handlers.

Bit structures (BitField, BitArray, etc.) are not view structures per se, although, since both use numbers and array buffers one can easily cast one into another. Your second example with BitArray is almost right, you just don't need to destructure it for encoding:

... 
    attributes: {
      type: 'array',
      items: { type: 'number', btype: 'uint32' },
      maxLength: BitArray.getLength(4),
    },
...

const attributes = new BitArray(4)
attributes.setBit(1, 1)
attributes.setBit(3, 1)

const user = UserView.from({
   username: 'vinerz',
   attributes,
 })

const { buffer, byteOffset, byteLength } = user.getView('attributes')
const attrManager = new BitArray(buffer, byteOffset, byteLength)
attrManager.setBit(2, 1)

This is because BitArray stores bits in Uint32Array (4 bytes or 32 bits per value), and our view for attributes also holds 4 byte values. Your casting example is correct: we simply instantiate a BitArray with the ArrayBuffer we got from the view. This is far less costly than creating a new ArrayBuffer and should not be a cause for concern. In fact, with this casting all edits on the BitArray (like attrManager.setBit(2, 1)) are reflected on the view since they both are using the same ArrayBuffer, and you don't need to re-encode the BitArray.

The casting is a bit of extra lines, and there is an option of creating a special binary type, but that is indeed not yet document.

As an aside, BitArray is only useful when the number of bits are above BitField's limit of 31, that is, when we have to store them in multiple integers. Personally, I use BitFields more often and store them as simple integers in views, casting into BitFields when necessary.

@vinerz
Copy link
Author

vinerz commented Jul 8, 2022

Hey @zandaqo, thanks for such a detailed explanation!

I finally got the concept: We use primitive/abstract views as the document definition and then we just cast them with the desired util as needed. I believe you've chosen this behaviour to not add any unnecessary overhead upon instantiation / serialization, right? As I came from Sequelize / Mongoose worlds, I was thinking too high level about the schema.

Could you give me some tips about this special binary type?

Yeah, I'll have to use BitArray because the attributes are actually ~80 bits long.

@vinerz
Copy link
Author

vinerz commented Jul 8, 2022

Update: In order for it to fully work, the User interface needs a tiny tweak:

interface User {
  id: string
  attributes: BitArray | number[]
}

Otherwise TypeScript will complain about type mismatch between ByteArray and number[]

@zandaqo
Copy link
Owner

zandaqo commented Jul 11, 2022

Hi, @vinerz, I have added some brief explanation and an example of creating a custom view type using BitArray. It's a bit crude, but I reason the code might be more helpful than descriptions alone. As you can notice, the view code itself is almost the same as BinaryView but for BitArrays instead of Uint8Arrays.

I finally got the concept: We use primitive/abstract views as the document definition and then we just cast them with the desired util as needed.

Yes, that's the general idea. Although, we can add custom view types for often used structures to reduce the boilerplate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants