Use a vector of integers instead of a string for faster computation #24

faheel · 2018-03-12T20:11:08Z

Instead of having the number's magnitude stored in a string, it would be much more efficient, in terms of both time and space, to have it stored in a vector of unsigned 64-bit integers (std::vector<uint64_t>). That is, store a number's digits in base 2⁶⁴.

The following table lists how the number's magnitude will be represented in base 2⁶⁴:

Magnitude	Representation	Comment
0	`{0}`	0⋅(2⁶⁴)⁰
1	`{1}`	1⋅(2⁶⁴)⁰
2	`{2}`	2⋅(2⁶⁴)⁰
2⁶⁴ - 1	`{18446744073709551615}`	18446744073709551615⋅(2⁶⁴)⁰
2⁶⁴	`{0, 1}`	1⋅(2⁶⁴)¹
2⁶⁴ + 1	`{1, 1}`	1⋅(2⁶⁴)⁰ + 1⋅(2⁶⁴)¹
2⁶⁴ + 2	`{2, 1}`	2⋅(2⁶⁴)⁰ + 1⋅(2⁶⁴)¹
2⁶⁵	`{0, 2}`	2⋅(2⁶⁴)¹
5⁵⁰	`{11670147153572883801, 4814824860968089}`	11670147153572883801⋅(2⁶⁴)⁰ + 4814824860968089⋅(2⁶⁴)¹
2¹²⁸	`{0, 0, 1}`	1⋅(2⁶⁴)²
2¹⁹²	`{0, 0, 0, 1}`	1⋅(2⁶⁴)³
2²⁵⁶	`{0, 0, 0, 0, 1}`	1⋅(2⁶⁴)⁴

The way how it could speed up performance is that, for example, when adding two BigInts, instead of adding their corresponding decimal digits together, their corresponding base 2⁶⁴ digits will be added, each digit in a single operation. Moreover, this addition (or any other arithmetic operation) of vectors can be vectorised by using SSE or AVX, which would further improve performance by a few folds.

Say, for instance, that you were adding two 5000 digit numbers. In base 10 they have 5000 digits, which means that it would take 5000 operations to add them together. Instead, if they were in base 2⁶⁴, they would have 260 digits, and would therefore require only 260 operations. Using AVX-512, vectors with up to 8 64-bit digits can be added together in a single operation. Which means that by using vectorized operations, two 5000 digit numbers could be added using just 33 operations.

The text was updated successfully, but these errors were encountered:

surajumang · 2018-03-30T16:13:56Z

The idea of using numbers in base 2^64 seems interesting. But the representation you have used is quite opposite of what is used commonly i.e Leftmost digit is the most significant one and rightmost digit is the least significant. Is there any reason behind this choice?

faheel · 2018-04-02T11:49:44Z

@surajumang Since all arithmetic operations (except division) are done from right to left, it's simpler to store the right-most digit as the first digit in an array since it will be processed first. But this is just to make the code simpler. When written formally, the most significant digits are always written first, followed by less significant digits to the right.

surajumang · 2018-04-03T17:15:04Z

I would like to work on this but think it will require a complete overhaul of the existing codebase like redefining all the arithmetic operations. What do you say?

faheel · 2018-04-03T17:49:53Z

@surajumang Yes, it would require a complete rewrite of all the arithmetic operators and a few other methods.

I'll be creating a new branch and will implement some basic stuff like converting from base 10 to base 2⁶⁴. Once that's done, I'll notify you and you can contribute to that branch.

faheel · 2018-04-09T14:42:51Z

@SingleJourney English please! Google Translate isn't perfect :)

OmriHab · 2018-04-09T14:47:49Z

Could you let me know when you've added that branch?
I would like to help contribute to the math functions using the new representation and anything else.

faheel · 2018-04-10T00:46:12Z

@surajumang @OmriHab I've added the branch base-2-to-the-64. The changes made so far on this branch can be tracked by comparing the branch with master.

OmriHab · 2018-04-10T09:15:44Z

Can i recommend using uint64_t from cstdint, instead of unsigned long long?
unsigned long long is promised to be at least 64 bit but not certainly exactly 64.
I assume that could create some problems if you use calculations that assume base 2^64.

OmriHab · 2018-04-10T10:49:04Z

Alternatively unsigned long long can be used and just take into consideration that it isn't necessarily 64 bit.
E.g the maximum would be numeric_limits<unsigned long long>::max() instead of UINT_64_MAX

surajumang · 2018-04-10T11:30:19Z

if you are implementing any feature then probably you should create a issue and then implement that particular feature like first create issue Implement sum of BigInts in vectorized implementation and then sumbit PR for that particular issue. This way different people can work on it

faheel · 2018-04-10T12:20:24Z

@OmriHab Good suggestion! I'll change the type to uint64_t.

ankur54 · 2018-10-03T18:18:55Z

Is the issue still open? If it is, I will start working on it.

DarkenedOrigins · 2020-07-03T02:27:03Z

Hi, I see the issue as open but seeing as its been over a year I would like to touch base and ask what I should work on as I would like to help.

faheel · 2020-07-13T08:53:24Z

@DarkenedOrigins If you're interested in working on it then I would be glad to help you by reviewing the code and helping with the logic.

You can start by creating a draft PR in which you can implement a utility function that takes a string representation of an integer and converts it to its base 2⁶⁴ integer vector representation.

If anyone else is interested in working on this too then they can contribute in the same PR.

DarkenedOrigins · 2020-07-14T00:12:55Z

Awesome, would you prefer that I create this in its own class called BigInt64 or i bolt it on to the current implementation

faheel · 2020-07-14T16:04:57Z

@DarkenedOrigins The existing class can be renamed to BigIntString and then you can create a new BigInt class.

DarkenedOrigins · 2020-07-15T21:57:31Z

inorder to be less destructive I made a new class called BigInt64. My implementation actually uses the string based BigInt internally. The only problem I had was difficulty in compiling since I have never used cmake before nor have I worked with such fragmented code where each implementation almost has its own file. Some help would be greatly appreciated.

faheel · 2020-07-16T16:52:22Z

inorder to be less destructive I made a new class called BigInt64

Good call! We can do the renaming later.

The only problem I had was difficulty in compiling [...]

Yeah, the documentation related to development is not very good at the moment. I'll add details regarding how and when to create a new header file, how to write tests for it, and how to compile and run the tests using CMake and CTest. If you need help on anything more specific, you can leave a comment in PR #58 that you opened and @mention me in it. I'll get back to you whenever I can.

szdytom · 2021-02-10T05:59:37Z

use std:bitset<>, long long calculations are much slower than 2 ints on some CPU.

faheel added enhancement New feature or request suggestion labels Mar 12, 2018

faheel added priority: high High priority and removed suggestion labels May 14, 2018

faheel pinned this issue Jan 2, 2019

perlun mentioned this issue Nov 17, 2023

Very slow compared to the GNU MP Bignum Library #69

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use a vector of integers instead of a string for faster computation #24

Use a vector of integers instead of a string for faster computation #24

faheel commented Mar 12, 2018 •

edited

surajumang commented Mar 30, 2018

faheel commented Apr 2, 2018

surajumang commented Apr 3, 2018

faheel commented Apr 3, 2018

faheel commented Apr 9, 2018

OmriHab commented Apr 9, 2018

faheel commented Apr 10, 2018

OmriHab commented Apr 10, 2018 •

edited

OmriHab commented Apr 10, 2018

surajumang commented Apr 10, 2018

faheel commented Apr 10, 2018

ankur54 commented Oct 3, 2018

DarkenedOrigins commented Jul 3, 2020

faheel commented Jul 13, 2020

DarkenedOrigins commented Jul 14, 2020

faheel commented Jul 14, 2020

DarkenedOrigins commented Jul 15, 2020

faheel commented Jul 16, 2020

szdytom commented Feb 10, 2021

Use a vector of integers instead of a string for faster computation #24

Use a vector of integers instead of a string for faster computation #24

Comments

faheel commented Mar 12, 2018 • edited

surajumang commented Mar 30, 2018

faheel commented Apr 2, 2018

surajumang commented Apr 3, 2018

faheel commented Apr 3, 2018

faheel commented Apr 9, 2018

OmriHab commented Apr 9, 2018

faheel commented Apr 10, 2018

OmriHab commented Apr 10, 2018 • edited

OmriHab commented Apr 10, 2018

surajumang commented Apr 10, 2018

faheel commented Apr 10, 2018

ankur54 commented Oct 3, 2018

DarkenedOrigins commented Jul 3, 2020

faheel commented Jul 13, 2020

DarkenedOrigins commented Jul 14, 2020

faheel commented Jul 14, 2020

DarkenedOrigins commented Jul 15, 2020

faheel commented Jul 16, 2020

szdytom commented Feb 10, 2021

faheel commented Mar 12, 2018 •

edited

OmriHab commented Apr 10, 2018 •

edited