Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace shifts with enforced byte dropping in special cases #35

Open
Pharap opened this issue May 5, 2018 · 1 comment
Open

Replace shifts with enforced byte dropping in special cases #35

Pharap opened this issue May 5, 2018 · 1 comment
Assignees
Labels
Feature Request This issue is a request for a particular feature Minor This change is a minor addition

Comments

@Pharap
Copy link
Owner

Pharap commented May 5, 2018

Aparently cases of >> 8, >> 16 and >>32 aren't being optimised to simply drop the bytes as I was originally expecting.

This isn't much of an issue for more powerful processors,
but it's a bit of an issue for AVR chips.

The fix will be to manually enforce the byte dropping.
I will look into the best way to do this.

If all else fails, type punning may be the only option.

@Pharap Pharap self-assigned this May 5, 2018
@Pharap Pharap added Feature Request This issue is a request for a particular feature Minor This change is a minor addition labels Mar 25, 2021
@Pharap
Copy link
Owner Author

Pharap commented Mar 29, 2021

I started investigating this today to see if I could work out a way to make it viable.

Out of the techniques I've tried so far, the only one that both produced the correct result and resulted in a saving was:

template< unsigned Integer >
SFixed<Integer, 8> operator *(const SFixed<Integer, 8> & left, const SFixed<Integer, 8> & right)
{
	static_assert(((Integer + 1) * 2 + 8 * 2) <= FIXED_POINTS_DETAILS::BitSize<intmax_t>::Value, "Multiplication cannot be performed, the intermediary type would be too large");	
	
	using InternalType = typename SFixed<Integer, 8>::InternalType;
	using PrecisionType = typename SFixed<Integer * 2, 8 * 2>::InternalType;

	const auto intermediary = (static_cast<PrecisionType>(left.getInternal()) * static_cast<PrecisionType>(right.getInternal()));

	char buffer[sizeof(PrecisionType)];
	
	for(size_t index = 0; index < sizeof(PrecisionType); ++index)
		buffer[index] = reinterpret_cast<const char *>(&intermediary)[index];

	const InternalType result = *reinterpret_cast<const InternalType *>(&buffer[1]);

	return SFixed<Integer, 8>::fromInternal(result);
}

But this only appeared to result in an 8 byte saving in progmem, and I'm not 100% sure if it's 'legal'.

I may look into creating an AVR assembly implementation, and if that creates a significant enough saving then I'll consider adding it in, but most likely only when enabled using conditional compilation via an opt-in macro.

Regardless of which result I come up with, I won't be making this a general feature, it will be specifically for AVR because:

  • Creating a version that works with big endian CPUs will be double the work
  • Testing it on other CPUs (most of which I don't own) would be another hurdle

And I'd make it an opt-in feature because any solution is probably going to require the constexpr to be dropped because Arduino only targets C++11 and it's unlikely that the operations required to pull this off without creeping into the land of undefined behaviour will be usable under C++11's constexpr restrictions (which outlaw both variables and loops).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature Request This issue is a request for a particular feature Minor This change is a minor addition
Projects
None yet
Development

No branches or pull requests

1 participant