Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify Methods for Array64 and Arrayb #8

Open
pinpox opened this issue Nov 30, 2016 · 5 comments
Open

Unify Methods for Array64 and Arrayb #8

pinpox opened this issue Nov 30, 2016 · 5 comments

Comments

@pinpox
Copy link
Contributor

pinpox commented Nov 30, 2016

There is a lot of redundancy in the Array64 and Arrayb methods. Quite a few of them are exactly the same, except for the input/return type.

Consider e.g. the Reshape() method of Array64 and Arrayb: The are identical, the only difference beeing in the declaration:

func (a *Arrayb) Reshape(shape ...int) *Arrayb {
...
}

vs.

func (a *Array64) Reshape(shape ...int) *Array64 {
...
}

This is not specific to this example and applies for a lot of code.

To simplify the code it would be possible to add a general n-Dimensional structure independent to the underlying type. The type of data (bool for Arrayb and float64 for Array64) would be encapsulated in a interface.

This would be a possible implementation:

type nDimObject struct {
	shape        []int
	strides      []int
	err          error
	debug, stack string
	data         []nDimElement
}

type nDimElement interface {
}

// Array64 is an n-dimensional array of float64 data
type Array64 struct {
	nDimObject
}

// Arrayb is an n-dimensional array of boolean values
type Arrayb struct {
	nDimObject
}

Methods that are identical for both Array types could be implemented on nDimObject that way.
I will create a pull request on a separate branch. It sure removes a lot of code and simplifies it (in my opinion) a lot.

I understand this is a major change to the structure of the project and you me want to go in a different direction. Let me know what you think of the idea!

P.S. this approach would make it a lot easier to implement new types of array (maybe for ints?) in the future if needed.

@pinpox
Copy link
Contributor Author

pinpox commented Dec 3, 2016

Hi @Kunde21,

I started to implement this but am having problems to get everything to work and the tests to pass. I will submit a pull request, maybe you can help me out an request the appropiate changes to get it to work.

See pullrequest #9 for everything else.

@Kunde21
Copy link
Owner

Kunde21 commented Dec 6, 2016

I like the idea of combining the ndim meta-data. Shape, error, and the like will be the same no matter the underlying type.

The use of interface{} types is an absolute no-go, though, for a few reasons. The most important is that there's a big performance penalty for type-assertion and the runtime bugs that come with sacrificing native types. Additionally, we would need to completely sacrifice the 2x-4x performance gains of the assembly code, because []interface{} is not the same memory layout as []float64. This library is intended for lots of number-crunching, so any performance sacrifices need to be more than recouped in other gains. Changing to []interface{} data buffers doesn't come with any real gains, much less enough to justify performance losses.

Rather than trying to go all-or-nothing on combining, I'd suggest we look at combining the meta-data, while maintaining type-specific data buffers:

type nDimArray struct {
	shape        []int
	strides      []int
	err          error
	debug, stack string
}

// Array64 is an n-dimensional array of float64 data
type Array64 struct {
	nDimArray
	data         []float64
}

// Arrayb is an n-dimensional array of boolean values
type Arrayb struct {
	nDimArray
	data         []bool
}

With this design, we can completely hoist Shape, Reshape, Flatten, and all error handling into nDimArray. It could also provide helper functions for methods like C and Resize. At the same time, we can keep all the performance and type-safety of native types in the respective data slices.

@pinpox
Copy link
Contributor Author

pinpox commented Dec 9, 2016

Thanks for the input, I didn't know that!
I will create a new branch and start over again with your design.

EDIT: moved the rest of this comment to new issue

This was referenced Dec 9, 2016
@pinpox
Copy link
Contributor Author

pinpox commented Dec 9, 2016

I submitted a pull request that just implements the new layout and makes fixes the all methods to use it.
There is no generalization of methods yet.

How are the methods on nDimMetadata supposed to access data?
Take the GetErr as example:

func (a *nDimMetadata) GetErr() (err error) {
	if a == nil || (a.data == nil && a.err == nil) {
		return NilError
	}
	err = a.err
	a.err, a.debug, a.stack = nil, "", ""
	return
}

It now throws an error because a nDimMetadata obviously has no field data. What is the cleanest way to get around this? The data is also accessed in a lot of methods that could be generalized otherwise.

@Kunde21
Copy link
Owner

Kunde21 commented Dec 15, 2016

For those methods, I was using a.data as a canary. Since it's required to index into a.data, a.shape can be used there. We should add a tests to the creation functions, to ensure data is always initialized with shape to avoid bugs there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants