Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AnyParameterProperties refactor #4412

Merged
merged 7 commits into from
Nov 15, 2018
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
71 changes: 57 additions & 14 deletions src/shogun/base/AnyParameter.h
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
/*
* This software is distributed under BSD 3-clause license (see LICENSE file).
*
* Authors: Heiko Strathmann, Gil Hoben
*/

#ifndef __ANYPARAMETER_H__
#define __ANYPARAMETER_H__

Expand All @@ -21,49 +27,86 @@ namespace shogun
GRADIENT_NOT_AVAILABLE = 0,
GRADIENT_AVAILABLE = 1
};

/** @brief Class AnyParameterProperties keeps track of parameter properties.
* The parameter properties can be either true or false.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

like!
Maybe also mention descriptions in the first sentence " ... of parameter meta information, such as properties and descriptions"

* These properties describe if a parameter is for example a hyperparameter
* or if it has a gradient.
*/
class AnyParameterProperties
{
public:
static const int32_t HYPER = 1u << 0;
static const int32_t GRADIENT = 1u << 1;
static const int32_t MODEL = 1u << 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah wait. the thing I had in mind was

static const int HYPER = 1;

and then

mask = (1 << HYPER) | (1 << MODEL);

This is now difference, as you are effectively storing masks themselves in these const members, not the number of bit shifts. Or am I wrong? Let me stare a bit more to understand

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean:

static const int HYPER = 0;
static const int GRADIENT = 1;
static const int MODEL = 2;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but start at 1

Copy link
Member Author

@gf712 gf712 Nov 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, if we have:

  static const int HYPER = 1;
  static const int GRADIENT = 2;
  static const int MODEL = 3;

  bool model_selection = true;
  bool gradient = true;

  int mask = ((1 << HYPER) & model_selection) | ((1 << GRADIENT) & gradient);

Then the above mask will always be false/0 no?

Copy link
Member Author

@gf712 gf712 Nov 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean more like this:

int mask = (model_selection << HYPER) | (gradient << GRADIENT)

which is the equivalent of: shift bit at position X (1,2 or 3 in this case) if the function parameter is true.

Copy link
Member Author

@gf712 gf712 Nov 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah now I remember why I did it this way.. Basically if you want to provide an interface to create your own mask like this

AnyParameterProperties("My description",  
                       AnyParameterProperties::MODEL |
                       AnyParameterProperties::GRADIENT)

you need to have the values of the static member with the shift. Unless I am missing something?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you mean, but what I have seen (and been doing) would result in

AnyParameterProperties("My description",  
                       (1 << AnyParameterProperties::MODEL) |
                       (1 << AnyParameterProperties::GRADIENT))

It is not as neat, you are totally right!. BUT it can be hidden quite easily either.
I mean, it is your code, Ill leave it to you. Not sure whether it is possible to say what is better now, we will have to commit to something and then see if it works :) So go ahead with the above solution and then we see where we get

Copy link
Member Author

@gf712 gf712 Nov 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK! Btw why does it need to start at 1, instead of 0?

Copy link
Member

@karlnapf karlnapf Nov 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, no that was nonsense (just needed to make sure that you never get an all zero mask)

BTW I also realised why the micro-controller people store the number of bits rather than the full masks: it uses less memory (which is not really relevant here, but still good to think about it for a moment).

byte HYPER = 2
...
byte BLA = 255

so with an 8bit integer, you can do 255 attributes if you generate the masks on the fly a la (1 << HYPER). If you wanted to store the mask for each attribute instead (i.e. HYPER = 0x00000010 (each mask is 8bit), you would have to store 255 * 8bit, which is quite a difference , especially if you have only 4k memory as on an AVR ;)

How many parameters are registered during a shogun run? Let's say 100 is realistic. Then we have 100 byte usage if we store only the offset, and 2.5kB if we use the masks. In a data heavy environment, and since we won't register many many more parameters this doesn't matter. For 1000 parameters it is still ok, 10^6 will result in some pretty big differences.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets typedef the actual type so we can change later.


/** Default constructor where all parameter properties are false
*/
AnyParameterProperties()
: m_description(), m_model_selection(MS_NOT_AVAILABLE),
m_gradient(GRADIENT_NOT_AVAILABLE)
: m_description("No description given"), m_attribute_mask(0)
{
}
/** Constructor
* @param description parameter description
* @param hyperparameter set to true for parameters that determine
* how training is performed, e.g. regularisation parameters
* @param gradient set to true for parameters required for gradient
* updates
* @param model set to true for parameters used in inference, e.g.
* weights and bias
* */
AnyParameterProperties(
std::string description,
EModelSelectionAvailability model_selection = MS_NOT_AVAILABLE,
EGradientAvailability gradient = GRADIENT_NOT_AVAILABLE)
: m_description(description), m_model_selection(model_selection),
EModelSelectionAvailability hyperparameter = MS_NOT_AVAILABLE,
EGradientAvailability gradient = GRADIENT_NOT_AVAILABLE,
bool model = false)
: m_description(description), m_model_selection(hyperparameter),
m_gradient(gradient)
{
m_attribute_mask = (hyperparameter << 0 & HYPER) |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was more thinking
(1 << HYPER) & model_selection | (1 << GRADIENT) && gradient etc

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's too low-level. Can we introduce a constexpr function for that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah actually good point! :)

(gradient << 1 & GRADIENT) |
(model << 2 & MODEL);
}
/** Mask constructor
* @param description parameter description
* @param attribute_mask mask encoding parameter properties
* */
AnyParameterProperties(std::string description, int32_t attribute_mask)
: m_description(description)
{
m_attribute_mask = attribute_mask;
}
/** Copy contructor */
AnyParameterProperties(const AnyParameterProperties& other)
: m_description(other.m_description),
m_model_selection(other.m_model_selection),
m_gradient(other.m_gradient)
m_gradient(other.m_gradient),
m_attribute_mask(other.m_attribute_mask)
{
}

std::string get_description() const
const std::string& get_description() const
{
return m_description;
}

EModelSelectionAvailability get_model_selection() const
{
return m_model_selection;
return static_cast<EModelSelectionAvailability>(
(m_attribute_mask & HYPER) > 0);
karlnapf marked this conversation as resolved.
Show resolved Hide resolved
}

EGradientAvailability get_gradient() const
{
return m_gradient;
return static_cast<EGradientAvailability>(
(m_attribute_mask & GRADIENT) > 0);
}
bool get_model() const
{
return static_cast<bool>(m_attribute_mask & MODEL);
}

private:
std::string m_description;
EModelSelectionAvailability m_model_selection;
EGradientAvailability m_gradient;
int32_t m_attribute_mask;
};

class AnyParameter
Expand Down Expand Up @@ -116,6 +159,6 @@ namespace shogun
Any m_value;
AnyParameterProperties m_properties;
};
}
} // namespace shogun

#endif
16 changes: 10 additions & 6 deletions src/shogun/base/SGObject.h
Original file line number Diff line number Diff line change
Expand Up @@ -676,7 +676,10 @@ class CSGObject
void watch_param(
const std::string& name, T* value,
AnyParameterProperties properties = AnyParameterProperties(
"Unknown parameter", MS_NOT_AVAILABLE, GRADIENT_NOT_AVAILABLE))
"Unknown parameter",
AnyParameterProperties::HYPER |
AnyParameterProperties::GRADIENT |
AnyParameterProperties::MODEL))
{
BaseTag tag(name);
create_parameter(tag, AnyParameter(make_any_ref(value), properties));
Expand All @@ -693,8 +696,7 @@ class CSGObject
template <typename T, typename S>
void watch_param(
const std::string& name, T** value, S* len,
AnyParameterProperties properties = AnyParameterProperties(
"Unknown parameter", MS_NOT_AVAILABLE, GRADIENT_NOT_AVAILABLE))
AnyParameterProperties properties = AnyParameterProperties())
{
BaseTag tag(name);
create_parameter(
Expand All @@ -714,8 +716,7 @@ class CSGObject
template <typename T, typename S>
void watch_param(
const std::string& name, T** value, S* rows, S* cols,
AnyParameterProperties properties = AnyParameterProperties(
"Unknown parameter", MS_NOT_AVAILABLE, GRADIENT_NOT_AVAILABLE))
AnyParameterProperties properties = AnyParameterProperties())
{
BaseTag tag(name);
create_parameter(
Expand All @@ -733,7 +734,10 @@ class CSGObject
{
BaseTag tag(name);
AnyParameterProperties properties(
"Dynamic parameter", MS_NOT_AVAILABLE, GRADIENT_NOT_AVAILABLE);
"Dynamic parameter",
!AnyParameterProperties::HYPER |
!AnyParameterProperties::GRADIENT |
!AnyParameterProperties::MODEL);
std::function<T()> bind_method =
std::bind(method, dynamic_cast<const S*>(this));
create_parameter(tag, AnyParameter(make_any(bind_method), properties));
Expand Down