Skip to content
Giovanni De Toni edited this page Jun 19, 2017 · 2 revisions

Shogun Style Guidelines

Shogun use an automatic style checker (namely Clang Format) to verify if code follows Shogun's coding guidelines. Please make sure that your patches conform to these guidelines, otherwise our Continuous Integration tools will spot these errors and they will mark your pull requests as failed.

For instance, this is an example of the CIs moaning for style errors: https://travis-ci.org/shogun-toolbox/shogun/jobs/243359288#L493

You can use our custom script, located in <your_shogun_source_dir>/scripts/check_format.sh to see if the code you have written follows our guidelines. The script will also give you instruction on how to fix style errors. Note that to use the script, clang-format-3.8 must be installed on your system.

Formatting Guidelines

Function Parameters

Align horizontally function parameters, but if they don't fit on a single line then break.

// Small function call
function("a", "few", "parameters"); 

// Long function call with many params
function(
    "too", "many", "parameters", "for", "this", "function");

Short Code Blocks

Avoid to contract simple statement to a single line (and also functions). This means that code like this:

if (a) return;

for (int i = 0; i < 10; i++) cout << i << endl;

while (true) cout << "true" << endl;

is forbidden and must be written this way:

if (a) 
    return;

for (int i = 0; i < 10; i++) 
    cout << i << endl;

while (true) 
    cout << "true" << endl;

Brace wrapping

Braces must be placed on a separated new lines (this applies to while, for, class etc.), for instance:

// If statements
if (a)
{
   // Code
}

// Function definitions
int fantastic_method(int a)
{
    // Code
}

Other General Rules

  • The column limit is 80;
  • Tab char are used for indentation;
  • Tab size is equal to 4;
  • #include directives must be sorted alphabetically;
  • Don't put multiple assignments on a single line;

Editor configurations

  • Indenting uses stroustrup style with tabsize 4, i.e. for emacs use in your ~/.emacs:
   (add-hook 'c-mode-common-hook
      (lambda ()
         (show-paren-mode 1)
         (setq indent-tabs-mode t)
         (c-set-style "stroustrup")
         (setq tab-width 4)))

For vim in ~/.vimrc

   set cindent         " C style indenting
   set ts=4            " tabstop
   set sw=4            " shiftwidth
  • For newlines use LF only; avoid CRLF and CR. Git can be configured to convert all newlines to LF as source files are committed to the repo by (for more information consult http://help.github.com/line-endings/):
   git config --global core.autocrlf input
  • Avoid trailing white-space (spaces & tabs) at end of lines and never use spaces for indentation. For emacs:
   (add-hook 'before-save-hook 'delete-trailing-whitespace)

For vim in ~/.vimrc (implemented as an autocmd, use wisely):

    autocmd BufWritePre * :%s/\s\+$//e

Shogun Coding Guidelines

Macros and #ifdef

  • Use macros sparingly;
  • Avoid defining constants using macros (bye bye type-checking), use enums (when defining several related constants) or:
  const int32_t FOO=5;
  • Use #ifdefs sparingly (really limit yourself to the ones necessary) as their extreme usage makes the code completely unreadable. If you need to use ifdefs always comment the corresponding #else / #endif in the following way:
#ifdef HAVE_LAPACK
// Code
#else //HAVE_LAPACK
// Code
#endif //HAVE_LAPACK

Functions

  • Functions should be short and sweet, and do just one thing. They should fit on one or two screenfuls of text (the ISO/ANSI screen size is 80x24, as we all know), and do one thing and do that well;
  • Another measure of the function is the number of local variables. They shouldn't exceed 5-10, or you're doing something wrong. Re-think the function, and split it into smaller pieces. A human brain can generally easily keep track of about 7 different things, anything more and it gets confused. You know you're brilliant, but maybe you'd like to understand what you did 2 weeks from now;

Naming Conventions

  • When naming variables:
    • In classes, member variables all start with m_, e.g. m_feature_vector (to avoid shadowing and the related bugs);
    • Parameters (in functions) shall be named e.g. feature_vector;
    • Don't use meaningless variable names, it is however fine to use short names like i,j,k etc. in loops;
    • Class names start with 'C', each syllable/subword starts with a capital letter, e.g. CStringFeatures;
    • Constants/defined objects are UPPERCASE, i.e. REALVALUED;
  • Function are named like get_feature_vector() and should be limited to as few arguments as possible (no monster functions with > 5 arguments please);
  • Objects which can deal with features of type DREAL and class SIMPLE don't need to contain Real/Dense in class name. Others are required to clarify class/type they can handle, e.g. CSparseByteLinearKernel, CSparseGaussianKernel;
  • Variable and function names are all lowercase (except for class Con/Destructors) syllables/subwords are separated by '_', e.g. compute_kernel_value(), my_local_variable;
  • Features and preprocessors are prefixed with featureclass (e.g. Dense/Sparse) followed by featuretype (Real/Byte/...);

Types

  • Please use only these types:
char		
uint8_t		
uint16_t	
uint32_t	
int32_t		
int64_t		
float32_t	
float64_t	
floatmax_t	

Code Comments

  • When writing API docs, use detailed proper English (please use Doxygen style syntax);
  • DON'T do redundant code comments;
  • Code should be self-explaining, otherwise naming/structure is bad;

Other general rules

  • Classes must be (directly or indirectly) derived from CSGObject;
  • Don't use fprintf/printf/cout, but SG_DEBUG/SG_INFO/SG_WARNING/SG_ERROR/SG_PRINT (if in a from CSGObject derived object) or the static SG_SDEBUG/... functions;
Clone this wiki locally