Code conversion guidelines
- It's recommended to start with a copy of the Java code and convert it line by line.
- The C* code we will convert from currently is http://git-wip-us.apache.org/repos/asf/cassandra.git bf599fb5b062cbcc652da78b7d699e7a01b949ad (from trunk, five months before 2.2.0-beta1, around the time 2.0.12 and 2.1.3 were released)
- It is mandatory to keep the Apache License header, and add a "Modified by Cloudius Systems" (or your own name if you prefer). This is required by the Apache license.
- As usual, add a "Copyright 2015 Cloudius Systems" line. Happy new year!
- Copy code comments where they make sense
- Keep unconverted Java code in
#if 0
blocks - Don't forget to add an include guard to headers
- Filenames:
SomeFile.java
->some_file.hh
, possiblysome_file.cc
- Packages/namespaces: strip
org.apache.cassandra
, rest of package name becomes the namespace. - Keep the directory structure (but strip org/apache/cassandra/).
- Class and method names:
SomeClass
->some_class
,someMethod()
->some_method()
. - If the Java code had a class
SomeClass
and elsewhere a variable or methodsomeClass
, the above rule will yield the same name for both. While this is legal, it is confusing and can lead to hard-to-spot bugs, so please rename the variable or method.
Every Java object reference is a pointer. The object can be shared among multiple references, and is garbage collected when no references exist. The closest parallel in C++ is shared_ptr<>, but we don't want the code to be littered with them.
- When possible, use values instead of references:
List<Foo>
->std::vector<foo>
, notstd::vector<shared_ptr<foo>
. - If a method only looks at an argument, the signature should specify a
const
reference. - If a method takes ownership of an argument, pass it by value (and the caller can use
std::move()
). - If both caller and callee continue using the object, use a
shared_ptr<>
(or a raw pointer if the lifetime is otherwise taken care of). - In the singleton pattern, use unique_ptr<>. For example, a singleton class A in Java might have a static (per-class) field instance:
public static final A instance = new A();
. Its C++ equivalent isstatic std::unique_ptr<A> instance (new A());
Java uses an interface/implementation pair (List
and ArrayList
). C++ uses an implementation class (std::vector<>
) and iterator interfaces. So both List
and ArrayList
should be converted to the implementation class.
Which implementation class is used depends on the usage. For lists, prefer vector<>
, only using list<>
if front or middle insertion is needed. For sets and maps, prefer the unordered variants unless sorting is needed.
Java code depends on Object
base methods such as equals()
or hashCode()
. For simple (non-polymorphic) types we can simply implement operator==()
and std::hash<>
. For type hierarchies these might need to delegate to a virtual method inside the base class.
On 64-bit Linux, the C++ types short
, int
, long
happen to be identical in length to the Java types of that name, but this will not necessarily be the case in other architectures. The C++ types guaranteed to be identical to the Java ones are int16_t
, int32_t
, int64_t
. In many cases where the Java code explicitly specified the integer length by using short or long, we should keep the same length and use int16_t and int64_t. For Java int, its translation should depend on context: Where the specific 4-byte length is important, use int32_t. Where the length was not important, leaving "int" is enough.
Java does not have unsigned types, but you can use them (unsigned int, size_t, etc.) in C++ code if you understand the code in question. If Java code uses the non-sign-extended shift operator "a >>> b", convert it into "(unsigned ..) a >> b"
Please remember that in Java, class fields of primitive types including the above integer types (and also boolean, float, etc.) are implicitly initialized to 0 (see this), but this is not the case in C++! So unless you are sure this is unnecessary (after understanding the code in question), please convert
private long updateTimestamp;
private boolean isAlive;
to
private:
int64_t update_timestamp = 0;
bool is_alive = false;
Convert uses of java's String
into use of sstring
(#include "core/sstring.hh"
).
Java code often has a class implements
an interfaces. These do not always need to be translated to C++ inheritance, and often needs to be converted differently:
A Java class which implements Comparable<T>
can be used, for example, in Collections.sort(), and needs to implements a compareTo(T other). We can drop this interface TODO: and probably, probably instead of the compareTo() want to define operator<, et al., instead. Finish this section.
In general, SMP is treated in the same way as clustering is treated in Cassandra:
- Global operations (schema changes) are broadcast across all cpus (each cpu has a local copy of the schema)
- Single row operations (insert, read) are unicast to the row's owner cpu
- Aggregate operations (multi-row select) use map/reduce to cpus that may contain the row
See the distributed<>
class.
In Java, all functions need to belong to some class. This often results in the "utility class" pattern, a class which cannot be instantiated, and has nothing but static methods.
In C++, we prefer to use a namespace
instead of a class
in that case. Only the public functions should be declared in the header file inside the name space; Private or protected functions, if any, and also static data, belongs in the source (.cc) file and should be declared static (file-local).
Java's method visibility modifiers (private/public/protected) are explained here. Watch out for one surprising difference between what they do in Java and in C++: in Java, methods declared with "protected" or no modifier at all are additionally visible to all other classes in the same package!
As C++ has no similar feature (making class methods visible only to code in the same namespace), the closest approximations are 1. to make all non-"private" methods public in the C++ code, or 2. to use C++'s friend
feature. Option 1 is easier.
Of course, if you can verify that a certain protected or no-modifier method is not used by other classes in this package, then it can be converted to C++'s protected
or private
respectively.