- Thy code shalt be the primary method of documentation. Part of this is to choose concise but descriptive names. Comments should be used to explain the concept and usage of classes (in the hpp file) and the structure of the algorithms (in the implementation).
- Thou shalt program defensively. Use
Assert
wherever it makes sense, useDebugAssert
in performance-critical parts (e.g., within hot loops). Also, we do not handle exceptions in Hyrise. If an invalid state is reached, we crash immediately. This makes debugging easier. An exception to this is user-facing code where we handle, e.g., typos in SQL queries. - Thou shalt reflect on existing code. Just as your code is not perfect, neither is the code that people wrote before you. Try to improve it as part of your PRs and do not hesitate to ask if anything is unclear. Chances are that it can be improved.
- Thou shalt properly test thy code. This includes unit and integration tests. Try to isolate parts that can be independently tested.
-
Use automatic memory management (RAII, smart pointers).
new
andmalloc
are evil words. -
Be mindful of ownership. Not everything needs to be a smart pointer. Consider passing around references to the object or references to a shared_ptr instead of copying the shared_ptr. Remember that this might not be safe when passing shared_ptrs into, e.g., JobTasks.
-
Use
const
whenever possible. Consider variables, methods, pointers, and their pointees. -
Header files
- Reduce the size of your hpp files, both in terms of the number of lines and the code complexity. This keeps the compilation duration low.
- Code in hpp files is compiled for every cpp file that includes the header. As such, move code to cpp files where possible. This often includes templated classes, where it is sometimes possible to implement their code in cpp files.
- Anonymous namespaces are a good way to define local helper methods.
- Use forward declarations instead of full header includes wherever possible.
-
Loops
- Use range-based
for
loops when possible:for (const auto& item : items) {...}
. - If you have to use old-style loops, keep in mind that the loop condition is evaluated every time: Instead of
for (auto offset = size_t{0}; offset < something.size(); ++offset)
, the size should be retrieved just once. See also this document.
- Use range-based
-
Data structures
- When creating a vector where you know the size beforehand, use
reserve
to avoid unnecessary resizes and allocations. - Hash-based data structures are usually faster than tree-based data structures. Unless you have a reason to use the
latter, prefer
unordered_(map|set)
overmap
andset
.
- When creating a vector where you know the size beforehand, use
-
Copies
- Avoid unnecessary copies, C++ makes it too easy to inadvertently copy objects.
- For larger elements (e.g., vectors), pass a (
const
) reference instead. - If your implemented class does not need to be copied (e.g., a
Table
should never exist more than once), inherit fromNoncopyable
to avoid these potentially expensive copies.
-
Miscellaneous
-
Prefer
if (object)
overif (object != nullptr)
orif (object.has_value())
. -
Don't write
this->
if you don't have to. -
Be explicit with types: Use
[u]int(8|16|32|64)_t
instead ofint, long, uint
etc. -
Use auto-to-stick:
auto x = 17;
orauto y = std::vector<size_t>{};
. -
Namespaces: Do not create nested namespaces, do not import namespaces.
-
Prefer pre-increment over post-increment. See the LLVM Coding Standards
-
Consider structured bindings:
const auto& [iterator, added] = unordered_map.emplace(...);
-
Use braced control statements, even for single-line blocks. Moreover, unless the block is empty (e.g.,
while (!ready) {}
), add line breaks. Instead ofif (...) x();
(orif (...) { x(); }
), write:if (...) { x(); }
-
-
Much of this is enforced by clang-tidy. However, clang-tidy does not yet cover hpp files (see #1901). Also, while clang-tidy is a great help, do not rely on it.
-
Call ./scripts/format.sh before committing your code.
-
Choose clear and concise names, and avoid, e.g.,
i
,j
,ch_ptr
. -
Formatting details: 2 spaces for indentation, 120 columns, comments above code.
-
Use empty lines to structure your code.
-
Naming conventions:
- Files: lowercase separated by underscores, e.g., abstract_operator.cpp, usually corresponding to a class, e.g., AbstractOperator.
- Types (classes, structs, enums, typedefs, using): PascalCase starting with uppercase letter, e.g.,
TableScan
. - Variables: lowercase separated by underscores, e.g.,
chunk_size
. - Functions: lowercase separated by underscores, e.g.,
append_mutable_chunk()
. - Private / protected members / methods: like variables / functions with leading underscore, e.g.,
_on_execute()
. - Classes that are used only to have a non-templated base class are named
BaseXY
(e.g.,BaseValueSegment
, while classes that have multiple differing implementations are namedAbstractXY
(e.g.,AbstractOperator
). - In cases where a constructor parameter would have the same name as the member it initializes, prefix it with
init
:C(int init_foo) : foo(init_foo) {}
. - If an identifier contains a verb or an adjective in addition to a noun, the schema [verb|adjective][noun] is
preferred, e.g., use
left_input
rather thanandinput_left
set_left_input()
rather than.set_input_left()
- Unused variables: instead of leaving out the parameter name, comment the parameter name out (e.g.,
void function(const size_t /* count */) { ... }
). If the context does not provide a good name (such as/* count */
), use/* unused */
.
-
Prefer declaring (member) variables of the same type in a new line each (isolated declaration):
class Foo { ... int a; int b; };
However, certain test classes have many member variables of the same type, e.g.,
std::shared_ptr<LQPColumnExpression>
for LQP node tests. Declarations with more than one variable can be used in such test classes if the number of variables is high (see below). Please use common sense to decide on which format you use.class FooTest : public BaseTest { ... std::shared_ptr<LQPColumnExpression> a, b, c, d, e, f, g, h, i, j, k; };
-
Maintain correct orthography and grammar. Comments should start with a capital letter, sentences should be finished with a full stop.
- Class names within comments are written in PascalCase - e.g., "As the input may be a ReferenceSegment, a valid RowID may point to a row that is NULL."
- When you submit a non-trivial PR, include the results of benchmark_all.sh.
- These results help in understanding potential performance changes as well as document potential changes to the compilation costs.
- We do not do this automatically as the CI server is not sufficiently isolated and the performance results would vary. Similarly, your personal laptop is likely to produce unreliable results.
- If your PR is related to an existing issue, reference it in the PR's description (e.g.,
fixes #123
orrefs #123
). - If you are not a member of the Hyrise organization, your PR will not be built by our CI server. Contact a maintainer for this. They can add you to the organization or manually trigger builds from within Jenkins.
- For your PR to be merged, it must pass a FullCI run. Set the FullCI tag in GitHub before committing to trigger the complete (but more expensive!) run.
- When merging your PR, copy your PR description (excluding the benchmark results) into the commit message. The commit message of the squash merge should NOT contain the individual commit messages from your branch.
- Keep the whole picture in mind. Often, it makes sense to make two passes: One for the code style and line-level modifications; one for checking how it fits into the overall picture.
- Check if the PR sufficiently adds tests both for happy and unhappy cases.
- Do not shy away from requesting changes on surrounding code that was not modified in the PR. Remember that after a PR, the code base should be better than before.
- Verify the CI results, including how the reported coverage changed, and check if the compile time or query performance have been negatively affected. For this, the author should have posted the results of benchmark_all.sh into the PR description.