Mu2e Home
C++ FAQ
Search
Mu2e@Work

This will become a collection of answers to questions that have come up.


  1. What C++ References are Recommended?
  2. C++11
  3. Where may I use: static const Type = value;
  4. Do not use Exception Specifications
  5. Compiler Generated Methods
  6. Comparison between Signed and Unsigned Types
  7. Unused Variables
  8. The Many Meanings of const
  9. Best Practices


What C++ References are Recommended?

There are four sorts of references you will need:

  1. A C++ Language Reference
  2. A reference that describes the Standard Library
  3. A Tutorial
  4. As your skills develop you will also want references that describe current ideas about best practices.
Most good tutorial books are, by design, incomplete language references and incomplete standard library references. If you are a beginner it is wise to borrow or buy each of the first two references at the same time you acquire your first tutorial. The section below recommends some specific books and some online resources.

A C++ Language References

Standard Library References

Tutorials

Best Practices

Bob Bernstein has used "Thinking in C++" Volumes 1 and 2 from http://www.ibiblio.org/pub/docs/books/eckel/. I have not read this carefully but, on first glance these appear to be a very formal introductory tutorial, mixed with advanced tutorials.


C++11

In 2011 the International Standards committee defined a new C++ standard, named C++11. The section below contains links to places that discuss what features are new in C++11 and who to use them.

Kevin Lynch's FAQ


Where may I use: static const Type = value;

The code fragment:
static const Type name = value;
defines a variable whose constant value is known at compile-time; this allows, but does not require, the compiler to perform various optimzations at compile-time. There are, however, some very non-intuitive restrictions on where this syntax can be used. The C++ standard states (I am paraphrasing): There are two places in which you may use the above syntax: In both of these cases, the value of the variable is known at compile time so optimizations are possible but not required. Why "not required"? For example the standard states that compilers are not required to be able do to floating point arithmetic at compile time; so compilers may defer all floating point arithmetic to run time, even if the information is available at compile time. Were compile-tine floating point arithmetic required, it would be very difficult, even impossible, to write a standard-compliant cross-compiler.

In the first case above, the scope of the variable may be broader than the compilation unit so the compiler will definitely allocate memory for the variable. In the second case, the scope of the variable is a subset of the compilation unit so the compiler can choose to make the variable a true compile time constant.

The g++ compiler, with its default options, will let you initialize a static const double data member within a class declarationn. We have, however, encountered a situation in which it produces incorrect code. This non-standard usage can be identified by using the -pedantic flag on the compiler. Unfortunately the -pedantic flag is, well, just too pedantic and code that Mu2e depends on, such as the framework, ROOT and G4 are not compliant with -pedantic. So we cannot use it. It looks like this is a case for which we will have to rely on standards and practices.

When you need to have a static const data member of non-integral type, the way to initialize it is as follows. In the header file:

struct MyClass{
  // ...
  static const double member_datum;
};

In the implementation file ( the .cc file ):


const double MyClass::member_datum=123.4;

And you really should put the last line in a .cc file, not in the .hh file and outside of the class declaration. If you make the latter choice there can be multiple copies of this object in memory, one for each compilation unit which includes the header.

Do not use Exception Specifications

In the following code fragment, the segment highlighted in red is known as an exception specification:
 void ClassName::methodName(int idx) const throw(art::Exception("category") {
   // body of the function
   if ( badThingHappens ){
     throw(art::Exception("category") << "Informational message.";
   }
 }
An exception specification is not required to be present when your code throws an exception and, when writing code for Mu2e you should never write an exception specification. ( Actually there is only one very obscure situation in which you must use an exception specification; the obscure situation is described at the end of this section. )

This is in contrast to the Java programming language in which exception specifications are required. The confusion arises because exception specifications do very different things in C++ and Java; these are both described below.

As a preamble to the explanation, you need to remember that code can throw in two ways:

In the following, the phase "if a method throws" is true if it does either of the above.

In Java, if a method throws an exception without catching it, then the code must have an exception specification of the correct type or the code will not compile. Conversely, if the method has an exception specification, then the body of the method may throw an exception of the specified type that is not caught. A consequence of this rule is as follows: consider method A, which calls method B, which calls method C, which calls method D. If method D throws an exception and we want it to be caught be method A, all of methods B, C and D must have an exception specification and method A must not. Alternatively, if we expect method C to catch the exception, then only method D must have an exception specification and all others must not. All of these rules are enforced at compile time.

In C++, on the other hand, the behaviour is very different. If a method has an exception specification, then the compiler will insert code that checks, at run time, the type of any exception thrown by the code ( either directly or propagated up from code called by the method). If this code detects that the thrown exception matches that in the exception specification then nothing special happens. If, on the other hand, it detects that a different exception has been thrown, then the program will immediately call terminate().

The Mu2e framework is designed to catch most exceptions. When it does catch an exception its default behaviour is to shutdown as gracefully as possible; under most circumstances it will properly close all of the histogram files and event-data output files and it will flush the log files. The framework can also be configured to do things like write the offending event to a separate output file and to continue with the next event. Using an exception specification is either useless or it will produce a hard termination that leaves corrupted and incomplete histogram files, event-data output files and log files.

The obscure circumstance in which an exception specification is required is this: if your code inherits from a Standard Library class, and that class has functions with nothrow specifications, then you derived class's functions must too.

The links below take you to discussions of this in the Computer Science literature.


Compiler Generated Methods

Consider the following class:

   // Hit.hh
   class Hit{
     Hit():_pos(), _dir(){}

     Hit( Hep3Vector pos, Hep3Vector dir):
        _pos(pos), _dir(dir){}

     // Accept compiler generated, d'tor, copy c'tor and assignment operator.

     const Hep3Vector& position()  const { return _pos; }
     const Hep3Vector& direction() const { return _dir; }

   private:
     Hep3Vector _pos, _dir;
   };
As the comment suggests, his class has three methods that are generated by the compiler:
  1. The destructor.
  2. A copy constructor.
  3. The assignment operator (= operator).
For this particular class these three methods will do the correct thing: the compiler generated destructor will call the destructor of each data member; the compiler written copy and assignment operators will call the corresponding methods of each data member.

With these additional methods, the above class allows code like:

   Hit A, B( Hep3Vector(0.,0.,0.), Hep3Vector(0.,0.,1.) );

   Hit C(B);   // Copy constructor
   A = B;      // Assignment operator

If you know that the compiler generated methods will do the correct thing, then we recommend that you let the compiler generate these methods.

The compiler written methods will do the correct thing if the data members are "Plain Old Data" (POD); they will also do the correct thing if the data members are objects which themselves have only data member that are PODs; and so on, recursively. In the above example, Hep3Vector is a POD. So it is safe to have data members of type Hep3Vector. If a data member is a std::vector<T>, where T satisfies the previous constraints, the compiler written code will do a deep copy of the vector. If this is the required behaviour, then let the compiler write the code.

In general the compiler written methods will do the wrong thing if your data members include objects that manage external resources, for example most kinds of pointers, or objects that are file streams.

A useful rule of thumb is callsed the "Rule of Three": if you discover that you need to write any of these three functions (presumably because the compiler would do the wrong thing), then write all three of them.

With the release of C++11, classes can have move-aware behaviour. When this happens, the compiler will also be able to write two addtional methods, a move-aware constructor and a move-aware assignment operator. When this happens, the rule of three will become the rule of five: if you need to write any of the five functions, then write all five.


Comparison between Signed and Unsigned Types

When the appropriate compiler warnings are enabled, the compiler will warn about comparisons between signed and unsigned integer types. Such code will work correctly so long as the expression of signed type is guaranteed to never be negative and the expression of unsigned type is guaranteed to never exceed the maximum value of the signed type.

Loop Indices

The following code fragment,

  std::vector<T> v;
  // code  to fill v
  for ( int i=0; i<v.size(); ++i){
     // do something with each element
  }
will generate an compiler diagnostic of the form,
warning: comparison between signed and unsigned integer expressions
The issue is that v.size() returns an unsigned type while i is a signed type. The recommended solution is to change the type of i:
  for ( std::size_t i=0; i<v.size(); ++i){
  }
To be pedantic, the correct data type for i is std::vector<T>::size_type. But in all implementations we know of, this is just a typedef to std::size_t and the visual impact of the full type is sufficiently distracting that we recommend just std::size_t.

String Lengths in Indices

The following code fragment, is looking inside a string for the presence of a substring delimited by a pair of open and close braces.

      int iopen  = Value.find("{");
      int iclose = Value.find_last_of("}");
      if ( ( iopen  == string::npos ) ||
           ( iclose == string::npos ) ||
           ( iclose < (iopen+1) ) ){
      }
It will generate a compiler diagnostic of the form,
warning: comparison between signed and unsigned integer expressions
There are several issues here. The two find methods return an unsigned type but the compiler will correctly convert it, without diagnostic, to a signed type as requested in the first two lines. The diagnostic is generated by the comparison to string::npos, which is an unsigned type. The recommended form is,
      std::string::size_type iopen  = Value.find("{");
      std::string::size_type iclose = Value.find_last_of("}");
      if ( ( iopen  == string::npos ) ||
           ( iclose == string::npos ) ||
           ( iclose < (iopen+1) ) ){        // Correct.  See below!!
      }
It would also be acceptable to write the type of iopen and iclose as std::size_t.

This example illustrates another point. One might have written the last line as

   ( iclose-1 < iopen )   // Unsafe
but that would have been a mistake. Consider the case the iclose=0 and iopen is either zero or almost any postive value. This code will fail because, under the rules of arithmetic with unsigned variables, the expression (iclose-1) evaluates to a large postive value! One can always avoid subtraction of unsigned types by changing to addition on the other side of the comparison. If previous logic has ensured that the subtraction is safe, and if the code reads much more naturally with subtraction, there is a case for writing the code using the subtraction. I strongly prefer we not do this - what if someone unwittingly removes the safety checks? If you decide there is a good reason to write your code this way, add a comment to explain why it is safe and add a comment to the safety checks to say that downstream code depends on them. Having said, this I expect that I have old code around that violates this rule; I will fix these as I encounter them.


Unused Variables

In order to get clean builds we must not have any unused variables in our code. This section discusses a few things we can do to avoid this. One large class of unused variables is those used for debugging; it is acceptable practice to simply comment these out. There are lots of other alternatives involving compile time flags. Whatever you choose, do it consistently and make sure that the production code compiles without diagnostics.

Another large class of warnings comes from code like:

G4Material* vacuum =  new G4Material( mat.name,
                                      1.,
                                      1.01 *g/mole,
                                      density,
                                      kStateGas,
                                      temperature,
                                      pressure)
where the variable vacuum is never used. In this example, when the G4Material object is created, the object registers itself with G4's material store. The material store then takes ownership of the object and manages its lifetime.

The recommended solution to this situation is to use a bare new and to comment the unusual choice:

 // G4 takes ownership of this object.
 new G4Material( mat.name,
                 1.,
                 1.01 *g/mole,
                 density,
                 kStateGas,
                 temperature,
                 pressure);
At a future date we may develop a different solution for documenting this behaviour; one option is to have a registry to hold pointers to objects that are really owned by G4. The registry would never do anything with the pointers it holds; it would not even have accessor methods. The act of registering would get rid of the compiler diagnostic and document the transfer of ownership to G4.


The Many Meanings of const

Under construction
Bernstein recommends: http://www.ibiblio.org/pub/docs/books/eckel/ Volume 1 Chapter 8

The short answer is that const is a contract between two pieces of code that one piece of code will not modify an object that is owned by another piece of code. If code breaks the contract then the result will usually be a compile-time error but the error may sometimes be delayed until link-time or load-time. The two pieces of code may be even be in separate files. The long answer is below.

The Basics

The basic example of const is to show how it applies to objects:
   double x       = 5.;
   const double y = 6.;
   // ...
   x = 7.;       // OK; you may modify x.
   y  = 6.;      // Compiler error; you are not permitted to modify y.
   x = y + 1.;   // OK; you are only using y, not modifying it.

There is an alternate syntax for the second line; the position of the const has changed. Both syntaxes produce exactly the same code.
   double const y = 6.;
Const also applies to references to objects:
   double        x = 5.;
   double&       y = x;
   const double& z = x;

   y = 7;   // OK.  This changes the value of both x and y.
   z = 9.;  // Compiler error;  you are not permitted to modify z.

There is an alternate syntax for the second line; the position of the const has changed. Both syntaxes produce exactly the same code.
   double const& z = x;
With references, you may not make a non-const reference to a const object:
   const double  x = 5.;
   const double& y = x;   // OK.
   double&       z = x;   // Compiler error.
With pointers, there are four permuations of const:
   double x=42.;
   double       *        y1 = &x;   // y1 is a non-const pointer to a non-const pointee.
   double const *        y2 = &x;   // y2 is a non-const pointer to a     const pointee.
   double       * const  y3 = &x;   // y3 is a     const pointer to a non-const pointee.
   double const * const  y4 = &x;   // y4 is a     const pointer to a     const pointee.

   double z=23.;
   y1  = &z;      // Ok. y1 now points to z.
   *y1 = 13.;     // Ok. pointee now has the value 13.

   y2  = &z;      // Ok. y2 now points to z.
   *y2 = 13.;     // Compiler error.  pointee is const.

   y3  = &z;      // Compiler error. y3 is a const pointer.
   *y3 = 13.;     // OK. pointee now has the value 13.

   y4  = &z;      // Compiler error. y4 is a const pointer.
   *y4 = 13.;     // Compiler error. pointee is const.

   // As with references, pointers must obey the constness of their pointee.
   const double w = 99.;
   double       *        t1 = w;   // Compiler error. Pointee is const and pointer must obey that.
   double const *        t2 = w;   // OK
   double       * const  t3 = w;   // Compiler error. Pointee is const and pointer must obey that.
   double const * const  t4 = w;   // OK.
The way to parse pointer constness is to read right to left:
"y1 is a non-const pointer to a non-const double"
"y4 is a const pointer to a const double"

Finally, there is an alternate syntax for a const pointee:

   double x=42.;
   const double *        y2 = x;   // y2 is a non-const pointer to a const pointee.
   const double * const  y4 = x;   // y4 is a     const pointer to a const pointee.

Why both "const double x" and "double const x"?

The two syntaxes are historical and come from backwards compatibility with c. To my mind the natural way to write values and references is with the const up front, while the natural way to write const-pointees is with the const in the second place:
   const double  x=42.;
   const double& y=x;
   double const * y = &x;
My thinking about pointers is that the "parse from the right rule" is easiest to use if the constness of the pointee is in the second position. In a quest for uniformity, I started to write values and references with the const in the second position. However I found that this confused too many people; because Mu2e standards and practices strongly discourages the use of bare pointers, this convention is a case of the tail wagging the dog. A secondary consideration is that the version of xemacs available on some of the Fermilab managed machines does not perform correct syntax highlighting if const is in the second place!

The Mu2e coding standard does not specify which position of const to use but does request that you be consistent within one file!

In some of the early Mu2e code, I always wrote const in the second place. I am now writing const in first place for values and references.

As an Argument to a Function

This is an example of legal code:
   // func.hh
   using CLHEP::Hep3Vector;
   void func ( Hep3Vector& v );

   // main.cc
   #include "func.hh"
   int main(){
      Hep3Vector a(0.,0.,1.);
      func(a);
      cout << "a = " << a << endl;
   }

   // func.cc
   #include "func.hh"
   void func( Hep3Vector& v ){
     v = Hep3Vector(1.,0.,0.);
   }
In this example the function func receives its argument by reference, modifies the argument and returns. In the main program the variable a is created with some value, which is modifed in func. The program will print out:
a = (1.,0.,0.)

Enforcing the contract: Case 1

Now consder modifying main.cc to make the variable "a" const:
      const Hep3Vector a(0.,0.,1.);
This will generate a compiler error that looks something like:
main.cc: In function `int main()':
main.cc:8: error: invalid initialization of reference of type 'Hep3Vector&' from expression of type 'const Hep3Vector'
func.hh:1: error: in passing argument 1 of `void func(Hep3Vector&)'
This error occurs when compiling main.cc. At this time the compiler only knows about main.cc and func.hh; it does not know whether or not func.cc actually modifies its argument; it only knows, from the header, that func is permitted to modify its argument. The compiler also knows that a, once created, may never be modified. So it will refuse to call func.

Breaking the contract: Case 2

   // func.hh
   using CLHEP::Hep3Vector;
   void func ( const Hep3Vector& v );

   // main.cc
   #include "func.hh"
   int main(){
      Hep3Vector a1(0.,0.,1.);
      func(a1);

      const Hep3Vector a2(0.,0.,1.);
      func(a2);
   }

   // func.cc
   #include "func.hh"
   void func( const Hep3Vector& v ){
     v = Hep3Vector(1.,0.,0.);             // Compile time error; you are not permitted to modify v.
   }

In this case, main.cc will compile without incident. The header func.hh tells the compiler that func does not modify its arguments. So the compiler may pass either a1 or a2 as arguments to func, even though one is const and one is not. On the other hand, when it tries to compile func.cc, the compiler will issue an error like the following:
func.cc: In function `void func(const Hep3Vector&)':
func.cc:4: error: assignment of read-only reference `v'
This says that the function func broke its contract by trying to modify a const argument.

As in the previous section, the const may come before or after the type,

   // func.hh
   using CLHEP::Hep3Vector;
   void func ( Hep3Vector const& v );

Similar comments apply if the argument is passed into func as a pointer to const Hep3Vector:

   // func.cc
   #include "func.hh"
   void func( Hep3Vector const *v ){
     *v = Hep3Vector(1.,0.,0.);             // Compile time error; you are not permitted to modify v.
   }

Mismatched declaration and definition: Case 3

   // func.hh
   using CLHEP::Hep3Vector;
   void func ( const Hep3Vector& v );

   // main.cc
   #include "func.hh"
   int main(){
      Hep3Vector a(0.,0.,1.);
      func(a);
   }

   // func.cc
   #include "func.hh"
   void func( Hep3Vector& v ){
     v = Hep3Vector(1.,0.,0.);
   }
This will generate a link-time ( or possibly load-time error ) that looks something like,
In function `main':
: undefined reference to `func(Hep3Vector const&)'
This message is generated because the linker understands the following two functions to be distinct.
   void func ( const Hep3Vector& t );
   void func ( Hep3Vector& t );
In the example, the main program knows about the first function, because its header is included. It does not, however, know about the second function because that function is never declared within main.cc ( either directly or by being included ). The main program understands that the first function will satisfy its needs and the compiled file main.o will contain a request that the linker find the first function and link it in. When the linker looks within func.o it can only find the second function; therefore it reports an error.

Why did func.cc compile successfully? To answer this, consider the file,

   // func.cc
   void func( Hep3Vector& v ){
     v = Hep3Vector(1.,0.,0.);
   }
This will compile successfully. There is no rule that says that a function definition must find a preceding function declaration. If this file happens include a header that declares functions that are not used by func that is also OK; they are simply ignored.

Return types and const Member Functions: Case 4

   // Hit.hh
   class Hit{
     Hit():_channelNumber(-1),_pos(){}

     Hit( int chan, const Hep3Vector& pos):
        _channelNumber(chan), _pos(pos){}

     // Accept compiler generated, d'tor, copy c'tor and assignment operator.

     // Accessors.
     int channel() const { return _channelNumber;}
     const Hep3Vector& position()  const { return _pos; }

     void setPosition( const Hep3Vector& pos){
        _pos=pos;
     }

   private:
     int _channelNumber;
     Hep3Vector _pos;
   }

   // main.cc
   int main() {

      Hit hit1( Hep3Vector(0.,0.,0.), Hep3Vector(0.,0.,1.) );

      const Hep3Vector& v = hit1.position();  // OK
      Hep3Vector&       v = hit1.position();  // Compiler error

      Hep3Vector v = hit1.position();          // OK since it makes a copy.

      hit1.setDirection ( Hep3Vector(1.,0.,0.) );   // OK

      const Hit hit2( Hep3Vector(0.,0.,0.), Hep3Vector(0.,0.,1.) );

      const Hep3Vector& v = hit1.position();       // OK
      hit2.setDirection ( Hep3Vector(1.,0.,0.) );  // Compiler error

   }
This example contains a toy class that might represent a hit in some detector; it holds a channel number and a position in 3 space. This example illustrates two more uses of const:
  1. The accessor function position() returns its information by a const reference to a private data member.
  2. The two accessor functions have a const after the () and before the opening {.
Note that the channel number is returned by value, not by const reference. The short answer is that an int is a "small object" while a Hep3Vector is a "large" object for which we have to consider the overhead of copying it. This is discussed in more detail in the next section.

The first use of const is necessary because we want users of a const Hit object to be able to see, but not modify its internal state.

More about return types: Case 5

   // Hit.hh
   class Hit{
     Hit():_channelNumber(-1),_pos(), _dir(){}

     Hit( int chan, Hep3Vector pos, Hep3Vector dir):
        _channelNumber(chan), _pos(pos), _dir(dir){}

     // Accept compiler generated, d'tor, copy c'tor and assignment operator.

     const Hep3Vector& position()  const { return _pos; }
     const Hep3Vector& direction() const { return _dir; }

     int channel() const { return _channelNumber;}

   private:

     int _channelNumber;
     Hep3Vector _pos, _dir;
   }

   // main.cc
   int main() {

      Hit hit( Hep3Vector(0.,0.,0.), Hep3Vector(0.,0.,1.) );

      const Hep3Vector& v = hit.position();  // OK
      Hep3Vector&       v = hit.position();  // Compiler error

      Hep3Vector        v = hit.position();  // OK.  Makes a copy.
   }
In this example the class Hit allows users to look at its member data but not to modify them. This is a very common pattern found in Mu2e classes that are part of the event model, the geometry data or the conditions data. In the case of event data, once data has been added to the event it may never be modified; this is required so that the audit trail of who created which data product is not violated. In the case of geometry data, the geometry service is responsible for keeping the geometry up to date; users of the geometry information may only view the geometry data, not modify it. A similar situation is true for conditions data. The Mu2e classes have been designed so that the compiler will spot attempts to evade these rules and will give compile time errors for illegal operators.

There is a second use of const in this example, the const that follows the name of the two accessor functions, postiion() and direction(). This will be discussed in the next section.

There are a variety of choices for the return type of the accessor functions:

  1. Return by value ( make a copy ).
  2. Return by const reference.
  3. Return by pointer to const.
  4. Return by some sort of smart pointer to const.
This list excludes things like non-const reference and pointer to non-const that would allow the user to modify the data inside of the class Hit.

The first option was chosen for returning the channel number, while the second option was chosen for returning the position and direction. The reasoning is that, if an object is "small", then return it by value and, if an object is "large", then return it by one of the other three types. This is an efficiency argument: it can be expensive in both memory and CPU to make a copy of a large object; therefore, grant access via some sort of pointer type ( a reference qualifies as a pointer type in this sense ) and use const to ensure that the member data cannot be modified. In usual practice, small objects include the built in data types plus objects that use no more memory than does a pointer (4 bytes on a 32 bit machine and 8 bytes on a 64 bit machine).

This brings us the the decision of which pointer type to use. If this Hit class were a top-level object, something that one gets directly either from a Service or from the Event Data Model (EDM), then it would be appropriate to use some sort of smart pointer type that projects the user from the object not being there at all. Instead I imagine that a class representing a single hit will be inside a top level object.


Best Practices

Marc Paterno's talk from the August 2012 Workshop


Still to come:

  1. return argument constness
  2. Pointless-ness of const values as return types.
  3. const member functions
  4. const and stl containers. const vector can return only a const T& and const T*.
  5. const is not deep
  6. const is viral
  7. mutable - use sparingly.
  8. Advanced topic: reference to const temp as an argument or a return type. But not ref to non-const temp.
  9. Advanced topic: overloaded pairs of functions. Signature does not include the return type but does include method constness.
   // func.hh
   using CLHEP::Hep3Vector;
   void func ( Hep3Vector  v );
   void func ( Hep3Vector& v );
   void func ( const Hep3Vector& v );

   // main.cc
   #include "func.hh"
   int main(){
      Hep3Vector v(0.,0.,1.);
      func(v);
      func(v);
      func(v);
   }


Fermilab at Work ]  [ Mu2e Home ]  [ Mu2e @ Work ]  [ Mu2e DocDB ]  [ Mu2e Search ]

For web related questions: Mu2eWebMaster@fnal.gov.
For content related questions: kutschke@fnal.gov
This file last modified Wednesday, 20-Mar-2013 21:58:32 CDT
Security, Privacy, Legal Fermi National Accelerator Laboratory