About this blog

Save duplicate questions from disappearing from Google

Is any type convertible to void?

Question

All of the following expressions compile fine:

void();
void(5);
void("hello");
void(std::string("world"));
void(std::vector<int>{1, 2, 3});

Does this mean that any type can be converted to void? That is void is constructible from an argument of any type.

However the following function doesn't compile:

void func()
{
    return std::vector<int>{1, 2, 3};
}

But if I convert the return value to void explicitly, it compiles:

void func()
{
    return void(std::vector<int>{1, 2, 3});
}

Does it mean that all void constructors are explicit?

The answer

Yes, a value of any type can be explicitly converted to void.

Section 3.9.1/9, N3797:

The void type has an empty set of values. The void type is an incomplete type that cannot be completed. It is used as the return type for functions that do not return a value. Any expression can be EXPLICITLY converted to type cv void (5.4). An expression of type void shall be used only as an expression statement (6.2), as an operand of a comma expression (5.18), as a second or third operand of ?: (5.16), as the operand of typeid, noexcept, or decltype, as the expression in a return statement (6.6.3) for a function with the return type void, or as the operand of an explicit conversion to type cv void.

In-class initialization of vector with size

Question

struct Uct
{
    std::vector<int> vec{10};
};

The code above creates vector that contains single element with value 10. But I need to initialize the vector with size 10 instead. Just like this:

std::vector<int> vec(10);

How can I do this with in-class initialization?

Answer

I think there are 2 aswers:

std::vector<int> vec = std::vector<int>(10);

as said in the comments and:

std::vector<int> vec{0, 0, 0, 0, 0, 0, 0, 0, 0, 0};

this is less preferable since it's less readable and harder to adjust later on, but I think it's faster (before c++17) because it doesn't invoke a move constructor as said in the comments.

That being said Uct(): vec(10){}; is also a perfectly viable option with the same properties(I think).

Why does C++ have such weird inheritance rules when using template base classes?

Question

template<typename T>
class B {
public:
  class Xyz { /*...*/ };  // Type nested in class B<T>
  typedef int Pqr;        // Type nested in class B<T>
};

template<typename T>
class D : public B<T> {
public:
  void g()
  {
    Xyz x;  // Bad (even though some compilers erroneously (temporarily?) accept it)
    Pqr y;  // Bad (even though some compilers erroneously (temporarily?) accept it)
  }
};

The result:

main.cpp: In member function 'void D<T>::g()':
main.cpp:13:5: error: 'Xyz' was not declared in this scope
   13 |     Xyz x;  // Bad (even though some compilers erroneously (temporarily?) accept it)
      |     ^~~
main.cpp:14:5: error: 'Pqr' was not declared in this scope
   14 |     Pqr y;  // Bad (even though some compilers erroneously (temporarily?) accept it)
      |     ^~~

Actually, I know why (isocpp.org/wiki/faq):

This might hurt your head; better if you sit down.

Within D<T>::g(), name Xyz and Pqr do not depend on template parameter T, so they are known as a nondependent names. On the other hand, B<T> is dependent on template parameter T so B<T> is called a dependent name.

Here’s the rule: the compiler does not look in dependent base classes (like B<T>) when looking up nondependent names (like Xyz or Pqr). As a result, the compiler does not know they even exist let alone are types.

It really hurts my head. And what I'm asking is why do we have a so strange rule?

Answer

You can find rather long answer at the link above.

Wouldn't it be possible to use just dot (.) to access members of namespace and static members of a class

Question

In C++ we use double colon (::) to access members of namespace, use dot (.) to access members of class/structure and use arrow (->) to access members of class/structure via pointer.

Wouldn't it be possible to use just dot (.)? E.g. my_namespace.my_class.my_static_pointer.my_member. Why are separate lexemes used? Would there be any problems with syntax, if only dot (.) were used for all three cases?

Answers

Answer #1

As observed by Jules it's a fact that early C++ implementations (CFront pre-1.0) had a dot for scope identification.

A dot was also used in C with Classes (1980). Indeed this is a simple snippet from Classes: An Abstract Data Type Facility for the C Language 1:

class stack {
    char    s[SIZE];  /* array of characters */
    char *  min;      /* pointer to bottom of stack */
    char *  top;      /* pointer to top of stack */
    char *  max;      /* pointer to top of allocated space */
    void    new();    /* initialization function (constructor) */
public:
    void push(char);
    char pop();
};

char stack.pop()
{
    if (top <= min) error("stack underflow");
    return *(−−top);
}

(the code was an example of how member functions were typically defined "elsewhere")

The :: was one of the additions to C with Classes introduced to produce C++.

The reason is given by Stroustrup himself:

In C with Classes, a dot was used to express membership of a class as well as expressing selection of a member of a particular object.

This had been the cause of some minor confusion and could also be used to construct ambiguous examples. To alleviate this, :: was introduced to mean membership of class and . was retained exclusively for membership of object

(A History of C++: 1979−1991 [2] page 21 - § 3.3.1)


  1. Bjarne Stroustrup: "Classes: An Abstract Data Type Facility for the C Language" - Bell Laboratories Computer Science Technical Report CSTR−84. April 1980.

  2. Bjarne Stroustrup: "A History of C++: 1979−1991" - AT&T Bell Laboratories Murray Hill, New Jersey 07974.

Comment

It is still unclear what those minor confusion and ambiguous examples are.

Answer #2

Moreover it's true that

they do different things, so they might as well look different

indeed

In N::m neither N nor m are expressions with values; N and m are names known to the compiler and :: performs a (compile time) scope resolution rather than an expression evaluation. One could imagine allowing overloading of x::y where x is an object rather than a namespace or a class, but that would - contrary to first appearances - involve introducing new syntax (to allow expr::expr). It is not obvious what benefits such a complication would bring.

Operator . (dot) could in principle be overloaded using the same technique as used for ->.

(Bjarne Stroustrup's C++ Style and Technique FAQ)

Answer #3

Because someone in the C++ standards committee thought that it was a good idea to allow this code to work:

struct foo
{
  int blah;
};

struct thingy
{
  int data;
};

struct bar : public foo
{
  thingy foo;
};

int main()
{
  bar test;
  test.foo.data = 5;
  test.foo::blah = 10;
  return 0;
}

Basically, it allows a member variable and a derived class type to have the same name. I have no idea what someone was smoking when they thought that this was important. But there it is.

When the compiler sees ., it knows that the thing to the left must be an object. When it sees ::, it must be a typename or namespace (or nothing, indicating the global namespace). That's how it resolves this ambiguity.

Comment

"test.(foo.blah) = 10" could be used in such ambiguous cases.

Get pointer to struct by pointer to its member

Question

Is there a portable way to get a pointer to a structure if you have a pointer to a member of that structure?

There is a linked list implementation in Linux kernel (1):

struct list_head {
    struct list_head *next, *prev;
};

static inline void __list_add(struct list_head *new,
                  struct list_head *prev,
                  struct list_head *next)
{
    next->prev = new;
    new->next = next;
    new->prev = prev;
    prev->next = new;
}

static inline void list_add(struct list_head *new, struct list_head *head)
{
     __list_add(new, head, head->next);
}

#define list_entry(ptr, type, member) \
    container_of(ptr, type, member)

The idea is that this implementation is generic. You can use it with any struct type:

struct my_struct {
    int my_data;

    struct list_head node;
};

void example()
{
    struct list_head head;
    struct my_struct element1 = { 1 };
    struct my_struct element2 = { 2 };

    head.next = head.prev = &head;   // head <-> head
    list_add(&element1.node, &head); // head <-> {1} <-> head
    list_add(&element2.node, &head); // head <-> {2} <-> {1} <-> head

    struct my_struct *front_element = list_entry(&head.next, struct my_struct, node);

    printf("front element data: %d\n", front_element->my_data); // will print "2"
}

Elements of list_head are linked with each other, but there are only pointers to list_head and there are no pointers to my_struct (that contains list_head inside). However having a pointer to the node member of my_struct, you can convert this pointer to a pointer to my_struct itself using the list_entry macro. This is done using tricky pointer arithmetic (the offset of the member in the struct is subtracted from the address of the member).

But the implementation of the container_of macro is not portable because it uses gcc extensions and deferencing of null pointer (which is generally UB):

#define container_of(ptr, type, member) ({            \
    const typeof( ((type *)0)->member ) *__mptr = (ptr);    \
    (type *)( (char *)__mptr - offsetof(type,member) );})

Is there a way to make a portable implementation of this macro?

Answer

Use offsetof macro that comes with compiler (not UB).

#define container_of(ptr, type, member) \
                      ((type *) ((char *)(ptr) - offsetof(type, member)))

That looks clean to me. It's only spread across > two lines for SO.

git fake merge (marking a commit as merged without a real merge)

Question

Assume I have the following history in my repository:

     E--F
    /
A--B--C---D

And I want to modify it to be this:

     E--F
    /    \
A--B--C---D

I don't want to modify file content of revisions, I just want to "draw a merge arrow". How can I do this?

I tried to do this by the following commands:

git checkout D
git merge F -s ours --no-ff --no-commit
git commit --amend

But I got the following error message:

fatal: You are in the middle of a merge -- cannot amend.

I want to leave the commit information unchanged (author, date, message). The only thing I want to change is to add pseudo merge-info to the commit.

Answer

This worked for me:

git checkout D~1
git merge F -s ours --no-ff --no-commit
git cherry-pick D -n
git commit -C D
git branch -f D
git checkout D

Member access operator and indirection operator difference

Question

When overloading, indirection operator (*obj) must return reference, but member access operator (obj->) must return pointer.

struct my_ptr
{
    some_type & operator* () { return *ptr; } // returns reference
    some_type * operator->() { return  ptr; } // returns pointer

    some_type *ptr;
};

Why is there such difference?

Answers

You can find some answers at the link above. But they doesn't completely answer why C++ was designed in such way that operator* and operator-> return different types.

https://stackoverflow.com/a/28435758/5447906

I guess this was just the only way they could think of to implement it and it turned out a bit hackish.

operator-> allows to recursively travers through multiple objects until it will meet a real pointer:

struct A {
    int foo, bar;
};

struct B {
    A a;
    A *operator->() { return &a; }
};

struct C {
    B b;
    B operator->() { return b; }
};

struct D {
    C c;
    C operator->() { return c; }
};
D d;
d->bar; // Provides access to A::bar.

Someone might consider this as a useful feature. But I personally do not. Consistency and obviousness would be a bigger advantage for me.

In which cases should "std::move" be used in "return" statements and in which shouldn't?

Question

There are many similar questions here. All of them asks about usage of std::move in return in specific cases. But I want to know when std::move should be (or shouldn't be) used in return statement in general.

Here I found the following answer:

All return values are already moved or else optimized out, so there is no need to explicitly move with return values.

Compilers are allowed to automatically move the return value (to optimize out the copy), and even optimize out the move!

So I expected that std::move must never be used in return statement, because in any case compiler will optimize that. But I decided to check this and written the following test code:

class A
{
public:
    A() {}
    A(const A  & a) { std::cout << "A(const A  &)" << std::endl; }
    A(      A && a) { std::cout << "A(      A &&)" << std::endl; }
};

A func1(bool first)
{
    A a1, a2;
    return (first ? a1 : a2);
}

int main()
{
    A a1(func1(true));
    return 0;
}

And here is what I got:

A(const A  &)

So compiler did not automatically move the value in return statement, and I had to moved it manually:

return std::move(first ? a1 : a2);

This returned:

A(      A &&)

However when I rewrote my code in such way:

A func2(bool first)
{
    A a1, a2;
    if (first) return a1; else return a2;
}

int main()
{
      A a2(func2(true));
      return 0;
}

I found that automatic move is working:

A(      A &&)

Next test:

A func3(A &&a)
{
    return a;
}

int main()
{
    A a3(func3(A()));
    return 0;
}

Result:

A(const A  &)

So here return std::move(a) must be used.

My understating is next. Compiler automatically moves (or even use RVO) only local variables (and parameters passed by value (they are actually local variables too)). Rvalue references passed as parameters must be moved manually (it's probably because compiler doesn't know whether this external reference will be used anywhere later). Also compiler use move only in “pure” return statement (return var;) and can’t move when returning expression which uses local variables (return <some expression using var>, e.g. return somefunc(var);).

So std::move should not be used in pure "return local variable" statements (using std::move in this case prevents Return Value Optimization). In all other cases std::move should be used in return.

Is my understanding right?

Answers

My answer

When you return a local variable or parameter passed to your function by value, you don't need to use std::move. It is even better not to use std::move because without it compiler might use return value optimization which is even more efficient than moving.

In this case:

return (first ? a1 : a2);

you do not return a local variable, you return expression which result is first calculated by copying a1 or a2, and then this result is returned (using return value optimization). That is why move does not automatically happen here.

In case you return a parameter passed by reference:

A func3(A &&a)
{
    return a;
}

All named r-value references automatically become l-value references until you convert them back to r-value by std::move. That is why you have to use std::move. But people say that things were changed in C++20 where you don't have to use std::move in such case.

Answer #1

If you're returning a local variable, don't use move(). This will allow the compiler to use NRVO, and failing that, the compiler will still be allowed to perform a move (local variables become R-values within a return statement). Using move() in that context would simply inhibit NRVO and force the compiler to use a move (or a copy if move is unavailable). If you're returning something other than a local variable, NRVO isn't an option anyway and you should use move() if (and only if) you intend to pilfer the object.

Answer #2

It's quite simple.

return buffer;

If you do this, then either NRVO will happen or it won't. If it doesn't happen then buffer will be moved from.

return std::move( buffer );

If you do this, then NVRO will not happen, and buffer will be moved from.

So there is nothing to gain by using std::move here, and much to lose.


There is one exception* to the above rule:

Buffer read(Buffer&& buffer) {
    //...
    return std::move( buffer );
}

If buffer is an rvalue reference, then you should use std::move. This is because references are not eligible for NRVO, so without std::move it would result in a copy from an lvalue.

This is just an instance of the rule "always move rvalue references and forward universal references", which takes precedence over the rule "never move a return value".

* As of C++20 this exception can be forgotten. Rvalue references in return statements are implicitly moved from, now.

Is null pointer a valid iterator?

Question

I'm implementing std::vector-like class. Its iterators are just plain pointers. The problem is that if the vector is empty (there is no allocated buffer for elements) it returns nullptr as begin and end iterators. Are null pointers valid iterators?

I tried to learn std::vector source code in GNU C++ standard library. It looks like they use the same approach with null pointers.

But what about using std::distance with null pointers? It is defined as last - first for random access iterators. But it is not valid to subtract null pointers (at least in pure C). Ok, they can compare iterators before subtracting (comparing null pointers is valid), and if they are the same, return 0 without subtracting.

But anyway, last - first is assumed to be valid expression for random access iterators, but it is not because subtracting null pointers is undefined behavior.

If null pointers can't be used as iterators, what can I use instead?

Answer

It's valid to subtract null pointers in C++. It's still unclear if there are other reasons why null pointer can't be a valid iterator. There are probably no. If you are still interested, vote to reopen the question or write your opinion at meta.stackoverflow.com.

Does range constructor of std::vector reserve required capacity before filling?

Question

Which of 2 implementation is more close to the specification?

#1

template <typename Iter>
vector::vector(Iter first, Iter last)
{
    reserve(std::distance(first, last));
    for (auto it = first; it != last; it++)
        { push_back(*it); }
}

#2

template <typename Iter>
vector::vector(Iter first, Iter last)
{
    for (auto it = first; it != last; it++)
        { push_back(*it); }
}

The reserving implementation (#1) runs two times through the range if Iter is a not random access iterator. The first time is in std::distance to determine vector size, the second time is in the constructor itself for filling the vector with elements.

The reserve-less implementation (#2) is always single-pass. But if the vector is out of capacity, then reallocations happen which involve additional "pass through the range" actions. So in generate the #1 implementation has better performance.

Therefore I'm almost sure that the correct answer is #1. Also GNU C++ standard library does it this way. But I'm still curious.

The question may be more relevant to corresponding assign method that takes range as a parameter, because, when doing assign, the vector may have enough capacity, and so doing reserve before assign is redundant. Therefore it is unclear if vector assign reserves required capacity before filling.

Answer

Yes, it's guaranteed that there will be no reallocations, because pointers are RandomAccessIterators. vector.cons/9

template <class InputIterator>
vector(InputIterator first, InputIterator last, const Allocator& = Allocator());

Effects: Constructs a vector equal to the range [first, last), using the specified allocator.

Complexity: Makes only N calls to the copy constructor of T (where N is the distance between first and last) and no reallocations if iterators first and last are of forward, bidirectional, or random access categories. It makes order N calls to the copy constructor of T and order log(N) reallocations if they are just input iterators.

Save duplicate questions from disappearing from Google

Duplicate questions at Stack Overflow are hidden from search engines. I save my questions here to make them indexable.

Read more info:
The author: anton_rh.