C2cpp Notes: Essential 6, Container, Inheritance
Published:
1. C to C++
Makefile
CC = g++
CXX = g++
CXXFLAGS = -g -Wall -std=c++20
CC: compiler for C source filesCXX: compiler for C++-std=c++20
hello.cpp
#include <iostream> // cpp headers don't have .h
int main() {
std::cout << "hello world c" << 2 << "cpp" << std::endl;
}
cout: console output C++ has namespace. In C, if two library have the same function names, linker can’t link
intandchar *are mixed- Left shift
<<redefined forchar *(can’t redefine forint)
Target
target: source
- Tab: recipe
CC = g++
CXX = g++
If CC = gcc, gcc will invoke linker ld with the C standard libraries, problematic!!!
- Or specify
ld hello.o libstdc++...in the recipe line
hello: hello.o
g++ hello.o -o hello
hello: hello.cpp
g++ -g -Wall -std=c++20 -c hello.c -o hello.o
.PHONY: clean
clean:
rm *.o hello
.PHONY: all
all: clean hello
Namespace
#include <iostream> // cpp headers don't have .h
int main() {
using namespace std;
cout << "hello world c" << 2 << "cpp" << endl;
}
Define your own namespace:
#include <iostream>
namespace c2cpp {
int main() {
using namespace std;
cout << "hello c" << 2 << "cpp!" << endl;
return 0;
}
} // namespace c2cpp
int main() { // global namespace
return c2cpp::main();
}
String
In standard library: string
string s;
s = "hello";
s = s + "world"
- Can’t concatenate by
s + t s = s + 100won’t work.operator+()undefineds += 100works.operator+=()overloaded
2. Basic 4
Class
Member Functions
Adding functions in a struct
struct Pt {
double x;
double y;
void print() {
std::cout << this->x << ",", y << std::endl
}
};
Refer object members
- Explicit:
y - Implicit:
this->x(thisis implicit pointer)
Refer member function
int main() {
struct Pt p1 = { 1.0, 2.0 }; // you can drop struct in cpp
p1.print();
(&p1)->print();
}
Public vs Private
Replace struct with class
class:privateby defaultstruct:publicby default
class Pt {
public:
double x;
double y;
void print();
};
Early cpp compiler translates member functions as:
void Pt_print(struct Pt *this);
int main() {
Pt_print(&p1);
}
Stack Allocation
Constructor
Same name as the class
class Pt {
public:
double x; double y;
Pt() {
x = 4.0; y = 5.0;
}
};
int main() {
Pt p1; // constructor automatically invoked
}
When declared, allocate space and call the constructor
Can overload with different parameters and default parameters
Pt();
Pt(double _x);
Pt(double _x = 4.0);
Construction syntax
Pt p1(6, 7); // old cpp style
Pt p1(); // does NOT compile, taken as a FUNCTION PROTOTYPE
Pt p1 = {6, 7}; // C style
Pt p1 {6, 7}; // modern cpp
Pt p1; // implied no argument
Destructor
Use ~class_name (negation)
~Pt() {
std::cout << "bye" << std::endl;
}
- Called when the object goes out of scope
- If enclose object in
{}, will go away early - Always called when you leave the scope (exception)
- If enclose object in
- Constructor:
malloc(),fopen() - Destructor:
free(),fclose()
Heap Allocation
In cpp, void * cannot be assigned as a regular pointer, unless you cast it
Pt *p2 = (Pt *)malloc(sizeof(Pt));
p2->print();
free(p2);
Constructor and Destructor did not involve, since malloc() and free() are for void*
new() and delete()
Wrapper around malloc() and free()
Pt* p2 = new Pt {6, 7};
p2->print();
delete p2;
- No
deletein Java
Array Allocation
Pt* p3 = new Pt[5] {{6, 7}, {8, 9}, {10}};
delete[] p3;
- Allocate for an array of 5 (80 bytes)
- On the heap, an 8-byte integer pads in front to hold
5 - Can access through
((int_64_t *)p3) - 1
- On the heap, an 8-byte integer pads in front to hold
- Each constructor is called
- At
delete[], each destructor called- Passing
p3(pointer to the first element) - If not
delete[], will leak 88 bytes!
- Passing
Why doesn’t cpp do this when allocating a single object? Efficiency.
Value vs Reference
Passing by Value
void transpose(Pt p) { // no hi
double t = p.x;
// something
p.print(); // (5, 4)
} // bye
int main() {
Pt p4 = {4, 5};
p4.print(); // (4, 5)
transpose(p4); // passing by value
p4.print(); // (4, 5), unchanged
}
When you pass by value, p is copied
- Our own constructor not called. An internal [[#Copy Constructor]] called!!!
- After
transpose()done, its destructor called
Passing by Reference in C
void transpose(Pt* p) {
double t = p->x;
// something
}
int main() {
Pt p4 = {4, 5};
p4.print(); // (4, 5)
transpose(&p4); // passing by pointer
p4.print(); // (5, 4)!
}
- No copying
When passing
Pt *, technically it’s still passing an address value, simulating reference
Passing by Reference in Cpp
CPP supports this natively. Change parameter from Pt to Pt&
void transpose(Pt& p) { // type Pt&
double t = p.x;
}
Can use reference as if using the value.
- Automatic dereferencing
- Think
pas a reference (alias) forp4 - Pointer, but without deref
When seeing
transpose(p4)in cpp, less readable. Can’t tell if value/reference
Pt p1 {100. 5}, p2 {200, 5};
Pt* p; // we declare a pointer variable, uninitialized
p = &p1; // p points to p1
p = &p2; // p now points to p2
Pt& r = p1; // we create a reference to p1 named r
// Pt& r; // error: references must be initialized when declared
r = p2; // this does NOT change r to refer to p2
// instead, it is equivalent to p1 = p2;
p1.print(); // will print "(200,5)"
- You have to bind a reference to an object at creation
- aka stuck pointer
- In reassignment, will reassign the referenced objects
- For all purposes of
r, it meansp1 - but avoids making a separate object
- For all purposes of
- Always stack-allocated
Copy Constructor
Regular constructor that takes another object reference ONLY CALLED AT INITIALIZATION
Pt(Pt& orig) { // copy constructor
x = orig.x;
y = orig.y;
std::cout << "copy" << std::endl;
}
When copying by value, copy constructor invoked
- If not defined, compiler will generate one for you that mimics the C behavior
cpp invented reference for copy constructor If you call
Pt(Pt other), the copy constructor has to be recursively called…
const
If you take something by const reference, the value won’t be changed
Pt(const Pt& orig);
If you declare const on p4, but:
void print();
const Pt p4; // p4 won't be modified
p4.print(); // WON'T COMPILE
this.print() will execute Pt* this = &p4;, but const Pt* &p4 can’t be converted to regular Pt* this!
print()suggests possible mutation
Need to declare as const member function!!!
void print() const;
- implies
thisas aconst Pt* - If member function not modifying objects, always declare as
const.
Direct Invocation
Directly make a copy
{
// copy construction semantics
Pt p5 {p4};
Pt p5(p4);
Pt p5 = {p4}; // C
Pt p5 = p4; // C init
// COPY ASSIGNMENT!
Pt p5; p5 = p4;
}
Pass values of p4 to p5
Return by Value
Compile of -fno-elide-constructors -std=c++14
Also invokes copy constructor
When expand() returns by value, it creates a temporary unnamed object, which is again copy constructed to be assigned
Pt expand(Pt p) { // copy construction by value
Pt q; // construction
q.x = p.x * 2;
q.y = p.y * 2;
return q; // copy construction: return value
// q destroyed
} // p destroyed
int main() {
const Pt p4;
cout << "*** p6: returning object by value" << endl;
{
Pt p6
{
// tmp created here // copy of q when expand() returns
expand(p4)
}; // copy construction on p6
// tmp destroyed
} // p6 destroyed
cout << "Thats all folks!" << endl;
} // p4 destroyed
g++ -fno-elide-constructors -std=c++14 # ...
hi # p4's constructor
*** p6: returning object by value
copy # call expand()
hi # q's constructor
copy # copy of tmp
bye # q's destructor
copy # p6 copy from tmp
bye # tmp or p
bye # tmp or p
bye # p6 destroyed
Thats all folks!
bye # p4's destructor
3 copy calls:
- Passing
p4toexpand(), copying top expand()returnq, creating atmpobject (different fromq)main()allocates space for suchtmp.
- Copy of
tmptop6
5 destructor calls
qdestroyed afterexpand()returnstmpdestroyed afterp6constructedpdestroyed afterexpand()returnsp6destroyedp4destroyed
Now compile without -no-elide-constructors
hi # p4's constructor
*** p6: returning object by value
copy # p construct
hi # q construct
bye
bye
Thats all folks!
bye # p4's destructor
p6 and tmp not constructed
- When returning object, it is typically constructed in the stack frame of the caller
main() - When compiler is allowed to
elide, this temporary (just used to copy into another object), is omitted p6,tmp,qget collapsed to one
By cpp 17, elide is forced
Copy Assignment
Assign one struct to another, so each member is assigned
int main() {
const Pt p4;
cout << "*** p7: copy assignment" << endl;
{
Pt p7 {8,9};
p7.print();
p7 = p4;
p7.print();
}
cout << "That's all folks!" << endl;
}
hi // p4's constructor
*** p7: copy assignment
hi // p7's constructor
(8,9) // p7.print() before the assignment
(4,5) // p7.print() after the assignment
bye // p7's destructor
That's all folks!
bye // p4's destructor
Same behavior as C
You can use operator=() to define the copy assignment operator
- Invoked from
b = a; - Assignment is an expression with a value
Pt& operator=(const Pt& rhs) {
x = rhs.x; // (this.)x = rhs.x
y = rhs.y;
std::cout << "op()" << std::endl;
return *this;
}
- LHS comes as this
thisobject - RHS comes as parameter
rhs - It you return by value
Pt, it will make another copy
Compiler Default
- Constructor: generated if no constructor or copy constructor. Invoke member constructors
- Destructor: invoke member destructor
- Copy constructor: generated if not present. Will copy by member (invoke their CC)
- Copy assignment: same
3. MyString Class
Header function: class definition with all member functions declared
- Except for short functions (let compiler
inlineit) - Define them in the
cppfile, prepended withMyString::
mystring.h
class MyString {
public:
MyString();
MyString(const char* p);
~MyString();
MyString(const MyString& s); // copy constructor
MyString& operator=(const MyString& s); // copy assignment
int length() const {return len;}
friend MyString operator+(const MyString& s1, const MyString& s2);
friend std::ostream& operator<<(std::ostream& os, const MyString& s);
friend std::istream& operator>>(std::istream& is, MyString& s);
char& operator[](int i);
const char& operator[](int i) const;
private:
char* data;
int len;
};
Makefile
executables = test1 test2 test3
objects = mystring.o test1.o test2.o test3.o
.PHONY: default
default: $(executables)
$(executables): mystring.o
$(objects): mystring.h
defaulthas all executables we want to build. Will get expanded totest1 test2 test3- For each
$(executables)and$(objects)gets expanded to
test1: test1.o mystring.o
g++ test1.o mystring.o -o test1
test1.o: test1.cpp mystring.h
g++ -g -Wall -std=c++14 -c test1.cpp
The variables will be expanded. *.o and *.cpp ingredients are implied
Basic 4
Don’t forget the
nullptrcase, as well as allocatinglen + 1
#include "mystring.h"
int main() {
using namespace std;
MyString s1;
MyString s2("hello");
MyString s3(s2); // copy construct
s1 = s2; // copy assignment
cout << s1 << "," << s2 << "," << s3 << endl;
}
Constructor
When you initialize "hello":
- Type:
char [6] - Value: pointer at
.rodata, betweencodeanddata- However, string there will be immutable.
- Need to allocate a copy
MyString::MyString(const char* p) {
if (p) {
len = strlen(p);
data = new char[len + 1];
strcpy(data, p);
} else {/* allocate a null string */}
}
new char[6]: allocate 6 elements ofcharon heap- If
newfails to allocate, throws exception
- If
- Can’t allocate on stack due to scope
If string is empty, allocating 1 byte is unfortunate. Why not make it
NULL?
- Creates an invariant property, better testing integrity
- real
std::stringhas a short (E. 32-byte) buffer to avoid heap allocation
Destructor
MyString::~MyString() {
delete[] data;
}
Copy Constructor
Consider this code
void f(MyString s2)
cout << s2;
int main() {
MyString s1("hello");
f(s1);
cout << s1;
}
If copy constructor not defined, s2 will be copied on the stack, member-wise
- When
freturns,s2goes away, and its destructor gets called!!! s1.datadeleted!!!- Shallow copy!
Need make every copy carry its own heap allocated string
MyString::MyString(const MyString& s) {
len = s.len;
data = new char[len+1];
strcpy(data, s.data);
}
Copy Assignment
Need do a little more,
MyString& MyString::operator=(const MyString& rhs) {
// if s1 = s1, DON'T change anything
if (this == &rhs)
return *this;
// first, deallocate memory that 'this' used to hold
delete[] data;
// same as copy constructor
len = rhs.len;
data = new char[len+1];
strcpy(data, rhs.data);
return *this;
}
What if I use
*this == rhs?
- Same result, but not efficient to compare all
structcontents- Depends on
operator==()definition
Rule of 3
If you implement any of destructor, copy constructor, and copy assignment, you probably need to do all 3 of them
Inline Functions
Entire body defined in mystring.h
- Tell compiler to
inlineit - Compiler may refuse (E. long/recursive/virtual functions)
Operators
operator+()
MyString operator+(const MyString& s1, const MyString& s2) {
MyString temp;
delete[] temp.data;
temp.len = s1.len + s2.len;
temp.data = new char[temp.len+1];
strcpy(temp.data, s1.data);
strcat(temp.data, s2.data);
return temp;
}
operator+()is not a member function (nothispointer, noMyString::). It’s global
How does it access
privatedata?
Declared asfriendin prototype
friend MyString operator+(const MyString& s1, const MyString& s2);
After return temp, a copy of temp is created, which goes into rhs for operator=()
Why is
operator=()member, butoperator+()global?
Either works syntax-wise
operator=()modifies LFS, so make it member- lhs and rhs of
operator+()are symmetrical, so make it global
(s1 + "world") also works
- Compiler first looks for overload of
operator+(MyString, char*) - Next, sees if can promote one of the mismatched argument into
MyString- By invoking constructor
MyString(char*)- Finds a constructor of
MyStringthat takeschar* - Constructs a temporary object from
char*
- Only works if
operator+()is a global function"hello" + "world"doesn’t work
- Finds a constructor of
- By invoking constructor
- Preserves C behavior
operator<<
coutisstd::ostreamthat representsstdout- Return it by reference
- So associativity not violated
friend std::ostream& operator<<(std::ostream& os, const MyString& s);
Why not make it a member function of
std::ostream? Why make it global?
- Don’t want to redefine the code in
std
std::ostream& operator<<(std::ostream& os, const MyString& s) {
os << s.data; // give os the char*
return os; // return BY REFERENCE
}
(cout << s1) << "something else"; // LHS is still cout
operator>>()
friend std::istream& operator>>(std::istream& is, MyString& s);
std::istream& operator>>(std::istream& is, MyString& s) {
std::string temp;
is >> temp;
delete[] s.data;
s.len = strlen(temp.c_str());
s.data = new char[s.len+1];
strcpy(s.data, temp.c_str());
return is;
}
Cheated with std::string
- Actually, need to read each character from
stdin, and get whitespace right .c_strgives the regularchar*
operator[]()
Gives the i-th character
char&return, because need to write into the string
If MyString is const, both of the following would work:
const char& operator[](int i) const;
char& operator[](int i) const;
char& MyString::operator[](int i) {
if (i < 0 || i >= len) {
throw std::out_of_range{"MyString::op[]"};
}
return data[i];
}
operator[]() const
If this is const, all its members will be const
- Cast away
constness *thishas typeconst MyString*- When you dereference
const MyString*, it becomesconst MyString& - Cast to regular
MyString& - Can invoke the regular
operator[](), which returns achar& - Assigning
char&intoconst char&is fineconst char& MyString::operator[](int i) const { // illustration of casting away constness return ((MyString&)*this)[i]; }
The cpp syntax is:
return const_cast<MyString&>(*this)[i];
- Also checks if
*thisisMyString&to begin with
Exception Handling
throw std::out_of_range{"MyString::op[]"};
// same as
std::out_of:range ex{"MyString::op[]"}; // construct a temp object
throw ex // return by value
std::out_of_rangeis a type, constructs a temporary objectex, and returns by value
If not catched, and the exception goes out of main(), a library function will catch it, and terminate the program
try {
f1(); // function call that may throw an exception
}
catch (const out_of_range& e) {
cout << e.what() << endl;
}
Catch with const reference (if by value, has a copy construct…)
- Then invoke the member function
what()
If throwing an exception makes a function exit in the middle, the local variables will be properly destructed
4. Function Template
int Max(int x, int y)
return x > y ? x : y;
std::string Max(std::string x, std::string y);
return x > y ? x : y;
CPP allows templates for the same implementation, so you don’t have to repeat
template <typename T>
T Max(T x, T y)
return x > y ? x : y;
const T& Max(const T& x, const T& y);
// better
Above is not actual cpp code. The compiler generates from the template
- Figure out the actual type, and replace
Twith it - Better write with
const T&, since not modifying
Need definition in header file, not in another cpp file, since it has to be seen at compile time
Template Overloading
But if use Max("AAA", "BBB"), will fail, since comparing two pointers, unpredictable
- Compiler generates a version of
Max()withchar*s
Want a template that specializes the type by giving a concrete definition for certain types
const char* Max(const char* x, const char* y)
return strcmp(x, y) > 0 ? x : y;
- Even if you declare this specialization, the compiler will stop using the
template
Make it
staticin header file, andinlineif short
Compilation
Build a wrapper func1() around Max() in func1.cpp
#include "max.h"
// wrappers on Max() template
int func1(int x, int y) // defined for int
return Max(x, y);
int func2(int x, int y)
return Max(x, y);
- When
func1.cpp, infunc1.o, the compiler puts the definition ofMax()forintDo the same atfunc2.cpp
When linking, how is it not a duplicate definition?
In the object dumps,
func1is renamed as_Z5func1iiin cpp due to overloading- Encodes its parameters
ii
- Encodes its parameters
- Linker is aware that the
Max()are duplicates, and throws one of them outWEAKbinding (may have multiple instances)GLOBALbinding in normal functionsLOCALbinding forstaticfunctions (only in current object)
5. Dynamic Array
Implementation of
std::vector(templatized class)
Dynamically growable integer array (add to back)
- Allocate
sz + 1elements on the heap - When you call
push_back(), add to the provision element, andsize++
Member initializer list
Preferred, since members already constructed and initialized simultaneously
- Avoids default construct, then assignment in the body
class IntArray {
private:
size_t sz;
size_t cap; // need cap declared before a
int* a;
public:
IntArray() : sz{0}, cap{1}, a{new int[cap]} // member initializer list
{
// sz = 0;
// cap = 1;
// a = new int[cap];
}
void push_back(int x) {
if (sz == cap) {
int* a2 = new int[cap * 2]; // if new fails, exits, but cap intact
cap *= 2; // strong exception guarantee
std::copy(a, a+sz, a2); // [first, last+1), dst
delete[] a;
a = a2;
}
a[sz++] = x;
}
// ...
- If have capacity, put the next element (\(O(1)\))
- If not, allocate double the capacity, copy over, and add (\(O(n)\))
- Overall complexity: \(O(2n) = O(n)\) for \(n\) elements
- Average: amortized \(O(1)\)
Strong exception guarantee
If an exception happens in the middle, it won’t mess up your object
Not planning to use copy constructor and copy assignment, so use delete
- Undefine the compiler generated one
- Copy attempts will fail to compile
- Use
defaultfor compiler generated
// ...
~IntArray() {
delete[] a;
}
IntArray(const IntArray&) = delete;
IntArray& operator=(const IntArray&) = delete;
int& operator[](int i) { return a[i]; }
const int& operator[](int i) const { return a[i]; }
size_t size() const { return sz; }
size_t capacity() const { return cap; }
};
Usage:
int main() {
using namespace std;
IntArray ia;
for (int i = 0; i < 20; i++) {
ia.push_back(i);
cout << ia << endl;
}
}
6. Move Operation
Continuation of 2. Basic 4
Motivation
Construct a local array and return by value
- Will invoke 2 copy constructions
IntArray createIntArray() {
IntArray tmp;
for (int i = 0; i < 20; i++) {
tmp.push_back(i);
std::cout << tmp << std::endl;
}
return tmp; // copy by value
}
int main() {
IntArray ia { createIntArray() }; // make another copy
}
Fail to compile
Earlier,deleted the copy constructor!
It’s very cumbersome to return an object by value by copying!
- Want to pass a reference, and allocate the structure outside the function
void createIntArray(IntArray* ia_ptr) {
for (int i = 0; i < 20; i++) {
ia_ptr->push_back(i);
std::cout << *ia_ptr << std::endl;
}
}
int main() {
IntArray ia; // allocate locally on the stack
createIntArray(&ia);
}
Move Constructor
Move constructor only kicks in when assigning to a temporary object that will go away
- More efficient than copy
- Do a shallow copy of each element from
tmpto the new objectthis - Sever the connection to the original pointer
.a
IntArray(IntArray&& tmp) : sz{tmp.sz}, cap{tmp.cap}, a{tmp.a} {
tmp.sz = tmp.cap = 0;
tmp.a = nullptr;
std::cout << "move ctor" << std::endl;
}
Instead of making a deep copy of original
tmp, “steals” whattmpused to have, and move to the return
Now, when tmp -> unnamed return -> ia, each old object gets destroyed
- Instead of making a brand object and making a copy, steal the internals
- Two copies becomes two movements
- Old object gets destroyed, and
delete nullptris okay
Move constructor can be elided as well
&& rvalue Reference
Will be on exam
- Regular variables are lvalue
int i = 5,ican be on the LHS, lvalue5can’t be lvalue, can’t do5 = i
MyString{"xyz"}is rvalue
struct X {
X() : d{100} {} // set X.d to 100
double d;
};
void f1(X& t) { t.d *= 2; cout << t.d << endl; }
void f2(const X& t) { cout << "can't change t" << endl; }
void f3(X&& t) { t.d *= 3; cout << t.d << endl; }
int main() {
X x; // x is an lvalue
f1(x); // passing an lvalue to X& --> ok
f2(x); // passing an lvalue to const X& --> ok
f3(x); // passing an lvalue to X&& --> not ok
// X{} creates a temp object (rvalue)
f1( X{} ); // passing an rvalue to X& --> not ok
f2( X{} ); // passing an rvalue to const X& --> ok
f3( X{} ); // passing an rvalue to X&& --> ok
}
- Copy constructor take
constreference, since old object not modified - Move operator take revalue reference to bind to rvalue
- Only allow stealing from temporary objects
- Not named!!!
Move Assignment
IntArray& operator=(IntArray&& tmp) {
if (this != &tmp) {
delete[] a;
sz = tmp.sz;
cap = tmp.cap;
a = tmp.a;
tmp.sz = tmp.cap = 0;
tmp.a = nullptr;
}
std::cout << "move assignment" << std::endl;
return *this;
}
Usage
IntArray ia = { createIntArray() }
IntArray ia2;
ia2 = ia; // we don't have copy assignment,
Won’t work since ia is an lvalue, not allowed to move!!
Need to cast to force a move assignment
ia2 = (IntArray&&) ia; // not recommended
// or
ia2 = std::move(ia);
Original ia loses its content
Summary
Compiler can generate a move constructor and move assignment
There are some edge cases for that
- When no destructor, copy constructor, or copy assignment declared
- Compatibility with copy rules, if legacy code write copy but not move
- Move will fall back to copy
- rvalues can be bound to
constlvalues
- rvalues can be bound to
- Member-wise move
- If either move constructor or move assignments are declared, compiler will implicitly delete the copy constructor and copy assignment
Rule of 5
If you declare any of destructor/copy/move (includingdefaultanddelete), very likely need to declare all of them
Rule of 0
If all class members already have their 5 defined correctly (E. library class), and you class has no special resource management, no need to declare in your class
- Some prefer to declare with
=default
7. vec Class
template <typename T>- Change
inttoT - Make functions template functions parameterized by
T
template <typename T>
std::ostream& operator<<(std::ostream& os, const vec<T>& ia);
Vec<int> CreateIntVec() {
Vec<int> tmp; // need to explicitly specify the type
// ...
}
void push_back(int x) {
if (sz == cap) {
int* a2 = new int[cap * 2];
cap *= 2;
std::copy(a, a+sz, a2); // uses copy assignment
delete[] a;
a = a2;
}
a[sz++] = x;
}
int main() {
Vec<MyString> v; // (1)
v.push_back("abc"); // (2)
v.push_back("def"); // (3)
MyString s{"xyz"}; // (4)
v.push_back(s); // (5)
}
- Constructs
von the stack{size_t size, size_t cap, MyString a[1]}sz = 0; cap = 1;ais an allocated 1-element array- Calls
MyString()- Makes a default
MyStringobject on the heap, with
- Makes a default
- Calls
const T&requires a promoted temporary object- Temp (rvalue)
MyString("abc")object constructed on stacklen = 3data = "abc\0"
- Temp comes into
push_back()asconst T& x(reference)- Must have
const, else won’t accept rvalue
- Must have
a[sz++] = x- Copy assignment
xdestroyed afterpush_back()return, and temp
- Temp (rvalue)
push_back("def")- Same temp creation
- Allocation needed! (two levels)
new MyString[2]allocated on heap- Both members are default constructed
MyString()
- Both members are default constructed
std::copy()- Copy assignment
deleteold data- Data pointer updated
- Copy assignment
delete[] adelete[]all string allocations
x(and its members) gets copy assigned to itsa[1], and destroyed
MyString s{"xyz"}screated on stack"xyz\0"on heap
v.push_back(s)spassed by reference- Need to reallocate again
- Every
MyStringgets default constructed - Copy assign the
MyStrings over - The vector entry
v.a[3].datahas a different copy withs.data- Last slot left “empty”, but aMyStringobject
- Every
STL containers hold its own copy of the data (by value)
This is NOT how
vectorworks. Default construction is wasteful!
std::vectorwill not initialize the memory (no default constructornew, justmalloc())
- But how would
a[sz++] = xwork?
- This is an assignment. The LHS object must be constructed
- Solution: Placement
new
Placement new
char s_buf[sizeof(MyString)];
MyString* sp = new (s_buf) MyString{"hello"};
Casting decouples memory allocation and constructor call!
- No memory allocation
- Only calls constructor Can’t directly invoke
delete, need to manually invoke the destructorsp->~MyString();
Strong Exception Guarantee
For push_back(Int), exception can only happen in new. std::copy() of integers is trivial
- Need to guarantee that
IntArraywon’t be changed
For push_back(MyString),
- If
newthrows, nothing created - If
copy()throws,operator=(MyString)usesnew, may throw
- Assignment
a[sz++] = xmay throw
void push_back(const T& x) {
if (sz == cap) {
T* a2 = new T[cap * 2]; // no issue
try {
std::copy(a, a+sz, a2);
} catch (...) { // catch any exception
delete[] a2; // cleanup before throw
throw;
}
cap *= 2;
delete[] a;
a = a2;
}
a[sz] = x; // if throw here, all did was increasing cap
sz++; // separate sz++
}
Still issues!
Ona[sz] = x,operator=(MyString)may leave the default constructor in a bad state - Assume if copy assignment fails, it still leaves a valid object there
Might be more efficient if the elements are moved, rather than copy
std::vectortries to move elements during reallocation, but
- Move may not easy to roll back on exception
std::vector uses the criteria:
- Moved if move constructor marked
noexcept - Copy else
- If copy constructor deleted, the fall back to move (forgo strong exception guarantee)
8. Inheritance
Not to useful in cpp, mostly on template and OOP mechanisms
Based class -> derived class
- Derived classes need to be constructed, if its base class constructor takes arguments
class B {
public:
uint64_t sum() const { return b; }
// protected
private:
uint64_t b = 100;
};
class D : public B {
public:
// return b + d;
uint64_t sum() const { return B::sum() + d; }
private:
uint64_t d = 200;
};
D has the derived sum() from B, and its own version
Don’t forget
public!
Scope:
privatemembers can’t be accessed by derived classes. NeedB::sum()protectedmembers are accessible, but not to general outside
sizeof(D) is 16 bytes,
- 8 for
.b = 100(inherited) - 8 for
.d = 200
Can cast to find it:
uint64_t* p = (uint64_t*) &x;
uint64_t*p = reinterpret_cast<uint64_t*>(&x);
Static Binding
By default how cpp works, translated to C
x is a D object, but the compiler picks the function to invoke based on the object type
int main() {
using namespace std;
D x;
w( x.sum() );
B *pb = &x;
D *pd = &x;
w( pb );
w( pd );
w( pb->sum() ); // compiler generates B_sum(pb)
w( pd->sum() ); // compiler generates D_sum(pb)
}
Dynamic Binding
Java behavior: call the underlying object type, not pointer type
Use keyword virtual in base class
- Optionally
overridein derived class
class B {
public:
virtual uint64_t sum() const { return b; }
}
// not required
class D : public B {
public:
uint64_t sum() const override { return B::sum() + d; }
}
How did compiler know that
pbis pointing to aDobject?
What if in a different cpp file that hasfoo(B* pb) { pb-> sum();}Anypbderived fromBcan be called here
- Without
virtual, compiler will translate everything toB_sum- How does
virtualwork?
Now, sizeof(x) = 24, object gets elongated
.vptris some address- Virtual Table Pointer, array of function pointers for each virtual function in the class
- One table for class
- For
D, one entry:D::sum()
- Virtual Table Pointer, array of function pointers for each virtual function in the class
.b = 100.d = 200
If a B object instantiated,
.vptrwill point to a different V table forBclassB::sum()
When pb -> sum();
D::sum()is always called, regardless the pointer type ofpb- When declared,
pbis pointing to the vtable ofD - Directly invoke
D::sum()from the vtable pointer entry - Compiler generates
pb->vptr[0] (pb)- Call
sum()by passingpb
- Call
Even if ptr is B type, B::sum() invoked through B’s vtable
Virtual Table
Manually invoke sum() manually
sumhas typeuint64_t (const D*)- vtable entry
uint64_t (const D*) - vtable pointer in
D xhas typeuint64_t (**) (const D*) B* pbhas typeuint64_t (***) (const D*)Reference 3 times, and deref twice
D x;
B* pb = &x;
(uint64_t (***) (const D*)) pb )[0][0](&x);
Multiple Inheritance
CPP class can derive from multiple base classes (not used much)
- Can’t do java. Only can memberless classes: interface
// similar class C
class D: public B, public C {
public:
uint64_t sum() const { return B::sum() + C::sum() + d; }
private:
uint64_t d = 200;
};
With static binding, initialize B and B*, C*, D*,
B*, D*point to the beginningC*point to theCportion, a different pointer!- When assign
C* pc = &x;, compiler silently changes the pointer - Since the base portion of
Cis in the middle
- When assign
static_cast
Downcast base pointer C* back to the original object pointer D*
- Opposite of
C* pc = &x;(upcast) - with
static_cast
D* pd2 = static_cast<D*> (pc); // cast C* to D*
w( pc );
w( pd2 );
Different pointers are printed
Not perfect
C y;
pd2 = static_cast<D*> (&y); // compiler allowed
pd2->sum(); // crash
yhas typeC, bad* pd2has typeD, originally, OK
Multiple Virtual Inheritance
- Add
virtualto base classesB::sum()andC::sum() - Add
overridetoD::sum()
Now,
sizeof(x) = 40 (5 slots)
- Without
virtual, was24, with.b, .c, .dTwo vtable pointers, to different places of the same vtable [0] .vptr1[1] .b = 100[2] .vptr2[3] .c = 150[4] .d = 200- 01:
Bportion - 23:
Cportion
D x;
C* pc = &x;
pc->sum(); // calls C portion, D[2], and invoke D::sum();
- When calling from
C,thisis not pointing to the beginning of the object - Need to adjust
thisby subtractingsizeof(B)
Translated to the C version in the object file:
uint64_t thunk_of_D_sum(const C* this_ptr) {
char* adjusted_this_ptr = (char*)this_ptr - sizeof(B); // B portion size
return D::sum( (const D*)adjusted_this_ptr ); // actual this
}
dynamic_cast
In vtable, there are information for crosscast and downcast with more checking
- Point to global
std::type_info(1 per class)
At compile, still checks
- Inputs needs to be pointers
- Classes have
virtualfunctions, so vtable can be accessed.
D x;
B* pb = &x;
C* pc = &x;
D* pd = &x;
assert(dynamic_cast<B*>(pc) == pb); // cross cast C (actual D) to B
assert(dynamic_cast<D*>(pc) == pd); // down cast C (actual D) to D
C y;
C* pc2 = &y;
assert(dynamic_cast<B*>(pc2) == nullptr); // fails, can't cast (actual) C to B
assert(dynamic_cast<D*>(pc2) == nullptr); // fails, can't upcast C to D
Virtual Inheritance
Diamond Problem
class A {
public:
virtual uint64_t sum() const { return a; }
private:
uint64_t a = 5;
};
class B : public A {
public:
virtual uint64_t sum() const { return b; }
private:
uint64_t b = 100;
};
class C : public A {
public:
virtual uint64_t sum() const { return c; }
private:
uint64_t c = 150;
};
class D : public B, public C {
public:
uint64_t sum() const override { return B::sum() + C::sum() + d; }
// A::sum() is ambiguous and does not compile:
// uint64_t sum() const override { return A::sum() + B::sum() + C::sum() + d; }
private:
uint64_t d = 200;
};
D is derived from both B and C, which is derived in D
D x;
A* pa = &x; // compilation error!
- Can’t
static_castordynamic_castto(D*)either
Diamond Problem
Compiler won’t know whichAportion to point to!
A::sum()can’t be called inDeither!
Solution
Derive classes B, C virtual from A
- Eliminates duplication of
A
class B : virtual public A {
// ...
};
Now all will compile properly and sum 455:
D x;
A* pa = &x;
B* pb = &x;
C* pc = &x;
D* pd = &x;
w( pa ); // 0de8, off=40 (16+24)
w( pb ); // 0dc0
w( pc ); // 0dd0, off=16
w( pd ); // 0dc0
w( pa->sum() );
w( pb->sum() );
w( pc->sum() );
w( pd->sum() ); // all 455
Will no longer copy A in each B C instance. Put it at the end
In vtable, new vbase_offset, vcall_offset
Misc
Construction Order
Sock and shoes (opposite for destruction)
- Base class constructed
- All members constructed
- Derived class itself constructed
Virtual Function & Abstract Class
Abstract class have purely virtual function
- aka interface in Java
virtual uint64_t sum() const = 0;No code, meant purely to be overwritten by derived classes
- The class itself cannot be instantiated
- Only its derived classes that overrides this method can
Virtual Destructor
int main() {
B* pb = new D;
delete pb; // only destructor of B called
}
The destructor is not virtual, (static binding), so the B destructor will be invoked
Need to virtual the destructor
struct B {
virtual ~B() { cout << "B::~B()" << endl; }
};
struct D : public B {
~D() override { cout << "D::~D()" << endl; }
};
All members, D, M, B, will be destroyed properly
9. iostream
int sum_ints(std::istream& is) {
int i, sum = 0;
while (is >> i) { // extract a number to int i
sum += i;
std::cout << "current sum: " << std::setw(4) << sum;// 4 space, right align
print_io_states(is);
}
return sum;
}
Keeps printing the sum and stats
echo "5 7 9 " | ./io1
istream
while (is >> i);
istream& istream::operator>>(int& value); // modifies istream (this)
ios::operator bool() const;
// same as !ios::fail
- Return
istreamso can keep chaining things operator <type>is a conversion operation- Invoked when
<type>is required
- Invoked when
istreamis a base class ofiosstd::setw()is a function that returns an objectWidthManipulatorostream& operator<<(ostream& os, const WidthManipulator& wm) { os.width(wm.get_width()); return os; }
Hierarchy
std::coutandstd::cinare instances ofostreamandistreamostingstreamprints into a stringofstreamprints into a file
iostate
Top level
ios_basecontains flags and constants, such asios_base::iostate
iostate is a set of flags telling the state of the IO, can be combined
failbit: formatting/extraction error/EOFbadbit: irrecoverable error (E. disk crashed)- `eofbit
In class ios, flags can be accessed by:
fail()(failbitORbadbit)operator bool()same as!fail()
bad()eof()echo "10 2 3 4" | ./io1Only after summing
4, and trying to read for one more time, hits EOFechoadds a\nat the end of “4”.ioshits the\nfirst and parses the 4 okay
Now suppress \n by echo -n
echo -n "10 2 3 4" | ./io1
- Now, when summing “4”,
eof()set, since hit EOF directly after “4”fail() = 0,eof() = 1is >> istill evaluated to1, so loop executes one more time
- Last time,
fail() = 1,eof() = 1(still)
Error Recovery
Keep recovering until eof()
int sum_ints(std::istream& is) {
int i, sum = 0;
while (!is.eof()) {
is >> i; // skip all leading whitespaces
if (is.bad()) {
throw std::runtime_error{"bad istream"};
}
else if (is.fail()) { // fail(), but not bad()
if (is.eof()) {
// failbit && eofbit: this is normal (trailing whitespace)
break;
}
// failbit && !eofbit: probably non-digit; just skip it
is.clear();
char c;
is.get(c); // consume a single char
std::cout << " skipped: " << c << '\n';
}
else { // read i successfully;
// we could have hit eof or not (depends on trailing whitespace)
sum += i;
std::cout << "current sum: " << std::setw(4) << sum;
print_io_states(is);
}
}
return sum;
}
- If get
char, consume one character at a time - If get
int, consume a chunk
C vs Cpp
By default: std::cin is sync with stdin, and for out and err
- Cpp streams are unbuffered
- Each IO operation on cpp stream is immediately applied to the corresponding C stream’s buffer
- Underneath, call the C functions (E.
printf(),scanf())- Thin wrapper
Can detach Cpp stream from C stream, automatically add its own buffer:
std::ios_base::sync_with_stdio(false);
- Cpp streams are allowed to buffer its IO independently
- Lose the following:
- No longer thread-safe
- No
\nmay not flushstdoutto terminal- Use
'\n'instead ofstd::endlfor terminal output
- Use
- No longer thread-safe
Polymorphic ios
pair<T1, T2> is a library template with two types
void f1(istream& is) {
while (!is.eof()) {
is >> std::ws; // discard leading whitespace
if (!is || is.eof()) // break if nothing else
break;
pair<string,double> e;
is >> e; // overload >> to read from istream
if (is.fail())
break;
cout << e << '\n';
}
if (is.bad())
throw runtime_error{"bad istream"};
else if (is.fail()) // fail but not just eof
throw invalid_argument{"bad grade format"};
}
int main(int argc, char **argv) {
try {
f1(cin);
} catch (const exception& x) {
cerr << x.what() << '\n';
}
}
Now parse pair from istream:
std::istream& getline(std::istream& is, std::string& str, char delim);
static istream& operator>>(istream& is, pair<string,double>& e) {
char c;
// 1. read opening quote, skipping whitespace
if (is >> c && c == '"') {
string student;
// 2. read all chars into student, stop at closing quote, then discard it
if (std::getline(is, student, '"')) {
// 3. read a comma, skipping whitespace
if (is >> c && c == ',') {
double grade;
// 4. read double grade
if (is >> grade) {
e = make_pair(student, grade);
return is;
}
}
}
}
// if fall here, fail :(
is.setstate(ios_base::failbit);
return is;
}
static ostream& operator<<(ostream& os, const pair<string,double>& e) {
return os << "[" << e.first << "] (" << e.second << ")";
}
make_pair()
String Stream
Issue with f1(): one failed record failed will fail break and throw
- Want recover and move on
f2() adds another layer of buffer
- Read an entire line
- Construct
istringstreamobject from the line, as backup - Call
f1(iss)issbecomes theistreamsource off1(), instead ofcin- If error, print
void f2(istream& is) {
string str;
while (std::getline(is, str)) {
istringstream iss(str);
try {
f1(iss);
} catch (const invalid_argument& x) {
cerr << x.what() << ": " << str << '\n';
}
}
}
File Stream
f3() constructs ifstream object
void f3(const char* filename) {
ifstream ifs { filename };
if (!ifs) {
if (filename != nullptr)
throw runtime_error{"can't open file: "s + filename};
else
throw runtime_error{"no file name provided"};
}
f2(ifs);
}
10. Sequence Containers
Standard Template Library (STL)
Array
int a[2] = {6, 7};
int a[2] {6, 7}; // preferred in cpp
Now wrap C array in class template Array
template <typename T, size_t N>
struct Array {
// typedef T value_type;
using value_type = T;
// T(&)[N] is a reference to array of N elements of type T
using TA = T(&)[N];
T& operator[](int i) { return a[i]; }
const T& operator[](int i) const { return a[i]; }
// member function, done in compile time
constexpr int size() const { return N; }
TA underlying() { return a; }
T a[N];
};
usingdeclaration: another way to dotypedef. Define newvalue_typemeansTTAreferences to the array
constexprcan be designated to any (member) function. Tells compiler that such call can be evaluated at compile timeunderlying()pass out the reference to the entire array- Avoids decaying array into a pointer, by using
T(&)[N]
- Avoids decaying array into a pointer, by using
int *p = a; // pa is a pointer to an array of 5 ints and points to the array a int (*pa)[5] = &a; // ra is a reference to an array of 5 ints and refers to the array a int (&ra)[5] = a;
pais a pointer. After dereferencingpa, you will getint[5]
pahas typeint (*)[5]pasame value asa, but different typepa + 1will point to 5 elements after
Array<int,5> v { 100, 101, 102, 103, 104 };
constexpr size_t n = sizeof(v.underlying()) /
sizeof(decltype(v)::value_type);
// sizeof(Array<int,5>::value_type);
static_assert(v.size() == n);
decltype()to get type of a variable
std::array
Same fixed size
std::array<int,5> v { 100, 101, 102, 103, 104 };
for (size_t i = 0; i < v.size(); ++i) { print_int(v[i]); }
std::vector
Initialization similar to array
- Also guarantees elements are contiguous
push_back()may invalidate pointers and refs!
std::vector<int> v { 100, 101, 102, 103, 104 };
v.push_back(105);
for (size_t i = 0; i < v.size(); ++i) { print_int(v[i]); }
Can use directly list-initialization. The compiler can deduce type T
std::vector v { 100, 101, 102, 103, 104 };
std::deque
Double-ended queue
- Vector with
push_front()in $O(1)$,- vs vector $O(n)$ (must shift everything)
- Existing elements never move, pointer and refs are preserved on pushing front/back
- Not preserved if
insert()ordelete()in the middle :(
- Not preserved if
- No contiguity guarantee
- Chunks (still good cache locality)
- Slower access
deque<int> v { 100, 101, 102, 103, 104 }; v.push_front(99); v.push_back(105);
Implementation
Deque maintains a chunk pointer array
- Allocate a new chunk if needed
- Allocate new chunk pointer array, same as vector
std::list
Doubly linked list
list<int> v { 100, 101, 102, 103, 104 };
v.push_front(99);
v.push_back(105);
while (!v.empty()) {
print_int(v.front());
v.pop_front();
}
No operator[]()!!! Need to traverse yourself
- Avoid providing an inefficient version
Push and insert() all in \(O(1)\)
- Slower than vector in practice
std::forward_list
Minimal singly linked list
- Just
push_back()forward_list<int> v { 100, 101, 102, 103, 104 }; v.push_front(99); while (!v.empty()) { print_int(v.front()); v.pop_front(); }
sizeof(forward_list<int>) = 8
- Header is 8 bytes (single head pointer)
- Middle nodes will have size 12.
