IA Algorithms Enrichment 2017
Activities > Aside > Scoping and style

Scoping and style

In this page:

Scoping and variable lifetimes

Every program needs to manage its memory efficiently. For example, if a section of code is not currently running (or paused), then it is usually a good idea for the program to remove any variables that belonged to that part of the program from memory. However, thus far, you haven't ever told your program to remove these variables, and yet it is doing this all by itself, so the question stands: how does it know when a variable is no longer being used?

The answer is that all of the variables you have used so far have automatic storage duration, in other words, they are removed from memory at specific points in your program based on how you have written your program. Understanding when and where your variables exist will help you both find out why your code doesn't work as you expect it to, and write better code.

Note: It is possible to create variables that do not have automatic storage duration; however, this is outside of the scope (pun intended) of this course, as managing these variables correctly is often difficult and is easy to do wrong if you're not careful.

Scopes

Automatic storage duration is determined by scopes. A scope is essentially any block of your program that is surrounded by curly braces ({ and }). This includes the main function, any code blocks you have following if and else statements, and loop bodies (Additionally, for loops create a scope for one-line loop bodies). You can even create scopes that have no control structures attached to them, like so:

int main () {
	int a, b, c;
	
	{
		double d = 0.1;
		a = 5;
		
		d += static_cast<double>(a); //convert int to double
	}
	
	//d == ?
	
	return 0;
}

Whenever you declare a variable inside a scope, that variable exists inside of that scope only. If you try to access that variable outside of the scope (before or after), you will get a compiler error. So, in the above example, d would not even exist on line 11.

Nested scopes

As you have probably guessed by the above example (and from if-else statements and loops), you can nest scopes. Nested scopes have access to the variables declared in all of its parent scopes, and changes to those variables will persist after the nested scope stops executing. For example:

#include <iostream>
using namespace std;

int main () {
	int a = 0;
	int b = 15;
	{
		{
			a += b;
			a -= 2;
		}
	}
	cout << a << endl;
	return 0;
}

13

However, what if we declare a variable in the inner scope with the same name as a variable in the outer scope? Let's see:

#include <iostream>
using namespace std;

int main () {
	int a = 1;
	{
		int a = 2;
		cout << a << endl;
	}
	return 0;
}

2

Firstly, you may be surprised that it even compiles -- how come it lets us declare the variable a twice? Well, the first a -- the one that was assigned a value of 1 -- lives in main's outer scope, while the second a -- the one that was assigned a value of 2 -- lives in the inner scope. As such, the compiler has a way to distinguish the two.

Now, why does it output 2? This is where one of the most important rules of scoping comes into play -- if there are name conflicts between variables in nested scopes, the compiler always chooses the variable from the most specific/deepest scope. Since the second a was declared in a deeper scope, it takes precedence. If we were to move the cout statement after the end of the inner scope, we would see this program output 1, as the second a does not exist outside of the inner scope, so the first a is the only variable named a that exists at that particular point.

Generally, however, you should always try to avoid naming conflicts between scopes -- it makes your code much easier to read if all variables have unique names, even between scopes; plus you will be much less likely to encounter confusing bugs.

Loop scoping

Loops are a little peculiar when it comes to scoping. Let's try to declare a variable in a loop and do something with it:

#include <iostream>
using namespace std;

int main () {
	for (int i = 0; i < 5; ++i) {
		int a = 0;
		cout << a << endl;
		++a;
	}
	return 0;

0
0
0
0
0

OK, that's a little weird...shouldn't it have printed out the numbers 0 through 4? Well, there are a couple of factors that cause the above behavior:

  • You are setting a to 0 at the beginning of every iteration.
  • The scope ends after each iteration. In other words, all of the variables that existed in the last iteration no longer exist, and in their place fresh ones have been created.

The second point is the bigger one here -- variables you declare in a loop body are deleted after every iteration, only to be created anew on the next iteration. So, if you declare a variable in a loop, it will not hold data across loop iterations.

The scoping of the for loop's control variable is even weirder:

#include <iostream>
using namespace std;

int main () {
        for (int i = 0; i < 10; ++i) {
                cout << i << endl;
        }
        cout << i << endl;
        return 0;
}

scoping.cpp:8:10: error: name lookup of 'i' changed for ISO 'for' scoping [-fpermissive]

The error above is telling you that i exists only inside the for loop; so its scope is actually in the loop body (despite the fact that it was not declared between the curly braces). So, let's remove that last print statement:

#include <iostream>
using namespace std;

int main () {
        for (int i = 0; i < 10; ++i) {
                cout << i << endl;
        }
        return 0;
}

0
1
2
3
4
5
6
7
8
9

So, i belongs to the loop body scope, but it is retaining its value! Didn't we just say that the variables declared in a loop scope are deleted after each iteration?

The for loop's control variable is special. Because it needs to retain its value between iterations in order for the loop to function as expected, it is not deleted between iterations. However, because the variable still belongs to the loop, it is not accessible outside of the loop.

The global scope

So, now you know how scopes work, let's examine the most bizarre scope of them all: the global scope. The global scope is simply the scope that exists from the start of your source file to the end of the source file. Unlike other scopes, it is not enclosed by curly braces -- the compiler always knows when it starts and ends based on the input file.

As of right now, we have not done much with the global scope, other than to #include <iostream>, use the std namespace, and define main (As we proceed with this class, you will see more things be placed in the global scope). However, it is a scope, so you can declare variables in it, just like any other:

#include <iostream>
using namespace std;

int a = 5;

int main () {
	cout << a <<
	return 0;
}

5

Since main is a nested scope, it reads up to the global scope and gets the value 5 for a. Variables declared on the global scope are accessible in all scopes in a source file -- when we talk about functions, you can see that variables declared on the global scope are accessible in functions other than main as well.

Name resolution works just as we described earlier with regards to variables on the global scope:

#include <iostream>
using namespace std;

int a = 5;

int main () {
	int a = 6;
	cout << a <<
	return 0;
}

6

Now, the important thing you should know about the global scope is that you should never declare variables or write code on the global scope. This is considered incredibly poor practice.

Why? Because your code is not the only thing being placed on the global scope -- everything you #include is put on the global scope (unless you #include it in a nested scope, but doing so will have unintended side effects; you should always #include on the global scope), and the chances of you declaring a variable with the same name as something provided by a header file you included are much higher. This is an actual problem -- without the scope difference, the compiler will think you are trying to redeclare something, which is not allowable. However, if you declare all of your variables in scopes nested in the global scope, you can avoid this naming conflict by taking advantage of the name resolution process.

In fact, this is the reason for the using namespace std; line at the beginning of the file -- the designers of C++ were concerned that putting all of the standard library would "pollute" the global scope, they created a named scope (a namespace) called std that would not interfere with the global scope of any program (other than the name std). By placing using namespace std; at the beginning of the file, we are promising not to declare anything on the global scope that has the same name as anything in the standard library on the global scope, so we can treat objects and functions provided by the standard library as if they were on the global scope (and avoid having to explicitly say that they come from the std scope).

Activity

You will not need repl.it for this activity. Consider the following code:

#include <iostream>
using namespace std;

int a = 5;

int main () {
	double b = 4.3;
	string str = "Hello World!";
	
	if (b < a) {
		string str2 = "Programming is cool";
		cout << str << ' ' << str2 << endl;
		b = static_cast<double>(a) + b;
	}
	
	//Call this scope "C"
	{
		bool c = (a < b);
		cout << str2 << "? " << boolalpha << c << endl;
		
		//Call this scope "D"
		{
			double c = b - static_cast<double>(a);
			int e = 42;
			cout << c << endl;
		}
		
		++e;
	}
	
	cout << e << endl;
	
	for (int i = 0; i < 10; ++i) {
		cout << i << endl;
	}
	
	cout << (i + a) << endl;
	
	return 0;
}

Answer the following:

  1. Identify the scope of each variable.
  2. What is the output from the cout statement on the highlighted line?
  3. Find all the scoping bugs in the code.

Code style

As has been mentioned, C++ compilers are very lenient about the amount of whitespace you have in your code, so long as it does not change the meaning of a statement (for example, int foo and intfoo do not mean the same thing). However, one of the most important aspects of learning to code is learning how to write it effectively so that someone else can sit down in front of it and understand what your code does. This not only helps with communication in team software projects, but it also helps anyone who looks at your code long after you have moved on.

Style is not the focus of this enrichment, and we will not be spending much time discussing style. However, it is very important that you learn and practice good coding style habits early, as you learn how to write code. This section will introduce you to some of basic style recommendations, and the reasons for those recommendations.

If you look around online, you will quickly discover that there is no one definitive coding style guide. Different people follow different guidelines depending on what suits their needs. Since we are only just learning programming, I am not expecting you to follow any particular set of style rules, however, I will be inspecting your exercise submissions to provide you with feedback on your coding style.

But above all else, the number one coding style rule is always be consistent. Do not change what style rules you're following halfway through the source (or, if you are working on a multi-file project, in the middle of the project). This will be the biggest thing I look for when I look at your coding exercises.

Indentation

Indentation is the most ubiquitous style rule. Proper indentation helps communicate the structure in your code, even if the indentation itself serves no structural purpose. Let's look a badly indented example:

//*** BAD. DO NOT DO THIS. ***
#include <iostream>
using namespace std;

int main () {
int a;
int b;

cout << "Please input two numbers: ";
cin >> a >> b;

if ((a + b) < 0) {
cout << "The result is negative." << endl;
} else if ((a + b) > 0) {
cout << "The result is positive." << endl;
} else {
cout << "The result is 0." << endl;
}

return 0;
}

It may take you a moment for you to figure out what the structure is here: lines 6 - 20 belong to main, and lines 13, 15, and 17 belong to if statements. However, with proper indentation, this structure becomes almost immediately obvious:

//Good example of indentation
#include <iostream>
using namespace std;

int main () {
	int a;
	int b;
	
	cout << "Please input two numbers: ";
	cin >> a >> b;
	
	if ((a + b) < 0) {
		cout << "The result is negative." << endl;
	} else if ((a + b) > 0) {
		cout << "The result is positive." << endl;
	} else {
		cout << "The result is 0." << endl;
	}
	
	return 0;
}

While this change may not seem to make much of a difference with this example, in large software projects where there are, on average, ~100k lines of code, with somewhere between 500-1000 lines of code per file (and the code is much more complex), this makes a huge difference.

So, with regards to indentation, here are some general guidelines:

  • Use equal spacing at all levels of indentation (for example, do not indent 4 spaces at the first level and then indent 2 spaces at the next level).
  • The contents of each scope should be indented one level further than its parent scope. One-liners for if-else statements and loops should also be indented by one level if wrapped to the next line.
    • Closing curly braces for scopes ('}') should have the same indentation as the parent scope.

Most of the variation in style with regards to indentation has to do with whether you should use a tab character or spaces for indentation (most modern code editors can insert either when you hit the Tab key), or how many spaces should be used to indent (usually 2, 4, or 8). Again, I do not care what choice you make for either of these, as long as you are consistent.

Code spacing

Inserting empty lines into your code also helps visually break your code apart into related "chunks". Again, let's do a bad example-good example analysis:

//*** BAD. DO NOT DO THIS. ***
#include <iostream>
using namespace std;
int main () {
	int a;
	int b;
	cout << "Please input two numbers: ";
	cin >> a >> b;
	if ((a + b) < 0) {
		cout << "The result is negative." << endl;
	} else if ((a + b) > 0) {
		cout << "The result is positive." << endl;
	} else {
		cout << "The result is 0." << endl;
	}
	return 0;
}

This really just looks like a giant blob of code, and is difficult to make sense of at first. This example does a good job of separating "chunks" of related code:

//Good example of code separation
#include <iostream>
using namespace std;

int main () {
	int a;
	int b;
	
	//Get user input
	cout << "Please input two numbers: ";
	cin >> a >> b;
	
	//Output sign of the sum of the two input numbers
	if ((a + b) < 0) {
		cout << "The result is negative." << endl;
	} else if ((a + b) > 0) {
		cout << "The result is positive." << endl;
	} else {
		cout << "The result is 0." << endl;
	}
	
	return 0;
}

As you can see, in addition to separating related code by whitespace, the author has also put comments at the beginning of some of the "chunks" to quickly describe their purpose. While this is not necessary, it is very helpful for others when they are trying to understand your code. Adding comments around especially confusing "chunks" can also be helpful for you to navigate your own code in the future, should you ever need to revisit it.

Here are some general guidelines for code spacing:

  • Separate "chunks" or "paragraphs" of code using a single empty line.
  • Function definitions (e.g. main) should be separated by single empty lines. If a function definition is at the end of a file, a trailing newline is not necessary, but it is recommended (some text editors may enforce this anyways).
  • Comments should appear alongside the "chunk" the refer to. If placed at the beginning, the empty newline should appear above the comment, with the code "chunk" appearing immediately on the next line.

Operator spacing

Operators in C++ do not care a lot about spacing, however, how you space operator arguments from the operator can help communicate what operation you are performing.

First, let us quickly define two types of operators: binary and unary operators. A binary operator is any operator that has two operands/arguments. Addition (+), division (-), assignment (=), and inequality comparison (!=) are all examples of binary operators. On the other hand, a unary operator is any operator that has just one operand/argument. Increment/decrement (++ and --), arithmetic negation (- as in -5), and logical negation (! as in !that) are all examples of unary operators.

The style rules for operators are not as clear cut as the ones we have discussed so far, but here is my recommendation:

  • Binary operators should be separated from their operands by a single space on either side.
  • Unary operators should have no space between themselves and their operand.

Here's an example:

//Operator spacing example
#include <iostream>
using namespace std;

int main () {
	int a;
	int b;
	bool negative;
	
	//<< and >> are binary operators
	cout << "Please input two numbers: ";
	cin >> a >> b;
	
	//= and < are binary operators
	negative = ((a + b) < 0);
	
	if (negative) {
		cout << "The result is negative." << endl;
	} 
	//! is a unary operator. && and != are binary operators.
	else if (!negative && (a + b) != 0) {
		cout << "The result is positive." << endl;
	} else {
		cout << "The result is 0." << endl;
	}
	
	return 0;
}

Commas are not (usually) operators, however, whenever you use a comma, use it like you would in English:

//Comma spacing
#include <iostream>
using namespace std;

void doSomething (int x, int y);

int main () {
	int a, b;
	
	cout << "Please input two numbers: ";
	cin >> a >> b;
	
	doSomething(a, b);
	
	return 0;
}

void doSomething (int x, int y) {
	for (; x < y; ++x, --y) {
		//The comma below is a literal comma character, ignore it with respect to style
		cout << '(' << x << ',' << y << ')' << endl;
	}
}

Opening braces

This is one of the most contentious issues in coding style (the other being tabs vs. spaces). Basically, your two options are as follows:

int main () {
}

// ** OR **

int main ()
{
}

Neither is incorrect, just be consistent. This rule applies to all control structures (functions, if-else statements, loops, classes, etc.), so make sure to use the same choice for all opening braces. Note that this rule makes little sense when applied to scopes without any control structures attached to them.

Another closely related style aspect is whether to place a space before an open parenthesis (be it in a math formula, the condition for an if statement or a loop, or a function call). Again, there is no right or wrong way to do this, but here is my recommendation:

//Put a space before the open parenthesis in function prototypes
int doSomething (int x, int y);

//Put a space before the open parenthesis in function definitions
int main () {
	int a, b;
	
	cout << "Please input two numbers: ";
	cin >> a >> b;
	
	//Do not put a space before the open parenthesis in a function *call*
	cout << "The result is " << doSomething(a, b) << '.' << endl;
	
	return 0;
}

//Put a space before the open parenthesis in function definitions
int doSomething (int x, int y) {
	/* Put a space before the open parenthesis in an if statement. Do not put
	   a space before open parentheses in arithmetic or logic statements */
	if ((x + y) > 0) {
		return x - y;
	} else {
		/* Put a space before the open parenthesis in a loop condition. Here,
		   we still follow the rule "no space before open parenthesis in 
		   arithmetic", but the spacing around the '=' operator takes
		   precedence. */
		for (int i = (y - x); i < 0; ++i) {
			y += i;
		}
		return y;
	}
}

Conclusion

There are more style rules that we could discuss; it would take a long time for me to enumerate them all here. The point of this section is not to teach and subsequently enforce a strict coding style, but rather to expose you to what considerations we make when we are writing code in good style. I recommend that you read this section, and make a mental note of what style works for you, and then stick to that style when you write code. As always, ensure that you are being consistent.