Secure Format String
- An executable statement can have a single argument or multiple arguments.
- The argument is simply the data that the function processes to produce an output.
- It is not recommended to have a single argument in a printf statement. The reason for this is that this function processes the first argument as the format string that contains conversion specifications which the function needs to replace with the correct values. These values are provided in the subsequent arguments in the same statement.
- Consider this example:
printf(“%d is larger than %d”, number1, number2);
- In the above example, the first conversion specification is replaced by the second argument i.e number1, while the second conversion specification is replaced by the third argument i.e number2.
- If the conversion specification was for a string i.e %s, a user can format the input string so that it contains more conversion specifications than the arguments contained in the statement. This results in the user being able to read memory without authorization or authentication, and this is a clear case of access violation.
- Instead of using printf with a \n (new line) escape sequence, one can use the puts function which prints the string in a single line and then moves the cursor to the next line. Using the puts function prevents the user from using a formatted input string that contains conversion specifications to read memory. This is because the conversion specification will be processed as normal characters that need to be printed. For example:
printf(“%d is larger than %d”, number1, number2);
And the user inputs 7%s as the number1 and 7 as the number2 values respectively, then the printf function will erronously process the 7%s value, while if the same values are used in the following the puts function
puts("%d is larger than %d", number1, number2);
//the puts output will be
7%s is larger than 7
- So, how should one write a word string using the printf function?
- The answer is simple, use two arguments with the first argument being the string conversion specification, and the second argument being the word string that needs to be displayed. For example:
//Instead of using
printf (“Welcome to Synowledge by Synowl”);
printf (“%s”, “Welcome to Synowledge by Synowl”);
- The simple addition operation of integers can result in very large values that cannot be stored in the sum variable. This is called arithmetic overflow and is an error that needs to be managed.
Sum = int1 + int2; // Can result in Arithmetic Overflow if the sum is very large to be stored in the variable's memory space.
- Arithmetic overflow is a security vulnerability, and to manage it, the maximum and minimum values that can be stored in the variable need to be specified. They are normally specified as INT_MAX (maximum integer value) and INT_MIN (minimum integer value).
- INT_MAX and INT_MIN are constants because they are variables that does not change during the execution of the program.
- The INT_MIN and INT_MAX constants are defined in the header file called the limits header or <limits.h>.
- The values assigned to the INT_MIN and INT_MAX constants are platform-specific.
- The standard C library contains a function identified(named) as rand that is used to generate random integer values between 0 and a maximum integer value designated as RAND_MAX (for maximum random value which is normally 32767). The RAND_MAX is a constant whose value is defined in the stdlib.h header. For this reason, this header must be included in the program.
- The rand value can be used to initialize a variable as follows:
int variable_identifier = rand();
- In the rand function, all the integer values between 0 and RAND_MAX have an equal probability of being passed to the variable above.
- At times, the variable does not need the integer value 0. For example, when rolling a dice, the numbers needed are 1, 2, 3, 4, 5, and 6. This creates a limit in the integer values that a program e.g a dice-rolling program needs to work with. This limit can be represented as minimum_value (1) ≤ range ≥ maximum_value (6). The limits of the integer values in the randomizing function are described as the width of the range. In this case, the width of the range is 6.
- If the variable cannot use zero, then the first value of rand needs to be shifted to a positive non-zero integer value.
- Creating the width of the range of values in rand is described as scaling. Scaling is syntactically written as:
rand() % maximum_value
- The maximum value of the range is called the scaling factor. The scaling factor is equal to the width of the range.
- To ensure that 0 is excluded from the range, a positive non-zero integer value is added to the scaled range. The value added to the scaled range to set the minimum value of the range is called the shifting value, and it is written syntactically as follows:
shifting_value + (rand() % maximum_value)
- For example, a shift of 1 sets the minimum value of the range to 1, and it is written as: 1 + (rand() % maximum_value).
- The shifting value is equal to the minimum value of the range.
- The shift does not change the range of values produced by the rand function but instead adds the shift value to the generated values e.g if the shift value is 1 and the scaling factor is set to 6, then rand will generate 6 values as follows: 0, 1, 2, 3, 4, and 5. Then the shift value of 1 is added to each value so as to get 1, 2, 3, 4, 5, and 6 thus creating the desired range of 1 ≤ range ≥ 6.
- The above syntax of rand allows it to be used in an argument.
- When the rand function is used as a called function, the first value that it generates when the function is executed is the same value that is generated when the function is executed again as long as the program was not terminated between the function executions. This repeatability in terms of function output is important as it allows the programmer to debug the program and test if the caller function is working as expected. If the function has been proven to work as expected, then the program can be restarted so that the rand function can generate a new value.
- The rand function is considered a generator of pseudorandom numbers because of its repeatability property.
- If one needs to generate an entirely new random integer value every time the function is run, then the function used is srand.
- The srand function uses an integer parameter to change the value generated by the rand function during each execution of the function. This means that the srand function is based on the rand function.
- The parameter used by the srand function to change the value generated by the function is called the seed.
- The ability to generate a new integer value every time the srand function is executed is called randomizing.
- The syntax of the srand function is:
- Because srand is based on rand function, then if the seeds supplied to the srand during two subsequent program executions are the same, then the same values will be generated. For this reason, the seed value should be different during each instance of program execution. This can be achieved by coercing the user to provide a seed value during each instance of program execution. A better approach is to automatically assign a different seed value during each instance of program execution using the time function.
- The time function is assigned the parameter NULL so that its return value is the number of seconds that have passed since Jan 1, 1970 0000Hours until the time the program is being executed. This return value is the seed value, and thus the time argument can be passed to the srand function as follows:
- The use of user-generated seed value allows for an experienced programmer to predict that integer value that will be generated by the srand function. To mitigate against this, there are custom random-number generation functions that are not predictable, hence are more secure than the srand and rand functions in the standard library.
- Among these secure custom random-number generation functions are the BCryptGenRandom function of Microsoft Windows, arc4random function of MacOS, and the random function of Linux-based operating systems.