Arrays

6.1 Introduction

In previous chapters, we discussed various programming structures, namely, sequential, conditional, and loop, which can be employed in algorithms and programs. This chapter will increase the power of these structures by introducing a new data structure that can be employed with any of them, that is, the array. A data structure is any organized means of storage, an array being among the simplest of structures.

6.2 Array Types

6.2.1 One-Dimensional Arrays

An array is an ordered list of variables. To get the idea, imagine a row of compartments, each of the compartments can contain something or be empty, and they are numbered sequentially starting at 0, as you see on the left side of .

Arrays start counting their elements, or individual variables, at index 0, as the index is the offset from the beginning of the array. Index 0 is the position of the first element in any array that contains at least one element. Likewise, the n^th element can be found at index n – 1. Starting at 0 and ending at n – 1 may seem odd, but it is common among most programming languages.

Diagram of an array and array access in Ruby

Gem of Wisdom

Array indices are the offset from the first element. As a result, the first element is stored at index 0.

The arrays in are known as one-dimensional arrays because there is only one index or dimension. To access an element in an array in Ruby, the notation is array_name[index], where array_name indicates the name of the array and index indicates the element of the array being referenced.

Consider an array named arr that stores a list of five test scores for a student. The student received the scores 73, 98, 86, 61, and 96. The first step of creating an array is the statement: array_name = Array.new. The example code in shows how to initialize an array to store test scores.

Initializing an array

arr = Array.new()
arr[0] = 73 
arr[1] = 98 
arr[2] = 86 
arr[3] = 61 
arr[4] = 96
puts arr

The code shown is actual Ruby syntax for initializing array indices 0 through 4. The result of the code is much like the array in the righthand side of , an array of size 5 with every element initialized. However, there is a quicker way to initialize an array, as shown in .

Initializing an array

arr = [73, 98, 86, 61, 96]
puts arr

No matter which way the array is initialized, the result is the same. To use the array, you access array_name[index], as if it were a variable of the data type expected, as shown in .

Changing the value of an element

arr = [5,6]
arr[0] = arr[0] + 10
puts arr[0]

The key advantage of arrays is highlighted when used in conjunction with loops. Since the syntax for accessing an element in an array is array_name[index], we can use a variable for the index instead of literal numbers, as in the examples just shown. Thus, we can change the index in every loop iteration, and traverse or move through every element in the array. To know when to stop traversing the array, we can get the number of elements in an array by using the following statement:

arr.size

New programmers often make errors when dealing with the bounds of an array. These are the basic rules for array bounds:

The first element in an array is at index 0.
arr.size returns the number of elements in the array, not the highest indexed element.
The last element in an array is at index arr.size - 1.

If we want to use a while loop to traverse an array, we need to initialize the index to 0 and increment it for every loop iteration. It is important to note that the condition in the while loop is index < arr.size, not index <= arr.size, for the reasons just mentioned. The code in is an example of basic array traversal that prints out every element in the array.

Gem of Wisdom

Using arrays is a flexible and organized way of expressing multiple related items. Every programming language has them. As an abstraction, arrays are expanded variables. A variable holds one value; an array holds many. For example, to record names of multiple institutions using variables requires the use of variables such as institution1, institution2, institution3, and so on. With arrays, institutions[] stores all institutions.

Displaying array content

arr = [73, 98, 86, 61, 96]
index = 0
while (index < arr.size)
   puts arr[index]
   index = index + 1
end

Running the code (stored in a file called array_4.rb) gives the following output:

$ ruby array_4.rb
73
98
86
61
96

However, in Ruby it is possible to accomplish the same goal in one line of code:

puts arr

This example is meant to be merely an introduction to simple array traversal. More practical reasons to use loops and arrays together are illustrated later.

Example: Find the Max

The example in shows how to find the maximum of a list of five non-negative numbers. What would you need to change to support negative numbers?

Find the max

# Initialize array and loop values
arr = [73, 98, 86, 61, 96]
index = 0
max = 0

# Loop over each element in arr
while (index < arr.size)
   if (arr[index] > max)
      # Update max
      max = arr[index]
    end
    index = index + 1
end

# Output calculated max
puts "Max ==> " + max.to_s

Line 2 creates a new array named arr and initializes its variables.
Line 3 sets a counter named index that will serve as an index into the array.
Line 4 declares a variable called max that will be used to store the maximum number.
Lines 7–13 implement a loop that scans every element of the array.
Line 16 prints the maximum element at the end.

Each time the index variable index is incremented in line 12, the if statement in lines 8–11 tests to see whether the current value in the array indexed at index is higher than the current value in max. If the current value is higher, then the max variable is updated with the highest value.

Run this program, called find_the_max.rb. Your output should be:

$ ruby find_the_max.rb
Max ==> 98

As an exercise to make sure you understand the preceding code, change the example to output the lowest value in the array.

6.2.2 Multidimensional Arrays

Arrays that have more than one dimension are called multidimensional arrays. A common multidimensional array is the two-dimensional array, which can be used to represent matrices and coordinate systems. Unlike some other programming languages, Ruby does not provide built-in support for multidimensional arrays. The way to work around this is to put an array inside an array, which is essentially what a multidimensional array is anyway.

Consider the example in the previous section of an array that stores a list of five test scores for a student. Now, what if you had students with the following scores?

Geraldo: 73, 98, 86, 61, 96

Brittany: 60, 90, 96, 92, 77

Michael: 44, 50, 99, 65, 10

Multidimensional array
	[0]	[1]	[2]	[3]	[4]
[0]	73	98	86	61	96
[1]	60	90	96	92	77
[2]	44	50	99	65	10

The best way to represent this data is to create one array with three indices and to have each index contain five elements. This is basically an array with three rows and five columns. To create such an array in Ruby, type in the following:

arr = [[73, 98, 86, 61, 96],
       [60, 90, 96, 92, 77],
       [44, 50, 99, 65, 10]]

puts "Brittany's Third Exam: " + arr[1][2].to_s

By typing in this assignment statement, you have created a 3 × 5 table where the first, second, and third rows represent Geraldo’s, Brittany’s, and Michael’s test scores, respectively. To access an individual score, use the format array[row][column]. So, if you wanted to know what Brittany scored on her third exam (remember, each index starts with 0, not 1), type:

puts "Brittany's Third Exam: " + arr[1][2].to_s

The output should be:

Brittany's Third Exam: 96

The rules for traversing a multidimensional array are similar to traversing a one-dimensional array, except you have to add a nested loop for each extra dimension. In , we illustrate how to print out each value in the array arr.

Outputting multidimensional arrays

     1 # Initialize array and loop values
     2 arr = [[73, 98, 86, 61, 96], 
     3        [60, 90, 96, 92, 77], 
     4        [44, 50, 99, 65, 10]] 
     5 row = 0
     6 column = 0
     7 
     8 # Loop over each row
     9 while (row < arr.size)
    10 	puts "Row: " + row.to_s
    11 	# Loop over each column
    12 	while (column < arr[row].size)
    13 		# Print the item at position row x column
    14 		puts arr[row][column]
    15 		column = column + 1
    16 	end
    17 	# Reset column, advance row
    18 	column = 0
    19 	row = row + 1
    20 end

Gem of Wisdom

Step through and convince yourself that it generates the values you expect. What output values will be generated? What would happen if we added an if statement after line 13 that asked to skip values that were prime? Give that a try, as it uses our prime number work.

Like one-dimensional arrays, you can output everything using one line of code:

puts arr

The only problem with this statement is that Ruby will list all the values without any formatting. So it would be difficult to sort through all the data. We address means to format output in one particular case as part of our tic-tac-toe example in . We do not address formatting generally in this book.

Example: Find the Max—Modified

In the previous example, we created a program that finds the highest score in the array. Now we will create a program that stores the five scores for all three students. Using what you learned previously, modify the program to find out which student has the highest score. Print out the highest score.

When you are done with your program, compare your program to the code in .

Find the max, modified

     1 # initialize the array and index/score variables
     2 arr = [[73, 98, 86, 61, 96], 
     3        [60, 90, 96, 92, 77], 
     4        [44, 50, 99, 65, 10]] 
     5 
     6 row = 0
     7 column = 0
     8 maxscore = 0
     9 maxrow = 0
    10 
    11 # for each row
    12 while (row < arr.size)
    13    # for each column
    14    while (column < arr[row].size)
    15       # update score variables
    16       if (arr[row][column] > maxscore)
    17          maxrow = row
    18          maxscore = arr[row][column]
    19       end
    20       # increment column
    21       column = column + 1
    22    end
    23    # reset column, increment row
    24    column = 0
    25    row = row + 1
    26 end
    27 
    28 # output name and high score information
    29 if maxrow == 0
    30    puts "Geraldo has the highest score."
    31 elsif maxrow == 1
    32    puts "Brittany has the highest score."
    33 elsif maxrow == 2
    34    puts "Michael has the highest score."
    35 else
    36    puts "Something didn't work correctly."
    37 end
    38 puts "The high score was: " + maxscore.to_s

Lines 2–4 initialize a 3 × 5 array called arr.
Lines 6–7 initialize the initial location of the row and column to start traversing the array.
Lines 8–9 declare maxscore that will keep track of the highest score and maxrow that will keep track of who has the highest score.
Lines 12–26 implement a loop that scans each element of the array.
Lines 29–37 compare the value of maxrow and output the corresponding person’s name as the individual with the highest score.
Line 38 outputs the highest score.

Intuitively, like the previous example, there is a marker for the highest score. Whenever the program finds a score higher than the current value of maxscore, it updates maxrow to contain the value of the row in which the program found the high score (line 17) and maxscore to reflect the highest score (line 18). The program then uses if-else statements to find out who has the highest score (lines 29–37). Notice again how rows 0, 1, and 2 correspond with Geraldo, Brittany, and Michael, respectively. When you run the program, the output should read:

$ ruby find_the_max_modified.rb
Michael has the highest score.
The high score was: 99

6.3 Hashes

Unlike arrays, which strictly use integer indices, hashes can use any data type as their index. What Ruby calls a “hash” is really a clever way of using a string data type to map quickly to a specific element inside an array.

The string is referred to as a hash key. Some kind of function must exist to map a string to a number. For example, a simple hash function could add up the ASCII codes for each letter and implement a modulo for the number of keys we have. A hash collision occurs when our hash function returns the same number for two different keys, which can be handled with various collision resolution algorithms. A simple collision resolution algorithm simply places all keys that have a collision into a bucket, and the bucket is sequentially scanned for the specific key that is requested when a collision occurs. A detailed discussion of hashing is beyond the scope of this book, but we wanted to illustrate the differences between a hash table and an array.

In most cases, strings are used to associate keys to values. For example, instead of using a two-dimensional array, we can use a hash to store student test scores by name as seen in . As shown, similar to arrays, line 1 creates a new hash structure. Likewise, element assignment, lines 2–4, follow the same process done for arrays.

Example hash usage

    1 scores = Hash.new
    2 scores["Geraldo"] = [98, 95, 93, 96]
    3 scores["Brittany"] = [74, 90, 84, 92]
    4 scores["Michael"] = [72, 87, 68, 54, 10]

Example: Hash

To access Brittany’s score, we could simply call on scores["Brittany"]. Of course, the string "Brittany" can also be replaced by a variable that holds that string.

Gem of Wisdom

Arrays are accessed with a numerical index, as in array[5]. Hashes are accessed with a string as the index, as in scores["Brittany"].

Example: Accessing a Hash

Example hash accessor usage

    1 scores = Hash.new
    2 scores["Geraldo"] = [98, 95, 93, 96]
    3 scores["Brittany"] = [74, 90, 84, 92]
    4 scores["Michael"] = [72, 87, 68, 54, 10]
    5 name = "Brittany"
    6 puts name + " first score is: " + scores[name][0].to_s

In line 5 of , we assigned “Brittany” to the variable name; so, assuming that the code of is stored in file hash_2.rb, executing the code should display Brittany’s first score on the screen:

$ ruby hash_2.rb
Brittany first score is: 74

It is possible to get an array of all the keys by calling on scores.keys. We can then go through each key by using a for loop. We can now rewrite the maximum score example to work for any number of students, no matter what their names are or how many scores each student has.

Note that in our example, the number of individual scores varies among the students. That is, in , both “Geraldo” and “Brittany” have four scores each, while “Michael” has five. The ability to have varying numbers of entries provides great flexibility.

Example: Find the Max—Hash

Find the max—hash

     1 scores = Hash.new
     2 
     3 scores["Geraldo"] = [98, 95, 93, 96]
     4 scores["Brittany"] = [74, 90, 84, 92]
     5 scores["Michael"] = [72, 87, 68, 54, 10]
     6 
     7 maxscore = 0
     8 for name in scores.keys
     9 	column = 0
    10 	while (column < scores[name].size)
    11 
    12 		if (scores[name][column] > maxscore)
    13 			maxname = name
    14 			maxscore = scores[name][column]
    15 		end
    16 		column = column + 1
    17 	end
    18 end
    19 
    20 puts maxname + " has the highest score."
    21 puts "The highest score is: " + maxscore.to_s

We see that running the code from , stored in file find_max_hash.rb, will output the following result:

$ ruby find_max_hash.rb
Geraldo has the highest score.
The highest score is: 98

Note that the entries in this hash differ from the entries used in the array example.

Hashes cannot replace arrays outright. Due to the nature of their keys, they do not actually have any sensible sequence for their elements. Hashes and arrays serve separate but similar roles. Hashes excel at lookup. A hash keyed on name with a phone number as a value is much easier to work with than a multidimensional array of names and phone numbers.

Arrays refer to a sequence of variables where each variable does not have a name; instead, it is referenced by an integer index. That is, arr[i] refers to the i^th element in the sequence, remembering that indices start at 0. In contrast, a hash table uses a key-value pairing to identify the particular entry. In the earlier example, we wish to access test scores based on a person’s name. That is, the hash table arr['Geraldo'] identifies Geraldo’s test scores even though Geraldo is not an integer. Such referencing supports both efficient access and logical correlations.

6.4 Summary

We discussed one-dimensional arrays, arrays of arrays, and hashes. These are constructs that often take students time to learn, so we strongly suggest that you work through all the exercises in this chapter to ensure that you have a full understanding of these concepts.

6.4.1 Key Concepts

Arrays are structures that use a table format to store variables. The data stored in an array is accessed using numbers as an index starting at 0. They can be used in any programming structure, but they are most commonly associated with the loop structure.
One key concept when working with arrays is that they can have an infinite number of dimensions. This means that a memory location within an array can either store a single piece of data or store an entirely new array.
Hashes are much like arrays, except that rather than using only an integer to look up a memory location, any variable can be used as a key.

6.4.2 Key Definitions

Array: A consecutively numbered list of variables.
Element: A variable contained within an array.
Multidimensional array: An array whose elements are also arrays.
Index: The number associated with a certain element within an array.
Traverse: To move from one element to another within an array.
Hash: A data structure that can map any data type (key) to a value.

6.5 Exercises

Using the array arr with value a[0] = 9, a[1] = 2, a[2] = 5, a[3] = 4, a[4] = 3, determine the output of the code in .

Code for Exercise 1

    1 i = 0
    2 
    3 while (i < a.size)
    4 	puts a[i]
    5 	i = i + 1
    6 end

The code in looks for the first two elements that are out of order and swaps them; however, it is not producing the correct results. Fix the code so that it works correctly.

Code for Exercise 2

    1 arr = [5, 22, 29, 39, 19, 51, 78, 96, 84]
    2 i = 0 
    3 while (i < arr.size - 1 and arr[i] < arr[i + 1])
    4     i = i + 1 
    5 end 
    6 puts i 
    7 
    8 arr[i] = arr[i + 1]
    9 arr[i + 1] = arr[i]

Write a program that splits an array into two arrays where any element in one array is smaller than any element in the other array. Solutions are not unique, but equally sized splits are desirable. The input can be any size array less than 100.

Example input: [6, 45, 23, 65, 17, 48, 97, 32, 18, 9, 88]

Example output: [6, 23, 17, 18 , 9] < [45, 65, 48, 97, 32, 88]
There are many ways to store image data. One way is to store pixel data in a two-dimensional array. The pixel data is itself a three-element array that describes the amount of red, green, and blue in the pixel. The amount of red, green, or blue is a number from 0 to 255. Here are a few example RGB values:
```
red = [255, 0, 0]
green = [0, 255, 0]
blue = [0, 0, 255]
black = [0, 0, 0]
white = [255, 255, 255]
yellow = [255, 255, 0]
```
Suppose you have a picture and need to count red pixels. For a pixel to be red, it must be within the following RGB constraints:
1. The R value must be greater than 100.
2. The G and B values must each be less than the R value divided by 4.
Write this program. Use this sample data to test your program:
```
sample =
[[[ 65, 67, 23], [234,176,  0], [143,  0,  0]],
 [[255, 30, 51], [156, 41, 38], [  3,243,176]],
 [[255,255,255], [  0,  0,  0], [133, 28, 13]],
 [[ 26, 43,255], [ 48,  2,  2], [ 57, 89,202]]]
```
This sample has three red pixels.
Function-plotting software must calculate a function at many points to plot it. Given the function:

$f (x) = x 4 + 17 × x 3 − 416 × x 2 − 612 × x + 2500 500$
1. Write a program that calculates and stores 100,000 values for f (x) between x = –50 and x = 50.
2. Extend the program so that it searches for values for x that are very close to, or are, zero. How many x values between –50 and 50 make f(x) zero? What are they?
The three witches in Hamlet can brew any potion provided they have the right ingredients. Suppose that five ingredients are necessary in making a health potion: eye of newt (eon), toe of frog (tof), wool of bat (wob), adder’s fork (af), and tooth of wolf (tow). Four reactions can occur between these ingredients:
- 4 eon + 2 wob = 3 af + 4 tow
- 3 tow + 1 tof = 2 eon
- 1 wob + 2 af = 1 tof
- 4 tof + 7 tow + 2 af = 1 health potion
  
  Assuming you can control the order of reactions, write a program that can calculate the maximum number of health potions one can brew with a given amount of ingredients. Here is example output:
```
If I have 34 eon, 59 tof, 20 wob, 5 af, and 20 tow, I can make seven health potions.
```