In previous chapters, we discussed various programming structures, namely, sequential, conditional, and loop, which can be employed in algorithms and programs. This chapter will increase the power of these structures by introducing a new data structure that can be employed with any of them, that is, the array. A data structure is any organized means of storage, an array being among the simplest of structures.
An array is an ordered list of variables. To get the idea, imagine a row of compartments, each of the compartments can contain something or be empty, and they are numbered sequentially starting at 0, as you see on the left side of .
Arrays start counting their elements, or individual variables, at index 0, as the index is the offset from the beginning of the array. Index 0 is the position of the first element in any array that contains at least one element. Likewise, the nth element can be found at index n – 1. Starting at 0 and ending at n – 1 may seem odd, but it is common among most programming languages.
Array indices are the offset from the first element. As a result, the first element is stored at index 0.
The arrays in are known as one-dimensional arrays because there is only one index or dimension. To access an element in an array in Ruby, the notation is
, where array_name
[index
]array_name
indicates the name of the array and index
indicates the element of the array being referenced.
Consider an array named arr
that stores a list of five test scores for a student. The student received the scores 73, 98, 86, 61, and 96. The first step of creating an array is the statement: array_name = Array.new
. The example code in shows how to initialize an array to store test scores.
arr
=
Array
.
new
()
arr
[
0
]
=
73
arr
[
1
]
=
98
arr
[
2
]
=
86
arr
[
3
]
=
61
arr
[
4
]
=
96
puts
arr
The code shown is actual Ruby syntax for initializing array indices 0 through 4. The result of the code is much like the array in the righthand side of , an array of size 5 with every element initialized. However, there is a quicker way to initialize an array, as shown in .
arr
=
[
73
,
98
,
86
,
61
,
96
]
puts
arr
No matter which way the array is initialized, the result is the same. To use the array, you access array_name[index]
, as if it were a variable of the data type expected, as shown in .
arr
=
[
5
,
6
]
arr
[
0
]
=
arr
[
0
]
+
10
puts
arr
[
0
]
The key advantage of arrays is highlighted when used in conjunction with loops. Since the syntax for accessing an element in an array is array_name[index], we can use a variable for the index instead of literal numbers, as in the examples just shown. Thus, we can change the index in every loop iteration, and traverse or move through every element in the array. To know when to stop traversing the array, we can get the number of elements in an array by using the following statement:
arr
.
size
New programmers often make errors when dealing with the bounds of an array. These are the basic rules for array bounds:
The first element in an array is at index 0.
arr.size
returns the number of elements in the array, not the highest indexed element.
The last element in an array is at index arr.size - 1
.
If we want to use a while loop to traverse an array, we need to initialize the index to 0 and increment it for every loop iteration. It is important to note that the condition in the while loop is index
< arr.size
, not index
<= arr.size
, for the reasons just mentioned. The code in is an example of basic array traversal that prints out every element in the array.
Using arrays is a flexible and organized way of expressing multiple related items. Every programming language has them. As an abstraction, arrays are expanded variables. A variable holds one value; an array holds many. For example, to record names of multiple institutions using variables requires the use of variables such as institution1
, institution2
, institution3
, and so on. With arrays, institutions[]
stores all institutions.
arr
=
[
73
,
98
,
86
,
61
,
96
]
index
=
0
while
(
index
<
arr
.
size
)
puts
arr
[
index
]
index
=
index
+
1
end
Running the code (stored in a file called array_4.rb
) gives the following output:
$
ruby
array_4
.
rb
73
98
86
61
96
However, in Ruby it is possible to accomplish the same goal in one line of code:
puts
arr
This example is meant to be merely an introduction to simple array traversal. More practical reasons to use loops and arrays together are illustrated later.
The example in shows how to find the maximum of a list of five non-negative numbers. What would you need to change to support negative numbers?
# Initialize array and loop values
arr
=
[
73
,
98
,
86
,
61
,
96
]
index
=
0
max
=
0
# Loop over each element in arr
while
(
index
<
arr
.
size
)
if
(
arr
[
index
]
>
max
)
# Update max
max
=
arr
[
index
]
end
index
=
index
+
1
end
# Output calculated max
puts
"Max ==> "
+
max
.
to_s
Line 2 creates a new array named arr
and initializes its variables.
Line 3 sets a counter named index
that will serve as an index into the array.
Line 4 declares a variable called max
that will be used to store the maximum number.
Lines 7–13 implement a loop that scans every element of the array.
Line 16 prints the maximum element at the end.
Each time the index variable index
is incremented in line 12, the if
statement in lines 8–11 tests to see whether the current value in the array indexed at index
is higher than the current value in max
. If the current value is higher, then the max
variable is updated with the highest value.
Run this program, called find_the_max.rb
. Your output should be:
$
ruby
find_the_max
.
rb
Max
==>
98
As an exercise to make sure you understand the preceding code, change the example to output the lowest value in the array.
Arrays that have more than one dimension are called multidimensional arrays. A common multidimensional array is the two-dimensional array, which can be used to represent matrices and coordinate systems. Unlike some other programming languages, Ruby does not provide built-in support for multidimensional arrays. The way to work around this is to put an array inside an array, which is essentially what a multidimensional array is anyway.
Consider the example in the previous section of an array that stores a list of five test scores for a student. Now, what if you had students with the following scores?
Geraldo: 73, 98, 86, 61, 96
Brittany: 60, 90, 96, 92, 77
Michael: 44, 50, 99, 65, 10
[0] | [1] | [2] | [3] | [4] | |
---|---|---|---|---|---|
[0] |
73 |
98 |
86 |
61 |
96 |
[1] |
60 |
90 |
96 |
92 |
77 |
[2] |
44 |
50 |
99 |
65 |
10 |
The best way to represent this data is to create one array with three indices and to have each index contain five elements. This is basically an array with three rows and five columns. To create such an array in Ruby, type in the following:
arr
=
[[
73
,
98
,
86
,
61
,
96
]
,
[
60
,
90
,
96
,
92
,
77
]
,
[
44
,
50
,
99
,
65
,
10
]]
puts
"Brittany's Third Exam: "
+
arr
[
1
][
2
].
to_s
By typing in this assignment statement, you have created a 3 × 5 table where the first, second, and third rows represent Geraldo’s, Brittany’s, and Michael’s test scores, respectively. To access an individual score, use the format array[row][column]
. So, if you wanted to know what Brittany scored on her third exam (remember, each index starts with 0, not 1), type:
puts
"Brittany's Third Exam: "
+
arr
[
1
][
2
].
to_s
The output should be:
Brittany
'
s
Third
Exam
:
96
The rules for traversing a multidimensional array are similar to traversing a one-dimensional array, except you have to add a nested loop for each extra dimension. In , we illustrate how to print out each value in the array arr
.
1
# Initialize array and loop values
2
arr
=
[[
73
,
98
,
86
,
61
,
96
]
,
3
[
60
,
90
,
96
,
92
,
77
]
,
4
[
44
,
50
,
99
,
65
,
10
]]
5
row
=
0
6
column
=
0
7
8
# Loop over each row
9
while
(
row
<
arr
.
size
)
10
puts
"Row: "
+
row
.
to_s
11
# Loop over each column
12
while
(
column
<
arr
[
row
].
size
)
13
# Print the item at position row x column
14
puts
arr
[
row
][
column
]
15
column
=
column
+
1
16
end
17
# Reset column, advance row
18
column
=
0
19
row
=
row
+
1
20
end
Step through and convince yourself that it generates the values you expect. What output values will be generated? What would happen if we added an if
statement after line 13 that asked to skip values that were prime? Give that a try, as it uses our prime number work.
Like one-dimensional arrays, you can output everything using one line of code:
puts
arr
The only problem with this statement is that Ruby will list all the values without any formatting. So it would be difficult to sort through all the data. We address means to format output in one particular case as part of our tic-tac-toe example in . We do not address formatting generally in this book.
In the previous example, we created a program that finds the highest score in the array. Now we will create a program that stores the five scores for all three students. Using what you learned previously, modify the program to find out which student has the highest score. Print out the highest score.
When you are done with your program, compare your program to the code in .
1
# initialize the array and index/score variables
2
arr
=
[[
73
,
98
,
86
,
61
,
96
]
,
3
[
60
,
90
,
96
,
92
,
77
]
,
4
[
44
,
50
,
99
,
65
,
10
]]
5
6
row
=
0
7
column
=
0
8
maxscore
=
0
9
maxrow
=
0
10
11
# for each row
12
while
(
row
<
arr
.
size
)
13
# for each column
14
while
(
column
<
arr
[
row
].
size
)
15
# update score variables
16
if
(
arr
[
row
][
column
]
>
maxscore
)
17
maxrow
=
row
18
maxscore
=
arr
[
row
][
column
]
19
end
20
# increment column
21
column
=
column
+
1
22
end
23
# reset column, increment row
24
column
=
0
25
row
=
row
+
1
26
end
27
28
# output name and high score information
29
if
maxrow
==
0
30
puts
"Geraldo has the highest score."
31
elsif
maxrow
==
1
32
puts
"Brittany has the highest score."
33
elsif
maxrow
==
2
34
puts
"Michael has the highest score."
35
else
36
puts
"Something didn't work correctly."
37
end
38
puts
"The high score was: "
+
maxscore
.
to_s
Lines 2–4 initialize a 3 × 5 array called arr
.
Lines 6–7 initialize the initial location of the row and column to start traversing the array.
Lines 8–9 declare maxscore
that will keep track of the highest score and maxrow
that will keep track of who has the highest score.
Lines 12–26 implement a loop that scans each element of the array.
Lines 29–37 compare the value of maxrow
and output the corresponding person’s name as the individual with the highest score.
Line 38 outputs the highest score.
Intuitively, like the previous example, there is a marker for the highest score. Whenever the program finds a score higher than the current value of maxscore
, it updates maxrow
to contain the value of the row in which the program found the high score (line 17) and maxscore
to reflect the highest score (line 18). The program then uses if-else
statements to find out who has the highest score (lines 29–37). Notice again how rows 0, 1, and 2 correspond with Geraldo, Brittany, and Michael, respectively. When you run the program, the output should read:
$
ruby
find_the_max_modified
.
rb
Michael
has
the
highest
score
.
The
high
score
was
:
99
Unlike arrays, which strictly use integer indices, hashes can use any data type as their index. What Ruby calls a “hash” is really a clever way of using a string data type to map quickly to a specific element inside an array.
The string is referred to as a hash key. Some kind of function must exist to map a string to a number. For example, a simple hash function could add up the ASCII codes for each letter and implement a modulo for the number of keys we have. A hash collision occurs when our hash function returns the same number for two different keys, which can be handled with various collision resolution algorithms. A simple collision resolution algorithm simply places all keys that have a collision into a bucket, and the bucket is sequentially scanned for the specific key that is requested when a collision occurs. A detailed discussion of hashing is beyond the scope of this book, but we wanted to illustrate the differences between a hash table and an array.
In most cases, strings are used to associate keys to values. For example, instead of using a two-dimensional array, we can use a hash to store student test scores by name as seen in . As shown, similar to arrays, line 1 creates a new hash structure. Likewise, element assignment, lines 2–4, follow the same process done for arrays.
1
scores
=
Hash
.
new
2
scores
[
"Geraldo"
]
=
[
98
,
95
,
93
,
96
]
3
scores
[
"Brittany"
]
=
[
74
,
90
,
84
,
92
]
4
scores
[
"Michael"
]
=
[
72
,
87
,
68
,
54
,
10
]
To access Brittany’s score, we could simply call on scores["Brittany"]
. Of course, the string "Brittany"
can also be replaced by a variable that holds that string.
Arrays are accessed with a numerical index, as in array[5]
. Hashes are accessed with a string as the index, as in scores["Brittany"]
.
1
scores
=
Hash
.
new
2
scores
[
"Geraldo"
]
=
[
98
,
95
,
93
,
96
]
3
scores
[
"Brittany"
]
=
[
74
,
90
,
84
,
92
]
4
scores
[
"Michael"
]
=
[
72
,
87
,
68
,
54
,
10
]
5
name
=
"Brittany"
6
puts
name
+
" first score is: "
+
scores
[
name
][
0
].
to_s
In line 5 of , we assigned “Brittany” to the variable name
; so, assuming that the code of is stored in file hash_2.rb
, executing the code should display Brittany’s first score on the screen:
$
ruby
hash_2
.
rb
Brittany
first
score
is
:
74
It is possible to get an array of all the keys by calling on scores.keys
. We can then go through each key by using a for loop. We can now rewrite the maximum score example to work for any number of students, no matter what their names are or how many scores each student has.
Note that in our example, the number of individual scores varies among the students. That is, in , both “Geraldo” and “Brittany” have four scores each, while “Michael” has five. The ability to have varying numbers of entries provides great flexibility.
1
scores
=
Hash
.
new
2
3
scores
[
"Geraldo"
]
=
[
98
,
95
,
93
,
96
]
4
scores
[
"Brittany"
]
=
[
74
,
90
,
84
,
92
]
5
scores
[
"Michael"
]
=
[
72
,
87
,
68
,
54
,
10
]
6
7
maxscore
=
0
8
for
name
in
scores
.
keys
9
column
=
0
10
while
(
column
<
scores
[
name
].
size
)
11
12
if
(
scores
[
name
][
column
]
>
maxscore
)
13
maxname
=
name
14
maxscore
=
scores
[
name
][
column
]
15
end
16
column
=
column
+
1
17
end
18
end
19
20
puts
maxname
+
" has the highest score."
21
puts
"The highest score is: "
+
maxscore
.
to_s
We see that running the code from , stored in file find_max_hash.rb
, will output the following result:
$
ruby
find_max_hash
.
rb
Geraldo
has
the
highest
score
.
The
highest
score
is
:
98
Note that the entries in this hash differ from the entries used in the array example.
Hashes cannot replace arrays outright. Due to the nature of their keys, they do not actually have any sensible sequence for their elements. Hashes and arrays serve separate but similar roles. Hashes excel at lookup. A hash keyed on name with a phone number as a value is much easier to work with than a multidimensional array of names and phone numbers.
Arrays refer to a sequence of variables where each variable does not have a name; instead, it is referenced by an integer index. That is, arr[i]
refers to the ith element in the sequence, remembering that indices start at 0. In contrast, a hash table uses a key-value pairing to identify the particular entry. In the earlier example, we wish to access test scores based on a person’s name. That is, the hash table arr
['Geraldo'
] identifies Geraldo’s test scores even though Geraldo is not an integer. Such referencing supports both efficient access and logical correlations.
We discussed one-dimensional arrays, arrays of arrays, and hashes. These are constructs that often take students time to learn, so we strongly suggest that you work through all the exercises in this chapter to ensure that you have a full understanding of these concepts.
Arrays are structures that use a table format to store variables. The data stored in an array is accessed using numbers as an index starting at 0. They can be used in any programming structure, but they are most commonly associated with the loop structure.
One key concept when working with arrays is that they can have an infinite number of dimensions. This means that a memory location within an array can either store a single piece of data or store an entirely new array.
Hashes are much like arrays, except that rather than using only an integer to look up a memory location, any variable can be used as a key.
Array: A consecutively numbered list of variables.
Element: A variable contained within an array.
Multidimensional array: An array whose elements are also arrays.
Index: The number associated with a certain element within an array.
Traverse: To move from one element to another within an array.
Hash: A data structure that can map any data type (key) to a value.
Using the array arr
with value a[0] = 9, a[1] = 2, a[2] = 5, a[3] = 4, a[4] = 3,
determine the output of the code in .
1
i
=
0
2
3
while
(
i
<
a
.
size
)
4
puts
a
[
i
]
5
i
=
i
+
1
6
end
The code in looks for the first two elements that are out of order and swaps them; however, it is not producing the correct results. Fix the code so that it works correctly.
1
arr
=
[
5
,
22
,
29
,
39
,
19
,
51
,
78
,
96
,
84
]
2
i
=
0
3
while
(
i
<
arr
.
size
-
1
and
arr
[
i
]
<
arr
[
i
+
1
]
)
4
i
=
i
+
1
5
end
6
puts
i
7
8
arr
[
i
]
=
arr
[
i
+
1
]
9
arr
[
i
+
1
]
=
arr
[
i
]
Write a program that splits an array into two arrays where any element in one array is smaller than any element in the other array. Solutions are not unique, but equally sized splits are desirable. The input can be any size array less than 100.
Example input: [6, 45, 23, 65, 17, 48, 97, 32, 18, 9, 88]
Example output: [6, 23, 17, 18 , 9] < [45, 65, 48, 97, 32, 88]
There are many ways to store image data. One way is to store pixel data in a two-dimensional array. The pixel data is itself a three-element array that describes the amount of red, green, and blue in the pixel. The amount of red, green, or blue is a number from 0 to 255. Here are a few example RGB values:
red
=
[
255
,
0
,
0
]
green
=
[
0
,
255
,
0
]
blue
=
[
0
,
0
,
255
]
black
=
[
0
,
0
,
0
]
white
=
[
255
,
255
,
255
]
yellow
=
[
255
,
255
,
0
]
Suppose you have a picture and need to count red pixels. For a pixel to be red, it must be within the following RGB constraints:
The R
value must be greater than 100.
The G
and B
values must each be less than the R
value divided by 4.
Write this program. Use this sample data to test your program:
sample
=
[[[
65
,
67
,
23
]
,
[
234
,
176
,
0
]
,
[
143
,
0
,
0
]]
,
[[
255
,
30
,
51
]
,
[
156
,
41
,
38
]
,
[
3
,
243
,
176
]]
,
[[
255
,
255
,
255
]
,
[
0
,
0
,
0
]
,
[
133
,
28
,
13
]]
,
[[
26
,
43
,
255
]
,
[
48
,
2
,
2
]
,
[
57
,
89
,
202
]]]
This sample has three red pixels.
Function-plotting software must calculate a function at many points to plot it. Given the function:
Write a program that calculates and stores 100,000 values for f (x) between x = –50 and x = 50.
Extend the program so that it searches for values for x that are very close to, or are, zero. How many x values between –50 and 50 make f(x) zero? What are they?
The three witches in Hamlet can brew any potion provided they have the right ingredients. Suppose that five ingredients are necessary in making a health potion: eye of newt (eon), toe of frog (tof), wool of bat (wob), adder’s fork (af), and tooth of wolf (tow). Four reactions can occur between these ingredients:
4 eon + 2 wob = 3 af + 4 tow
3 tow + 1 tof = 2 eon
1 wob + 2 af = 1 tof
4 tof + 7 tow + 2 af = 1 health potion
Assuming you can control the order of reactions, write a program that can calculate the maximum number of health potions one can brew with a given amount of ingredients. Here is example output:
If
I
have
34
eon
,
59
tof
,
20
wob
,
5
af
,
and
20
tow
,
I
can
make
seven
health
potions
.