Variable References and Mutability of Ruby Objects

This article discusses references and variables in Ruby along with the mutability or immutablility of objects. It serves as a lead-in to separate articles that discuss mutating and non-mutating methods, and pass by reference/pass by value in Ruby.

Note: This article was originally published on the Launch School blog on 2016–07–23

This is the first in a series of three articles that discuss how Ruby manipulates variables and objects, and, in particular, how objects are passed around in a Ruby program. You can find many such discussions in articles that attempt to answer the question “Is Ruby pass by reference or pass by value?” Our goal in this series isn’t necessarily to answer this question — though we will provide an answer, of sorts — but to put that question in the context of how Ruby actually works.

The answer to the question really isn’t that important. What is important is more fundamental: knowing how Ruby works. A proper understanding of how Ruby works will take you much further than being able to provide an answer to the question; it will help you learn to anticipate what Ruby will do, when it will do it, and why it will do it. This will help you become a better Ruby programmer, and reduce the number of bugs you encounter due to unexpected behaviors.

In this article, we start out by exploring how Ruby variables and objects are connected to each other, and the part that mutability or immutability of objects plays when manipulating variables. We will also introduce the concepts of pass by reference and pass by value, but a more detailed discussion will be postponed until the article Object Passing in Ruby — by Reference or by Value.

Introduction

It seems simple: you take an object in Ruby, and you assign it to a variable. Subsequently, that variable can be used to access and manipulate the object. It’s the same in dozens of different programming languages. Can Ruby be any different? What kind of complexity can lurk behind such a simple concept?

As simple as the idea of variables seems, there are differences between languages. Some languages, like C++ and Perl, make copies of an object when you assign them to a variable; others, like Javascript and Python, create a link — a reference or binding — between the variable and the object in question. These differences in behavior can lead to some unexpected behavior if you don’t understand how your favorite language does things.

Variables and References

An object is a bit of data that has some sort of state — sometimes called a value — and associated behavior. It can be simple, like the Boolean object true, or it can be complex, like an object that represents a database connection.

Objects can be assigned to variables, like this:

>> greeting = 'Hello'
=> "Hello"

This tells Ruby to associate the greeting with the String object whose value is 'Hello'. In Ruby, greeting is said to reference (or point to) the String object. We can also talk of the variable as being bound to the String object, or binding the variable to the String object. Internally, the relationship looks like this:

greeting is a reference to the String ‘Hello’
greeting is a reference to the String ‘Hello’

Here, the String object represented by the literal 'Hello' is assigned to a variable that has the name greeting. This causes the variable greeting to reference the String object whose value is 'Hello'. It does so by storing the object id of the String. Subsequently, we can simply use the value 'Hello' by using the variable greeting:

>> greeting
=> "Hello"
>> greeting.object_id
=> 70101471431160

We use `#object_id` frequently in this article. Every object in Ruby has a unique object id, and that object id can be retrieved simply by calling `#object_id` on the object in question. Even literals, such as numbers, booleans, `nil`, and Strings have object ids:

>> 5.object_id
=> 11
>> true.object_id
=> 20
>> n
nil.object_id
=> 8
>> "abc".object_id
=> 70101471581080

Get comfortable with using #object_id, both while reading this article, and whenever you have trouble understanding why an object has an unexpected value.

Let’s assign greeting to a new variable:

>> whazzup = greeting
=> "Hello"
>> greeting
=> "Hello"
>> whazzup
=> "Hello"
>> greeting.object_id
=> 70101471431160
>> whazzup.object_id
=> 70101471431160

Internally, the situation now looks like this:

greeting and whazzup reference the same String
greeting and whazzup reference the same String

While the exact value returned by #object_id on your computer may differ from what we show, you should see that both object ids are the same. This demonstrates that both greeting and whazzup not only reference a String with the same value, but are, in fact, references to the same String; greeting and whazzup are aliases for each other. We can show this by using one of the two variables to mutate the object (to change its state or value):

>> greeting.upcase!
=> "HELLO"
>> greeting
=> "HELLO"
>> whazzup
=> "HELLO"
>> whazzup.concat('!')
=> "HELLO!"
>> greeting
=> "HELLO!"
>> whazzup
=> "HELLO!"
>> greeting.object_id
=> 70101471431160
>> whazzup.object_id
=> 70101471431160

Since both variables are associated with the same object, using either variable to mutate the object is reflected in the other variable. We can also see that the object id does not change. Internally, we now have this:

greeting and whazzup reference the same mutated String
greeting and whazzup reference the same mutated String

Reassignment

Let’s assign a new object to one of these variables:

>> greeting = 'Dude!'
=> "Dude!"
>> puts greeting
=> "Dude!"
>> puts whazzup
=> "HELLO!"
>> greeting.object_id
=> 70101479528400
>> whazzup.object_id
=> 70101471431160

Here, we see that greeting and whazzup no longer refer to the same object; they have different values and different object ids. Crazy, right? Internally, we now have:

greeting and whazzup reference different Strings
greeting and whazzup reference different Strings

What this shows is that reassignment to a variable doesn’t mutate the object referenced by that variable; instead, the variable is bound to a different object. The original object is merely disconnected from the variable. In this example, greeting is bound to the String object whose value is Dude!, while whazzup continues to reference the String object whose value is HELLO!.

Mutability

Now that we understand what references are and how they relate to variables, we can investigate mutability. Objects can be either mutable or immutable. Mutable objects can be mutated — that is, their values can be altered; immutable objects cannot be mutated — they can only be reassigned.

Some objects can be mutated, others can’t. Again, other languages may do something different. In C++ and Perl, for instance, string objects are mutable, but in Java and Python, string objects are immutable. Understanding mutability of an object is necessary to understanding how your language deals with those objects.

Immutable Objects

In Ruby, numbers and boolean values are immutable. Once we create an immutable object, we cannot change it. “But,” we hear you ask, “What about this code?”

>> number = 3
=> 3
>> number
=> 3
>> number = 2 * number
=> 6
>> number
=> 6

“Doesn’t this show that the object 3 was changed to 6?"

Nope. As we saw above, this is reassignment which, as we learned, doesn’t mutate the object. Instead, it binds a different object to the variable. In this case, we create a new Integer with a value of 6 and assign it to number. There are, in fact, no methods available that let you mutate the value of any immutable object. All you can do is reassign the variable so it references a different object. This disconnects the original object from the variable.

Internally, the reassignment looks like this:

number before and after assignment
number before and after assignment

Lets demonstrate this in irb:

>> a = 5.2
=> 5.2
>> b = 7.3
=> 7.3
>> a
=> 5.2
>> b
=> 7.3
>> a.object_id
=> 46837436124653162
>> b.object_id
=> 65752554559609242

We start by assigning the Float values 5.2 and 7.3 to the variables a and b. We see these values and their object ids when we ask irb to display them. The object ids differ, so the variables reference different objects.

Lets see what happens if b is assigned to a:

>> a = b
=> 7.3
>> a
=> 7.3
>> b
=> 7.3
>> a.object_id
=> 65752554559609242
>> b.object_id
=> 65752554559609242

irb now displays the same value for each variable. More interestingly, it shows that the object ids for both a and b are the same. The object that originally held the value 5.2 is no longer available through either a or b.

Let’s try to alter the object now:

>> b += 1.1
=> 8.4
>> a
=> 7.3
>> b
=> 8.4
>> a.object_id
=> 65752554559609242
>> b.object_id
=> 32425917317067566

On the first line, we try to alter the object referenced by b by incrementing b by 1.1. This yields 8.4 and, as we can see, b is also set to 8.4. a has not been changed, and still references the 7.3 object. But, b now references a completely new object. Though we changed the value associated with b, we didn't mutate the object -- the object is immutable. Instead, += created a brand-new Float object and bound b to the new object.

Simple assignment never mutates an immutable object:

>> a = 5.5
=> 5.5
>> a.object_id
=> 49539595901075458

Instead of mutating the original object, a new Float is created and a is bound to the new object.

Immutable objects aren’t limited to numbers and booleans. Objects of some complex classes, such as nil (the only member of the NilClass class) and Range objects (e.g., 1..10) are immutable. Any class can establish itself as immutable by simply not providing any methods that alter its state.

Mutable Objects

Unlike numbers, booleans, and a few other types, most objects in Ruby are mutable; they are objects of a class that permit changes to the object’s state in some way. Whether mutation is permitted by setter methods or by calling methods that perform more complex operations is unimportant; as long as you can mutate an object, it is mutable.

A setter method (or simply, a setter) is a method defined by a Ruby object that allows a programmer to explicitly change the value of part of an object. Setters always use a name like something=. For our purposes in this series of articles, we're mostly interested in array element setters, e.g., the Array#[]= method, which is called like this:

>> a = [1, 2, 3, 4, 5]
>> a[3] = 0 # calls setter method
>> a # => [1, 2, 3, 0, 5]

Other setters show up in conjunction with classes, a topic we discuss in RB120. Here’s a simple example:

class Dog
def name=(new_name)
@name = new_name
end
end
dog = Dog.new
dog.name = "Fido" # calls setter method for `name` attribute

Consider Ruby Array objects; you can use index assignment to alter what object is referenced by an element:

>> a = %w(a b c)
=> ["a", "b", "c"]
>> a.object_id
=> 70227178642840
>> a[1] = '-' # calls `Array#[]=` setter method
=> "-"
>> a
=> ["a", "-", "c"]
>> a.object_id
=> 70227178642840

This demonstrates that we can mutate the array that a refers to. However, it doesn't create a new array since the object id remains the same. We can see why this is by looking at how a is stored in memory:

Arrays in memory
Arrays in memory

We can see that a is a reference to an Array, and, in this case, that Array contains three elements; each element is a reference to a String object. When we assign - to a[1], we are binding a[1] to a new String. We're mutating the array given by a by assigning a new string to the element at index 1 (a[1]).

Strings and other collection classes are similar in the way they behave — variables reference the collection (or String), and the collection contains references to the actual objects in the collection. Strings are a little bit different — it’s not really necessary to have separate objects for each character — but they act in a similar way.

Several Array methods, such as #delete, #fill, and #insert mutate the original object without creating a new one.

A Brief Introduction to Object Passing

When you pass an object as an argument to a method, the method can — in theory — either mutate the object or leave it unchanged. It’s easy enough to see that any method can avoid mutating its arguments. However, whether or not the method can mutate an argument is less clear; the ability to mutate arguments depends in part on the mutability or immutability of the object represented by the argument, but also on how the argument is passed to the method.

Some languages make copies of method arguments, and pass those copies to the method — since they are merely copies, the original objects can’t be mutated. Objects passed to methods in this way are said to be passed by value, and the language is said to be using a pass by value object passing strategy.

Other languages pass references to the method instead — a reference can be used to mutate the original object, provided that object is mutable. Objects passed to methods in this way are said to be passed by reference, and the language is said to be using a pass by reference object passing strategy.

Many languages employ both object passing strategies. One strategy is used by default; the other is used when a special syntax, keyword, or declaration is used. Some languages may even employ different defaults depending on the object type — for example, numbers may be passed using a pass by value strategy, while strings may be passed using a pass by reference strategy.

Regardless of which strategy a language employs for a given argument and method, it’s important to know which one is used so you can understand what happens if the method appears to mutate one of its arguments.

Developing A Mental Model

When learning new concepts, it often helps to develop a mental model of the concept, and then refine that model as additional information comes to light. We’re now in a position to begin formulating our mental model for object passing in Ruby.

Pass by value, as you’ll recall, means copying the original objects, so the original object cannot be mutated. Since immutable objects cannot be changed, they act like Ruby passes them around by value. This isn’t a completely accurate interpretation of how Ruby passes immutable objects, but it helps us determine why the following code works as it does:

def increment(a)
a = a + 1
end
b = 3
puts increment(b) # prints 4
puts b # prints 3

Here, the numeric object 3 is immutable. You can reasonably say that b's value is not mutated by #increment since 3 is passed by value to #increment where it is bound to variable a. Even though a is assigned to 4 inside the method and returned to the caller, the original object referenced by b is untouched.

Mutable objects, on the other hand, can always be mutated simply by calling one of their mutating methods. They act like Ruby passes them around by reference; it isn’t necessary for a method to mutate an object that is passed by reference, only that it can mutate the object. As you’ll recall, pass by reference means that only a reference to an object is passed around; the variables used inside a method are bound to the original objects. This means that the method is free to mutate those objects. Once again, this isn’t completely accurate, but it is helpful. For instance:

def append(s)
s << '*'
end
t = 'abc'
puts append(t) # prints abc*
puts t # prints abc*

Here, the String object abc is mutable. You can reasonably say that the object referenced by t is mutated by #append since t's value is passed by reference to #append where it is bound to variable s. When we apply the << operator to s, the change is reflected through t as well. Upon return from the method, the value of t has been mutated. However, t still points to the same object in memory; it merely has a different value.

Conclusion

In this article, we’ve seen that Ruby variables are merely references to objects in memory; that is, a variable is merely a name for some object. Multiple variables can reference the same object, so mutating an object using a given variable name will be reflected in every other variable that is bound to that object. We’ve also learned that assignment to a variable merely changes the binding; the object the variable originally referenced is not mutated. Instead, a different object is bound to the variable.

We’ve also learned that certain object types, primarily numbers and Booleans but also some other types, are immutable in Ruby — unchanging; many other objects are mutable — changeable. If you attempt to change an immutable object, you won’t succeed — at best, you can create a new object, and bind a variable to that object with assignment. Mutable objects, however, can be mutated without creating new objects.

Finally, we’ve learned a bit about what pass by value and pass by reference mean. We’ve established a mental model that says that Ruby is pass by value for immutable objects, pass by reference otherwise. This model isn’t perfect, but it can be used to help determine whether the object associated bound to an argument will be mutated.

We’re now equipped with the tools we need to explore the differences between mutating and non-mutating methods. Continue reading at Ruby’s Mutating and Non-Mutating Methods.

The slow path for studious beginners to a career in software development.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store