6 Objects and data structures

There is a reason that we keep our variables private. We don’t want anyone else to depend on them. Why, then, do so many programmers automatically add getters and setters to their objects, exposing their private variables as if they were public?

Data abstraction

Hiding implementation is not just a matter of putting a layer of functions between the variables. Hiding implementation is about abstractions! It exposes abstract interfaces that allow its users to manipulate the essence of the data, without having to know its implementation. Serious thought needs to be put into the best way to represent the data that an object contains. The worst option is to blithely add getters and setters.

Data/object anti-symmetry

Objects hide their data behind abstractions and expose functions that operate on that data. Data structure expose their data and have no meaningful functions. They are virtual opposites. This difference may seem trivial, but it has far-reaching implications.
Procedural code (code using data structures) makes it easy to add new functions without changing the existing data structures. OO code, on the other hand, makes it easy to add new classes without changing existing functions. Procedural code makes it hard to add new data structures because all the functions must change. OO code makes it hard to add new functions because all the classes must change.
Mature programmers know that the idea that everything is an object is a myth. Sometimes you really do want simple data structures with procedures operating on them.

The law of Demeter

There is a well-known heuristic called the Law of Demeter that says a module should not know about the innards of the objects it manipulates. More precisely, the Law of Demeter says that a method f of a class C should only call the methods of these:

  • C
  • An object created by f
  • An object passed as an argument to f
  • An object held in an instance variable of C

The method should not invoke methods on objects that are returned by any of the allowed functions. In other words, talk to friends, not to strangers.

Train Wrecks

Chains of calls are generally considered to be sloppy style and should be avoided. It is usually best to split them up. Whether this is still a violation of Demeter depends on whether or not they are objects or data structures. If they are objects knowledge of their innards is a clear violation of the Law of Demeter. On the other hand, if they are just data structures with no behavior, then they naturally expose their internal structure, and so Demeter does not apply.
This issue would be a lot less confusing if data structures simply had public variables and no functions, whereas objects had private variables and public functions. However, there are frameworks and standards (e.g., “beans”) that demand that even simple data structures have accessors and mutators.

Hybrids

Hybrid structures are half object and half data structure. They have functions that do significant things, and they also have either public variables or public accessors and mutators. Such hybrids make it hard to add new functions but also make it hard to add new data structures. They are the worst of both worlds. Avoid creating them.

Hiding Structure

Because objects are supposed to hide their internal structure, we should not be able to navigate through them. Neither adding methods to the first object, nor presuming that the object returns a data structure feels good. Instead, we should be telling it to do something.

Data transfer objects

The quintessential form of a data structure is a class with public variables and no functions. This is sometimes called a data transfer object, or DTO. DTOs are very useful structures, especially when communicating with databases or parsing messages from sockets, and so on. The quasi-encapsulation of beans (private variables manipulated by getters and setters) seems to make some OO purists feel better but usually provides no other benefit.

Active Record

Active Records are special forms of DTOs. They are data structures with public (or bean-accessed) variables; but they typically have navigational methods like save and find. Treat the Active Record as a data structure and create separate objects that contain the business rules and that hide their internal data (which are probably just instances of the Active Record).

Conclusion

Objects expose behavior and hide data. This makes it easy to add new kinds of objects without changing existing behaviors. It also makes it hard to add new behaviors to existing objects. Data structures expose data and have no significant behavior. This makes it easy to add new behaviors to existing data structures but makes it hard to add new data structures to existing functions.

Previous: 5 FormattingUp: ContentsNext: 7 Error handling