[1]Dawid Ciężarkiewicz aka `dpc`

  [2]About Myself[3]Find Me On...[4]Open Source[5]Archive

The faster you unlearn OOP, the better for you and your software

  November 20, 2018

    Object-oriented programming is an exceptionally bad idea which could
    only have originated in California.

  — Edsger W. Dijkstra

  Maybe it's just my experience, but [6]Object-Oriented Programming seems
  like a default, most common paradigm of software engineering. The one
  typically thought to students, featured in online material and for some
  reason, spontaneously applied even by people that didn't intend it.

  I know how succumbing it is, and how great of an idea it seems on the
  surface. It took me years to break its spell, and understand clearly
  how horrible it is and why. Because of this perspective, I have a
  strong belief that it's important that people understand what is wrong
  with OOP, and what they should do instead.

  Many people discussed problems with OOP before, and I will provide a
  list of my favorite articles and videos at the end of this post. Before
  that, I'd like to give it my own take.

Data is more important than code

  At its core, all software is about manipulating data to achieve a
  certain goal. The goal determines how the data should be structured,
  and the structure of the data determines what code is necessary.

  This part is very important, so I will repeat. goal -> data
  architecture -> code. One must never change the order here! When
  designing a piece of software, always start with figuring out what do
  you want to achieve, then at least roughly think about data
  architecture: data structures and infrastructure you need to
  efficiently achieve it. Only then write your code to work in such
  architecture. If with time the goal changes, alter the architecture,
  then change your code.

  In my experience, the biggest problem with OOP is that encourages
  ignoring the data model architecture and applying a mindless pattern of
  storing everything in objects, promising some vague benefits. If it
  looks like a candidate for a class, it goes into a class. Do I have a
  Customer? It goes into class Customer. Do I have a rendering context?
  It goes into class RenderingContext.

  Instead of building a good data architecture, the developer attention
  is moved toward inventing “good” classes, relations between them,
  taxonomies, inheritance hierarchies and so on. Not only is this a
  useless effort. It's actually deeply harmful.

Encouraging complexity

  When explicitly designing a data architecture, the result is typically
  a minimum viable set of data structures that support the goal of our
  software. When thinking in terms of abstract classes and objects there
  is no upper bound to how grandiose and complex can our abstractions be.
  Just look at [7]FizzBuzz Enterprise Edition – the reason why such a
  simple problem can be implemented in so many lines of code, is because
  in OOP there's always a room for more abstractions.

  OOP apologists will respond that it's a matter of developer skill, to
  keep abstractions in check. Maybe. But in practice, OOP programs tend
  to only grow and never shrink because OOP encourages it.

Graphs everywhere

  Because OOP requires scattering everything across many, many tiny
  encapsulated objects, the number of references to these objects
  explodes as well. OOP requires passing long lists of arguments
  everywhere or holding references to related objects directly to
  shortcut it.

  Your class Customer has a reference to class Order and vice versa.
  class OrderManager holds references to all Orders, and thus indirectly
  to Customer's. Everything tends to point to everything else because as
  time passes, there are more and more places in the code that require
  referring to a related object.

    [8]You wanted a banana but what you got was a gorilla holding the
    banana and the entire jungle.

  Instead of a well-designed data store, OOP projects tend to look like a
  huge spaghetti graph of objects pointing at each other and methods
  taking long argument lists. When you start to design Context objects
  just to cut on the number of arguments passed around, you know you're
  writing real OOP Enterprise-level software.

Cross-cutting concerns

  The vast majority of essential code is not operating on just one object
  – it is actually implementing cross-cutting concerns. Example: when
  class Player hits() a class Monster, where exactly do we modify data?
  Monster's hp has to decrease by Player's attackPower, Player's xps
  increase by Monster's level if Monster got killed. Does it happen in
  Player.hits(Monster m) or Monster.isHitBy(Player p). What if there's a
  class Weapon involved? Do we pass it as an argument to isHitBy or does
  Player has a currentWeapon() getter?

  This oversimplified example with just 3 interacting classes is already
  becoming a typical OOP nightmare. A simple data transformation becomes
  a bunch of awkward, intertwined methods that call each other for no
  reason other than OOP dogma of encapsulation. Adding a bit of
  inheritance to the mix gets us a nice example of what stereotypical
  “Enterprise” software is about.

Object encapsulation is schizophrenic

  Let's look at the definition of [9]Encapsulation:

    Encapsulation is an object-oriented programming concept that binds
    together the data and functions that manipulate the data, and that
    keeps both safe from outside interference and misuse. Data
    encapsulation led to the important OOP concept of data hiding.

  The sentiment is good, but in practice, encapsulation on a granularity
  of an object or a class often leads to code trying to separate
  everything from everything else (from itself). It generates tons of
  boilerplate: getters, setters, multiple constructors, odd methods, all
  trying to protect from mistakes that are unlikely to happen, on a scale
  too small to mater. The metaphor that I give is putting a padlock on
  your left pocket, to make sure your right hand can't take anything from
  it.

  Don't get me wrong – enforcing constraints, especially on [10]ADTs is
  usually a great idea. But in OOP with all the inter-referencing of
  objects, encapsulation often doesn't achieve anything useful, and it's
  hard to address the constraints spanning across many classes.

  In my opinion classes and objects are just too granular, and the right
  place to focus on the isolation, APIs etc. are
  “modules”/“components”/“libraries” boundaries. And in my experience,
  OOP (Java/Scala) codebases are usually the ones in which no
  modules/libraries are employed. Developers focus on putting boundaries
  around each class, without much thought which groups of classes form
  together a standalone, reusable, consistent logical unit.

There are multiple ways to look at the same data

  OOP requires an inflexible data organization: splitting it into many
  logical objects, which defines a data architecture: graph of objects
  with associated behavior (methods). However, it's often useful to have
  multiple ways of logically expressing data manipulations.

  If program data is stored e.g. in a tabular, data-oriented form, it's
  possible to have two or more modules each operating on the same data
  structure, but in a different way. If the data is split into objects
  with methods it's no longer possible.

  That's also the main reason for [11]Object-relational impedance
  mismatch. While relational data architecture might not always be the
  best one, it is typically flexible enough to be able to operate on the
  data in many different ways, using different paradigms. However, the
  rigidness of OOP data organization causes incompatibility with any
  other data architecture.

Bad performance

  Combination of data scattered between many small objects, heavy use of
  indirection and pointers and lack of right data architecture in the
  first place leads to poor runtime performance. Nuff said.

What to do instead?

  I don't think there's a silver bullet, so I'm going to just describe
  how it tends to work in my code nowadays.

  First, the data-consideration goes first. I analyze what is going to be
  the input and the outputs, their format, volume. How should the data be
  stored at runtime, and how persisted: what operations will have to be
  supported, how fast (throughput, latencies) etc.

  Typically the design is something close to a database for the data that
  has any significant volume. That is: there will be some object like a
  DataStore with an API exposing all the necessary operations for
  querying and storing the data. The data itself will be in form of an
  ADT/PoD structures, and any references between the data records will be
  of a form of an ID (number, uuid, or a deterministic hash). Under the
  hood, it typically closely resembles or actually is backed by a
  relational database: Vectors or HashMaps storing bulk of the data by
  Index or ID, some other ones for “indices” that are required for fast
  lookup and so on. Other data structures like LRU caches etc. are also
  placed there.

  The bulk of actual program logic takes a reference to such DataStores,
  and performs the necessary operations on them. For concurrency and
  multi-threading, I typically glue different logical components via
  message passing, actor-style. Example of an actor: stdin reader, input
  data processor, trust manager, game state, etc. Such “actors” can be
  implemented as thread-pools, elements of pipelines etc. When required,
  they can have their own DataStore or share one with other “actors”.

  Such architecture gives me nice testing points: DataStores can have
  multiple implementations via polymorphism, and actors communicating via
  messages can be instantiated separately and driven through test
  sequence of messages.

  The main point is: just because my software operates in a domain with
  concepts of eg. Customers and Orders, doesn't mean there is any
  Customer class, with methods associated with it. Quite the opposite:
  the Customer concept is just a bunch of data in a tabular form in one
  or more DataStores, and “business logic” code manipulates the data
  directly.

Follow-up read

  As many things in software engineering critique of OOP is not a simple
  matter. I might have failed at clearly articulating my views and/or
  convincing you. If you're still interested, here are some links for
  you:
    * Two videos by Brian Will where he makes plenty of great points
      against OOP: [12]Object-Oriented Programming is Bad and
      [13]Object-Oriented Programming is Garbage: 3800 SLOC example
    * [14]CppCon 2018: Stoyan Nikolov “OOP Is Dead, Long Live
      Data-oriented Design” where the author beautifully goes through an
      example OOP codebase and points out problems with it.
    * [15]Arguments Against Oop on wiki.c2.com for a list of common
      arguments against OOP.
    * [16]Object Oriented Programming is an expensive disaster which must
      end by Lawrence Krubner – this one is long and goes in depth into
      many ideas

Feedback

  I've been receiving comments and more links, so I'm putting them here:
    * [17]Quora: Is C++ OOP slower than C? If yes, is the difference
      significant? [18]#programming [19]#oop [20]#opinion
    __________________________________________________________________

  published with [21]write.as

References

  1. https://dpc.pw/
  2. https://dpc.pw/about
  3. https://dpc.pw/social
  4. https://dpc.pw/open-source
  5. https://dpc.pw/archive
  6. https://en.wikipedia.org/wiki/Object-oriented_programming
  7. https://github.com/EnterpriseQualityCoding/FizzBuzzEnterpriseEdition
  8. https://www.johndcook.com/blog/2011/07/19/you-wanted-banana/
  9. https://en.wikipedia.org/wiki/Object-oriented_programming#Encapsulation
 10. https://en.wikipedia.org/wiki/Abstract_data_type
 11. https://en.wikipedia.org/wiki/Object-relational_impedance_mismatch
 12. https://www.youtube.com/watch?v=QM1iUe6IofM
 13. https://www.youtube.com/watch?v=V6VP-2aIcSc
 14. https://www.youtube.com/watch?v=yy8jQgmhbAU
 15. http://wiki.c2.com/?ArgumentsAgainstOop=
 16. http://www.smashcompany.com/technology/object-oriented-programming-is-an-expensive-disaster-which-must-end
 17. https://www.quora.com/Is-C++-slower-than-C-If-yes-is-the-difference-significant/answer/Simon-Hardy-Francis
 18. https://dpc.pw/tag:programming
 19. https://dpc.pw/tag:oop
 20. https://dpc.pw/tag:opinion
 21. https://write.as/