___________________________

                    WHEN SHALL I USE MONGODB?

                          Nicolas Herry
                   ___________________________


                           2017/12/02





1 When shall I use MongoDB?
===========================

 When to pick MongoDB over a relational database can be a tricky
 question. Luckily, I'm here to help with a quick FAQ. Guaranteed
 100% bias free.


1.1 Does MongoDB guarantee data persistence?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

 MongoDB is not ACID-compliant, which means that data can be
 lost. You should only pick MongoDB when you intend to store data
 you don't mind losing.


1.2 Does MongoDB scale well?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

 With MongoDB, data is stored duplicated. This means that indexes,
 when they are used, cannot be as effective as that of a
 relational database. Also, since data is duplicated, this means
 that databases tend to occupy more space on the disk than with a
 relational database. However, MongoDB scales through sharding,
 which means that data is spread across multiple servers to both
 balance the load and provide some redundancy. So if scaling is
 limited on one server, one can always add more servers to cope
 with the load. As for writing, MongoDB offer two approaches: with
 the MMAPv1 engine, it locks collections when a document is being
 saved, disallowing concurrent writing to different
 documents. This in part comes from the fact that MongoDB lacks
 transaction isolation (ACIDity). The second option involves the
 new default engine, WiredTiger, which relies on the optimistic
 concurrency model, assuming very low data contention. When a
 conflict occurs, transactions are rolled back and need to be run
 again. This is fine if transactions deal with different pieces of
 data; and very costly when multiple writes happen at the same
 time on the same document.


1.3 Is MongoDB more flexible than old cranky relational databases?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

 MongoDB gives developers the possibility to store data as
 documents, that is without any defined schema. In a collection,
 documents can actually look very different. This means the shape
 of the data doesn't have to be anticipated from the beginning,
 and fields can be added later. However, as a result, there is no
 way to alter a collection as a whole in one go: each record has
 to be fetched, edited, and saved back in the database. On large
 datasets, this can take hours. One also should think if their
 scenario really implies going on production without a clear idea
 of what the data looks like.


1.4 Does MongoDB makes the life easier for application developers?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

 Since MongoDB allows application developers to start working
 without thinking about the schema or the data itself, things are
 indeed easy. However, since the approach revolves only around
 data storage and not data itself (shape, type, constraints,
 etc.), developers have to write the validation part themselves,
 on the application side. This means more code to write, maintain,
 debug.


1.5 Is MongoDB the quickest engine with flexible, modern JSON data?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

 MongoDB makes JSON and binary JSON data its sole data format and
 tries to be effective at storing and reading it back. However,
 relational databases can also manage JSON and binary JSON data,
 and can be pretty efficient with it. Actually, PostgreSQL
 outperformed MongoDB is a series of benchmarks. Benchmarks being
 benchmarks, one can consider that at the very least, it's not
 clear which of MongoDB and PostgreSQL is the most
 efficient. However, it is clear that with PostgreSQL, one also
 gets to benefit from all the rest it has to offer: ACIDity, data
 validation, indexes, etc.


1.6 To sum up
~~~~~~~~~~~~~

 One should use MongoDB when they want to store a moderate
 quantity of data they don't mind losing, don't know much about
 and don't mind locking for basic maintenance tasks. It's stunning
 how much cutting edge in 2017 resembles cutting edge in 1997,
 when startups were trying to circumvent the shortcomings of
 MyISAM by cobbling together piles of Perl code, ending up with
 half-working data processing pipelines.