PEP: 416
Title: Add a frozendict builtin type
Version: $Revision$
Last-Modified: $Date$
Author: Victor Stinner <[email protected]>
Status: Rejected
Type: Standards Track
Content-Type: text/x-rst
Created: 29-Feb-2012
Python-Version: 3.3


Rejection Notice
================

I'm rejecting this PEP.  A number of reasons (not exhaustive):

* According to Raymond Hettinger, use of frozendict is low.  Those
 that do use it tend to use it as a hint only, such as declaring
 global or class-level "constants": they aren't really immutable,
 since anyone can still assign to the name.
* There are existing idioms for avoiding mutable default values.
* The potential of optimizing code using frozendict in PyPy is
 unsure; a lot of other things would have to change first.  The same
 holds for compile-time lookups in general.
* Multiple threads can agree by convention not to mutate a shared
 dict, there's no great need for enforcement.  Multiple processes
 can't share dicts.
* Adding a security sandbox written in Python, even with a limited
 scope, is frowned upon by many, due to the inherent difficulty with
 ever proving that the sandbox is actually secure.  Because of this
 we won't be adding one to the stdlib any time soon, so this use
 case falls outside the scope of a PEP.

On the other hand, exposing the existing read-only dict proxy as a
built-in type sounds good to me.  (It would need to be changed to
allow calling the constructor.)  GvR.

**Update** (2012-04-15): A new ``MappingProxyType`` type was added to the types
module of Python 3.3.


Abstract
========

Add a new frozendict builtin type.


Rationale
=========

A frozendict is a read-only mapping: a key cannot be added nor removed, and a
key is always mapped to the same value. However, frozendict values can be not
hashable. A frozendict is hashable if and only if all values are hashable.

Use cases:

* Immutable global variable like a default configuration.
* Default value of a function parameter. Avoid the issue of mutable default
 arguments.
* Implement a cache: frozendict can be used to store function keywords.
 frozendict can be used as a key of a mapping or as a member of set.
* frozendict avoids the need of a lock when the frozendict is shared
 by multiple threads or processes, especially hashable frozendict. It would
 also help to prohibe coroutines (generators + greenlets) to modify the
 global state.
* frozendict lookup can be done at compile time instead of runtime because the
 mapping is read-only. frozendict can be used instead of a preprocessor to
 remove conditional code at compilation, like code specific to a debug build.
* frozendict helps to implement read-only object proxies for security modules.
 For example, it would be possible to use frozendict type for __builtins__
 mapping or type.__dict__. This is possible because frozendict is compatible
 with the PyDict C API.
* frozendict avoids the need of a read-only proxy in some cases. frozendict is
 faster than a proxy because getting an item in a frozendict is a fast lookup
 whereas a proxy requires a function call.


Constraints
===========

* frozendict has to implement the Mapping abstract base class
* frozendict keys and values can be unorderable
* a frozendict is hashable if all keys and values are hashable
* frozendict hash does not depend on the items creation order


Implementation
==============

* Add a PyFrozenDictObject structure based on PyDictObject with an extra
 "Py_hash_t hash;" field
* frozendict.__hash__() is implemented using hash(frozenset(self.items())) and
 caches the result in its private hash attribute
* Register frozendict as a collections.abc.Mapping
* frozendict can be used with PyDict_GetItem(), but PyDict_SetItem() and
 PyDict_DelItem() raise a TypeError


Recipe: hashable dict
======================

To ensure that a frozendict is hashable, values can be checked
before creating the frozendict::

   import itertools

   def hashabledict(*args, **kw):
       # ensure that all values are hashable
       for key, value in itertools.chain(args, kw.items()):
           if isinstance(value, (int, str, bytes, float, frozenset, complex)):
               # avoid the compute the hash (which may be slow) for builtin
               # types known to be hashable for any value
               continue
           hash(value)
           # don't check the key: frozendict already checks the key
       return frozendict.__new__(cls, *args, **kw)


Objections
==========

*namedtuple may fit the requirements of a frozendict.*

A namedtuple is not a mapping, it does not implement the Mapping abstract base
class.

*frozendict can be implemented in Python using descriptors" and "frozendict
just need to be practically constant.*

If frozendict is used to harden Python (security purpose), it must be
implemented in C. A type implemented in C is also faster.

*The* :pep:`351` *was rejected.*

The :pep:`351` tries to freeze an object and so may convert a mutable object to an
immutable object (using a different type). frozendict doesn't convert anything:
hash(frozendict) raises a TypeError if a value is not hashable. Freezing an
object is not the purpose of this PEP.


Alternative: dictproxy
======================

Python has a builtin dictproxy type used by type.__dict__ getter descriptor.
This type is not public. dictproxy is a read-only view of a dictionary, but it
is not read-only mapping.  If a dictionary is modified, the dictproxy is also
modified.

dictproxy can be used using ctypes and the Python C API, see for example the
`make dictproxy object via ctypes.pythonapi and type() (Python recipe 576540)`_
by Ikkei Shimomura. The recipe contains a test checking that a dictproxy is
"mutable" (modify the dictionary linked to the dictproxy).

However dictproxy can be useful in some cases, where its mutable property is
not an issue, to avoid a copy of the dictionary.


Existing implementations
========================

Whitelist approach.

* `Implementing an Immutable Dictionary (Python recipe 498072)
 <http://code.activestate.com/recipes/498072/>`_ by Aristotelis Mikropoulos.
 Similar to frozendict except that it is not truly read-only: it is possible
 to access to this private internal dict.  It does not implement __hash__ and
 has an implementation issue: it is possible to call again __init__() to
 modify the mapping.
* PyWebmail contains an ImmutableDict type: `webmail.utils.ImmutableDict
 <http://pywebmail.cvs.sourceforge.net/viewvc/pywebmail/webmail/webmail/utils/ImmutableDict.py?revision=1.2&view=markup>`_.
 It is hashable if keys and values are hashable. It is not truly read-only:
 its internal dict is a public attribute.
* remember project: `remember.dicts.FrozenDict
 <https://bitbucket.org/mikegraham/remember/src/tip/remember/dicts.py>`_.
 It is used to implement a cache: FrozenDict is used to store function callbacks.
 FrozenDict may be hashable. It has an extra supply_dict() class method to
 create a FrozenDict from a dict without copying the dict: store the dict as
 the internal dict. Implementation issue: __init__() can be called to modify
 the mapping and the hash may differ depending on item creation order. The
 mapping is not truly read-only: the internal dict is accessible in Python.


Blacklist approach: inherit from dict and override write methods to raise an
exception. It is not truly read-only: it is still possible to call dict methods
on such "frozen dictionary" to modify it.

* brownie: `brownie.datastructures.ImmutableDict
 <https://github.com/DasIch/brownie/blob/HEAD/brownie/datastructures/mappings.py>`_.
 It is hashable if keys and values are hashable. werkzeug project has the
 same code: `werkzeug.datastructures.ImmutableDict
 <https://github.com/mitsuhiko/werkzeug/blob/master/werkzeug/datastructures.py>`_.
 ImmutableDict is used for global constant (configuration options). The Flask
 project uses ImmutableDict of werkzeug for its default configuration.
* SQLAlchemy project: `sqlalchemy.util.immutabledict
 <http://hg.sqlalchemy.org/sqlalchemy/file/tip/lib/sqlalchemy/util/_collections.py>`_.
 It is not hashable and has an extra method: union(). immutabledict is used
 for the default value of parameter of some functions expecting a mapping.
 Example: mapper_args=immutabledict() in SqlSoup.map().
* `Frozen dictionaries (Python recipe 414283) <http://code.activestate.com/recipes/414283/>`_
 by Oren Tirosh. It is hashable if keys and values are hashable. Included in
 the following projects:

 * lingospot: `frozendict/frozendict.py
   <http://code.google.com/p/lingospot/source/browse/trunk/frozendict/frozendict.py>`_
 * factor-graphics: frozendict type in `python/fglib/util_ext_frozendict.py
   <https://github.com/ih/factor-graphics/blob/41006fb71a09377445cc140489da5ce8eeb9c8b1/python/fglib/util_ext_frozendict.py>`_

* The gsakkis-utils project written by George Sakkis includes a frozendict
 type: `datastructs.frozendict
 <http://code.google.com/p/gsakkis-utils/source/browse/trunk/datastructs/frozendict.py>`_
* characters: `scripts/python/frozendict.py
 <https://github.com/JasonGross/characters/blob/15a2af5f7861cd33a0dbce70f1569cda74e9a1e3/scripts/python/frozendict.py#L1>`_.
 It is hashable. __init__() sets __init__ to None.
* Old NLTK (1.x): `nltk.util.frozendict
 <http://nltk.googlecode.com/svn/trunk/nltk-old/src/nltk/util.py>`_. Keys and
 values must be hashable. __init__() can be called twice to modify the
 mapping. frozendict is used to "freeze" an object.

Hashable dict: inherit from dict and just add an __hash__ method.

* `pypy.rpython.lltypesystem.lltype.frozendict
 <https://bitbucket.org/pypy/pypy/src/1f49987cc2fe/pypy/rpython/lltypesystem/lltype.py#cl-86>`_.
 It is hashable but don't deny modification of the mapping.
* factor-graphics: hashabledict type in `python/fglib/util_ext_frozendict.py
 <https://github.com/ih/factor-graphics/blob/41006fb71a09377445cc140489da5ce8eeb9c8b1/python/fglib/util_ext_frozendict.py>`_


Links
=====

* `Issue #14162: PEP 416: Add a builtin frozendict type
 <http://bugs.python.org/issue14162>`_
* PEP 412: Key-Sharing Dictionary
 (`issue #13903 <http://bugs.python.org/issue13903>`_)
* :pep:`351`: The freeze protocol
* `The case for immutable dictionaries; and the central misunderstanding of
 PEP 351 <http://www.cs.toronto.edu/~tijmen/programming/immutableDictionaries.html>`_
* `make dictproxy object via ctypes.pythonapi and type() (Python recipe
 576540) <http://code.activestate.com/recipes/576540/>`_ by Ikkei Shimomura.
* Python security modules implementing read-only object proxies using a C
 extension:

 * `pysandbox <https://github.com/vstinner/pysandbox/>`_
 * `mxProxy <http://www.egenix.com/products/python/mxBase/mxProxy/>`_
 * `zope.proxy <http://pypi.python.org/pypi/zope.proxy>`_
 * `zope.security <http://pypi.python.org/pypi/zope.security>`_


Copyright
=========

This document has been placed in the public domain.