NAME

   JSON::Relaxed -- An extension of JSON that allows for better
   human-readability.

SYNOPSIS

    my ($rjson, $hash, $parser);

    # raw RJSON code
    $rjson = <<'(RAW)';
    /* Javascript-like comments are allowed */
    {
      // single or double quotes allowed
      a : 'Larry',
      b : "Curly",

      // nested structures allowed like in JSON
      c: [
         {a:1, b:2},
      ],

      // like Perl, trailing commas are allowed
      d: "more stuff",
    }
    (RAW)

    # subroutine parsing
    $hash = from_rjson($rjson);

    # object-oriented parsing
    $parser = JSON::Relaxed::Parser->new();
    $hash = $parser->parse($rjson);

INSTALLATION

   JSON::Relaxed can be installed with the usual routine:

    perl Makefile.PL
    make
    make test
    make install

DESCRIPTION

   JSON::Relaxed is a lightweight parser and serializer for an extension
   of JSON called Relaxed JSON (RJSON). The intent of RJSON is to provide
   a format that is more human-readable and human-editable than JSON. Most
   notably, RJSON allows the use of JavaScript-like comments. By doing so,
   configuration files and other human-edited files can include comments
   to indicate the intention of each configuration.

   JSON::Relaxed is currently only a parser that reads in RJSON code and
   produces a data structure. JSON::Relaxed does not currently encode data
   structures into JSON/RJSON. That feature is planned.

Why Relaxed JSON?

   There's been increasing support for the idea of expanding JSON to
   improve human-readability. "Relaxed" JSON is a term that has been used
   to describe a JSON-ish format that has some features that JSON doesn't.
   Although there isn't yet any kind of official specification,
   descriptions of Relaxed JSON generally include the following extensions
   to JSON:

     * comments

     RJSON supports JavaScript-like comments:

      /* inline comments */
      // line-based comments

     * trailing commas

     Like Perl, RJSON allows treats commas as separators. If nothing is
     before, after, or between commas, those commas are just ignored:

      [
         , // nothing before this comma
         "data",
         , // nothing after this comma
      ]

     * single quotes, double quotes, no quotes

     Strings can be quoted with either single or double quotes. Space-less
     strings are also parsed as strings. So, the following data items are
     equivalent:

      [
         "Starflower",
         'Starflower',
         Starflower
      ]

     Note that unquoted boolean values are still treated as boolean
     values, so the following are NOT the same:

      [
         "true",  // string
         true,    // boolean true

         "false", // string
         false,   // boolean false

         "null", // string
         null, // what Perl programmers call undef
      ]

     Because of this ambiguity, unquoted non-boolean strings should be
     considered sloppy and not something you do in polite company.

     * documents that are just a single string

     Early versions of JSON require that a JSON document contains either a
     single hash or a single array. Later versions also allow a single
     string. RJSON follows that later rule, so the following is a valid
     RJSON document:

      "Hello world"

     * hash keys without values

     A hash in JSON can have a key that is followed by a comma or a
     closing } without a specified value. In that case the hash element is
     simply assigned the undefined value. So, in the following example, a
     is assigned 1, b is assigned 2, and c is assigned undef:

      {
         a: 1,
         b: 2,
         c
      }

from_rjson()

   from_rjson() is the simple way to quickly parse an RJSON string.
   Currently from_rjson() only takes a single parameter, the string
   itself. So in the following example, from_rjson() parses and returns
   the structure defined in $rjson.

    $structure = from_rjson($rjson);

Object-oriented parsing

   To parse using an object, create a JSON::Relaxed::Parser object, like
   this:

    $parser = JSON::Relaxed::Parser->new();

   Then call the parser's <code>parse</code> method, passing in the RJSON
   string:

    $structure = $parser->parse($rjson);

   Methods

     * $parser->extra_tokens_ok()

     extra_tokens_ok() sets/gets the extra_tokens_ok property. By default,
     extra_tokens_ok is false. If by extra_tokens_ok is true then the
     multiple-structures isn't triggered and the parser returns the first
     structure it finds. So, for example, the following code would return
     undef and sets the multiple-structures error:

      $parser = JSON::Relaxed::Parser->new();
      $structure = $parser->parse('{"x":1} []');

     However, by setting multiple-structures to true, a hash structure is
     returned, the extra code after that first hash is ignored, and no
     error is set:

      $parser = JSON::Relaxed::Parser->new();
      $parser->extra_tokens_ok(1);
      $structure = $parser->parse('{"x":1} []');

Error codes

   When JSON::Relaxed encounters a parsing error it returns undef and sets
   two global variables:

     * $JSON::Relaxed::err_id

     $err_id is a unique code for a specific error. Every code is set in
     only one place in JSON::Relaxed.

     * $JSON::Relaxed::err_msg

     $err_msg is an English description of the code. It would be cool to
     migrate towards multi-language support for $err_msg.

   Following is a list of all error codes in JSON::Relaxed:

     * missing-parameter

     The string to be parsed was not sent to $parser->parse(). For
     example:

      $parser->parse()

     * undefined-input

     The string to be parsed is undefined. For example:

      $parser->parse(undef)

     * zero-length-input

     The string to be parsed is zero-length. For example:

      $parser->parse('')

     * space-only-input

     The string to be parsed has no content beside space characters. For
     example:

      $parser->parse('   ')

     * no-content

     The string to be parsed has no content. This error is slightly
     different than space-only-input in that it is triggered when the
     input contains only comments, like this:

      $parser->parse('/* whatever */')

     * unclosed-inline-comment

     A comment was started with /* but was never closed. For example:

      $parser->parse('/*')

     * invalid-structure-opening-character

     The document opens with an invalid structural character like a comma
     or colon. The following examples would trigger this error.

      $parser->parse(':')
      $parser->parse(',')
      $parser->parse('}')
      $parser->parse(']')

     * multiple-structures

     The document has multiple structures. JSON and RJSON only allow a
     document to consist of a single hash, a single array, or a single
     string. The following examples would trigger this error.

      $parse->parse('{}[]')
      $parse->parse('{} "whatever"')
      $parse->parse('"abc" "def"')

     * unknown-token-after-key

     A hash key may only be followed by the closing hash brace or a colon.
     Anything else triggers unknown-token-after-key. So, the following
     examples would trigger this error.

      $parse->parse("{a [ }") }
      $parse->parse("{a b") }

     * unknown-token-for-hash-key

     The parser encountered something besides a string where a hash key
     should be. The following are examples of code that would trigger this
     error.

      $parse->parse('{{}}')
      $parse->parse('{[]}')
      $parse->parse('{]}')
      $parse->parse('{:}')

     * unclosed-hash-brace

     A hash has an opening brace but no closing brace. For example:

      $parse->parse('{x:1')

     * unclosed-array-brace

     An array has an opening brace but not a closing brace. For example:

      $parse->parse('["x", "y"')

     * unexpected-token-after-colon

     In a hash, a colon must be followed by a value. Anything else
     triggers this error. For example:

      $parse->parse('{"a":,}')
      $parse->parse('{"a":}')

     * missing-comma-between-array-elements

     In an array, a comma must be followed by a value, another comma, or
     the closing array brace. Anything else triggers this error. For
     example:

      $parse->parse('[ "x" "y" ]')
      $parse->parse('[ "x" : ]')

     * unknown-array-token

     This error exists just in case there's an invalid token in an array
     that somehow wasn't caught by missing-comma-between-array-elements.
     This error shouldn't ever be triggered. If it is please let me know.

     * unclosed-quote

     This error is triggered when a quote isn't closed. For example:

      $parse->parse("'whatever")
      $parse->parse('"whatever') }

INTERNALS

   The following documentation is for if you want to edit the code of
   JSON::Relaxed itself.

JSON::Relaxed

   JSON::Relaxed is the parent package. Not a lot actually happens in
   JSON::Relaxed, it mostly contains from_rjson() and definitions of
   various structures.

   Special character and string definitions

     The following hashes provide information about characters and strings
     that have special meaning in RJSON.

       * Escape characters

       The %esc hash defines the six escape characters in RJSON that are
       changed to single characters. %esc is defined as follows.

        our %esc = (
          'b'   => "\b",    #  Backspace
          'f'   => "\f",    #  Form feed
          'n'   => "\n",    #  New line
          'r'   => "\r",    #  Carriage return
          't'   => "\t",    #  Tab
          'v'   => chr(11), #  Vertical tab
        );

       * Structural characters

       The %structural hash defines the six characters in RJSON that
       define the structure of the data object. The structural characters
       are defined as follows.

        our %structural = (
          '[' => 1, # beginning of array
          ']' => 1, # end of array
          '{' => 1, # beginning of hash
          '}' => 1, # end of hash
          ':' => 1, # delimiter between name and value of hash element
          ',' => 1, # separator between elements in hashes and arrays
        );

       * Quotes

       The %quotes hash defines the two types of quotes recognized by
       RJSON: single and double quotes. JSON only allows the use of double
       quotes to define strings. Relaxed also allows single quotes.
       %quotes is defined as follows.

        our %quotes = (
          '"' => 1,
          "'" => 1,
        );

       * End of line characters

       The %newlines hash defines the three ways a line can end in a RJSON
       document. Lines in Windows text files end with carriage-return
       newline ("\r\n"). Lines in Unixish text files end with newline
       ("\n"). Lines in some operating systems end with just carriage
       returns ("\n"). %newlines is defined as follows.

        our %newlines = (
          "\r\n" => 1,
          "\r" => 1,
          "\n" => 1,
        );

       * Boolean

       The %boolean hash defines strings that are boolean values: true,
       false, and null. (OK, 'null' isn't just a boolean value, but I
       couldn't think of what else to call this hash.) %boolean is defined
       as follows.

        our %boolean = (
          'null' => 1,
          'true' => 1,
          'false' => 1,
        );

JSON::Relaxed::Parser

   A JSON::Relaxed::Parser object parses the raw RJSON string. You don't
   need to instantiate a parser if you just want to use the default
   settings. In that case just use from_rjson().

   You would create a JSON::Relaxed::Parser object if you want to
   customize how the string is parsed. I say "would" because there isn't
   actually any customization in these early releases. When there is
   you'll use a parser object.

   To parse in an object oriented manner, create the parser, then parse.

    $parser = JSON::Relaxed::Parser->new();
    $structure = $parser->parse($string);

   new

     JSON::Relaxed::Parser-new()> creates a parser object. Its simplest
     and most common use is without any parameters.

      my $parser = JSON::Relaxed::Parser->new();

     option: unknown

       The unknown option sets the character which creates the unknown
       object. The unknown object exists only for testing JSON::Relaxed.
       It has no purpose in production use.

        my $parser = JSON::Relaxed::Parser->new(unknown=>'~');

   Parser "is" methods

     The following methods indicate if a token has some specific property,
     such as being a string object or a structural character.

       * is_string()

       Returns true if the token is a string object, i.e. in the class
       JSON::Relaxed::Parser::Token::String.

       * is_struct_char()

       Returns true if the token is one of the structural characters of
       JSON, i.e. one of the following:

        { } [ ] : ,

       * is_unknown_char()

       Returns true if the token is the unknown character.

       * is_list_opener()

       Returns true if the token is the opening character for a hash or an
       array, i.e. it is one of the following two characters:

        { [

       * is_comment_opener()

       Returns true if the token is the opening character for a comment,
       i.e. it is one of the following two couplets:

        /*
        //

   parse()

     parse() is the method that does the work of parsing the RJSON string.
     It returns the data structure that is defined in the RJSON string. A
     typical usage would be as follows.

      my $parser = JSON::Relaxed::Parser->new();
      my $structure = $parser->parse('["hello world"]');

     parse() does not take any options.

   parse_chars()

     parse_chars() parses the RJSON string into either individual
     characters or two-character couplets. This method returns an array.
     The only input is the raw RJSON string. So, for example, the
     following string:

      $raw = qq|/*x*/["y"]|;
      @chars = $parser->parse_chars($raw);

     would be parsed into the following array:

      ( "/*", "x", "*/", "[", "\"", "y", "\""", "]" )

     Most of the elements in the array are single characters. However,
     comment delimiters, escaped characters, and Windows-style newlines
     are parsed as two-character couplets:

       * \ followed by any character

       * \r\n

       * //

       * /*

       * */

     parse_chars() should not produce any fatal errors.

   tokenize()

     tokenize() organizes the characters from parse_chars() into tokens.
     Those tokens can then be organized into a data structure with
     structure().

     Each token represents an item that is recognized by JSON. Those items
     include structural characters such as { or }, or strings such as
     "hello world". Comments and insignificant whitespace are filtered out
     by tokenize().

     For example, this code:

      $parser = JSON::Relaxed::Parser->new();
      $raw = qq|/*x*/ ["y"]|;
      @chars = $parser->parse_chars($raw);
      @tokens = $parser->tokenize(\@chars);

     would produce an array like this:

      (
        '[',
        JSON::Relaxed::Parser::Token::String::Quoted=HASH(0x20bf0e8),
        ']'
      )

     Strings are tokenized into string objects. When the parsing is
     complete they are returned as scalar variables, not objects.

     tokenize() should not produce any fatal errors.

   structure()

     $parser-structure()> organizes the tokens from tokenize() into a data
     structure. $parser-structure()> returns a single string, single array
     reference, a single hash reference, or (if there are errors) undef.

JSON::Relaxed::Parser::Structure::Hash

   This package parses Relaxed into hash structures. It is a static
   package, i.e. it is not instantiated.

   build()

     This static method accepts the array of tokens and works through them
     building the hash reference that they represent. When build() reaches
     the closing curly brace (}) it returns the hash reference.

   get_value

     This static method gets the value of a hash element. This method is
     called after a hash key is followed by a colon. A colon must be
     followed by a value. It may not be followed by the end of the tokens,
     a comma, or a closing brace.

JSON::Relaxed::Parser::Structure::Array

   This package parses Relaxed into array structures. It is a static
   package, i.e. it is not instantiated.

   build()

     This static method accepts the array of tokens and works through them
     building the array reference that they represent. When build()
     reaches the closing square brace (]) it returns the array reference.

   missing_comma()

     This static method build the missing-comma-between-array-elements
     error message.

   invalid_array_token)

     This static method build the unknown-array-token error message.

JSON::Relaxed::Parser::Token::String

   Base class . Nothing actually happens in this package, it's just a base
   class for JSON::Relaxed::Parser::Token::String::Quoted and
   JSON::Relaxed::Parser::Token::String::Unquoted.

JSON::Relaxed::Parser::Token::String::Quoted

   A JSON::Relaxed::Parser::Token::String::Quoted object represents a
   string in the document that is delimited with single or double quotes.
   In the following example, Larry and Curly would be represented by
   Quoted objects by Moe would not.

    [
       "Larry",
       'Curly',
       Moe
    ]

   Quoted objects are created by $parser->tokenize() when it works through
   the array of characters in the document.

     * new()

     new() instantiates a JSON::Relaxed::Parser::Token::String::Quoted
     object and slurps in all the characters in the characters array until
     it gets to the closing quote. Then it returns the new Quoted object.

     A Quoted object has the following two properties:

     raw: the string that is inside the quotes. If the string contained
     any escape characters then the escapes are processed and the
     unescaped characters are in raw. So, for example, \n would become an
     actual newline.

     quote: the delimiting quote, i.e. either a single quote or a double
     quote.

     * as_perl()

     as_perl() returns the string that was in quotes (without the quotes).

JSON::Relaxed::Parser::Token::String::Unquoted

   A JSON::Relaxed::Parser::Token::String::Unquoted object represents a
   string in the document that was not delimited quotes. In the following
   example, Moe would be represented by an Unquoted object, but Larry and
   Curly would not.

    [
       "Larry",
       'Curly',
       Moe
    ]

   Unquoted objects are created by $parser->tokenize() when it works
   through the array of characters in the document.

   An Unquoted object has one property, raw, which is the string. Escaped
   characters are resolved in raw.

     * new()

     new() instantiates a JSON::Relaxed::Parser::Token::String::Unquoted
     object and slurps in all the characters in the characters array until
     it gets to a space character, a comment, or one of the structural
     characters such as { or :.

     * as_perl()

     as_perl() returns the unquoted string or a boolean value, depending
     on how it is called.

     If the string is a boolean value, i.e. true, false, then the as_perl
     return 1 (for true), 0 (for false) or undef (for null), unless the
     always_string option is sent, in which case the string itself is
     returned. If the string does not represent a boolean value then it is
     returned as-is.

     $parser->structure() sends the always_string when the token is a key
     in a hash. The following example should clarify how always_string is
     used:

      {
         // key: the literal string "larry"
         // value: 1
         larry : true,

         // key: the literal string "true"
         // value: 'x'
         true : 'x',

         // key: the literal string "null"
         // value: 'y'
         null : 'y',

         // key: the literal string "z"
         // value: undef
         z : null,
      }

JSON::Relaxed::Parser::Token::Unknown

   This class is just used for development of JSON::Relaxed. It has no use
   in production. This class allows testing for when a token is an unknown
   object.

   To implement this class, add the 'unknown' option to
   JSON::Relaxed->new(). The value of the option should be the character
   that creates an unknown object. For example, the following option sets
   the tilde (~) as an unknown object.

    my $parser = JSON::Relaxed::Parser->new(unknown=>'~');

   The "unknown" character must not be inside quotes or inside an unquoted
   string.

TERMS AND CONDITIONS

   Copyright (c) 2014 by Miko O'Sullivan. All rights reserved. This
   program is free software; you can redistribute it and/or modify it
   under the same terms as Perl itself. This software comes with NO
   WARRANTY of any kind.

AUTHOR

   Miko O'Sullivan [email protected]

VERSION

   Version: 0.04

HISTORY

   Version 0.01 Nov 30, 2014

     Initial version.

   Version 0.02 Dec 3, 2014

     Fixed test.t so that it can load lib.pm when it runs.

     Added $parser->extra_tokens_ok(). Removed error code
     invalid-structure-opening-string and allowed that error to fall
     through to multiple-structures.

     Cleaned up documentation.

   Version 0.03 Dec 6, 2014

     Modified test for parse_chars to normalize newlines. Apparently the
     way Perl on Windows handles newline is different than what I
     expected, but as long as it's recognizing newlines and|or carriage
     returns then the test should pass.

   Version 0.04 Apr 28, 2016

     Fixed bug in which end of line did not terminate some line comments.

     Minor cleanups of documentation.

     Cleaned up test.pl.

   Version 0.05 Apr 30, 2016

     Fixed bug: Test::Most was not added to the prerequisite list. No
     changes to the functionality of the module itself.