* * * * *

     It shouldn't be this hard to support another syndication feed format

A few days ago I came across a new syndication feed format (like RSS (Really
Simple Syndication) [1] or Atom [2])—JSON Feed [3]:

> We — Manton Reece and Brent Simmons — have noticed that JSON (JavaScript
> Object Notation) has become the developers’ choice for API (Application
> Programming Interface)s, and that developers will often go out of their way
> to avoid XML (eXtensible Markup Language). JSON is simpler to read and
> write, and it’s less prone to bugs.
>
> So we developed JSON Feed, a format similar to RSS (Really Simple
> Syndication) and Atom but in JSON. It reflects the lessons learned from our
> years of work reading and publishing feeds.
>
> See the spec. It’s at version 1, which may be the only version ever needed.
> If future versions are needed, version 1 feeds will still be valid feeds.
>

“JSON Feed: Home [4]”

It's not like I need another syndication format, and it's still unclear just
how popular JSON Feed really is, but hey, I thought, it should be pretty easy
to add this. It looks simple enough:

-----[ Javascript ]-----
{
   "version": "https://jsonfeed.org/version/1",
   "title": "My Example Feed",
   "home_page_url": "https://example.org/",
   "feed_url": "https://example.org/feed.json",
   "items": [
       {
           "id": "2",
           "content_text": "This is a second item.",
           "url": "https://example.org/second-item"
       },
       {
           "id": "1",
           "content_html": "<p>Hello, world!</p>",
           "url": "https://example.org/initial-post"
       }
   ]
}
-----[ END OF LINE ]-----

I just need to add another entry to the template section of the configuration
file [5], create a few templates files, and as they say in England, “the
brother of your mother is Robert [6]” (how they know my mother's brother is
Robert, I don't know—the English are weird [7] like that).

But the issue is filling in the content_text field. The first issue—JSON is
encoded using UTF-8 [8]. For me, that's not an issue, as I'm using UTF-8 (and
even before I switched to using UTF-8, I was using ASCII (American Standard
Code for Information Interchange) [9], which is valid UTF-8 by design). But
in theory, someone could be using mod_blog [10] with some other encoding
scheme, which means an invalid JSON Feed unless fed through a character set
conversion routine, which I don't support in mod_blog.

But even assuming I did, that still doesn't mean I'm out of the water.

Suppose this was my content:

-----[ HTML ]-----
<p>"Hello," said the politician, lying.</p>

<p>"Back up!" I said, using my left hand to quickly cover my wallet in my back pocket.
"You aren't getting any money from me!"</p>
-----[ END OF LINE ]-----

If you check the syntax of JSON [11], you'll see that the double quote
character " needs to be converted to \". A similar transformation is required
for the blank line, being converted to \n. And I have no code written in
mod_blog for such conversions.

It's not like it would be that much code to write. When I added support for
RSS and Atom, I had to write code. But it irks me that I have to special case
a lot of string processing.

Yes, yes, I know—mod_blog is written in C, which is a horrible choice for
string processing. But even if I picked a better language suited to the task,
I would still have to write code to manually transform strings from, say, ISO
8859-1 [12] to UTF-8 and code to convert HTML (HyperText Markup Language) to
a form of non-HTML:

-----[ HTML ]-----
&lt;p&gt;&quot;Hello,&quot; said the politician, lying.&lt;/p&gt;

&lt;p&gt;&quot;Back up!&quot; I said, using my left hand to quickly cover my wallet in my back pocket.
&quot;You aren't getting any money from me!&quot;&lt;/p&gt;
-----[ END OF LINE ]-----

(Not to get all meta, but to display the first example HTML, I had to encode
it into the non-HTML you see above, and to display the non-HTML you see
above, I have to encod the non-HTML into non-non-HTML—or in other words,
convert the output yet again. So, to show a simple & in this page, I have to
encode it as &amp;, and to show that, I have to encode it as &amp;amp, in
ever deepening layers of Inception [13]-like encoding. By the way, that was
encoded as &amp;amp;amp;—just for your information.)

I spent way too much time trying to generalize a solution, only to ultimately
reject the code. I'll probably just add the code I need to support JSON Feed
and call it a day, because solving the issue once and for all is just too
much work.

[1] https://en.wikipedia.org/wiki/RSS
[2] https://en.wikipedia.org/wiki/Atom_(standard)
[3] https://jsonfeed.org/
[4] https://jsonfeed.org/
[5] https://github.com/spc476/mod_blog/blob/6a2143322bccccee7e137c1e02874bbef103686d/journal/blog.conf#L29
[6] https://www.phrases.org.uk/meanings/bobs-your-uncle.html
[7] http://www.montypython.com/
[8] https://en.wikipedia.org/wiki/UTF-8
[9] https://en.wikipedia.org/wiki/ASCII
[10] https://github.com/spc476/mod_blog/
[11] http://json.org/
[12] https://en.wikipedia.org/wiki/ISO/IEC_8859-1
[13] https://en.wikipedia.org/wiki/Inception

Email author at [email protected]