* * * * *
It shouldn't be this hard to support another syndication feed format
A few days ago I came across a new syndication feed format (like RSS (Really
Simple Syndication) [1] or Atom [2])—JSON Feed [3]:
> We — Manton Reece and Brent Simmons — have noticed that JSON (JavaScript
> Object Notation) has become the developers’ choice for API (Application
> Programming Interface)s, and that developers will often go out of their way
> to avoid XML (eXtensible Markup Language). JSON is simpler to read and
> write, and it’s less prone to bugs.
>
> So we developed JSON Feed, a format similar to RSS (Really Simple
> Syndication) and Atom but in JSON. It reflects the lessons learned from our
> years of work reading and publishing feeds.
>
> See the spec. It’s at version 1, which may be the only version ever needed.
> If future versions are needed, version 1 feeds will still be valid feeds.
>
“JSON Feed: Home [4]”
It's not like I need another syndication format, and it's still unclear just
how popular JSON Feed really is, but hey, I thought, it should be pretty easy
to add this. It looks simple enough:
-----[ Javascript ]-----
{
"version": "
https://jsonfeed.org/version/1",
"title": "My Example Feed",
"home_page_url": "
https://example.org/",
"feed_url": "
https://example.org/feed.json",
"items": [
{
"id": "2",
"content_text": "This is a second item.",
"url": "
https://example.org/second-item"
},
{
"id": "1",
"content_html": "<p>Hello, world!</p>",
"url": "
https://example.org/initial-post"
}
]
}
-----[ END OF LINE ]-----
I just need to add another entry to the template section of the configuration
file [5], create a few templates files, and as they say in England, “the
brother of your mother is Robert [6]” (how they know my mother's brother is
Robert, I don't know—the English are weird [7] like that).
But the issue is filling in the content_text field. The first issue—JSON is
encoded using UTF-8 [8]. For me, that's not an issue, as I'm using UTF-8 (and
even before I switched to using UTF-8, I was using ASCII (American Standard
Code for Information Interchange) [9], which is valid UTF-8 by design). But
in theory, someone could be using mod_blog [10] with some other encoding
scheme, which means an invalid JSON Feed unless fed through a character set
conversion routine, which I don't support in mod_blog.
But even assuming I did, that still doesn't mean I'm out of the water.
Suppose this was my content:
-----[ HTML ]-----
<p>"Hello," said the politician, lying.</p>
<p>"Back up!" I said, using my left hand to quickly cover my wallet in my back pocket.
"You aren't getting any money from me!"</p>
-----[ END OF LINE ]-----
If you check the syntax of JSON [11], you'll see that the double quote
character " needs to be converted to \". A similar transformation is required
for the blank line, being converted to \n. And I have no code written in
mod_blog for such conversions.
It's not like it would be that much code to write. When I added support for
RSS and Atom, I had to write code. But it irks me that I have to special case
a lot of string processing.
Yes, yes, I know—mod_blog is written in C, which is a horrible choice for
string processing. But even if I picked a better language suited to the task,
I would still have to write code to manually transform strings from, say, ISO
8859-1 [12] to UTF-8 and code to convert HTML (HyperText Markup Language) to
a form of non-HTML:
-----[ HTML ]-----
<p>"Hello," said the politician, lying.</p>
<p>"Back up!" I said, using my left hand to quickly cover my wallet in my back pocket.
"You aren't getting any money from me!"</p>
-----[ END OF LINE ]-----
(Not to get all meta, but to display the first example HTML, I had to encode
it into the non-HTML you see above, and to display the non-HTML you see
above, I have to encod the non-HTML into non-non-HTML—or in other words,
convert the output yet again. So, to show a simple & in this page, I have to
encode it as &, and to show that, I have to encode it as &amp, in
ever deepening layers of Inception [13]-like encoding. By the way, that was
encoded as &amp;amp;—just for your information.)
I spent way too much time trying to generalize a solution, only to ultimately
reject the code. I'll probably just add the code I need to support JSON Feed
and call it a day, because solving the issue once and for all is just too
much work.
[1]
https://en.wikipedia.org/wiki/RSS
[2]
https://en.wikipedia.org/wiki/Atom_(standard)
[3]
https://jsonfeed.org/
[4]
https://jsonfeed.org/
[5]
https://github.com/spc476/mod_blog/blob/6a2143322bccccee7e137c1e02874bbef103686d/journal/blog.conf#L29
[6]
https://www.phrases.org.uk/meanings/bobs-your-uncle.html
[7]
http://www.montypython.com/
[8]
https://en.wikipedia.org/wiki/UTF-8
[9]
https://en.wikipedia.org/wiki/ASCII
[10]
https://github.com/spc476/mod_blog/
[11]
http://json.org/
[12]
https://en.wikipedia.org/wiki/ISO/IEC_8859-1
[13]
https://en.wikipedia.org/wiki/Inception
Email author at
[email protected]