It's probably a good thing some malformed URLs are considered “valid”

* * * * *

It's probably a good thing some malformed URLs are considered “valid”

It seems it's all too easy to generate double slashes in the path component
[1] of a URL (Uniform Resource Locator), because I received via email a
report that my current [2] feed [3] files [4] all had that issue.

Sigh.

I made a change a few months ago in how I internally store the base URL of my
blog. It used to be that I did not store the trailing slash (so that
"https://boston.conman.org/" would be stored as "https://bost.conman.org") so
I had code to keep adding it back in when generating links. I changed the
code to store the tailing slash, but missed one section of code because I
don't subscribe to any of my feed files and didn't notice the issue.

I also fixed an actual crashing bug. All I have to say about that is that web
robots are quite good at generating really garbage requests [5] using a
variety of methods [6]—it's like free fuzz testing [7]! Woo hoo! Sob!

[1] gopher://gopher.conman.org/0Phlog:2023/01/11.1
[2] https://boston.conman.org/bostondiaries.rss
[3] https://boston.conman.org/index.atom
[4] https://boston.conman.org/index.json
[5] gopher://gopher.conman.org/0Phlog:2019/07/09.1
[6] https://www.iana.org/assignments/http-methods/http-methods.xhtml
[7] https://en.wikipedia.org/wiki/Fuzzing

Email author at [email protected]