* * * * *

                         Yet more character building

Back from hacking [1].

And it's been an interesting session. Learned quite a bit, and picked up some
new tricks as well.

I'm doing some testing when I copy-n-pasted the following:

> sensor—and? How did **that** happen?  Let's look at the O'Reilly source
> [2] from which Cory copy and pasted:
>
> > The phone has become a platform, moving beyond mere voice to smart mobile
> > sensor—and back to phone again, by way of voice-over-IP.
> >
>

“Sam Ruby: Copy and Paste [3]”

The program I wrote would classify the text as UTF-8, then iconv() would
return an error. I rewrote the conversion routine so that when it failed
(iconv() would return where it failed doing the conversion) I would re-
classify the remaining text and continue.

Doing that, the text fragment above would be first tagged as UTF- 8, then
WINDOWS-1252 and displayed it correctly:

> sensor—and? How did that happen? Let's look at the O'Reilly source from
> which Cory copy and pasted:
>
> The phone has become a platform, moving beyond mere voice to smart mobile
> sensor—and back to phone again, by way of voice-over-IP.
>

But if I copied the text twice, it would still be tagged as first UTF-8 then
WINDOWS-1252, but the second copy would be incorrect:

> sensor—and? How did that happen? Let's look at the O'Reilly source from
> which Cory copy and pasted:
>
> The phone has become a platform, moving beyond mere voice to smart mobile
> sensor—and back to phone again, by way of voice-over-IP.
>
> sensor—and? How did that happen? Let's look at the O'Reilly source from
> which Cory copy and pasted:
>
> The phone has become a platform, moving beyond mere voice to smart mobile
> sensor—and back to phone again, by way of voice-over-IP.
>

Not really sure how to handle that (“garbage in, garbage out” and all that)
but it's a lot better than things were before. All that was left was to add
some more code to allow plain text or HTML formatted text and a preview
mode[DELETED-; I put it online so those of you who are curious can play
around with it-DELETED].

The trick I learned (an epiphany if you will): I added the following to the
code:

> volatile int g_debug = 1;
>
> while(g_debug)
>   ;
>

That will cause the program to just sit there, doing vast amounts of nothing
really fast. The reason for such a weird thing is that debugging a CGI
(Common Gateway Interface) program (and yes, this is written in C—don't ask)
is not easy (I used to go through quite a bit of rigamarole to simulate the
webserver environment so I could use a debugger). This trick allows the
webserver to run the program (which will just sit there) and then I can then
use gdb to attach to the running process to debug it (once in, I set my
breakpoint, then do set g_debug=0 and resume execution of the program—wish I
knew about this eight years ago).

Another amusing thing I learned—that the “/” character in Firefox will bring
up a search box. It's not a bad thing, until you try typing a “/” in a
<TEXTAREA> field. Then it gets right down annoying.

Now to take what I have and integrate it.

[1] gopher://gopher.conman.org/0Phlog:2004/12/05.1
[2] http://conferences.oreillynet.com/cs/et2005/create/e_sess
[3] http://www.intertwingly.net/blog/2004/09/23/Copy-and-Paste

Email author at [email protected]