* * * * *

               Notes on an ideal integrated development system

I'm not a fan of IDE (Integrated Development Environment)s. I grew up with
the “edit-compile-run” cyle of development, and while I didn't always have a
choice in the “compile” portion of things, I did in the “edit” portion, and
over time became very picky about which editor I use. Because of that,
whenever I did try an IDE, I invariably found the “edit” portion to be very
painful, stuck in an editor that I wasn't used to; being forced to use an
unfamiliar editor resulted in a vast loss of productivity and thus, I've
never liked IDEs. So I stuck with the “edit-compile-run” cycle.

But the recent bout of programming I've done has made me wish for something
better than the “edit-compile-run” cycle. And while IDEs have probably
evolved since I last tried them in the late 80s, I don't think they've
evolved enough to suit me.

What I'm about to describe is defintely “pie-in-the-sky” stuff. I'm not
saying that IDEs must be this way—I'm just saying that this is what I would
like in an IDE. Who knows? Maybe this won't work. Maybe it's unworkable. But
I wouldn't mind seeing these features (at long as the editing could be
configured to my liking).

> A database I used in the early 80s that ran on a twin floppy PC (Personal
> Computer). Written by Brian Berkowitz and Richard Ilson
>
> Wonderful features were:
>
> * It stored user-defined names separately from internal IDs, so you could
>   change the names of tables and fields without worrying.
> * Fantastic date handling—you could enter “Next Wednesday” and it would
>   work out the date.
> * No Table/View separation, you could define a field on a table as being
>   calculated on fields from a related table and that definition became part
>   of the original table.
>

“Cornerstone [1]”

The one feature of Corner stone [2] that still strikes me as innovative is
the separation of variables (or in this case, fields and tables) from their
name. One could change the name of a variable without having to edit every
other occurrence of that name. That's a very powerful feature, but to
implement it in an IDE, that IDE would have to have intimate knowledge of the
computer language being used.

A few years ago, I cleaned up the code in mod_blog. I had a bunch of global
variables used throughout the codebase, all starting with “g_” (such as
g_rssfile) but they weren't variables in the traditional sense, they were
more or less “run-time settable constants” (to the rest of the codebase, the
declaration for g_rssfile was extern const char *const g_rssfile). I decided
that they needed a renaming to better reflect how I actually use them, and
changed the majority of global variables to start with “c_”.

Talk about pain.

Each one required at minimum three edits—the declaration in a header file,
the actual declaration, and the setting of said variable when the program
starts up. If I had this feature, something that took maybe an hour could
have been finished in a few minutes.

But mod_blog is a very small codebase—some 14,000 lines of code. Could such a
feature scale to something like the Linux kernel? Or Firefox? Or even Windows
Vista? I don't know. And how would you even implement something like that?

My guess—if you even hope to do something like this on multimillion line
codebases, you may have to give up on storing the code as text and move on to
some other internal format.

It's not like it's a new idea. Most forms of BASIC (Beginners All-purpose
Symbolic Instruction Code) [3] (you know, that horrible langauge made popular
on 8-bit microprocessors of the 70s and 80s) were not stored as text but in a
mixture of binary and text form (although you could get a pure text version
of the code if you wanted it).

So, what happens if we get away from distinct text files? And hey, why not
design (or redesign) a language while we're at it?

A common complaint about static typechecking languages is the requirement to
declare all your variables. But if we're using an ideal IDE, one that
understands the langauge we're programming in, why not take the work on type
inference and use it during the editing phase?

Something like:

[Example 1] [4]

The editing takes place on the right-hand side, whereas the IDE will track
your variables and types on the left-hand side. In this simple example, we
see that the IDE has determined that the function nth() takes an integer, and
returns a constant string.

In this example:

[Example 2] [5]

The IDE inferred that the function foo() will return either a constant string
or a number, which is highlighted in red to indicate the conflict (not that
it won't run depending upon the language—it's just highlighting the fact that
this function will return one of two types). It also inferred that the
parameters are of type “number” (doubles, floats, integers, what have you).

So, the IDE could be doing these types annotations for you, but why not the
ability to further annotate the annotations? I don't see why you couldn't
edit the left-hand side to, say, change the type the IDE detected, or even
annotate further conditions:

[Example 3] [6]

Here, we annotated that b is not to be 0, and the IDE then highlighted the
code to say “hey, this can't happen.” The assumption here is, the compiler
can then use the annotations to statically check the code, and if it can
determine at compile time that b is 0, then flag a compilation error—
otherwise it can insert the runtime code for us to check and raise an
exception (or do the equivilent of assert()) at runtime.

(And if we have all this syntax and typechecking stuff going on, along with
the ability to change variable and function names at will without having to
re-edit a bunch of code, we might as well have the IDE compile the code as we
write it—although on a huge codebase this may be impractical—just a thought)

I'm still not entirely sure how to present the source code though. Since this
“pie-in-the-sky” IDE stores the source code in some internal format, the
minimum “working unit” isn't a file. I want to say that the minimum “working
unit” is a function (that's how the examples are presented), or maybe a group
of related functions. Heck, at this stage, we could probably incorporate
Literate Programming principles.

Another feature that I don't think any existing IDE has is revision control
as part of the system. And like the editing portion (“I want my editor, not
the crap one the IDE provides”), revision control is another area of
contention (not only over say, CVS [7] vs. SVN [8], but centralized vs.
decentralized, file-based vs. content-based, commenting every change vs.
commenting over a series of changes, etc.). But since I'm taking a “pie-in-
the-sky” approach to IDEs, I'll include revision control from within it as
well.

It would probably also help with managing slightly different versions of the
code base. For instance, the original version of the graylist daemon had the
following bit of code to generate a report (more or less pulled from another
daemon I had written):

-----[ C ]-----
static void handle_sigusr1(void)
{
 Stream out;
 pid_t  child;
 size_t i;

 (*cv_report)(LOG_DEBUG,"","User 1 Signal");
 mf_sigusr1 = 0;

 child = fork();
 if (child == (pid_t)-1)
 {
   (*cv_report)(LOG_CRIT,"$","fork() = %a",strerror(errno));
   return;
 }

 out = FileStreamWrite(c_dumpfile,FILE_CREATE | FILE_TRUNCATE);
 if (out == NULL)
 {
   (*cv_report)(LOG_ERR,"$","could not open %a",c_dumpfile);
   _exit(0);
 }

 for (i = 0 ; i < g_poolnum ; i++)
 {
   LineSFormat(
       out,
       "$ $ $ $ $ $ $ $ L L",
       "%a %b %c %d%e%f%g%h %i %j\n",
       ipv4(g_tuplespace[i]->ip),
       g_tuplespace[i]->from,
       g_tuplespace[i]->to,
       (g_tuplespace[i]->f & F_WHITELIST) ? "W" : "-",
       (g_tuplespace[i]->f & F_GRAYLIST)  ? "G" : "-",
       (g_tuplespace[i]->f & F_TRUNCFROM) ? "F" : "-",
       (g_tuplespace[i]->f & F_TRUNCTO)   ? "T" : "-",
       (g_tuplespace[i]->f & F_IPv6)      ? "6" : "-",
       (unsigned long)g_tuplespace[i]->ctime,
       (unsigned long)g_tuplespace[i]->atime
   );
 }

 StreamFree(out);
 _exit(0);
}
-----[ END OF LINE ]-----

It works on all the development servers, but not the actual server.

Sigh.

Next version:

-----[ C ]-----
static void handle_sigusr1(void)
{
 Stream out;
#ifdef CAN_DO_FORK
 pid_t  child;
#endif
 size_t i;

 (*cv_report)(LOG_DEBUG,"","User 1 Signal");
 mf_sigusr1 = 0;

#ifdef CAN_DO_FORK
 child = fork();
 if (child == (pid_t)-1)
 {
   (*cv_report)(LOG_CRIT,"$","fork() = %a",strerror(errno));
   return;
 }
#endif

 out = FileStreamWrite(c_dumpfile,FILE_CREATE | FILE_TRUNCATE);
 if (out == NULL)
 {
   (*cv_report)(LOG_ERR,"$","could not open %a",c_dumpfile);
#ifdef CAN_DO_FORK
   _exit(0);
#else
   return;
#endif
 }

 for (i = 0 ; i < g_poolnum ; i++)
 {
   LineSFormat(
       out,
       "$ $ $ $ $ $ $ $ L L",
       "%a %b %c %d%e%f%g%h %i %j\n",
       ipv4(g_tuplespace[i]->ip),
       g_tuplespace[i]->from,
       g_tuplespace[i]->to,
       (g_tuplespace[i]->f & F_WHITELIST) ? "W" : "-",
       (g_tuplespace[i]->f & F_GRAYLIST)  ? "G" : "-",
       (g_tuplespace[i]->f & F_TRUNCFROM) ? "F" : "-",
       (g_tuplespace[i]->f & F_TRUNCTO)   ? "T" : "-",
       (g_tuplespace[i]->f & F_IPv6)      ? "6" : "-",
       (unsigned long)g_tuplespace[i]->ctime,
       (unsigned long)g_tuplespace[i]->atime
   );
 }

 StreamFree(out);
#ifdef CAN_DO_FORK
 _exit(0);
#endif
}
-----[ END OF LINE ]-----

Ugly as hell. But typical of “portable” C code. If, however, one could easily
make alternative versions (or branches) to the code, then I could, say,
branch the previous version into the “Can do fork” and the “Not a forking
chance” versions, then all this #ifdef crap. And by removing all that #ifdef
crap, it makes it easier to follow the code.

And if you need to see all the current versions?

I guess something like FileMerge [9] could be used to view the different
revisions (and if the minimum “working unit” is the function, we get very
fine-grained revision control).

And I suppose, while I'm at it, the ability to not only debug from the IDE,
but edit a running instance of the program (Smalltalk In Action) [10]
wouldn't be asking too much, although doing so for any arbitrary language may
be difficult to darn near impossible.

[1] http://c2.com/cgi/wiki?CornerStone
[2] http://web.mit.edu/6.933/www/Fall2000/infocom/csdemo/index.html
[3] http://en.wikipedia.org/wiki/BASIC
[4] gopher://gopher.conman.org/gPhlog:2007/09/08/samp1.gif
[5] gopher://gopher.conman.org/gPhlog:2007/09/08/samp2.gif
[6] gopher://gopher.conman.org/gPhlog:2007/09/08/samp3.gif
[7] http://www.nongnu.org/cvs/
[8] http://subversion.tigris.org/
[9] http://en.wikipedia.org/wiki/Image:FileMerge_Screenshot.jpg
[10] http://onsmalltalk.com/programming/smalltalk/smalltalk-in-action/

Email author at [email protected]