* * * * *
Stripping strips from a website
I started reading a new on-line strip, Player Versus Player. [1] Seems
promising but I'd like to read the archive, of which it reaches back to May
of 1998, making it two full years of archives to go through.
It's a simple enough matter to write a program that downloads the entire
archive of strips:
-----[ C ]-----
while(1)
{
sprintf(filename,"%d%02d%02d.gif",year,month,day);
sprintf(url,"
http://www.pvponline.com/archive/%d/pvp%s",year,filename);
sprintf(cmd,"lynx -source %s >%s",url,filename);
system(cmd);
sleep(10); /* be nice on their server */
day ++;
if (day > daysinmonth(year,month))
{
day = 1;
month++;
if (month > 12)
{
month = 1;
year ++;
if (isthistoday(year,month,day)) break;
}
}
}
-----[ END OF LINE ]-----
I feel somewhat odd about doing that though, seeing how they get their
revenue through advertising (not that I agree that's the best way to make
money, but that's beside the point). Well, that and if they check their logs
and see a bunch of requests for just the strips, every 10 seconds, well, in
case I do end up liking the strip I don't want to be banned from their
server.
[1]
http://www.pvponline.com/
Email author at
[email protected]