[futurebasic] re: Re: notes for laurent about the list archives

Message: < previous - next > : Reply : Subscribe : Cleanse
Home   : May 2002 : Group Archive : Group : All Groups

From: lcs@...
Date: Wed, 29 May 2002 19:54:19 +0200 (MEST)

Hi bowerbird, and others

You're scoring some direct hits bb. Thanks for the
extra criticism (Date: Tue, 28 May 2002 03:51:37 EDT).

(1) digests 2058 and 2059 beginning March are indeed
accidentally duplicated as flotsam at end Feb.

(2) You write:

 > one post had the dreaded "=hex" charset ugliness,
 > number 29681 from jay with "droplist" as the header.
 > it's probably best for one person to take the time to
 > clean that up right, so everyone doesn't have to do it.

Normally I clean this up, partly by hand, partly by macro.
Jay's piece, was somehow neglected, and in any case, that
sort of cleanup is hazardous at the speed I have to go. I'll
send you and Jay this one backchannel for verification.

Robotic code erosion is widestread. And it can interact with
our own errors to make a deadly cocktail.  Authors, you must
protect your postings by *anticipating* the hazards of data
transport. Everything will ultimately be reduced to strict
ASCII text with hard line breaks.  Just like the main RFC 
internet standard postings that similarly aim chiefly at 
durability and clarity.

One important "hex-precaution" in this regard:- 

Immediately after an equal sign, never type any of 

     0123456789ABCDEF<CR>

(3)

 > not exactly "errors", but there are probably some things
 > that glenn could change that would make the automatic
 > transparent formatting go much easier and be more clear.
 > 
 > for example, some list digests have a "separator line"
 > that clearly delineates the break between the messages.
 > without such separators, this task becomes very difficult,
 > especially since the header-lines are often jumbled (e.g., 
 > sometimes the date comes first, sometimes something else).
 > 
 > i'm not sure of the reasons for it, but there doesn't seem 
 > to be a lot of consistency in the structure of these 
 > digests.

The beginning of a message is reliably signalled by material that
I normally condense into a line like

         Content: futurebasic_31236.ezm

And occasional exceptions are my slipups! Please report
them!

On the other hand, for the reasons bb mentions, it's next to
impossible to detect the end of the message header material. So it
would be helpful if Glen would insert something that signals that;
his robots know how.

Header structure comes to a large extent from the mailer of the
message author, whence the variability.

It seems to me that the one bad thing you are doing bb is to
delete Glen's message serial number that is potentially very
useful.  Instead I would suggest that Glen put that serial number
line on even the first message-by-message list distribution!

(4) 

 > in "give", you can inactivate any of the 10 search terms,
 > and then reactivate them later.  so that might help you.

Worth a look. I have become something of a speed freak and a
power junky. QUED/M (sorry not QED/M) wins for speed; and
'perl' (the pERVERSELY eCLECTIC rUBBISH lISTER) wins for
power.  So far...

Incidentally, I have just located a great load of perl doc 
at

       http://www.blindprogramming.com/web.htm

This seems a good source for HTML and CGI too.

 > i also want to work up some routines to write search results
 > out to a file, and then reload them.  that might also help you.

 > i'm looking to build some auto-indexing tools.
 > give me some ideas and i'll try and run with 'em.

For message selection the preferred custom tool seems to be
"procmail" (see yahoo & google).  That plus a CGI script might let
bb and collaborators autoextract an archive subset for any
user-specified subject on some future FB web page.

I have my plate overfull just cleaning up the raw material!

(5)

 > yes, "postcard" was a neat little tool.
 > futurebasic can print2pict as well, though.
 > just print to a window, and turn it into a pict.

>>   Cannot extract text but maybe I could learn...
>
> it depends on how the text was put into the pict.

OK.  Is the inaccesible stuff heavily locked?

> edoc turns text into picts, and can then search them.
> but why put the text into picts in the first place?

Every Mac Sys allows handsome pict viewing with minimal 
programming.

Have you a code scrap or digest ref to prove that <<futurebasic
can print2pict as well, though. just print to a window, and turn
it into a pict.>> This technology is still competitive for
Mac-only doc as we have for our apps.  Additionally the PICTs can
convert to PS (pop them into Write Now for example and print to
PS) and then to PDF.  I am willing to help in the last step for
list member public domain & GPL programming doc.

(6)

 > and once again, as i've been saying for years,
 > thanks for taking on the task of curating these archives...

It's very kind of you to say so.  (And kinder still to help by
spotting the slipups;=)

         Cheers

             Laurent S.


PS. You all should all be watching for for updated digests when
May gets posted next week.