Hi bowerbird, and others You're scoring some direct hits bb. Thanks for the extra criticism (Date: Tue, 28 May 2002 03:51:37 EDT). (1) digests 2058 and 2059 beginning March are indeed accidentally duplicated as flotsam at end Feb. (2) You write: > one post had the dreaded "=hex" charset ugliness, > number 29681 from jay with "droplist" as the header. > it's probably best for one person to take the time to > clean that up right, so everyone doesn't have to do it. Normally I clean this up, partly by hand, partly by macro. Jay's piece, was somehow neglected, and in any case, that sort of cleanup is hazardous at the speed I have to go. I'll send you and Jay this one backchannel for verification. Robotic code erosion is widestread. And it can interact with our own errors to make a deadly cocktail. Authors, you must protect your postings by *anticipating* the hazards of data transport. Everything will ultimately be reduced to strict ASCII text with hard line breaks. Just like the main RFC internet standard postings that similarly aim chiefly at durability and clarity. One important "hex-precaution" in this regard:- Immediately after an equal sign, never type any of 0123456789ABCDEF<CR> (3) > not exactly "errors", but there are probably some things > that glenn could change that would make the automatic > transparent formatting go much easier and be more clear. > > for example, some list digests have a "separator line" > that clearly delineates the break between the messages. > without such separators, this task becomes very difficult, > especially since the header-lines are often jumbled (e.g., > sometimes the date comes first, sometimes something else). > > i'm not sure of the reasons for it, but there doesn't seem > to be a lot of consistency in the structure of these > digests. The beginning of a message is reliably signalled by material that I normally condense into a line like Content: futurebasic_31236.ezm And occasional exceptions are my slipups! Please report them! On the other hand, for the reasons bb mentions, it's next to impossible to detect the end of the message header material. So it would be helpful if Glen would insert something that signals that; his robots know how. Header structure comes to a large extent from the mailer of the message author, whence the variability. It seems to me that the one bad thing you are doing bb is to delete Glen's message serial number that is potentially very useful. Instead I would suggest that Glen put that serial number line on even the first message-by-message list distribution! (4) > in "give", you can inactivate any of the 10 search terms, > and then reactivate them later. so that might help you. Worth a look. I have become something of a speed freak and a power junky. QUED/M (sorry not QED/M) wins for speed; and 'perl' (the pERVERSELY eCLECTIC rUBBISH lISTER) wins for power. So far... Incidentally, I have just located a great load of perl doc at http://www.blindprogramming.com/web.htm This seems a good source for HTML and CGI too. > i also want to work up some routines to write search results > out to a file, and then reload them. that might also help you. > i'm looking to build some auto-indexing tools. > give me some ideas and i'll try and run with 'em. For message selection the preferred custom tool seems to be "procmail" (see yahoo & google). That plus a CGI script might let bb and collaborators autoextract an archive subset for any user-specified subject on some future FB web page. I have my plate overfull just cleaning up the raw material! (5) > yes, "postcard" was a neat little tool. > futurebasic can print2pict as well, though. > just print to a window, and turn it into a pict. >> Cannot extract text but maybe I could learn... > > it depends on how the text was put into the pict. OK. Is the inaccesible stuff heavily locked? > edoc turns text into picts, and can then search them. > but why put the text into picts in the first place? Every Mac Sys allows handsome pict viewing with minimal programming. Have you a code scrap or digest ref to prove that <<futurebasic can print2pict as well, though. just print to a window, and turn it into a pict.>> This technology is still competitive for Mac-only doc as we have for our apps. Additionally the PICTs can convert to PS (pop them into Write Now for example and print to PS) and then to PDF. I am willing to help in the last step for list member public domain & GPL programming doc. (6) > and once again, as i've been saying for years, > thanks for taking on the task of curating these archives... It's very kind of you to say so. (And kinder still to help by spotting the slipups;=) Cheers Laurent S. PS. You all should all be watching for for updated digests when May gets posted next week.