[stella] New home for TheDig

Subject: [stella] New home for TheDig
From: Glenn Saunders <mos6507@xxxxxxxxxxx>
Date: Sat, 06 Dec 2003 03:27:51 -0800
This is an announcement to let everyone know that Garon has agreed to pass the torch of The Dig over to me.

I would like to help enhance The Dig and make it a more robust, useful, and current application for homebrewers who are looking to mine for data from the list.

This is as an alternative to the browse and search functionality of biglist.

Now, you may all know that google itself has indexed stellalist and through advanced search you can scan through the archives that way.

However, I've also worked out a way to import the Stellalist archive including the full message bodies into a SQL server at homebrewgames.net.

In the import process I am separating out the message body from the header fields and the attachments. Everything is properly deconstructed into their fundamental components where they can be searched on, filtered, ordered, and redisplayed in new ways. I also have an inner lookup table of member names and all the email addresses each of us has used over the years properly assigned to them. This can be used to, for instance, just display _my_ messages in the archive even though I've posted from several different email addresses. All of them will be identified as being mine.

(The only thing it really doesn't do is attempt to thread the discussions. I don't know how Omar is doing it but I didn't really want to even try.)

For those of you familiar enough with the biglist archive, yes, I came up with a way to decrypt the email addresses Omar puts in the HTML comments. It helps that I used to run Stellalist and therefore have memorized most of them. I don't have to redisplay the email addresses (so we can protect everyone from spammers) but it helps to have the real emails internally.

I have asked Russ about how he feels about offering a full browsable interface on my site but I haven't heard back from him yet. I would imagine Omar might have some issues with this as well. I can just offer links back to biglist but I need all the data there for me to properly index it for the search engine.

Ever since I started using Egroups I have been wanting a fancier, prettier way to browse through the Stellalist archive and it can definitely be done now.

I would also like to see some of the messages here be categorized, which would really be a manual process requiring the help of a few of the members here to volunteer.

If we really wanted to start using this system collectively then we could all start to categorize our own messages by using smart-tags in the subject or body of our messages.

For instance, something like this:

[CATEGORY]GAME[/CATEGORY]

When I pull the data in I'd parse these and I could also remove them in the process at least from my copy of the body.

These categories would be most useful as an alternative to keyword searching. For instance, you could look for all instances of the word GAME in messages, but there may be some cases where the message is talking about someone's game project but the word 'game' doesn't happen to appear. So it's more for generic groupings similar to what Garon was doing with The Dig, snippets, disassembly, etc...

Stellalist has covered so much valuable territory over the years that I see the archive of enormous use as a general reference source for us if we can find ways to properly work with it.

Also, if for whatever reason this list shuts down, we will have a completely functional online read-only mirror of all the data for posterity.

Opinions?

P.S. For the record, I have 14,776 messages and 1,079 attachments indexed which is current as of a few days ago. I noticed that the earliest messages have attachments embedded in the plaintext body as UUENCODEd text.





----------------------------------------------------------------------------------------------
Archives (includes files) at http://www.biglist.com/lists/stella/archives/
Unsub & more at http://www.biglist.com/lists/stella/


Current Thread