This is an announcement to let everyone know that Garon has agreed to pass
the torch of The Dig over to me.
I would like to help enhance The Dig and make it a more robust, useful, and
current application for homebrewers who are looking to mine for data from
the list.
This is as an alternative to the browse and search functionality of biglist.
Now, you may all know that google itself has indexed stellalist and through
advanced search you can scan through the archives that way.
However, I've also worked out a way to import the Stellalist archive
including the full message bodies into a SQL server at homebrewgames.net.
In the import process I am separating out the message body from the header
fields and the attachments. Everything is properly deconstructed into
their fundamental components where they can be searched on, filtered,
ordered, and redisplayed in new ways. I also have an inner lookup table of
member names and all the email addresses each of us has used over the years
properly assigned to them. This can be used to, for instance, just display
_my_ messages in the archive even though I've posted from several different
email addresses. All of them will be identified as being mine.
(The only thing it really doesn't do is attempt to thread the
discussions. I don't know how Omar is doing it but I didn't really want to
even try.)
For those of you familiar enough with the biglist archive, yes, I came up
with a way to decrypt the email addresses Omar puts in the HTML
comments. It helps that I used to run Stellalist and therefore have
memorized most of them. I don't have to redisplay the email addresses (so
we can protect everyone from spammers) but it helps to have the real emails
internally.
I have asked Russ about how he feels about offering a full browsable
interface on my site but I haven't heard back from him yet. I would
imagine Omar might have some issues with this as well. I can just offer
links back to biglist but I need all the data there for me to properly
index it for the search engine.
Ever since I started using Egroups I have been wanting a fancier, prettier
way to browse through the Stellalist archive and it can definitely be done now.
I would also like to see some of the messages here be categorized, which
would really be a manual process requiring the help of a few of the members
here to volunteer.
If we really wanted to start using this system collectively then we could
all start to categorize our own messages by using smart-tags in the subject
or body of our messages.
For instance, something like this:
[CATEGORY]GAME[/CATEGORY]
When I pull the data in I'd parse these and I could also remove them in the
process at least from my copy of the body.
These categories would be most useful as an alternative to keyword
searching. For instance, you could look for all instances of the word GAME
in messages, but there may be some cases where the message is talking about
someone's game project but the word 'game' doesn't happen to appear. So
it's more for generic groupings similar to what Garon was doing with The
Dig, snippets, disassembly, etc...
Stellalist has covered so much valuable territory over the years that I see
the archive of enormous use as a general reference source for us if we can
find ways to properly work with it.
Also, if for whatever reason this list shuts down, we will have a
completely functional online read-only mirror of all the data for posterity.
Opinions?
P.S. For the record, I have 14,776 messages and 1,079 attachments indexed
which is current as of a few days ago. I noticed that the earliest
messages have attachments embedded in the plaintext body as UUENCODEd text.
----------------------------------------------------------------------------------------------
Archives (includes files) at http://www.biglist.com/lists/stella/archives/
Unsub & more at http://www.biglist.com/lists/stella/