Subject: Re: [xsl] Is letting the browser transform XML to XHTML using XSLT a good choice? From: "M. David Peterson" <xmlhacker@xxxxxxxxx> Date: Sun, 5 Mar 2006 13:37:12 -0700 |
Hmmm... we're talking about two different things. Sure, google will parse locate the xml file, and run it through its text processing algorithms, extracting and sorting the information it deems appropriate. However, in the case of raw xml, there is no real type per se. In a defined XML format, such as XHTML, there is a level of understood document structure in which assumptions can be accurately made. For example title, or section header (tags h1-h6), keywords if included and correctly labeled in a meta tag. The same can be said about Atom data feeds, or the Open Document Format, or any number of XML-based documents in which information which is seen as something that is human understandable can be extracted and put inside of their internal database and used to by their internal query engine to determine relevancy to a particular search phrase. But raw XML that has no specification other than the XML 1.x specification it has been built against is MUCH MORE difficult to try an reliably extract qualified information. That doesnt mean that it can't parse the text of the document and pull out what seems to be relavent information, but the chances of that information ever making it to the eyeballs of a human performing a search are as close to nil as you can get and still not be nil. Of course, if you do a specific document type search for xml documents you'll find LOTS of them. There's just not any real presence of human understandable data elements that can accurately be displayed. Maybe they will get lucky here and there, but you can't build a high quality search engine who's foundation is built on the chance that logical data could be extracted. Therefore your not going to find all that many documents of type XML that are not of an understood XML format anywhere near the first X number of pages that the average human performing a search on Google will look at before giving up. Not absolutely certain what that exact number happens to be at this moment in time, but I'm sure someone does and I am guessing its probably less than 100. Of course if you do a site specific search, and that particular site ONLY has XML documents, then obviously all you will find in return is XML documents. then again, the only way Google is going to find those documents in the first place is if it can extract links from at least one known document to begin the spidering process. And to be honest, I can't say for sure if they even bother to search for links inside of generic XML documents. Q: Without turning this into a conversation on "Theories of Google Search Algorithm's" (Please don't... I'm already in enough trouble with Tommie from this weekends adventures as it is ;) :D ) -- does anybody know for sure what Google, Yahoo!, MSN Search, and/or any of the data feed specific search companies such as PubSub, Syndic8, and Technorati will parse for links and what they will not. (NOTE: I can almost be certain that the data feed specific search engines are just that, data feed specific. But that too is a guess) On 3/5/06, Didier PH Martin <martind@xxxxxxxxxxxxx> wrote: > Hello Manfred, > > Manfred said: > I'm stunned that most of you seem to believe that Google ignores XML pages > and you have to transform the XML server-side to feed the search engine. > For evidence of the contrary try the search: > staudinger site:free.pages.at filetype:xml > > Didier replies: > Thanks Manfred, this is a clear evidence that google process xml documents. > Have you any other clues that google associate these xml documents to > keywords or keyphrases? This would help us to discover some truth in this > world full of myths and preconceived ideas. > > I have also another question: Do some of you uses parameter passing to > inject a particular context to XSLT templates? > > Because all of our processing occurs at the client side, we massively use > parameter passing to XSLT stylesheets either through a URL, through <frame> > <iframe> elements or through ECMAcript function call. We found this > parameter passing mechanism tremendously useful. I wonder if others doing > server side transforms are using also parameter passing to inject a > particular context. If yes what do they use. We probably can learn a bit > more about other ways to do things and expand our horizons. > > Other thing I would like to know. We use XML encoding and XSLT processing to > create model driven applications. We found XSLT an excellent language to > perform model to model transformation. Does anybody else is doing some work > on model driven applications using XSLT as a model to model transformation > language? Are the model more sophisticated than just labeled data or does it > also include frame based (like RDF) or object based (like OO) encapsulated > data? > > Cheers > Didier PH Martin > > -- <M:D/> M. David Peterson http://www.xsltblog.com/
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: [xsl] Is letting the browser tr, Mulberry Technologie | Thread | Re: [xsl] Is letting the browser tr, M. David Peterson |
RE: [xsl] Is letting the browser tr, Mulberry Technologie | Date | Re: [xsl] Is letting the browser tr, M. David Peterson |
Month |