Subject: [xsl] Streaming and mapping plain text From: Wendell Piez <wapiez@xxxxxxxxxxxxxxx> Date: Tue, 17 Sep 2013 15:25:09 -0400 |
Hi, Like Roger, I have some questions about streaming in XSLT 3.0. Consider the problem of the classic mapping of CSV into XML. Assume we have files over 1GB in size, so we wish to stream. Assume also that the lines in the CSV input need to be grouped -- outputs will be XMLs containing data sets from adjacent lines, based on common values in a designatedd field in those lines. But which field this is has to be parameterized, because not every CSV input will have this cell in the same place in the row. We can easily map each line to a sequence of cell elements: <line><cell>1</cell><cell>2</cell><cell>3</cell></line> Since we know the mapping we wish to use, we can also mark the cell we wish to use to group: <line><keycell>1</keycell><cell>2</cell><cell>3</cell></line> (Maybe next time the second cell, not the first, will be keycell.) Then we can group-adjacent select="keycell" over the lines to collect our sequences of lines. My question is how can this be streamed most effectively? If I can stream to a stream, maybe the best way is first to stream the lines, with the mapping I need to generate XML, and then stream the lines into the sequences of grouped lines. If, however, I can only stream the plain text input through, and cannot stream the lines I generate in my first pass (with cells marked) into the second pass (to group the lines) then I need to collect the lines first, based on group-adjacent not on the value of 'keycell' (which isn't known yet) but on (say) tokenize(.,$delimiter)[$pos], where $pos is the position among the cells of 'keycell' for this mapping. Any advice or ideas would be welcome. Thanks! Wendell Wendell Piez | http://www.wendellpiez.com XML | XSLT | electronic publishing Eat Your Vegetables _____oo_________o_o___ooooo____ooooooo_^
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
[xsl] RE: [XSL-FO] RTF Limitation, Jean-Pierre Lamon | Thread | [xsl] Survey on the XSLT tools in E, Jesper Steen Møller |
[xsl] RE: [XSL-FO] RTF Limitation, Jean-Pierre Lamon | Date | [xsl] Survey on the XSLT tools in E, Jesper Steen Møller |
Month |