Re: [xsl] text replacement with mixed content

Thanks, yes that's good...  I did have a look at this but Test 7 has
thrown me, I don't want to say it's impossible but it's definitely
beyond what I could do in the time...



On 31 August 2011 16:01, Geert Bormans <geert@xxxxxxxxxxxxxxxxxxx> wrote:
> Hi Andrew,
>
> Thanks for your patience and continuing attention.
>
> I tried to prove in words to you that the test XML already provided is a
> good collection of tests
> and the results provided are correct (since you were doubting that)
>
> I made a test set now
> for each test in the test set, there is a <in> saying what the task is
> (pattern to be replaced is @original, replacement is @revision)
> and an out saying what it should become
>
> I think this is complete now,
> I hope this is what you were after
>
> cheers
>
> Geert
>
> <?xml version="1.0" encoding="UTF-8"?>
> <test-set>
>    <test id="1">
>        <in original="foo" revision="bar" >
>            <p>This old foo is breaking down my foo</p>
>        </in>
>        <out>
>            <p>This old <rev>bar</rev> is breaking down my
<rev>bar</rev></p>
>        </out>
>    </test>
>    <test id="2">
>        <in original="foo" revision="bar" >
>            <p>This old <b>foo</b> is breaking down <i>my foo</i></p>
>        </in>
>        <out>
>            <p>This old <b><rev>bar</rev></b> is breaking down <i>my
> <rev>bar</rev></i></p>
>        </out>
>    </test>
>    <test id="3">
>        <in original="old foo" revision="new bar" >
>            <p>This old <b>foo</b> is breaking down <i>my old foo</i></p>
>        </in>
>        <out>
>            <p>This <rev>new bar</rev> is breaking down <i>my <rev>new
> bar</rev></i></p>
>        </out>
>    </test>
>    <test id="4">
>        <in original="old foo" revision="new bar" >
>            <p><b>This old foo</b> is breaking down <i>my old frog</i></p>
>        </in>
>        <out>
>            <p><b>This <rev>new bar</rev></b> is breaking down <i>my old
> frog</i></p>
>        </out>
>    </test>
>    <test id="5">
>        <in original="old foo" revision="new bar" >
>            <p><b>This old    fo-o</b> is breaking down <i>my OLD
FOO</i></p>
>        </in>
>        <out>
>            <p><b>This <rev>new bar</rev></b> is breaking down <i>my
<rev>new
> bar</rev></i></p>
>        </out>
>    </test>
>    <test id="6">
>        <in original="this old foo is breaking" revision="a new bar is
> building" >
>            <p>This old foo did not realize that <b>this </b>old
> <i><inner/><old-rev rev="1">foo</old-rev></i> is <empty/> breaking <i>this
> old foo</i></p>
>        </in>
>        <out>
>            <p>This old foo did not realize that <rev>a new bar is
> building</rev> <i>this old foo</i></p>
>        </out>
>    </test>
>    <test id="7">
>        <in original="this old foo is breaking" revision="a new bar is
> building" >
>            <p><b type="stronger">I <i>did not realize that this </i></b>old
> foo is breaking <i>this old foo</i></p>
>        </in>
>        <out>
>            <p><b type="stronger">I <i>did not realize that </i></b><rev>a
> new bar is building</rev> <i>this old foo</i></p>
>        </out>
>    </test>
> </test-set>
>
> At 16:24 31/08/2011, you wrote:
>>
>> Ok what you've done there is written a load of words... 5 people
>> writing test cases from that could have 5 different interpretations.
>> For example this just sounds like a riddle:
>>
>> "markup that starts and ends in A can be dropped
>> markup that starts in A and ends outside A (with the exception of
>> markup ending right after closing A) must be forced to reopen"
>>
>> What's needed is an unamibiguous set of input and output combinations
>> of just xml, one for each variation of markup, with a short
>> description if its not obvious.  That does 2 things - its explains the
>> problem to other people in the best way, and provided all the tests
>> cover all the areas, you know you are done when all the tests pass.
>>
>>
>>
>> On 31 August 2011 14:58, Geert Bormans <geert@xxxxxxxxxxxxxxxxxxx> wrote:
>> > Let me summarize some rules for markup, to show I am not throwing away
>> > all
>> > markup (but indeed throw away some of it)
>> >
>> > There is a pattern A that needs to be replaced with a replacement B
>> > Pattern  A is described in a string value that triggers replacement but
>> > interfering markup can break the pure string in the actual document.
>> > A and B can contain multiple words
>> > A can contain markup, B is purely text
>> > B will be wrapped in an element to indicate it is a
>> > replacement(revision)
>> > markup that ends in A but did start before A (with the exception of
>> > markup
>> > that started right before A) will be forced to close prior to A's
>> > replacement
>> > eg.
>> > <p><b>here is some bolded text that my </b> foo</p>
>> > will become
>> > given that "my foo" needs to become "your bar"
>> > <p><b>here is some bolded text that </b><rev>your bar</rev></p>
>> >
>> > but
>> > <p>here is some bolded text that <b>my </b> foo</p>
>> > will become
>> > <p>here is some bolded text that <rev>your bar</rev></p>
>> >
>> > this is all about revisions, and the tricky part is to maintain or not
>> > maintain earlier revisions
>> >
>> > markup that starts and ends in A can be dropped
>> > markup that starts in A and ends outside A (with the exception of markup
>> > ending right after closing A) must be forced to reopen
>> > there is a predictable boundary (p in this example) an A should not
>> > cross
>> > that boundary
>> > markup in A does not break words
>> > soft hyphens and non breaking spaces (indicated by '-' in the example)
>> > can
>> > break "words"
>> >
>> > hope this helps, pretty confident that the example covers most of this
>> > and
>> > the result is what I need
>> >
>> >> Ok I've not totally convinced by that expected output, if you are
>> >> dropping that level of markup (such as the <j>) then you are heading
>> >> towards just stripping all the markup...
>> >>
>> >> If possible do some actual real world example, with solid expected
>> >> results.  When the task is "non-trivial" like this one, you really
>> >> don't want mistakes in the expected results.
>> >>
>> >>
>> >> --
>> >> Andrew Welch
>> >> http://andrewjwelch.com
>> >
>> >
>>
>>
>>
>> --
>> Andrew Welch
>> http://andrewjwelch.com
>
>



--
Andrew Welch
http://andrewjwelch.com
<- Previous	Index	Next ->
	Thread	Re: [xsl] text replacement with mix, Geert Bormans
	Date	[xsl] XSL-List Guidelines, Mulberry Technologie
	Month
<-prev [Thread] next->	<-prev [Date] next->
Month Index \| List Home