Re: [xsl] bad code Re: Subject: ChatGPT results are "subject

Gentle readers (which includes you, John Lumley :-),

The XSL-List archives are undoubtedly already in the training set for the big
LLMs. All our old code there is what Dorothy is seeing, regurgitated. (From
where else could they have got it?) They should be paying us royalties.

Indeed how well one of these does on an XML or XSLT task is a direct
reflection of how that task is covered in the archives as well as Reddit,
StackOverflow and the open forums altogether. This is easy enough to see if
you switch topics to something even more obscure.

Or if you ask it to go 'meta' and say, tell you things such as who contributes
to the open forums and lists, and what they say - something it will presumably
fabulate as cheerfully as it does about anything, until it's told it
shouldn't. Think about this for a second. This is about the erosion of trust
that Mike K (was it?) noted. Dorothy, you may consider yourself a mid-level
programmer but tell ChatGPT that you are the best, and it will not disagree.

I am not sure this will mean that we can't trust 'facts' any more. But we will
have to be much more intentional about what sources we rely on and how the
integrity of those sources can be guarded. "Fake people" and fake information
about real people are indeed actual, real risks, much more than bad XSLT that
won't actually be deployed, much.

Bringing it back on topic: doesn't the existence of 'confected code', like
applications based on generated code (hat tip to Roger in other thread), more
or less mean we have to come back to unit tests, in order to demonstrate, not
merely claim, the correctness and viability of processes?

And isn't 'confected code' already a problem, even if LLM code-assistance
makes it worse (or better)?

Cheers, Wendell

From: John Lumley john.lumley@xxxxxxxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Sent: Friday, July 7, 2023 9:43 AM
To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
Subject: Re: [xsl] bad code Re: Subject: ChatGPT results are "subject to
review"

Perhaps more importantly, I assume there is no way we can prevent
aforementioned hazard from using the XSLT-list as training data? Having made
some contributions I in no way wish those to be used/mangled by a glorified
deep pattern-matcher. Such a pity that knowledge-based programming didn't get
really pushed much further in the early 90s...
John Lumley
john@xxxxxxxxxxxxxx<mailto:john@xxxxxxxxxxxxxx>


On 7 Jul 2023, at 14:35, Dave Pawson
dave.pawson@xxxxxxxxx<mailto:dave.pawson@xxxxxxxxx>
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx<mailto:xsl-list-service@xxxxxxxxxxxx
rytech.com>> wrote:
Which begs the question, how might the xsl-list archives be ...
declared / converted / made available (whatever) as training data?
 And for this set (minor drawback), how to extract the 'eventual'
solution from others proffered in error?
XSL-List info and archive<http://www.mulberrytech.com/xsl/xsl-list>
EasyUnsubscribe<http://lists.mulberrytech.com/unsub/xsl-list/3302254> (by
email<>)

<- Previous	Index	Next ->
Re: [xsl] bad code Re: Subject: Cha, John Lumley john.lum	Thread	Re: [xsl] bad code Re: Subject: Cha, Michael Kay mike@xxx
Re: [xsl] ChatGPT results are "subj, Dave Pawson dave.paw	Date	Re: [xsl] ChatGPT results are "subj, Trevor Nicholls trev
	Month

<-prev [Thread] next->	<-prev [Date] next->
Month Index \| List Home

Re: [xsl] bad code Re: Subject: ChatGPT results are "subject to review"