2

How can I quickly narrow down duplicate id entries in a xml file, example:

<entry id="A">...
<entry id="B">...
<entry id="A">...

and output them

id="A" dup 2 times

Just to let you know I'm a total noob meaning I don't even know how to run any code, so if you have a code for this problem, can you at least tell me the software name I need to run it and I'll look it up from there.

Glorfindel
  • 4,089
  • 8
  • 24
  • 37
  • 1
    Please [edit] your question and expand on what it is you are exactly trying to do (including sample input and expected output). In addition, please detail the environment you are using (OS, etc.). Lastly, asking for help without showing any research you have undertaken to solve the problem is generally considered off-topic on Super User, as is directly inviting open-ended suggestions regarding software to accomplish a specific task. – Anaksunaman Nov 29 '19 at 22:14

1 Answers1

2

Here's an XSLT 2.0 stylesheet that does it:

<xsl:transform version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="/">
  <out>
    <xsl:for-each-group select="//entry" group-by="@id">
      <xsl:if test="count(current-group()) > 1">
        <duplicate id="{current-grouping-key()}" count="{count(current-group())"/>
      </xsl:if>
    </xsl:for-each-group>
  </out>
</xsl:template>

</xsl:transform>

You can run this by (for example) downloading Saxon-HE from SourceForge, and running (from the command line)

java -jar saxon9he.jar -s:input.xml -xsl:count-dupes.xsl

where input.xml is your XML input and count-dupes.xsl is the stylesheet.

I have formatted the output as XML but of course you can change the output format if you like.

Michael Kay
  • 434
  • 2
  • 4
  • I've got an error running above xslt example: `Error at xsl:for-each-group on line 5 column 51 of count-dupes.xsl.orig: XTSE1080: Exactly one of the attributes group-by, group-adjacent, group-starting-with, and group-ending-with must be specified Failed to compile stylesheet. 1 error detected.` The error disappeared after change 5 line to: `` – Kirill Mikhailov Jul 07 '23 at 12:23
  • Thanks, corrected. – Michael Kay Jul 10 '23 at 07:31