java - How to flatten an XML file into a set of xpath expressions? -
consider have following example xml file:
<ns1:create xmlns:ns1='http://predic8.com/wsdl/material/articleservice/1/'> <article xmlns:ns1='http://predic8.com/material/1/'> <name xmlns:ns1='http://predic8.com/material/1/'>foo</name> <description xmlns:ns1='http://predic8.com/material/1/'>bar</description> <price xmlns:ns1='http://predic8.com/common/1/'> <amount xmlns:ns1='http://predic8.com/common/1/'>00.00</amount> <currency xmlns:ns1='http://predic8.com/common/1/'>usd</currency> </price> <id xmlns:ns1='http://predic8.com/material/1/'>1</id> </article> </ns1:create>
what best (most efficient) way flatten set of xpath expressions. note also: want ignore namespace , attribute information. (if needed, done pre-processing step).
so want output:
/create/article/name /create/article/description /create/article/price/amount /create/article/price/currency /create/article/id
i’m implementing in java.
edit: ps, might need work in case there no data @ text node, example, following should generate same output above:
<ns1:create xmlns:ns1='http://predic8.com/wsdl/material/articleservice/1/'> <article xmlns:ns1='http://predic8.com/material/1/'> <name /> <description /> <price xmlns:ns1='http://predic8.com/common/1/'> <amount /> <currency xmlns:ns1='http://predic8.com/common/1/'></currency> </price> <id xmlns:ns1='http://predic8.com/material/1/'></id> </article> </ns1:create>
you pretty xslt. looking @ examples, seems want xpath of elements contain text. if that's not case, let me know , can update xslt.
i created new input example show how handles siblings same name. in case, <article>
.
xml input
<ns1:create xmlns:ns1='http://predic8.com/wsdl/material/articleservice/1/'> <article xmlns:ns1='http://predic8.com/material/1/'> <name xmlns:ns1='http://predic8.com/material/1/'>foo</name> <description xmlns:ns1='http://predic8.com/material/1/'>bar</description> <price xmlns:ns1='http://predic8.com/common/1/'> <amount xmlns:ns1='http://predic8.com/common/1/'>00.00</amount> <currency xmlns:ns1='http://predic8.com/common/1/'>usd</currency> </price> <id xmlns:ns1='http://predic8.com/material/1/'>1</id> </article> <article xmlns:ns1='http://predic8.com/material/2/'> <name xmlns:ns1='http://predic8.com/material/2/'>some name</name> <description xmlns:ns1='http://predic8.com/material/2/'>some description</description> <price xmlns:ns1='http://predic8.com/common/2/'> <amount xmlns:ns1='http://predic8.com/common/2/'>00.01</amount> <currency xmlns:ns1='http://predic8.com/common/2/'>usd</currency> </price> <id xmlns:ns1='http://predic8.com/material/2/'>2</id> </article> </ns1:create>
xslt 1.0
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/xsl/transform"> <xsl:output method="text"/> <xsl:strip-space elements="*"/> <xsl:template match="text()"/> <xsl:template match="*[text()]"> <xsl:call-template name="genpath"/> <xsl:apply-templates select="node()|@*"/> </xsl:template> <xsl:template name="genpath"> <xsl:param name="prevpath"/> <xsl:variable name="currpath" select="concat('/',local-name(),'[', count(preceding-sibling::*[name() = name(current())])+1,']',$prevpath)"/> <xsl:for-each select="parent::*"> <xsl:call-template name="genpath"> <xsl:with-param name="prevpath" select="$currpath"/> </xsl:call-template> </xsl:for-each> <xsl:if test="not(parent::*)"> <xsl:value-of select="$currpath"/> <xsl:text>
</xsl:text> </xsl:if> </xsl:template> </xsl:stylesheet>
output
/create[1]/article[1]/name[1] /create[1]/article[1]/description[1] /create[1]/article[1]/price[1]/amount[1] /create[1]/article[1]/price[1]/currency[1] /create[1]/article[1]/id[1] /create[1]/article[2]/name[1] /create[1]/article[2]/description[1] /create[1]/article[2]/price[1]/amount[1] /create[1]/article[2]/price[1]/currency[1] /create[1]/article[2]/id[1]
update
for xslt work elements, remove [text()]
predicate match="*[text()]"
. output path every element. if don't want path output elements contain other elements (like create, article, , price) add predicate [not(*)]
. here's updated example:
new xml input
<ns1:create xmlns:ns1='http://predic8.com/wsdl/material/articleservice/1/'> <article xmlns:ns1='http://predic8.com/material/1/'> <name /> <description /> <price xmlns:ns1='http://predic8.com/common/1/'> <amount /> <currency xmlns:ns1='http://predic8.com/common/1/'></currency> </price> <id xmlns:ns1='http://predic8.com/material/1/'></id> </article> <article xmlns:ns1='http://predic8.com/material/2/'> <name xmlns:ns1='http://predic8.com/material/2/'>some name</name> <description xmlns:ns1='http://predic8.com/material/2/'>some description</description> <price xmlns:ns1='http://predic8.com/common/2/'> <amount xmlns:ns1='http://predic8.com/common/2/'>00.01</amount> <currency xmlns:ns1='http://predic8.com/common/2/'>usd</currency> </price> <id xmlns:ns1='http://predic8.com/material/2/'>2</id> </article> </ns1:create>
xslt 1.0
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/xsl/transform"> <xsl:output method="text"/> <xsl:strip-space elements="*"/> <xsl:template match="text()"/> <xsl:template match="*[not(*)]"> <xsl:call-template name="genpath"/> <xsl:apply-templates select="node()"/> </xsl:template> <xsl:template name="genpath"> <xsl:param name="prevpath"/> <xsl:variable name="currpath" select="concat('/',local-name(),'[', count(preceding-sibling::*[name() = name(current())])+1,']',$prevpath)"/> <xsl:for-each select="parent::*"> <xsl:call-template name="genpath"> <xsl:with-param name="prevpath" select="$currpath"/> </xsl:call-template> </xsl:for-each> <xsl:if test="not(parent::*)"> <xsl:value-of select="$currpath"/> <xsl:text>
</xsl:text> </xsl:if> </xsl:template> </xsl:stylesheet>
output
/create[1]/article[1]/name[1] /create[1]/article[1]/description[1] /create[1]/article[1]/price[1]/amount[1] /create[1]/article[1]/price[1]/currency[1] /create[1]/article[1]/id[1] /create[1]/article[2]/name[1] /create[1]/article[2]/description[1] /create[1]/article[2]/price[1]/amount[1] /create[1]/article[2]/price[1]/currency[1] /create[1]/article[2]/id[1]
if remove [not(*)]
predicate, output looks (a path output every element):
/create[1] /create[1]/article[1] /create[1]/article[1]/name[1] /create[1]/article[1]/description[1] /create[1]/article[1]/price[1] /create[1]/article[1]/price[1]/amount[1] /create[1]/article[1]/price[1]/currency[1] /create[1]/article[1]/id[1] /create[1]/article[2] /create[1]/article[2]/name[1] /create[1]/article[2]/description[1] /create[1]/article[2]/price[1] /create[1]/article[2]/price[1]/amount[1] /create[1]/article[2]/price[1]/currency[1] /create[1]/article[2]/id[1]
here's version of xslt 65% faster:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/xsl/transform"> <xsl:output method="text"/> <xsl:strip-space elements="*"/> <xsl:template match="text()"/> <xsl:template match="*[not(*)]"> <xsl:for-each select="ancestor-or-self::*"> <xsl:value-of select="concat('/',local-name(),'[',count(preceding-sibling::*[local-name()=local-name(current())])+1,']')"/> </xsl:for-each> <xsl:text>
</xsl:text> <xsl:apply-templates select="node()"/> </xsl:template> </xsl:stylesheet>
Comments
Post a Comment