XX
A
CT
A
CT
TypeType
--
Safe XML Transformations in JavaSafe XML Transformations in Java
TypeType
--
Safe
XML
Transformations
in
JavaSafe
XML
Transformations
in
Java
Copyright © 2009 Anders Møller <[email protected]>
What is XWhat is XACTACT??
XACT is an API for writing XML transformations in Java
Building on Java, we get
the strength of a general
-
purpose programming language
the
strength
of
a
general
-
purpose
programming
language
a rich and well-known standard library
XACT adds the following features:
X
ML data as first-class Java values, with high-level
operations for data manipulation
ff
i
dl
e
ff
icient runt
i
me mo
d
e
l
compile-time validation of transformed XML
do ment
do
cu
ment
s
2/ 23
XML templatesXML templates
XML data is represented as
templates
XML
data
is
represented
as
templates
well-formed XML fragments
contain
gaps
(named / Java expressions)
contain
gaps
(named
/
Java
expressions)
first-class
l
XML x =
[[
hhtl
XML x =
[[
hhtl
v
a
l
ues
immutable
[[
<
h
:
ht
m
l
>
<h:head>
<h:title><[TITLE]></h:title>
[[
<
h
:
ht
m
l
>
<h:head>
<h:title><[TITLE]></h:title>
</h:head>
<h:body bgcolor={color}>
<h:h1><[
TITLE
]></h:h1>
</h:head>
<h:body bgcolor={color}>
<h:h1><[
TITLE
]></h:h1>
<h:h1><[
TITLE
]></h:h1>
<[NAME]>
</h:body>
<h:h1><[
TITLE
]></h:h1>
<[NAME]>
</h:body>
</h:html> ]];</h:html> ]];
3/ 23
Operations on XML templatesOperations on XML templates
parseTemplate – constructs template from
constant strin
g
(syntactic su
g
ar: [[ ]])
p
arseDocument –im
p
orts XML data
p
p
toTemplate/toDocument – exports XML data
plug – inserts templates or strings into gaps
get
selects subtemplates (using
XPath
)
get
selects
subtemplates
(using
XPath
)
gapify – converts subtrees to gaps
validate – runtime check of validity (like type cast)
analyze
compile
-
time
check for validity
analyze
compile
time
check
for
validity
...
4/ 23
Operations on XML templatesOperations on XML templates
append
prepend
has
DOM
like operations
has
remove
DOM
-
like
operations
(but still immutable!)
set
...
XAC
T
unifies the
tem
p
lat
e
a
pp
roach and the
DO
M
a
pp
roach
p
pp
pp
5/ 23
Example: Example: PhoneListPhoneList
<cardlist xmlns="http://businesscard.org"><cardlist xmlns="http://businesscard.org">
<card>
<name>John Doe</name>
<email>[email protected]</email>
<phone>(202) 555
1414</phone>
<card>
<name>John Doe</name>
<email>[email protected]</email>
<phone>(202) 555
1414</phone>
<phone>(202)
555
-
1414</phone>
</card>
<card>
ZhiD/
<phone>(202)
555
-
1414</phone>
</card>
<card>
ZhiD/
<name>
Z
ac
h
ar
i
as
D
oe<
/
name>
<email>zach@notmail.com</email>
</card>
d
<name>
Z
ac
h
ar
i
as
D
oe<
/
name>
<email>zach@notmail.com</email>
</card>
d
<car
d
>
<name>Jack Doe</name>
<email>jack@mailorder.edu</email>
<email>jack@geemail.com</email>
<car
d
>
<name>Jack Doe</name>
<email>jack@mailorder.edu</email>
<email>jack@geemail.com</email>
<email>jack@geemail.com</email>
<phone>(202) 456-1414</phone>
</card>
</cardlist>
<email>jack@geemail.com</email>
<phone>(202) 456-1414</phone>
</card>
</cardlist>
The following solution isn’t the
simplest possible, but it shows
a lot of X
ACT features…
</cardlist></cardlist>
6/ 23
Example: Example: PhoneListPhoneList (1/4)(1/4)
im
p
ort
j
ava.io.*
;
im
p
ort
j
ava.io.*
;
pj ;
import dk.brics.xact.*;
p
ublic class PhoneList
{
pj ;
import dk.brics.xact.*;
p
ublic class PhoneList
{
p
{
static {
XML.getNamespaceMap().put("h", "http://www.w3.org/1999/xhtml");
XML.getNamespaceMap
().put(
"
b
"
,
"
http
://businesscard.org
"
);
p
{
static {
XML.getNamespaceMap().put("h", "http://www.w3.org/1999/xhtml");
XML.getNamespaceMap
().put(
"
b
"
,
"
http
://businesscard.org
"
);
XML.getNamespaceMap
().put(
b
,
http
://businesscard.org
);
XML.getNamespaceMap().put("s", "http://www.w3.org/2001/XMLSchema");
XML.loadXMLSchema("file:xhtml1-transitional.dtd");
XML loadXMLSchema
(
"
file:bcard xsd
"
);
XML.getNamespaceMap
().put(
b
,
http
://businesscard.org
);
XML.getNamespaceMap().put("s", "http://www.w3.org/2001/XMLSchema");
XML.loadXMLSchema("file:xhtml1-transitional.dtd");
XML loadXMLSchema
(
"
file:bcard xsd
"
);
XML
.
loadXMLSchema
(
file:bcard
.
xsd
);
}
...
XML
.
loadXMLSchema
(
file:bcard
.
xsd
);
}
...
}}
Ndhifiddltil
N
amespaces an
d
sc
h
emas are spec
ifi
e
d
d
ec
l
ara
ti
ve
l
y
7/ 23
Example: Example: PhoneListPhoneList (2/4)(2/4)
public static void main(String[] args)
throws XMLException, IOException {
Phonelist
pp
=
new
Phonelist
();
public static void main(String[] args)
throws XMLException, IOException {
Phonelist
pp
=
new
Phonelist
();
Phonelist
pp
new
Phonelist
();
pp.setDefaultWrapper("white");
XML cardlist = XML.parseDocument(new URL("file:bcard.xml"))
.validate(
"
b:cardlist
"
);
Phonelist
pp
new
Phonelist
();
pp.setDefaultWrapper("white");
XML cardlist = XML.parseDocument(new URL("file:bcard.xml"))
.validate(
"
b:cardlist
"
);
.validate(
b:cardlist
);
XML xhtml = pp.transform(cardlist);
xhtml = xhtml.analyze("h:html");
System out println
(
xhtml toDocument
());
.validate(
b:cardlist
);
XML xhtml = pp.transform(cardlist);
xhtml = xhtml.analyze("h:html");
System out println
(
xhtml toDocument
());
System
.
out
.
println
(
xhtml
.
toDocument
());
}
System
.
out
.
println
(
xhtml
.
toDocument
());
}
8/ 23
Example: Example: PhoneListPhoneList (3/4)(3/4)
XMLXML
XML
wrapper;
private void setDefaultWrapper(String color) {
wrapper [[
<h:html>
XML
wrapper;
private void setDefaultWrapper(String color) {
wrapper [[
<h:html>
wrapper
=
[[
<h:html>
<h:head>
<h:title><[TITLE]></h:title>
</h:head>
wrapper
=
[[
<h:html>
<h:head>
<h:title><[TITLE]></h:title>
</h:head>
</h:head>
<h:body bgcolor={color}>
<h:h1><[TITLE]></h:h1>
<[
MAIN
]>
</h:head>
<h:body bgcolor={color}>
<h:h1><[TITLE]></h:h1>
<[
MAIN
]>
<[
MAIN
]>
</h:body>
</h:html>]];
}
<[
MAIN
]>
</h:body>
</h:html>]];
}
}}
9/ 23
Example: Example: PhoneListPhoneList (4/4)(4/4)
public XML transform(XML cardlist) {
return
wrapper.plug
(
"
TITLE
"
,
"
My Phone List
"
)
public XML transform(XML cardlist) {
return
wrapper.plug
(
"
TITLE
"
,
"
My Phone List
"
)
return
wrapper.plug
(
TITLE
,
My
Phone
List
)
.plug("MAIN", makeList(cardlist));
}
return
wrapper.plug
(
TITLE
,
My
Phone
List
)
.plug("MAIN", makeList(cardlist));
}
public XML makeList(XML cardlist) {
XML r = [[<h:ul><[CARDS]></h:ul>]]);
for (Element c : cardlist.getElements("b:card[b:phone]")) {
public XML makeList(XML cardlist) {
XML r = [[<h:ul><[CARDS]></h:ul>]]);
for (Element c : cardlist.getElements("b:card[b:phone]")) {
r = r.plug("CARDS", [[
<h:li>
<h:b><{ c.getString("b:name") }></h:b>,
r = r.plug("CARDS", [[
<h:li>
<h:b><{ c.getString("b:name") }></h:b>,
phone: <{ c.getString("b:phone") }>
</h:li>
<[CARDS]>
]])
phone: <{ c.getString("b:phone") }>
</h:li>
<[CARDS]>
]])
]])
;
}
return r.close();
}
]])
;
}
return r.close();
}
}}
10 / 23
Example: Example: PhoneListPhoneList (4/4)(4/4)
public XML transform(XML cardlist) {
return wrapper.plug(
"
TITLE
"
,
"
My Phone List
"
)
public XML transform(XML cardlist) {
return wrapper.plug(
"
TITLE
"
,
"
My Phone List
"
)
return
wrapper.plug(
TITLE
,
My
Phone
List
)
.plug("MAIN", makeList(cardlist));
}
return
wrapper.plug(
TITLE
,
My
Phone
List
)
.plug("MAIN", makeList(cardlist));
}
public XML makeList(XML cardlist) {
XML r = [[<h:ul><[CARDS]></h:ul>]]);
for (Element c : cardlist.getElements(“b:card[b:phone]")) {
public XML makeList(XML cardlist) {
XML r = [[<h:ul><[CARDS]></h:ul>]]);
for (Element c : cardlist.getElements(“b:card[b:phone]")) {
r = r.plug("CARDS", [[
<h:li>
<h:b><{ c.getString(“b:name") }></h:b>,
r = r.plug("CARDS", [[
<h:li>
<h:b><{ c.getString(“b:name") }></h:b>,
phone: <{ c.getString(“b:phone") }>
</h:li>
<[CARDS]>
]])
phone: <{ c.getString(“b:phone") }>
</h:li>
<[CARDS]>
]])
]])
;
}
return r.close();
}
]])
;
}
return r.close();
}
}}
11 / 23
Catching errors with the program analyzerCatching errors with the program analyzer
...
[[
hl[
]/hl
]])
...
[[
hl[
]/hl
]])
...
[[
[
]
]])
...
[[
[
]
]])
XML r =
[[
<
h
:u
l
><
[
CARDS
]
><
/h
:u
l
>
]])
;
...
XML r =
[[
<
h
:u
l
><
[
CARDS
]
><
/h
:u
l
>
]])
;
...
XML r =
[[
<
[
CARDS
]
>
]])
;
...
XML r =
[[
<
[
CARDS
]
>
]])
;
...
*** Validation error*** Validation error
Source: element
{http://www.w3.org/1999/xhtml}body at
PhLitli41l 31
Source: element
{http://www.w3.org/1999/xhtml}body at
PhLitli41l 31
Ph
one
Li
s
t
li
ne
41
co
l
umn
31
Schema: file:xhtml1-transitional.dtd line
913 column 26
Ph
one
Li
s
t
li
ne
41
co
l
umn
31
Schema: file:xhtml1-transitional.dtd line
913 column 26
913
column
26
Error: invalid child:
{http://www w3 org/1999/xhtml}
li
913
column
26
Error: invalid child:
{http://www w3 org/1999/xhtml}
li
{http://www
.
w3
.
org/1999/xhtml}
li
{http://www
.
w3
.
org/1999/xhtml}
li
12 / 23
Typed gapsTyped gaps
<h:html>
h:head
<h:html>
h:head
<
h:head
>
<h:title><[s:string TITLE]></h:title>
</h:head>
<
h:head
>
<h:title><[s:string TITLE]></h:title>
</h:head>
<h:body bgcolor={color}>
<h:h1><[s:string TITLE]></h:h1>
[
hFl
MAIN
]
<h:body bgcolor={color}>
<h:h1><[s:string TITLE]></h:h1>
[
hFl
MAIN
]
<
[
h
:
Fl
ow
MAIN
]
>
</h:body>
<
/
h:html>
<
[
h
:
Fl
ow
MAIN
]
>
</h:body>
<
/
h:html>
A
typed
gap can only be plugged with a valid value
//
A
typed
gap
can
only
be
plugged
with
a
valid
value
13 / 23
Optional type annotationsOptional type annotations
Declares a variable holdin
g
an
y
XML tem
p
late:
g
y
p
XML foo;XML foo;
Annotated type:
@Type(
"
S
"
)
XML
foo
;
@Type(
"
S
"
)
XML
foo
;
– the value of foo must have schema type
S
@Type(
S
)
XML
foo
;
@Type(
S
)
XML
foo
;
Annotated type with gaps:
@Type(
"
S
[
T
g
T
g
]
"
)
XML foo;
@Type(
"
S
[
T
g
T
g
]
"
)
XML foo;
the value of foo must have schema t
yp
e
S
@Type(
S
[
T
1
g
1
,...,
T
n
g
n
])
XML
foo;
@Type(
S
[
T
1
g
1
,...,
T
n
g
n
])
XML
foo;
yp
if every gap
g
i
is plugged with a value of type
T
i
14 / 23
Example: Example: PhoneList2PhoneList2
public @Type("h:html[s:string TITLE, h:Flow MAIN]") XML wrapper; public @Type("h:html[s:string TITLE, h:Flow MAIN]") XML wrapper;
Now using an
annotated
XML type
Now
using
an
annotated
XML
type
15 / 23
Example: Example: PhoneList2PhoneList2
public @Type("h:html") XML transform(@Type("b:cardlist") XML cardlist) {
return wrapper plug
(
"
TITLE
""
My Phone List
"
)
public @Type("h:html") XML transform(@Type("b:cardlist") XML cardlist) {
return wrapper plug
(
"
TITLE
""
My Phone List
"
)
return
wrapper
.
plug
(
TITLE
,
My
Phone
List
)
.plug("MAIN", makeList(cardlist));
}
return
wrapper
.
plug
(
TITLE
,
My
Phone
List
)
.plug("MAIN", makeList(cardlist));
}
Annotated XML types are implicit validate & analyze instructions
allows allows
modula
r
reasoning!
modula
r
reasoning!
16 / 23
Runtime representation of XML templatesRuntime representation of XML templates
Obtaining reasonable
runtime efficiency
Obtaining
reasonable
runtime
efficiency
of the XML template operations is not trivial
(because of immutability)
(because
of
immutability)
... but it’s possible
17 / 23
The XThe XACTACT program analyzerprogram analyzer
Checks ex
p
ressions marked with anal
y
ze
p
y
Checks plug operations for gap types
Checks assi
g
nments and method
in
p
ut
/
out
p
ut for t
yp
e annotations
p/ p
yp
exact answers are im
p
ossible
(
Rice’s theorem
),
p( ),
we settle for conservative approximations!
18 / 23
The main challengesThe main challenges
1. Extract the control-flow from the Java
p
ro
g
ram
pg
2. Define a suitable abstraction of XML templates
3
Define
data
-
flow equations
modeling the
3
.
Define
data
-
flow
equations
modeling
the
operations
immutability of XML templates is crucial here!
19 / 23
A suitable abstractionA suitable abstraction
For each XML expression constructFor each XML expression construct
For
each
XML
expression
,
construct
an XML graph that represents the
For
each
XML
expression
,
construct
an XML graph that represents the
possible valuespossible values
20 / 23
XML graphsXML graphs
An XML graph represents a set of XML templates
(like an “XML tree” with loops, choices, named
g
aps, and re
g
exp text)
Example:
ul
lists with zero or more
li
items that each
Example:
ul
lists
with
zero
or
more
li
items
that
each
contain an integer as text
ulul
sequence
choice
sequence
lili
1
2
[0-9]+[0-9]+
21 / 23
From XFrom XACTACT programs to XML graphsprograms to XML graphs
Constant XML tem
p
late XML
g
ra
p
h
p
gp
XML Schema type XML graph
Model all XML template
operations
and XPath
Model
all
XML
template
operations
and
XPath
expressions on XML graphs
Ch k
ili
bt XML h d
Ch
ec
k
i
nc
l
us
i
on
b
e
t
ween
XML
g
rap
h
an
d
XML Schema type
– we omit the (many!) technical details...
http://www.brics.dk/schematools/
http://www.brics.dk/schematools/
22 / 23
SummarySummary
XACT is a Java API for writing XML transformations
Key ideas:
immutable XML templates
XPath for navigation
XPath
for
navigation
DOM-like operations
ilXMLSh i
opt
i
ona
l
XML
S
c
h
ema type annotat
i
ons
static program analysis for
transformation validity
h//bidk//
h
ttp:
//
www.
b
r
i
cs.
dk/
Xact
/
23 / 23