1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
|
.TH XML2TSV 1 "04/01/2020" "" ""
.SH NAME
xml2tsv \- simple xml-to-tsv converter
.SH SYNOPSIS
.PP
xml2tsv
.PP
.SH DESCRIPTION
.PP
xml2tsv is a simple tool to convert XML given on standard input to a list of
tab-separated-values records. Each XML entry is converted to a line
like:
.EX
/full/path/to/current/entry[TAB]attr1=value1[TAB]attr2=value2[TAB]...[TAB]data[NEWLINE]
.EE
where
.I "/full/path/to/current/entry"
represents the full hierarchy of entries down to the current one. For
instance, the XML snippet:
.EX
<html>
<head>
<title>This is a title</title>
</head>
<body>
<h1>It works!</h1>
<a href="https://my.wonderful.website.net">Click here</a>
</body>
</html>
.EE
will produce the output:
.EX
/html
/html/head
/html/head/title This is a title
/html/body
/html/body/h1 It works!
/html/body/a href=https://my.wonderful.website.net Click here
.EE
By default, xml2tsv quotes '\\n', '\\t', and '\\', and strips other
control characters.
.SH CONFIGURATION
The maximum length of an entry name and the maximum depth of an entry
are fixed to STR_MAX and DEPTH_MAX, and can be changed by editing the
file
.BI config.h
and rebuilduing xml2tsv. It is also possible to change the separator
used on output (SEP, by default set to '\\t'), and the character used to
separate the name of an attribute from its value (SATTR, by default set
to '=').
.SH BUGS
xml2tsv currently lacks an option to force printing control characters
on output, if desired.
.SH AUTHORS
xml2tsv is written and maintained by Vincenzo "KatolaZ" Nicosia
<katolaz@freaknet.org>. The code is based on
.BI xmlparser
by Hiltjo Posthuma <hiltjo@codemadness.org>. You can use, distribute,
modify, and redistribute xml2tsv under the terms of the ISC License.
|