Software:HaXml

From HandWiki
HaXml
Stable release
v1.25.13[1] / July 13, 2023; 8 months ago (2023-07-13)
Repositoryhttps://github.com/HaXml/HaXml
Written inHaskell
TypeComputer library
LicenseLGPL-2.1 license

HaXml is a collection of utilities for parsing, filtering, transforming, and generating XML documents using Haskell.[2]

Overview

HaXml utilities include:[2][3]

  • XML parser
  • XML validator
  • a separate error-correcting parser for HTML
  • pretty-printers for XML and HTML
  • stream parser for XML events
  • translator from DTD to Haskell
  • translator from XML Schema definitions to Haskell data types

HaXml provides a combinator library with a set of higher-order functions which process the XML documents after they are represented using the native Haskell data types.[4] The basic data type is Content which represents the document subset of XML.[5]

HaXml allows to convert XML to Haskell data and vice versa, and it also allows to convert XML to XML (by transforming or filtering). The common usage of the HaXml's parser includes defining the method of traversing the XML data and it has the CFilter type (content filter), where type CFilter = Content -> [Content]. It means that this function defined by the user will take a fragment of an XML data and either return more fragments or none at all. This approach allows to choose XML elements satisfying certain conditions (e.g. tags with certain name or all children of a specified tag).[6][7]

Example

In the chapter 22 "Extended Example: Web Client Programming" of the Real World Haskell by Bryan O'Sullivan, Don Stewart, and John Goerzen, the following example is considered.[6] The XML file looks like this (simplified version):

<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:itunes="http://www.itunes.com/DTDs/Podcast-1.0.dtd" version="2.0">
  <channel>
    <title>Haskell Radio</title>
    <link>http://www.example.com/radio/</link>
    <description>Description of this podcast</description>
    <item>First item</item>
    <item>Second item</item>
  </channel>
</rss>

The following content filter is constructed:

channel :: CFilter
channel = tag "rss" /> tag "channel"

This filter is later used to get the title of the channel:

getTitle :: Content -> String
getTitle doc = contentToStringDefault "Untitled Podcast" (channel /> tag "title" /> txt $ doc)

References

External links