XML Basics for Technical Writers

It’s no news that there is demand for technical writers with XML experience, but precisely what might one be expected to know? As a technical writer working with XML, you will be expected to be able to author and publish XML documents. You will also need to have at least a basic understanding of how DTDs and style sheets work with XML documents.

How does XML differ from authoring in Word?

The primary difference between conventional authoring (as in MS Word) and markup languages like XML is that the author of an XML document  must use XML tags or elements. XML uses elements to define content structure. For example, a paragraph would use an element – perhaps the <p> element – to define that the content is a paragraph. So you must explicitly define structural elements such as paragraphs using tags, rather than merely hitting ‘Enter’ to go to the next paragraph. The content of the document – the words and grammar within paragraphs – is exactly the same as a corresponding MS Word file. The only addition to the XML document is that the author must place each content component inside a designated XML element or tag.

The small XML file below shows how a message is contained within XML elements that describe the contents.

Sample XML

What are DTDs?

As a technical writer who works with XML, you will have to understand and work with DTDs. A DTD is a description of all the elements that are permitted in any document that you make using that DTD. A DTD defines the elements in the document and contains the set of permissions by which you may include certain elements within a given document. In a DTD for a simple memo, one would expect titles, paragraphs, and subjects, but not glossaries, tables of contents, numbered lists and the like. A DTD corresponds to the kind of document a reader or user would expect.

The example below shows a small DTD for a memo. The element declarations, the lines that start with “<!ELEMENT”, declare each element in the DTD and the type of content allowed in that element. Several element declarations below declare the element as “PCDATA”, which means that the element can contain plain text.

Memo DTD

DTDs define the structural components of a document.  Because you are required to select a DTD (e.g. memo, report, manual, catalog) at the outset of your authoring, all of the features of that document are then available at the appropriate points in the document’s structure. The DTD is not a template. It does not necessarily require you to place a paragraph only at one particular location. What it does is permit you to use as many paragraphs as you wish at any point in the document where paragraphs are permitted.

XML editors make it quite easy to use DTDs. XML editors usually show you only the available elements at any specific location in the XML file. If you do manage to use an element correctly, the XML editor will let you know that it needs to be fixed when you validate your XML file. If you are using existing DTDs such as DITA or DocBook, then you probably won’t need to know how to edit or create a DTD. But you should have a general understanding of how the DTD works with the XML file.

What are style sheets?

An XML document imposes no conditions of style or formatting. By separating the content from the formatting, XML allows that document to be displayed in the optimized stylistics of the agent doing the display. In effect, by stating only the structural hierarchy of the document, XML supports the display of the document in whatever form is needed, which could be a PDF, Web Help, HTML web page, EPUB, or other document types.

The feature that governs the display characteristics of an XML document is called a style sheet. Basically, style sheets describe the size and internal relationships of document display, fonts, point sizes – a lot of the features you can set yourself in MS Word. What the style sheet does in addition to how you see the document is to assure that the next user, with a different device and different software, sees a version that incorporates as many of the author’s design intentions as that medium can convey. Once again, XML allows for a large range of display features.

Common style sheets for XML include:

  • CSS – CSS can style simple XML documents, but it has limitations in styling content.
  • XSLT – If your XML content will display in a browser, you will probably need to use an XSLT style sheet. XSLT is a style sheet written in XML that transforms the XML elements to other XML elements, usually XHTML or HTML. XSLT allows you to add, remove, and reorganize the content of the source XML file.
  • XSL-FO – Like XSLT, XSL-FO is an XML style sheet that transforms XML documents to other formats. It is used most frequently to output PDF documents.

XSLT and XSL-FO stylesheets can get quite complex to create and modify. But  popular XML formats such as DocBook or DITA include a set of style sheets that you can use to transform your XML to popular formats.

Getting started with XML

There is more to XML than just this, but this is a good start. You can learn to read and work with a DTD in your editor fairly quickly, especially if you’ve been prepping yourself by working with markup in areas like HTML. DTDs require systematic study, some of which you can do online and from texts. You will probably also want some exchanges with others about options and good habits. The Web has a lot of information on style sheets, and if you find yourself facing a particular challenge with a style sheet, you can conduct a web search and you will often find others who have faced the same challenge.

If you want to learn how to author in XML, write DTDs, and write style sheets with the guidance of an instructor, you might also consider taking a course such as our Professional XML Authoring course or our Intro to XML Authoring course. Our courses will help you to learn how to work with an XML editor and how to publish XML documents to other document types.