XML

From Canonica AI

Introduction

XML (Extensible Markup Language) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. The design goals of XML emphasize simplicity, generality, and usability across the Internet. It is a textual data format with strong support via Unicode for different human languages. Although the design of XML focuses on documents, it is widely used for the representation of arbitrary data structures such as those used in web services.

A screenshot of an XML document with various tags and attributes.
A screenshot of an XML document with various tags and attributes.

History

XML started as a simplified subset of the Standard Generalized Markup Language (SGML), and is designed to be relatively human-legible. By adding semantic constraints, application languages can be implemented in XML. These include XHTML, RSS, RDF, SOAP, and Atom, which are all now widely used.

Syntax

XML syntax refers to the rules that determine how an XML application can be written. The XML syntax is straightforward, but rigid. Documents must adhere to both well-formedness constraints, which are basic syntax rules, and validity constraints, which are additional rules defined by a Document Type Definition (DTD) or XML Schema.

Elements

An XML element is the most common type of XML component. Elements begin and end with opening and closing "tags", which are defined by the name of the element, surrounded by angle brackets. An element can have attributes, which are defined in the opening tag and have a name and a value.

Attributes

Attributes provide additional information about an element. They are always specified in the start tag and come in name/value pairs like: name="value".

Entities

Entities are a way of representing an item of data within an XML document, instead of using the data itself. This can be used to represent characters that are illegal in XML documents, to define commonly used strings of text, or to define single characters.

Structure

XML documents form a tree structure that starts at "the root" and branches to "the leaves". This is referred to as the Logical Structure of XML. Each piece of data is enclosed between a start-tag and an end-tag.

Processing

Processing XML documents involves reading the document, interpreting the document structure, and processing the elements and data within. This can be done using an XML processor, a piece of software that reads XML documents and provides access to their content and structure.

Uses

XML is used extensively to underpin various publishing formats, and is widely used in the field of scientific research, in particular. It is also used in many other industries and contexts.

See Also