In computer text processing, the markup language is a system for creating document annotations in a syntactically syntactically different way from text. Ideas and terminology evolved from the "tagging" of paper manuscripts, the revised instructions by editors, traditionally written in blue pencils in the writer's manuscripts. In digital media, this "blue pencil instruction text" is replaced by a tag, that is, the instruction is expressed directly by the tag or "instruction text encapsulated by the tag." However, the whole idea of ââa mark up language is to avoid formatting work for text, since tags in the mark up language serve the purpose of formatting the appropriate text (such as headers or beginning of the next paragraph... etc.). Every tag used in the Markup language has a property to format the text we write.
Examples include clipping instructions such as those found in troff, TeX and LaTeX, or structural markers such as XML tags. Markup instructs software that displays text to perform the appropriate action, but is removed from the text version the user sees.
Some markup languages, such as widely used HTML, have pre-defined presentation semantics - meaning that their specifications determine how to present structured data. Others, such as XML, do not have it and are a common goal.
HyperText Markup Language (HTML), one of the World Wide Web document formats, is derived from Standard Generalized Markup Language or SGML, and follows many of the markup conventions used in the publishing industry in communication print jobs between authors, editors, and printers.
Video Markup language
Etimologi
The term markup is derived from the traditional publishing practice of "marking" a script, which involves adding handwritten annotations in the form of conventional symbolic printer instructions in margins and text from a paper or printed script. This is the computer jargon used in encoding evidence. For centuries, this task was primarily done by typographers known as "markup men" or "copy markers" marking the text to indicate which fonts, styles and sizes should be applied to each section, and then passing the script on to others for handwriting. Markup is also commonly applied by editors, proofreaders, publishers, and graphic designers, and indeed by document authors.
Maps Markup language
Type of markup language
There are three main general categories of electronic markup:
- Presentation markup
- Type of markup used by traditional word processing systems: binary code embedded in document text that produces the WYSIWYG effect (" what you see is what you get "). This kind of markup is usually hidden from human users, even writers or editors.
- Procedural Marking
- Markup is embedded in text and provides instructions for programs that process text. Notable examples include troff, TeX, and PostScript. It is expected that the processor will run through text from beginning to end, following the instructions encountered. Text with such markup is often edited with visible markup and directly manipulated by the author. Popular procedural markup systems usually include programming constructs, so macros or subroutines can be defined and called by name.
- Descriptive markup
- Markup is used to label portions of documents rather than giving specific instructions about how they should be processed. Notable examples include LaTeX, HTML, and XML. The objective is to separate the inherent structure of the document from the treatment or certain rendition of the document. These markups are often described as "semantics". Examples of descriptive markup are HTML tags
& lt; cite & gt;
, which is used to label quotes. Descriptive markup - sometimes called logical markup or conceptual markup - encourages authors to write in ways that portray material conceptually, rather than visually.
There are many obscure lines between types of markup. In modern word processing systems, presentational markups are often stored in a markup-descriptive oriented system such as XML, and then processed procedurally with implementation. Programming constructs in procedural markup systems such as TeX can be used to create higher-level descriptive markup systems, such as LaTeX.
In recent years, a number of small and not so large markup languages ââhave been developed to allow authors to create text formatted through a web browser, for use in wikis and web forums. This is sometimes called a mild markup language. The markdown or markup language used by Wikipedia is an example of the wiki markup.
History of markup languages ââ
GenCode
The famous first public presentation of the markup language in computer text processing was made by William W. Tunnicliffe at a conference in 1967, though he preferred to call it generic coding. This can be seen as a response to the emergence of programs such as RUNOFF that each use their own control notation, often specific to the target scaling device. In the 1970s, Tunnicliffe led the development of a standard called GenCode for the publishing industry and later became the first chairman of the International Organization Committee for Standardization that created SGML, the first standard descriptive markup language. The book's designer, Stanley Rice, published speculation along the same lines in 1970. Brian Reid, in his 1980 dissertation at Carnegie Mellon University, developed the theory and implementation of the use of descriptive markup in actual usage.
However, IBM researcher Charles Goldfarb is more commonly seen today as the "father" of the markup language. Goldfarb found the basic idea when working on a primitive document management system intended for law firms in 1969, and helped create IBM GML later on that same year. GML was first disclosed to the public in 1973.
In 1975, Goldfarb moved from Cambridge, Massachusetts to Silicon Valley and became a product planner at IBM Almaden Research Center. There, he convinced IBM executives to deploy GML commercially in 1978 as part of IBM Document Composition Facility products, and it was widely used in business within a few years.
SGML, based on GML and GenCode, was developed by Goldfarb in 1974. Goldfarb eventually became chairman of the SGML committee. SGML was first released by ISO as ISO 8879 standard in October 1986.
troff and nroff
Some early examples of computer markup languages ââavailable outside the publishing industry can be found in the letters on Unix systems such as troff and nroff. In this system, the format command is inserted into the document text so that the letters software can format the text according to the editor's specifications. This is a recurring process of trying to get the document printed correctly. The availability of WYSIWYG ("what you see is what you get") publishing software supersedes much of this language's usage among normal users, although serious publishing jobs still use markup to define non-visual text structures, and the WYSIWYG editor now usually stores documents in a markup-based language format.
TeX
Another major publishing standard is TeX, created and refined by Donald Knuth in the 1970s and '80s. TeX concentrates on detailed text layout and font descriptions for sorting math books. It takes Knuth to spend a lot of time investigating the art of layout letters. TeX is mainly used in the academic world, where it is a de facto standard in many disciplines. The TeX macro package known as LaTeX provides a descriptive markup system above TeX, and is widely used.
Scribe, GML and SGML
The first language to make a clear distinction between structure and presentation was the Scribe, developed by Brian Reid and described in his doctoral thesis in 1980. The Scribe is revolutionary in a number of ways, not least introducing a style idea separate from the marked documents, language that controls the use of descriptive elements. Scribe affects the development of Generalized Markup Language (later SGML) and is the direct ancestor of HTML and LaTeX.
In the early 1980s, the idea that markup should be focused on the structural aspects of a document and let the visual presentation of that structure to the translator lead to the creation of SGML. This language was developed by a committee headed by Goldfarb. It includes ideas from various sources, including the Tunnicliffe project, GenCode. Sharon Adler, Anders Berglund, and James A. Marke are also key members of the SGML committee.
SGML specifies the syntax to include markup in the document, as well as one to separately explain which what tag is allowed, and where (Document Type Definition (DTD) or schema). This allows authors to create and use whatever markup they want, pick the tags that make the most sense for them and be named in their own natural language. As such, SGML is the correct meta language, and many special markup languages ââderive from it. From the late 80s, the most substantial new markup language has been based on the SGML system, including for example TEI and DocBook. SGML was enacted as an International Standard by the International Organization for Standardization, ISO 8879, in 1986.
SGML finds wide acceptance and is used in fields with very large-scale documentation requirements. However, many find it complicated and difficult to learn - the side effects of the design are trying to do too much and are too flexible. For example, SGML creates an optional end tag (or start-tag, or even both) in certain contexts, because the developer thinks markup will be done manually by overworked support staff who will appreciate keystrokes.
HTML
In 1989, computer scientist Sir Tim Berners-Lee wrote a memo proposing an Internet-based hypertext system, then assigned HTML and wrote the browser and server software in the last part of 1990. The first publicly available HTML description is a document called "HTML Tags ", first mentioned on the Internet by Berners-Lee in late 1991. It describes 18 elements that comprise a relatively simple initial design of HTML. Except for hyperlink tags, this is strongly influenced by SGMLguid, an in-house SGML based documentation format at CERN. These eleven elements still exist in HTML 4.
Berners-Lee considers HTML as a SGML application. The Internet Engineering Task Force (IETF) formally defines it as such with the publication of mid-1993 from the first proposal for the HTML specification: "Internet-Draft Hypertext Markup Language (HTML)" by Berners-Lee and Dan Connolly, which includes SGML Document Type Definition to define grammar. Many HTML text elements are found in the ISO 1988 TR 9537 technical report Techniques for using SGML , which in turn includes features of the original text formatting language as used by the RUNOFF command developed in the early 1960s for the CTSS operating system (Compatible Time Sharing System). These format commands come from those used by typesetters to format the document manually. Steven DeRose argues that the use of HTML descriptive markup (and the influence of SGML specifically) is a major factor in Web success, due to flexibility and extended activation. HTML becomes the primary markup language for creating web pages and other information that can be displayed in web browsers, and is most likely the most used markup language in the world today.
XML
XML (Extensible Markup Language) is a widely used meta-markup language. XML was developed by the World Wide Web Consortium, in a committee created and led by Jon Bosak. The primary purpose of XML is to simplify SGML by focusing on specific issues - documents on the Internet. XML remains a meta language like SGML, which allows the user to create any required tag (therefore "expandable") and then describe those tags and their permitted uses.
XML adoption is helpful because each XML document can be written in such a way that it is also an SGML document, and existing SGML users and software can easily switch to XML. However, XML eliminates many of the more complex and human-oriented features of SGML to simplify implementation environments such as documents and publications. However, it seems to strike a happy medium between simplicity and flexibility, and is quickly adopted for many other uses. XML is now widely used to communicate data between applications.
XHTML
Since January 2000, all W3C Recommendations for HTML have been based on XML rather than SGML, using the abbreviated XHTML (E x stands H yper T ext M arkup L anguage). The language specification requires that the XHTML Web document must be well-formed XML documents. This allows for tighter and stronger documents when using known tags from HTML.
One of the most striking differences between HTML and XHTML is the rule that all tags must be closed : empty HTML tags like & lt; br & gt;
must be closed with a regular end tag, or be replaced by a custom form: & lt; br/& gt;
(space before '/
' in the end tag is optional, but is often used because it allows some pre-XML Web browsers, and SGML parsers to accept tags). Another is that all attribute values ââin the tags should be quoted. Finally, all the tag names and attributes in the XHTML namespace must be lowercase to be valid. HTML, on the other hand, is not case-sensitive.
Other XML-based apps
Many XML-based applications now exist, including Resource Description Frameworks such as RDF/XML, XForms, DocBook, SOAP, and Web Ontology Languages ââ(OWL). For some of these lists, see List of XML markup languages.
The markup language feature
A common feature of many markup languages ââis that they combine text documents with markup instructions in the same data stream or file. This is not necessary; it is possible to isolate the markup of the text content, using pointers, offsets, IDs, or other methods to coordinate both. Such a distinctive "deadlock markup" for internal representations that use programs to work with marked documents. However, embedded or "inline" markups are much more common elsewhere. Here, for example, is a small part of the text marked in HTML:
The codes that are in angle brackets & lt; like this & gt;
is the markup instruction (known as the tag), while the text between these instructions is the text of the actual document. The codes h1
, p
, and em
are examples of semantic markup , because they represent the intended purpose or the meaning of the text they include. Specifically, h1
means "this is a first level header", p
means "this is a paragraph", and em
means "this is a word or phrases emphasized ". A program that interprets such structural markup can apply its own rules or styles to present various pieces of text, using different types of typography, courage, font size, indentation, color, or other styles, as desired. A tag such as "h1" (header level 1) may be presented in upper case sans-serif type letters, for example, or in monospaced documents (perhaps typewriters) may be underlined - or may not change the presentation at all.
By contrast, the i
tag in HTML is an example of a presentational markup ; usually used to determine certain characteristics of the text (in this case, the use of tilted typeography) without specifying the reason for that appearance.
The Text Encoding Initiative (TEI) has published an extensive guide to how to encode interesting texts in the humanities and social sciences, developed over the years of international cooperation work. This guide is used by projects that encode historical documents, works of certain scholars, periods or genres, and so on.
Alternate use
While the idea of ââmarkup languages ââoriginated with text documents, there is an increasing use of markup languages ââin the presentation of other types of information, including playlists, vector graphics, web services, content syndication, and user interfaces. Most are XML applications, because XML is a well-defined language and can be expanded.
XML usage also leads to the possibility of combining multiple markup languages ââinto one profile, such as XHTML SMIL and XHTML MathML SVG.
Because of markup languages, and more commonly the language of data description (not always textual markup), not programming languages ââ(they are data without instructions), they are easier to manipulate than programming languages ââ- for example, web pages are presented as HTML documents instead of C code, and thus may be embedded in other web pages, displayed when only partially accepted, and so on. This leads to the web design principle of the smallest power rule, which advocates using at least (computing) powerful languages ââthat fulfill the task of facilitating such manipulation and reuse.
See also
- Comparison of document markup languages ââ
- Curl (programming language)
- List of markup languages ââ
- Price drop
- ReStructuredText
- Programming language
- Style language
References
External links
Source of the article : Wikipedia