What Is Sgml Language ? | SGML Language Kya Ha ?

 What Is Sgml Language?

SGML is an ISO standard: "ISO 8879:1986 Information processing – Text and office systems Standard Generalized Markup Language (SGML)", of which there are three versions MarkupSGML stands for Standard Generalized Markup Language.
It can be defined as the standard for defining generalized markup language for documents. It was developed and designed by the International Organisation for Standards i.e ISO.HTML was theoretically an example of an SGML-based language until HTML 5, which browsers cannot parse as SGML for compatibility reasons. The SGML is extended from GML and later on it is extended to HTML and XML.

High Profile Creation Site List 2021 





SGML (Standard Generalized Markup Language) is a standard for how to specify a document markup 
language or tag set. Such a specification is itself a document type definition (DTD).
SGML is not in 
itself a document language, but a description of how to specify one. It is metadata.SGML is based on the idea that documents have structural and other semantic elements that can be described without reference to how such elements should be displayed. The actual display of such a document may vary, depending on the output medium and style preferences. Some advantages of documents based on SGML are: 



They can be created by thinking in terms of document structure rather than appearance characteristics 
(which may change over time). They will be more portable because an SGML compiler can interpret any document by reference to its document type definition (DTD). Documents originally intended for the print medium can easily be re-adapted for other media, such as the computer display screen.
The language that this Web browser uses, Hypertext Markup Language (HTML), is an example of an SGML-based language. There is a document type definition for HTML (and reading the HTML specification is effectively reading an expanded version of the document type definition). In today's distributed networking environment, many documents are being described with the Extensible Markup Language (XML) which is a data description language (and a document can be viewed as a collection of data) that uses SGML principles.

An SGML document may be composed from many entities (discrete pieces of text). In SGML, theentities and element types used in the document may be specified with a DTD, the different haracter 
sets, features, delimiter sets, and keywords are specified in the SGML Declaration to create the concrete syntax of the document.Although full SGML allows implicit markup and some other kinds of tags, the XML specification Each XML document has both a logical and a physical structure. Physically, the document is composedof units called entities. An entity may refer to other entities to cause their inclusion in the document. Adocument begins in a "root" or document entity. Logically, the document is composed of declarations, elements, comments, character references, and processing instructions, all of which are indicated in the document by explicit markup.



For introductory information on a basic, modern SGML syntax, see XML. The following material 
concentrates on features not in XML and is not a comprehensive summary of SGML syntax.

Optional features

SGML generalizes and supports a wide range of markup languages as found in the mid 1980s. These ranged from terse Wiki-like syntaxes to RTF-like bracketed languages to HTML-like matching-tag languages. SGML did this by a relatively simple default reference concrete syntax augmented with a 
large number of optional features that could be enabled in the SGML Declaration.
Not every SGML parser can necessarily process every SGML document. Because each processor's System Declaration can be compared to the document's SGML Declaration it is always possible to know whether a 
document is supported by a particular processor.Many SGML features relate to markup minimization. Other features relate to concurrent (parallel) markup (CONCUR), to linking processing attributes (LINK), and to embedding SGML documentswithin SGML documents (SUBDOC).

The notion of customizable features was not appropriate for Web use, so one goal of XML was to 
minimize optional features. However, XML's well-formedness rules cannot support Wiki-like languages, leaving them unstandardized and difficult to integrate with non-text information systems.



Document validity


SGML (ENR+WWW) defines two kinds of validity. According to the revised Terms and Definitions of ISO 8879 .
A conforming SGML document must be either a type-valid SGML document, a tag-valid SGML document, or both. Note: A user may wish to enforce additional constraints on a document, such as whether a document instance is integrally-stored or free of entity references.A type-valid SGML 
document is defined by the standard as An SGML document in which, for each document instance, there is an associated document type An SGML document, all of whose document instances are fully tagged.
There need not be a document type declaration associated with any of the instances. Note: If there is a document type declaration, the instance can be parsed with or without reference to it.

Terminology

Tag-validity was introduced in SGML (ENR+WWW) to support XML which allows documents with 
no DOCTYPE declaration but which can be parsed without a grammar, or documents which have a 
DOCTYPE declaration that makes no XML Infoset contributions to the document. The standard calls 
this fully tagged.
Integrally stored reflects the XML requirement that elements end in the same entity in 
which they started. Reference-free reflects the HTML requirement that entity references are for special 
characters and do not contain markup. SGML validity commentary, especially commentary that was 
made before 1997 or that is unaware of SGML (ENR+WWW), covers type-validity only.
The SGML emphasis on validity supports the requirement for generalized markup that markup should 
be rigorous.


The extension of SGML files is-  Characteristics The SGML Declarations.


The Prologue, containing a DOCTYPE declaration with the various markup declarations that together 
make a DTD i.e Document Type Definition. The instance itself, containing one top-most element and 
its contents Components of an SGML Document :

There are mainly three components of SGML document. They are – 

  • SGML Declaration
  • Prolog
  • Document instance.
  • Advantages
  • It has the capability to encode the full structure of the document and can support any media type.
  • It is of much more use than HTML which provides capabilities to code visual representation and not to structure the real piece of information.
  • Separates content from appearance.
  • SGML files encoding is allowed for more complex formatting as compared to HTML.
  • The Stylesheets present in SGML make the content to use for different purposes.
  • Extremely flexible.
  • Well supported with many tools available beacuse of ISO standard.
  • Disadvantages
  • It may be typical to code software in SGML.
  • Tools that are used in SGML are expansive.
  • It may not be used widely.
  • Special software is required to run or to allow the document to display.

Standard versions

  • Original SGML, which was accepted in October 1986, followed by a minor Technical Corrigendum.

  • SGML (ENR), in 1996, resulted from a Technical Corrigendum to add extended naming rules allowing 

  • arbitrary-language and -script markup.

  • SGML (ENR+WWW or WebSGML), in 1998, resulted from a Technical Corrigendum to better 
  •  
  • XML and WWW requirements.

  • SGML is part of a trio of enabling ISO standards for electronic documents developed by ISO/IEC  – 

  • SGML (ISO 8879) – Generalized markup language

  • SGML was reworked in 1998 into XML, a successful profile of SGML. Full SGML is rarely found or 
  • used in new projects.

  • DSSSL (ISO/IEC 10179)
    – Document processing and styling language based on Scheme.

  • DSSSL was reworked into[clarification needed] W3C XSLT and XSL-FO which use an XML syntax. 

  • Nowadays, DSSSL is rarely used in new projects apart from Linux documentation.

  • HyTime – Generalized hypertext and scheduling.

  • HyTime was partially reworked into W3C XLink. HyTime is rarely used in new projects.

  • SGML is supported by various technical reports, in particular


  • ISO/IEC TR 9573 – Information processing – SGML support facilities – Techniques for using 

  • Part 13: Public entity sets for mathematics and science

  • In 2007, the W3C MathML working group agreed to assume the maintenance of these entity sets.


History


SGML descended from IBM's Generalized Markup Language (GML), which Charles Goldfarb, dward 
Mosher, and Raymond Lorie developed in the 1960s. Goldfarb, editor of the international standard,
coined the "GML" term using their surname initials.[5] Goldfarb also wrote the definitive work on 
SGML syntax in "The SGML Handbook".[6] The syntax of SGML is closer to the COCOA format.[clarification needed] As a document markup language, SGML was originally designed to enable the 
sharing of machine-readable large-project documents in government, law, and industry. Many such documents must remain readable for several decades—a long time in the information technology field. 
SGML also was extensively applied by the military, and the aerospace, technical reference, and dustrial publishing industries. The advent of the XML profile has made SGML suitable for widespread plication for small-scale, general-purpose use.


Syntax

An SGML document may have three parts:
the Prologue, containing a DOCTYPE declaration with the various markup declarations that together make a Document Type Definition (DTD), andthe instance itself, containing one top-most element and
its contents.

Concrete and abstract syntaxes


The usual (default) SGML concrete syntax resembles this example, which is the default HTML concrete syntax:


<QUOTE TYPE="example">  typically something like <ITALICS>this</ITALICS></QUOTE>



SGML provides an abstract syntax that can be implemented in many different types of concrete syntax. 
Although the markup norm is using angle brackets as start- and end- tag delimiters in an SGMLdocument (per the standard-defined reference concrete syntax), it is possible to use other characters—

provided a suitable concrete syntax is defined in the document's SGML declaration.For example, an SGML interpreter might be programmed to parse GML, wherein the tags are delimited with a left olon 
and a right full stop, thus, an :e prefix denotes an end tag: :xmp.Hello, world:exmp.. According to the reference syntax, letter-case (upper- or lower-) is not distinguished in tag names, thus the three tags: (<quote>, (ii) <QUOTE>, and (iii) <quOtE> are equivalent. (NOTE: A concrete syntax might change 
this rule via the NAMECASE NAMING declarations).

Formal characterization


SGML has many features that defied convenient description with the popular formal automata theory
and the contemporary parser technology of the 1980s and the 1990s. The standard warns in Annex H:
The SGML model group notation was deliberately designed to resemble the regular expression otation 
of automata theory, because automata theory provides a theoretical foundation for some aspects of the
notion of conformance to a content model. No assumption should be made about the general applicability of automata to content models.

A report on an early implementation of a parser for basic SGML, the Amsterdam SGML Parser notes the DTD-grammar in SGML must conform to a notion of unambiguity which closely resembles the 
and specifies various differences.There appears to be no definitive classification of full SGML against a known class of formal rammar. Plausible classes may include tree-adjoining grammars and adaptive grammars.XML is described as being generally parsable like a two-level grammar for non-validated XML and a Conway-style pipeline of coroutines (lexer, parser, validator) for valid XML.

The class of documents that conform to a given SGML document grammar forms an language. The ML document grammars by themselves are, however, not  grammars.The SGML standard does not define SGML with formal data structures, such as parse trees; however, an SGML document is constructed of a rooted directed acyclic graph (RDAG) of physical storage units known as "entities", which is parsed into a RDAG of structural units known as "elements". The physical graph is loosely characterized as an entity tree, but entities might appear multiple times. Moreover, the structure graph is also loosely characterized as an element tree, but the ID/IDREF markup allows arbitrary arcs.

The results of parsing can also be understood as a data tree in different notations; where the document 
is the root node, and entities in other notations (text, graphics) are child nodes. SGML provides 
apparatus for linking to and annotating external non-SGML entities.The SGML standard describes it in terms of maps and recognition modes . Each entity, and each element, can have an associated notation or declared content type, which determines the kinds of references and tags which will be recognized in that entity and element.
Also, each element can have an associated delimiter map (and short reference map), which determines which characters are treated as delimiters in context. The SGML standard characterizes parsing as a state machine switching between recognition modes. During parsing, there is a stack of maps that configure the scanner, while the tokenizer relates to the recognition modes.

Parsing involves traversing the dynamically-retrieved entity graph,
finding/implying tags and the element structure, and validating those tags against the grammar. An unusual aspect of SGML is that 
the grammar (DTD) is used both passively — to recognize lexical structures, and actively — to generate missing structures and tags that the DTD has declared optional. End- and start- tags can be
omitted, because they can be inferred. Loosely, a series of tags can be omitted only if there is a single, 
possible path in the grammar to imply them. It was this active use of grammars that made concrete 
SGML parsing difficult to formally characterize.

SGML uses the term validation for both recognition and generation. XML does not use the grammar 
(DTD) to change delimiter maps or to inform the parse modes, and does not allow tag omission; 
consequently, XML validation of elements is not active in the sense that SGML validation is active.
SGML without a DTD (e.g. simple XML), is a grammar or a language; SGML with a DTD is a
metalanguage. SGML with an SGML declaration is, perhaps, a meta-metalanguage, since it is a 
metalanguage whose declaration mechanism is a metalanguage.

SGML has an abstract syntax implemented by many possible concrete syntaxes; however, this is not he 
same usage as in an abstract syntax tree and as in a concrete syntax tree. In the SGML usage, a oncrete 
syntax is a set of specific delimiters, while the abstract syntax is the set of names for the delimiters.The 
XML Infoset corresponds more to the programming language notion of abstract syntax introduced by 

Derivatives XML

The W3C XML (Extensible Markup Language) is a profile (subset) of SGML designed to ease the implementation of the parser compared to a full SGML parser, primarily for use on the World Wide 
Web. In addition to disabling many SGML options present in the reference syntax (such as omitting
tags and nested subdocuments) XML adds a number of additional restrictions on the kinds of SGML  syntax. For example, despite enabling SGML shortened tag forms, XML does not allow unclosed start 
or end tags. It also relied on many of the additions made by the WebSGML Annex.
XML currently is
more widely used than full SGML. XML has lightweight internationalization based on Unicode. Applications of XML include XHTML, XQuery, XSLT, XForms, XPointer, JSP, SVG, RSS, Atom, XML-RPC, RDF/XML, and SOAP.

HTML - Main article: HTML

While HTML was developed partially independently and in parallel with SGML, its creator, Tim Berners-Lee, intended it to be an application of SGML.
[citation needed] The design of HTML (Hyper 
Text Markup Language) was therefore inspired by SGML tagging, but, since no clear expansion and parsing guidelines were established, most actual HTML documents are not valid SGML documents. 
Later, HTML was reformulated (version 2.0) to be more of an SGML application; however, the HTML 
markup language has many legacy- and exception-handling features that differ from SGML's 
requirements. HTML 4 is an SGML application that fully conforms to ISO 8879 – SGML.The charter for the 2006 revival of the World Wide Web Consortium HTML Working Group says, "the Group will not assume that an SGML parser is used for 'classic HTML'".Although HTML syntax closely resembles SGML syntax with the default reference concrete syntax, HTML5 abandons any attempt to define HTML as an SGML application, explicitly defining its own parsing rules,which more closely match existing implementations and documents. It does, however, define an alternativeXHTML serialization, which conforms to XML and therefore to SGML as well.


OED

The second edition of the Oxford English Dictionary (OED) is entirely marked up with an SGML-based markup language using the LEXX text editor

The third edition is marked up as XML.

Several modern programming languages support tags as primitive token types, or now support Unicode and regular expression pattern-matching. An example is the Scala programming language.

Applications

Document markup languages defined using SGML are called "applications" by the standard; many pre-
XML SGML applications were proprietary property of the organizations which developed them, and thus unavailable in the World Wide Web. The following list is of pre-XML SGML applications.

Last Words 


I hope you like the information in this post of ours. And you may like to stay connected with us for ome time, if you like our post, then share this post with your friends and your family. So that your group may get help along with you. Somebody has said right, how good are you If you want to know something or ask something, you can comment to us and mail or comments.


Thanks for giving you time

Previous Post Next Post