123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276 |
- \chapter{Introduction}
- Nearly eight years after the appearance of the World Wide Web, it is still a difficult medium to use for the transmission
- of mathematics and scientific material in spite of its success in other areas. Sending mathematics via e-mail or reading
- mathematics into a software package from a web page is not a simple task, depriving the scientific community from a
- powerful communications tool which is the Internet. Likewise, displaying mathematics on the Internet in a way that allows
- editing and reuse has until now been impossible.
- As the Internet continues to grow it is becoming ever more important to facilitate the exchange of mathematics amongst
- users and computer algebra software packages, offering automatic processing of expressions, searching, editing and reuse.
- To overcome these difficulties, various companies and societies have joined together to produce standards for representing mathematics whilst
- preserving mathematical meaning. The World Wide Web Consortium\index{World Wide Web Consortium}~\cite{w3c} and the OpenMath\index{OpenMath
- Society} society~\cite{openmath} have developed the two leading standards currently receiving most attention. These are MathML\index{MathML}
- \cite{mathml} and OpenMath\index{OpenMath} \cite{openmathspec} respectively.
- The chief purpose of OpenMath\index{OpenMath} is to facilitate consistent communication of mathematics between
- mathematical applications. MathML\index{MathML} however, concentrates on displaying mathematics on the web whilst
- maintaining its meaning. Both standards are complementary and used together can provide the opportunity to expand our
- ability to represent, encode and successfully communicate mathematical ideas with one another across the Internet.
- The primary aim of this project is to understand the differences and similarities between OpenMath\index{OpenMath} and
- MathML\index{MathML}, to assess their exchangeability and develop a way of mapping one standard to the other. The main
- objective will be to ultimately design and implement an interface running on REDUCE\index{REDUCE} which will translate
- OpenMath\index{OpenMath} into MathML\index{MathML} and vice versa. This interface will provide REDUCE\index{REDUCE} with
- the capability of exchanging mathematics with other applications as well as displaying output on the World Wide Web and
- reading from it, allowing REDUCE to join the MathML/OpenMath trend.
- \chapter{Literature Review}
- The notation of mathematics has constantly evolved with the appearance of new concepts and ideas. Modern mathematical
- notation is the result of centuries of refinement. As a result of this, the sophisticated symbols with which we write
- mathematics pose certain problems when bringing them onto printed paper. Publishing mathematics is a difficult task simply
- because mathematics do not lend themselves easily to publication.
- Recently, the advances in Internet publishing, following the Internet expansion, have added a new dimension to
- mathematical publishing. New problems as well as new requirements must be dealt with. We want the Internet not only to be
- a medium for displaying mathematics around the world, but also a communications tool for transmitting them.
- How can we ensure that mathematics published on a web page are reusable? Editable? The outputs of one application should
- be displayed on the Internet in a way humans can understand and other applications can reuse. But because there is a
- distinction between presenting mathematical objects, and transmitting their content, merging both into one notation to
- achieve this duality is a non-trivial task.
- In order to fully understand the motivations of this project, as well as appreciating its outcome, it is important to
- carefully illustrate any related issues. We will look into the development of mathematical publishing and how it has
- evolved with the growth of the Internet. This will permit us to better understand the need for mathematical representation
- standards such as MathML\index{MathML} and OpenMath\index{OpenMath} which we shall introduce. Finally we will talk about
- the relation between these standards, the existing software supporting them, and their future.
- With such an overview of the current situation, the necessity of a MathML\index{MathML} to OpenMath\index{OpenMath}
- interface for REDUCE\index{REDUCE} will become clear.
- \section{Mathematical Publishing}
- Before the foundation of the World Wide Web, encoding of mathematical documents was already a widespread practice. Back in
- the days when computers were starting to become popular, the ASCII\index{ASCII} character set (and encodings based on it)
- was the only widely available encoding scheme. The restrictions of such a limited symbol set were soon apparent.
- In the mid seventies, Donald Knuth developed \TeX\index{\TeX}, from which variants such as \LaTeX\index{\LaTeX} stemmed. Layout and
- typesetting of mathematics is extremely demanding and until now, Donald Knuth's \TeX\index{\TeX} had been able to address
- these difficulties in a successful way, appealing to the scientific community who has now made it a standard in scientific
- publishing. \TeX\index{\TeX} has become the tool of choice for producing scientific and mathematical documents.
- Despite its widespread use and ease with which it is authored, \TeX\index{\TeX} does not preserve mathematical semantic
- value, making it unpractical for use in web documents and useless for transmission between applications. \TeX\index{\TeX}
- is only concerned with describing the presentation of mathematics, not the content. Because people are interested in
- transmitting their ideas and research via e-mail or web pages it is fundamental that semantic value is kept.
- While \TeX\index{\TeX} is mainly a UNIX based application, PC applications dealing with mathematical encoding have also emerged. Generally these
- are equipped with a graphical user interface making them easier to use: Design Science\index{Design Science}'s MS Word Equation Editor,
- FrameMaker\index{FrameMaker}, WordPerfect\index{WordPerfect} or ScientificWord\index{ScientificWord} are a few to name examples. All these
- applications\footnote{It is worth noting that PC applications have not had the same success as \TeX\index{\TeX}.} just deal with displaying
- mathematics and ignore semantic value. They are usually vendor specific making them unpractical for use in mathematical web publishing.
- \section{Mathematics and the Internet Challenge}
- \subsection{Html and Mathematics}
- In the early 1990's, The World Wide Web Consortium\index{World Wide Web Consortium}'s Html \index{Html} became the
- standard markup language for publishing on the World Wide Web. It has since evolved and has become an extensible and very
- powerful means of representing interactive Internet documents. In terms of representing mathematics however, Html has
- little support.
- In the first versions of Html\index{Html} , no support for mathematics was included. It was not until 1993 that the first
- intent of embedding mathematics within Internet documents was attempted in the Html+\index{Html!Html+} draft \cite{htmlp}
- presented by the World Wide Web Consortium\index{World Wide Web Consortium}. Equations were represented directly as
- Html+\index{Html!Html+} using an SGML\index{SGML} \cite{sgml} based notation, inspired by \LaTeX's\index{\LaTeX} approach.
- In 1994, the World Wide Web Consortium\index{World Wide Web Consortium} went further in mathematics Internet publishing by
- presenting the Html 3.0\index{Html!Html 3.0} draft \cite{html3} (which later was officially published as the Html
- 3.2\index{Html!Html 3.2} \cite{html3.2} specification with a few modifications) which offered a more comprehensive support.
- They claimed {\it ``Html math is powerful enough to describe the range of math expressions you can create in common word
- processing packages, as well as being suitable for rendering to speech.''}
- Nonetheless, both drafts failed because of lack of interest from popular browser vendors. But even though the mathematical
- ideas in the Html 3.2\index{Html!Html 3.2} specification were never fully deployed, people started thinking more carefully
- about mathematics, and how they could be represented on the WWW.
- In the meantime, while the World Wide Web Consortium\index{World Wide Web Consortium} and other societies continued
- working on developing mathematical support for Internet documents, other solutions to transmitting mathematics on the web
- arose. The lack of a standard approach to uniformly represent mathematics on the Internet pushed mathematicians and
- scientists to use a variety of different techniques to achieve this purpose. Let us give a brief overview of the main
- ones.
-
- \subsection{Embedded Graphics}
- One way of displaying mathematics on the web is by the use of embedded graphics inside Html documents. Mathematical
- equations are represented by graphical images (e.g. gifs) which all browsers display without difficulties. Formulae can be
- viewed in their original rendering, without the browser requiring additional fonts or external viewing programs.
- Nevertheless, these images display low resolutions and printing them results in poor quality documents. There are also
- problems with alignment and sizing. Because graphical images are generally slow to download, documents might take more
- time than desired to be rendered. Since we are only dealing with images, the equations are not editable. No modifications
- can be done on them. For the same reasons, they are not reusable, because semantic value is completely lost.
- This method is widespread but not very appreciated. In the Html 3.0\index{Html!Html 3.0} draft, the World Wide Web
- Consortium\index{World Wide Web Consortium} specifically states its intention of helping users avoid the use of inline
- images to display equations.
- This is the approach used by programs such as \LaTeX\index{\LaTeX}2Html \cite{latex2html} or \TeX\index{\TeX}4ht
- \cite{tex4ht} which can convert \LaTeX\index{\LaTeX} and \TeX\index{\TeX} documents to Html\index{Html} format for direct insertion into the
- Internet. \LaTeX\index{\LaTeX} markup is translated into Html while mathematical equations are converted into graphical
- images. It is worth noting however, that there exist programs such as TtM\index{TtM} \cite{TtM} which translate the
- mathematical sections directly into MathML\index{MathML} presentation markup \index{MathML!presentation markup}.
- \subsection{Graphical Page Display}
- Another way of approaching the problem is by using graphical page displays. The page is rendered into a page-description
- language such as postscript\index{postscript} or PDF\index{PDF}. Internet browsers, aided by an external viewer or plug-in
- can then display the page in its integrity, including any mathematical formulae within it. When using this method,
- documents are displayed with exactly the same layout as the original documents, which could be \TeX\index{\TeX} documents
- for instance. The printing resolution is also maintained at a high quality level.
- But using an external viewer or plug-in involves everyone possessing a copy. A viewer also requires a verbose and large
- file format including all the non-standard fonts used. Just in the same way as the embedded graphics display, any
- mathematics contained within these documents looses its semantic value, as well as the possibility to edit it or modify
- it.
- \section{OpenMath\index{OpenMath} and MathML\index{MathML}}
- These interim solutions have only contributed to the problem by putting in evidence the need of a consistent standardized methodology for the
- transmission of mathematics via the World Wide Web. In view of the failure of existing methods MathML and OpenMath's\footnote{Describing these
- standards in detail is not in the scope of this report. We do encourage the reader to have a careful read through both standard specifications
- \cite{openmath}\cite{mathml} in order to better understand this report and its implications.} significance and importance increased. Both standards
- are complementary yet serving different purposes.
- The primary aim of OpenMath\index{OpenMath} is to facilitate reliable communication of mathematical objects between mathematical applications. It
- ensures semantic content is preserved within the notation. The semantic scope of OpenMath\index{OpenMath} is defined within its content
- dictionaries\index{content dictionaries} (CD) where all symbols used are described defining their semantic value. Related symbols and functions are
- grouped into CD groups. It is expected that applications using OpenMath\index{OpenMath} declare which CD groups they understand.
- MathML\index{MathML} however is World Wide Web oriented in that it seeks to display mathematics on web pages.
- MathML\index{MathML} has two combinable versions, one encoding mathematical objects (presentation
- markup\index{MathML!presentation markup}) and the other encoding mathematical meaning (content markup\index{content
- markup}). Both versions allow authors to encode both the notation which represents a mathematical object and the
- mathematical structure of the object itself. Moreover, authors can mix both kinds of encoding in order to specify both the
- presentation and content of a mathematical idea.
- In fact there are strong links between both recommendations. The communities developing both standards are closely
- related, with some members belonging to both groups. This has resulted in both standards superceding each other in some
- areas.
- The {\it core} OpenMath\index{OpenMath} CD group is the principal CD group. The {\it core} CD group was designed based on
- MathML\index{MathML!MathML 1.0} 1.0, extending the set of symbols covered by MathML\index{MathML!MathML 1.0} 1.0. Its
- intention is not to be very specific, only covering everyday and K-12 (kindergarden to high school level) mathematics just
- as MathML\index{MathML} does.
- For completeness, a MathML\index{MathML} CD group was introduced in the OpenMath\index{OpenMath} standard. It is a subset
- of the {\it core} CD group and has the same semantic scope as do the content elements of MathML\index{MathML}. It is
- expected that most applications will understand the {\it core} CD group, automatically understanding the
- MathML\index{MathML} CD group.
- The recently published MathML\index{MathML!MathML 2.0} 2.0 version has incorporated elements of the {\it core}
- OpenMath\index{OpenMath} CD group which weren't before in MathML\index{MathML!MathML 1.0} 1.0. But in order to keep the
- scope of content markup\index{content markup} down to a reasonable size, the designers of MathML\index{MathML} have
- restricted the mathematics that it attempts to cover to high school level mathematics limiting MathML\index{MathML}'s
- ability to convey mathematical meaning. Because OpenMath\index{OpenMath} is more powerful in this respect, the designers
- of MathML\index{MathML} have introduced means allowing for extensibility. It is possible to encode semantic information
- inside MathML by embeding OpenMath\index{OpenMath} objects within MathML\index{MathML} code.
- This demonstrates the close ties existing between both the World Wide Web Consortium\index{World Wide Web Consortium} and
- the OpenMath\index{OpenMath Society} society. In the MathML\index{MathML!MathML 2.0} 2.0 specification one can read: {\it
- ``The MathML\index{MathML} content elements are heavily indebted to the OpenMath\index{OpenMath} project \ldots''}
- \section{Current Support}
-
- Both standards have received considerable attention, and have mobilized many developers. Support for MathML\footnote{For a comprehensive list of software supporting MathML look at the W3C web site~\cite{w3c}}
- \index{MathML}
- and OpenMath\index{OpenMath} is being introduced in many areas now that a future seems to profile itself.
- The dominance of Java\index{Java} on the Internet today has made it a good candidate for offering a solution to the
- problem of publishing mathematics. The flexibility and power of Java\index{Java} applets can be used in conjunction with
- MathML or OpenMath to display mathematical formulae.
- This approach is currently best represented by WebEQ\index{WebEQ} \cite{webeq}. WebEQ\index{WebEQ} is a collection of programs and Java\index{Java}
- programming libraries dealing with all aspects of putting math on the Web. Because WebEQ\index{WebEQ} is based on MathML\index{MathML},
- WebEQ\index{WebEQ} tools can easily be combined with each other and with other MathML\index{MathML} software to accomplish a wide range of tasks.
- The applet takes a representation of an equation as input, and displays it. The representation has to be some markup language which the applet
- supports (MathML\index{MathML} or Web\TeX\index{WebTeX}). Another Java\index{Java} application is ICEBrowser \cite{ice}. A browser component
- written in Java\index{Java} which renders MathML\index{MathML}.
- By using a Java\index{Java} applet we encounter the same difficulties as when using embedded graphics. In addition to
- this, Java\index{Java} applets have a larger initial download overhead, which can be disturbing to some users.
- Java\index{Java} applets usually offer good equation displays, but different vendors supply different solutions and markup
- languages.
- Another set of applications currently offering MathML support are plug-ins. The main distinction in principle between
- using plug-ins or Java\index{Java} applets is that plug-ins need to be pre-installed on the Internet browser for any
- rendering to take place. IBM\index{IBM} Techexplorer\index{TechExplorer} \cite{ibm} is a representative example under
- development. It currently supports MathML\index{MathML} encodings. IBM\index{IBM}'s approach to the problem is definetely
- bordering the solution the scientific community is hoping to see. Techexplorer can display MathML\index{MathML} and the
- quality of display is acceptable. Hopefully, IBM\index{IBM}'s techexplorer initiative will push other browser vendors and
- companies to adopt MathML\index{MathML} as the leading standard.
- But as with the other temporary solutions, plug-ins also have their limitations.
- Plug-ins have trouble getting the current HTML document font size, changing the size of the window to fit the display, or getting the current HTML document background color. Plug-ins such as IBM\index{IBM}'s are not
- yet widespread, and most people are not familiar with plug-in download and installation.
- In the area of computer algebra, soon many computer algebra packages should have interfaces to both standards. An example
- of this is the MathML\index{MathML} to REDUCE\index{REDUCE} interface available in REDUCE\index{REDUCE} 3.7, or the MathML
- interface built in Mathematica Version 4.
- Various programs convert \LaTeX~documents into MathML. This is important because of the large amount of documents written
- in LaTeX\index{\LaTeX} until now. An example of a program accomplishing this task is TtM\index{TtM} \cite{TtM} for
- instance.
- Various equation editors such as MathType or Design Science\index{Design Science}'s MS equation editor also support
- MathML\index{MathML}. They manipulate expressions and offer easy to use graphical user interfaces. It is possible to
- export equations to MathML format.
- Until now however, both Explorer\index{Explorer} and Netscape\index{Netscape} have not yet incorporated support for
- MathML\index{MathML}, although they have committed themselves in doing so in the near future. Because these are the most
- popular browsers, it is important that they soon provide MathML\index{MathML} facilities in order to boost the use of
- MathML\index{MathML}.
- \newpage
- \section{The future}
- \begin{quotation}
- \emph{``While many in the mathematical and scientific community have already adopted \LaTeX~as the standard for writing
- papers, it appears that MathML\index{MathML} is the future of scientific and mathematical notation on the Web.''} Bob
- Henshaw, UNC.
- \end{quotation}
- Regardless of how efficient MathML \index{MathML}and OpenMath are in transmitting and displaying mathematics, it is clear
- that they will only be of any use if all communities adopt it. It is expected however that most popular software companies
- working on the Internet or on computer algebra packages will soon support MathML and OpenMath. It seems as if MathML and
- OpenMath will recieve the necessary support due to the commitment that various big companies have already shown
- (IBM\index{IBM}, Netscape\index{Netscape}, Microsoft\index{Microsoft}, Wolfram\index{Wolfram}, Design Science\index{Design
- Science}, and many others).
- At the moment some browsers have already implemented MathML\index{MathML} rendering facilities (Amaya\index{Amaya} for
- instance), and soon other bigger browser vendors will join the trend. Mozilla has recently released its latest browser
- which does render MathML. Netscape should follow soon with Navigator5\index{Netscape!Navigator 5}. MathType from Design
- Science\index{Design Science} has released a new version incorporating various tools for dealing with MathML and OpenMath.
- For those not familiar with Design Science\index{Design Science}, they also make MS Word's equation editor. Other
- companies (mainly Stilo) are developing equation editors with MathML and OpenMath facilities which will soon hit the
- market.
- While substantial progress has been made, there are still areas in which more work is required before MathML can be
- incorporated easily into the Internet. Further improvement in coordination between browsers and embedded elements will be
- necessary. Furthermore, higher printing resolution must be achieved.
- MathML and OpenMath are the first XML\index{XML} based markup language to appear on the Internet. They will show the power and limitations of XML.
- An example has been set for other specialist areas which also want to benefit from the Internet.; areas such as Chemical Engineering or Music are
- using XML to develop representation standards. Both standards have been recieved enthousiastically and it will surely not take long before they are
- used widely by the scientific community.
|