ADVANCING THE IDEAS OF WORLD-WIDE-WEB : HYPER-G
H. Maurer
Institute for Information Processing and
Computer Supported New Media,
Graz University of Technology, Graz/Austria
email:
[email protected]
EXTENDED ABSTRACT
WWW (World Wide Web)[Berners-Lee 92] has become the most successful networked
multimedia system employing the hypertext paradigm over the last few years:
Documents consisting of textual information can have embedded pictures, movies
or audioclips and reside on a s e r v e r , accessible e.g. via internet using
suitable v i e w e r s. The only structuring mechanism for sets of documents is
the facility to place -- in hypertext fashion -- anchors in documents leading
to (linking them to) other documents. Although this mechanism can be used to
create menu-like hierarchical structures WWW databases are basically "flat"
(stratified) collections of documents linked together. Thus, a WWW database can
be seen as a graph whose nodes are the links between them. WWW has become an
easy to use tool, mainly for small and medium-scale multimedia presentations
that are accessible world-wide due to the excellent M o s a i c - viewer that
is available on all major platforms: X-Windows, Mac and MS-Windows.
However, WWW has a number of limitations that become apparent once tasks more
complex than "a few hundred multimedia screen applications" are considered. No
full-text search is provided as part of the WWW server, let alone the
possibility to search across boundaries of WWW servers; authorisation features
are lacking, hence the installation of a number of independent WWW servers
within an organisation is not uncommon, for the simple reason of preventing
access of unauthorised groups. This fragmentation prevents more global
searches: thus, one of the main aims - to tie information together - gets lost
due to the lack of authorisation features and the boundaries imposed by each
WWW server.
To overcome such weaknesses WWW offers an ingenious way out: it allows to start
arbitrary application programs, thus letting users link into other databases,
employ complex search algorithms or activate any other program desirable on top
of WWW servers. This great flexibility is achieved, however, at a big cost: the
uniformity of the interface disappears, different WWW servers start to behave
differently: the whole "jungle" of scattered databases each with a different
feel as we all know it from Internet starts to reappear, now on the level of
WWW.
Realizing this dilemma, a group of some 30 researchers and developers at the
Graz University of Technology has started to systematically examine the ideas,
structure and underlying features of large distributed multimedia servers,
leading, eventually, to a concept embracing WWW yet more general than WWW:
Hyper-G. Hyper-G has been developed carefully to ensure cross-operability with
WWW. Hyper-G databases have gateways to WWW (and Gopher [Alberti 92]), and
conversely; the Hyper-G viewer A m a d e u s (for MS-Windows) and H a r m o n y
(for X-Windows) will allow the perusal of WWW databases; and the Mosaic Viewers
of WWW can be used to access Hyper-G servers. (For more details on Hyper-G see
[Andrews 94a], [Andrews 94b], [Hyper-G 94], [Kappe 93a], and [Kappe 93b].)
The main difference between WWW and Hyper-G servers is that Hyper-G provides
much functionality integrated into it (and hence uniform in nature) that has to
be implemented on top of WWW (and hence potentially differs from site to site)
and that Hyper-G servers work on a truly distributed platform: a user can
activate a number of Hyper-G servers (that may or may not be arbitrarily
geographically dispersed) in such a fashion that the union of all the databases
involved appears as if it were one single database. Indeed, the Hyper-G concept
is a bit deeper and more general:
The basic item of a Hyper-G database is a document cluster rather than a single
document: this is a convenient tool to handle such diverse features as multiple
languages, multiple windows or multiple representations. A typical example of
the latter is the treatment of LaTeX documents in WWW (Mosaic) and Hyper-G: in
Hyper-G the basic idea is to store LaTeX documents as a cluster of two
documents: one of them is a textual document with in-line pictures for all
formulae (this is the approach taken in WWW/Mosaic), the other document is the
DVI File corresponding to the LaTeX document containing links to e.g. pictures,
i.e. a file retaining all the precision and beauty of the original LaTeX
version. For casual browsing on a medium resolution screen the first
alternative is the only viable one, for serious viewing (or printing on a laser
printer) the second one (not available in WWW/Mosaic without some contortions)
is the only one that makes sense.
Document clusters in Hyper-G are put together in so-called
c o l l e c t i o n s,
and a collection can be part of one (or more) parent collections. Thus,
Hyper-G structures its documents into a kind of hierarchy (actually not a tree,
but a DAG). This is useful for many reasons: documents can be inserted without
defining links (not possible in WWW: a document without a link to it is not
(properly) accessible in WWW; it is accessible in Hyper-G, however, due to the
collection structure); collections (and document clusters) can have attributes,
allowing Boolean searches on those attributes; and although Hyper-G provides
the full anchor-link hypertext paradigm it also allows (Boolean or WAIS-like)
full text searches within the scope of any number of user-selected collections.
Since each Hyper-G database is a collection, users can activate even
geographically remote Hyper-G databases (better still: arbitrary
sub-collections within them) and perform powerful searches across all of them.
Note that such a facility is rather hard to implement in WWW: although full
text search can be added on top of WWW databases, scope definitions are very
difficult and automatic searches across various WWW databases are next to
impossible. Thus, Hyper-G avoids the danger of independent "WWW-empires", the
"Balkanisation" of databases as Ted Nelson has so aptly called it!
However, it must be clearly recognized that Hyper-G generalises, but makes full
use of WWW and Mosaic facilities.
It is worthwhile to look at a specific example. Suppose a university has five
different WWW servers operated by five different departments, each department
authorized to modify only its own database, and departments unable or unwilling
to combine the data for exaclty the mentioned authorization problems. Although
a good solution, it is not perfect: to find information on person xxx within
that university, each of the five databases has to be queried. Maybe not even
all of them support full text searching or may support it using different
mechanisms: thus, the problem "where do I have to look and how do I do it"
(well-known from the world of Internet and international databases) can arise
using WWW even within a single institution (and even more so if the databases
are spread over various institutions). Using Hyper-G, each of the five
mentioned WWW databases can be converted into five Hyper-G collections, all of
them belonging to the collection "University yyy". Authorisation to modify data
remains where it is desirable, yet a single full-text search for "Person xxx"
in the collection "University yyy" will reveal all information on person xxx,
if any such information is available. Observe that no manual changes in any of
the WWW databases are necessary; nor is it necessary to abandon the viewer
Mosaic, if users have started to like this beautiful piece of software. On the
other hand, the Harmony viewer (see below) does provide all of Mosaic's
features and a few additional ones (and is available free of charge like
Mosaic), so may become a welcome addition at some stage ...
To take a larger and maybe more pertinent example: suppose a number of
universities in Germany use Hyper-G, each with a collection "Mathematics". By
defining a collection "Mathematics in Germany" a single search will examine all
sub-collections "Mathematics" automatically, independent of where they are
located geographically.
We believe that above notion of unions of collections defining scopes for
searches (or other actions!) is essential to prevent that a new kind of
fragmentation occurring on a new level.
Hyper-G was developed using also knowledge of and experience with WWW;
Hyper-G is thus influenced by WWW and has systematically tried
to stay consistent with WWW without giving up the insights gained in the
meantime:
- Big hypermedia systems must have a structure: a "flat" graph with no
"semantic" meaning of links will not do for large systems.
- Activities in large networked multi-media systems have to be restricted to
scopes definable by the users.
- Activities that are considered central (like searching, structuring,
non-private annotations, etc.) have to be integrated into the basic
system so as to avoid unsystematic proliferation of "unorthogonal" features.
Hyper-G is based on above premises.More on those and other points (like
mechnaisms for gathering statistics,billing and "active" mail )will be
contained in a full version of this paper. However, a few more specific
aspects should be mentioned:
The annotation concept in WWW (actually part of the Mosaic client) allows
"private" annotations that are stored locally. Hyper-G allows to define
authorisation classes for annotations, permitting "private", "group" or
"public" annotations.
Links in WWW are restricted to textual anchors, while Hyper-G supports anchors
in arbitrary data-types like pictures or movies. Links in Hyper-G are
bi-directional. Hyper-G introduces a sophisticated authorisation mechanism
defining for each user the rights to read, create links, modify and annotate.
This provides the basis for sophisticated customisation and even CSCW within
Hyper-G that have to be -- like all other more sophisticated features -- built
on top of WWW (potentially creating confusion and incompatibility).
Hyper-G is being used as information system at a number of universities (Graz
University of Technology and the University of Auckland are two examples); it
has been selected as information system by major organisations such as ESA
(European Space Agency), it is the basis of a multi-media guidance system by
large museums or exhibition operators (such as the new Museum of New Zealand,
or the Images of Austria Presentation at the EXPO' 92 at Sevilla); it is the
platform of one of the most ambitious (30 GByte data) multimedia projects
anywhere (the millenium celebration of Austria) and is the basis of the first
serious attempt of electronically publishing a high-quality journal in computer
scince, J.UCS (Journal of Universal Computer Science): J.UCS is suported by
Springer Pub.Co., has an editorial board of over 100 prominent computer
scientists and more than 25 universites world-wide have agreed to act as
server.
Hyper-G, as a late-comer in the field, has been able to profit from and
incorporate experience from earlier projects such as Gopher and WWW. And
despite the fact that Hyper-G will not be officially released before June 30,
94 above list does show a fairly wide acceptance even of its pre-beta-release
version.
Since Mosaic has been the main driving force for WWW it is worthwhile to
mention that the current X-Windows viewer of Hyper-G will be replaced by
Harmony. (The current Harmony version is available as development prototype
for functionality tests if specifically requested for such tests; it must
not be considered an operational tool before June 30, 94 [Hyper-G 94].)
Harmony includes all features of Mosaic, plus a graphical browser giving
document-type, history, in-and-out links, dynamic and static environments, and
incorporates a viewer of 3D objects and scenes (including navigation within
them) plus a first attempt at producing 3D "information landscapes".
The MS-Windows Viewer A m a d e u s is available as of May 94. It implements a
subset of Harmony's features on a PC-Windows platform.
Summarizing, WWW has been successful in establishing networked multimedia as a
major option for information systems of the future.
Hyper-G has been built using experiences with WWW and other large-scale
networked multimedia systems, preserving full interoperability with WWW, yet
incorporating all those features into the basic system that have been
universally accepted as indispensable. In this sense, Hyper-G tries to
contribute to a more uniform and controlled environment of the world opened by
WWW.
REFERENCES:
[Alberti92] Alberti B., Anklesaria F., Lindner P., McCahill M., Torrey D.,
"The Internet Gopher Protocol: A distributed Document Search and
Retrieval Protocol", Available by anonymous ftp from
boombox.micro.umn.edu in directory: pub/gopher/gopher_protocol.
[Andrews 94a] Andrews, K., Kappe, F.: Soaring Through Hyperspace: A Snapshot of
Hyper-G and its Harmony Client; Proc. Eurographics-Multimedia 94, Graz,
June 94; FTP iicm.tu-graz.ac.at in: pub/Hyper-G/papers.
[Andrews 94b] Andrews, K., Kappe, F.: Hyper-G: A New Tool for Distributed
Hypermedia; submitted to ISMM Int. Conf. on Distributed Multimedia
Systems and Applications, Hawaii (1994); anonymous FTP iicm.tu-graz.ac.at
in: pub/Hyper-G/papers
[Berners-Lee 92] Berners-Lee T., Cailliau R., Groff J., Pollermann B.,
"WorldWideWeb: The Information Universe", Electronic Networking:
Research, Applications and Policy 1, 2 (1992), 52-58.
[Hyper-G 94] Reports, Information and SW concerning Hyper-G; anonymous FTP
iicm.tu-graz.ac.at in: pub/Hyper-G.
[Kappe 93a] Kappe, F., Maurer, H., Scherbakov, N.: Hyper-G --
A Universal Hypermedia System; J.EMH (Journal of Educational Multimedia
and Hypermedia) 2, 1 (1993), 39-66
[Kappe 93b] Kappe, F., Maurer, H.: Hyper-G: A Large Universal Hypermedia
Systems and Some Spin-offs; anonymous FTP siggraph.org, in:
publications/May-93-online/Kappe.Maurer