Path: senator-bedfellow.mit.edu!dreaderd!not-for-mail
Message-ID: <internet/info-research-faq/[email protected]>
Supersedes: <internet/info-research-faq/[email protected]>
Expires: 20 Sep 2000 07:49:19 GMT
References: <internet/info-research-faq/[email protected]>
X-Last-Updated: 2000/03/04
Organization: none
From: [email protected] (David Novak)
Newsgroups: alt.internet.research,sci.research,alt.answers,sci.answers,news.answers
Subject: Information Research FAQ v.4.1 (Part 9/9)
Followup-To: poster
Approved: [email protected]
Summary: Information Research FAQ: Resources, Tools & Training
Originator: [email protected]
Date: 07 Aug 2000 07:57:01 GMT
Lines: 1386
NNTP-Posting-Host: penguin-lust.mit.edu
X-Trace: dreaderd 965635021 9446 18.181.0.29
Xref: senator-bedfellow.mit.edu sci.research:20502 alt.answers:50553 sci.answers:11943 news.answers:189427

Archive-name: internet/info-research-faq/part9
Posting-Frequency: monthly
Last-modified: Mar 02 2000
URL: http://spireproject.com
Copyright: (c) 2000 David Novak
Maintainer: David Novak <[email protected]>

                       Information Research FAQ     (Part 9/9)

   This part of the FAQ highlights other aspects of information research.
   Please note the disclaimer statement on Part 1 of this FAQ. The full faq
   rests at SpireProject.com/faq.txt, SpireProject.co.uk/faq.txt and
   Cn.net.au/faq.txt.

   Please forward leads and comments to David (david\@cn.net.au) and note
   the disclaimer statement on Part 1 of this FAQ.

   Enjoy,
   David Novak - david\@cn.net.au
   The Spire Project: SpireProject.com, SpireProject.co.uk and Cn.net.au

                               Contents

               ----- Part 9 -----
   32. Internet Information Theory
         32.1 three definitions of the internet
         32.2 information, transaction, entertainment
         32.3 information formats
         32.4 information preparation
         32.5 publishing motivation
         32.6 promoting information
         32.7 information clumps
         32.8 bringing this together
   33. More on the Commercial Information Sphere
         33.1 structure of the database industry
         33.2 advanced search technologies
         33.3 the art of searching
   34. More on the Information Service Industry
         34.1 the extended information market
         34.2 judging information value
   35. Emerging Trends in the information sphere
         35.1 developing an informative internet?
   36. Question and Answer Section
         36.1 How do I find information on the internet?
         36.2 The squeeze is on the Info-Broker
   37. Acknowledgements
   ___________________________________________________

32. Internet Information Theory

   Lets agree the internet is great fun to surf, but less valuable when you
   have a specific question in mind.

   To improve our search skills, we begin by understanding how information
   is arranged on the internet. Contrary to myth, information is not
   disorganized but rather organized very carefully along clear patterns.
   Many patterns are specific to the information format (text document,
   webpage, email message, printed article). Further patterns match the way
   we become aware of information, or are specific to the information
   systems (mailing list, faq, peer-reviewed journal). Your understanding
   of the strengths and weaknesses of each pattern, each format, each
   system, guides your search for information. We shall start by shattering
   the internet, and commenting on the many pieces.

__ 32.1 three definitions of the internet

   Let us be careful when we use the word 'internet'.

   1_ The internet is a physical network; more than a million computers
   continuously exchanging information. The internet allows us to transfer
   information around the world.

   2_ The internet is a landscape of information available on almost every
   topic imaginable. This information appears almost chaotically
   distributed to the world, but holds clear patterns. For instance,
   linking information together are various structures like government web
   links, search engines and FAQ documents.

   3_ The internet is a community of 100+ million individuals. These are
   real people who chose to interact, discuss and share information online.

   What we learn here is not so important as the technique - break the
   large seemingly chaotic system into smaller pieces: pieces that
   hopefully make more sense. Eventually, when we've made sense of the
   little bits, perhaps we can comment astutely on the big-picture.

   In this example, let me just draw your attention to the way most of our
   research effort focuses on the second definition: a landscape of
   information. Much of the best information originates in the third
   definition: the internet is a community. Sometimes it is far more
   effective to ask real people than search the information cyberspace.

   Let us now illuminate more important facets of the internet.

__ 32.2 information, transaction, entertainment

   There is a triad of functions to all online activity:

   Function      -  Activity  -      Unit
   ----------------------------------------
   Information   -  Research  -  The Fact or Conclusion
   Exchange      -  Business  -  The Transaction
   Entertainment -  Play      -  The Experience

   Each internet function grows at a different rate and moves in a
   different direction. The development of forums is firmly in the smallest
   segment dealing with information. This segment is quite poorly organized
   and confusing. The entertainment function in contrast is well financed
   and graphically innovative with clear, profitable opportunities.

   Much of the web is prepared with Exchange or Entertainment in mind.
   "Brochureware" (purely promotional webpages) is rarely required for
   research, but is critical to securing a transaction. Entertainment
   related, or just entertaining, websites abound. Let us recognize just
   how few webpages are information & research related.

   My own experience suggests we are just beginning to see the movements
   towards profiting from providing information. Direct sales of
   information is still chaotic and unrewarding.

__ 32.3 information formats

   The way information is packaged has a great bearing on the content,
   quality and use of the information. This theme is evident throughout the
   work of The Spire Project, and is particularly applicable to internet
   information. Webpages, text files, software, email and database entries
   each have particular qualities. Each shapes, constrains and restricts
   the informative content. These particular qualities apply irrespective
   of the information involved.

   Books are dense, factual, a little old. Articles are short, sharp, more
   recent. News is puff, introductory, immediate. Each way the information
   is packaged, each format, presents the information to set standards.

   Information formats on the internet are the same. Webpages are
   graphical, technical to produce, and not easily updated. FAQs are easier
   to maintain, text only, and attract more peer review. Mailing lists are
   simpler still, text, short, immediate, very peer-reviewed, characterized
   by discussion and resource discovery. Newsgroups are characterized by
   extremely low costs, vulnerable to trashing, poorly managed. Email is
   simple use, one-to-one discussion.

   Lets look at books more closely. Books are created by authors who have
   something to write. Books are printed and marketed by Publishers to the
   bookstores that then provide it to the readers. Each facet of this
   process defines the resource. Books have quality, editorial vetting but
   minimal peer-review, marketable value and a potentially lengthy
   preparation time.

   When it comes to research, why look for a book when investigating
   digital money? Books would just have the wrong qualities - would present
   the information poorly. We need a more current format (digital money is
   a fast moving topic), and a more peer-reviewed format (books have
   editorial vetting, but not intrinsic peer-review). Why not search for a
   mailing list, an FAQ, or an association website. These formats have
   qualities more appropriate to our question.

__ 32.4 information preparation

   Information flows also impress patterns on internet information. Most
   information is transplanted to the web - first created elsewhere. The
   source of information imparts as much pattern as the eventual format the
   information takes.

   Information may appear as a webpage, and conform to our expectations for
   all webpages, but the information may have been prepared from the
   discussion on a mailing list - and thus enjoy a more topical, specific,
   timely and peer-reviewed quality.

   Lets look at FAQs. The best resource in the world on copyright law is
   the musings of a group of copyright lawyers who form the copyright
   mailing list. The copyright FAQ supported by this group is a logical
   document summarizing much of the discussion of this mailing list. FAQs
   are vetted by the news.answers team, then automatically mirrored around
   the world. From its origins in the mailing list, the FAQ is a
   peer-reviewed document, often full of links to further resources,
   topical, knowledgeable and factual. As an FAQ, the document is not
   immediate, graphical or financially rewarding (some FAQs stagnate).

   Only some internet information is created within the internet
   environment. The concept of  'brochureware' describes the common traits
   to promotional webpages directly prepared from paper promotional
   brochures.

   One of the more exciting trends is the movement of information from the
   dusty shelves of government offices and association libraries to their
   more accessible websites. The quality of information retained in your
   average government agency, from quality research reports, to detailed
   studies, to current industry monitoring is very high. These qualities
   are then brought over to the web format. Such web-documents tend to be
   isolated (not linked to other related resources) and perhaps a little
   behind the time line, but of a generally high quality.

   An exciting holistic view of the internet information landscape is based
   on these descriptions. Imagine, for a moment, information flowing
   through a collection of systems. At certain points, information groups
   together, and generates new, perhaps higher quality information, which
   then flows in a different system, a different direction, to different
   people.

   The flow of information from one person to another, from one format to
   another, imprints qualities to the information along the way. Each
   organization, or subsequent re-organization, imparts specific styles and
   conventions and quality to the result.

__ 32.5 publishing motivation

   Let us proceed to a third set of patterns. Information appears on the
   internet for one very specific reason. Someone Publishes (DUH). The
   motivation behind publishing colours the information. Patterns we will
   use to better search for answers on the web.

   Ask yourself who is publishing, and why.

   One of the biggest publishing segment a year ago were individuals
   publishing documents derived from their personal expertise. A typical
   document would be one with minimal peer review, a list of aging links to
   further resources, simple graphics, variable to short length, prone to
   bias, but moderately reliable because the publisher knows their topic
   well. These pages are often located on web pages with private
   sub-directories (usually starting /~name/).

   Commercial sites publish mainly for the promotional value. Their
   secondary purpose is to provide sales information to prospective
   clients. Rarely do commercial sites go beyond this. Commercial webpages
   often reside on their own domain name, as a .com, or in sub-directories
   - without the tilde symbol. Commercial sites also tend to age badly.
   They are very noticeable from their front page.

   Government agencies are emerging as valued publishers. Slowly their
   dormant information becomes available through this new medium. Currently
   almost all government documents on the internet also appear in print,
   meaning they are factual, exhaustively reviewed, tend to be a little old
   (but age well), and come from highly paid knowledgeable people who
   believe it is their duty to inform others. Such documents are lengthy
   and appear on .gov domains.

   These patterns are simple to see.

   Grant-funded projects create brilliant research resources and hold much
   promise in pushing the limits of this technology. I am eager to see the
   results of the US Patents project, and appreciate the value of having
   Supreme Court rulings on the internet. Often such projects are short on
   money but deeply focused on content. Most projects reside on educational
   servers and are widely discussed within knowledgeable groups.

   Associations, publish association-kind-of-things. Most are initially
   just like the commercial webpages, but with time become much more
   factual and research-worthy. Most associations are dedicated to
   developing awareness of their chosen topic, albeit coloured by their
   chosen bias. Few associations are significant publishers yet, but this
   segment will begin to liberate dormant information within associations.

   Let's summarize. The key is to always watch who is the publisher. We can
   assume a great deal, quickly. We are unlikely to find the latest changes
   to patent law from government or commercial publishers. Such
   organizations are simply not motivated to present such information.

__ 32.6 promoting information

   Publishing is one achievement, but you and I will never read any
   information until we learn it exists. This simple fact creates even more
   patterns to internet information. Knowledge of information moves through
   set routes on its way from writer to reader.

   Promotion is not simple. It is a process that takes time, effort and
   perhaps money. Information without serious promotion tends not to be
   promoted far from the source. Another way to phrase this; you must
   search close to the source to find poorly promoted information.

   A search engine indexes pages relatively indiscriminately. This also
   means a site of quality is not likely to reach your attention. The odds
   are not good, and from a promotion point of view, search engines
   generate minimal traffic to your webpage. Search engines drop you rather
   randomly into a website. It is often necessary to move up a directory to
   understand the purpose and motivation of a site you find interesting.

   Information published through advertising tends to have a financial
   payoff for the promoter. This kind of information tends to be
   promotional information. Brochureware.

   The alternatives are to promote a webpage or website through one of the
   referral tools. Each such tool accepts links on some criterion. Each
   tool you use to locate information also selects particular types of
   information for your attention.

   If you arrive at a document by recommendation through a mailing list,
   the document is likely to be recent, on-topic, and specific to the
   purpose of the mailing list. Alternatively, (for poor mailing lists) it
   will be wildly off topic and trash. You are unlikely to see referrals to
   old documents or documents of historical importance. These are the
   qualities most acceptable to the mailing list environment.

   Directory trees, FAQs, guidebooks and related promotion tools all work
   as historically important documents. In the past, such resources list,
   describe and alert people to relevant information for the field. Slowly,
   over time, this function becomes acknowledged, reinforced and promoted.
   Time is the essence of this fame.

   Webpages or websites found through historically important documents, by
   their nature, tend to be long lasting websites with lasting importance
   in the field. Such documents point to other similar documents or
   websites that have achieved a long-lasting importance. You are unlikely
   to find specific documents, but rather sites that focus or bring
   together information. In short, there is little motivation to link to
   specific webpages, when a link to important websites is considered just
   as good.

   Similar generations can be made of each type of promotional tool, and
   become important in rapidly seeking our information which matches our
   intention, as well as summarizing the likely motivation - and bias - of
   webpages we are interested in.

__ 32.7 information clumps

   Information Clumps. Information is created, nurtured, develops, gets
   transplanted, gets arranged and then becomes visible through a process
   which brings similar information together.

   As we have discussed, there are factors deeply affecting all information
   on the internet. Motivation, Preparation, Format and Promotion defines
   the quality and content of any given item of information. With so many
   influences, we should not be surprised to learn information naturally
   groups together. In reality, there is nothing natural involved - it is a
   social phenomenon reinforced each time you and I visit or read one
   resource but not another.

   History can explain some aspects of internet development. As a small
   collection of sites become dominant in particular fields, by collecting
   and delivering better content to more people, new sites find it
   progressively more difficult to capture attention. This dynamic works
   for websites reaching out for visitors, and discussion groups reaching
   out for subscribers. In each case, seniority counts.

   Seniority counts in several ways too. Promotion is directly related to
   quality, interest, traffic and time. The longer a site is active, the
   better the footpath develops, the more people visit. Secondly, quality
   content is directly related to access to quality content, peer review,
   and time/money. Important existing sites gain in every way.

   This results in a grand system where the first-in, best-dressed, can
   capture the high ground and secure a grand lead in awareness and
   footpath over competitors who follow. Yahoo is a prime example of a
   directory tree, not even the best in most areas, which has achieved
   unparalleled traffic & awareness.

   This competition is equally evident where no money is involved. Perhaps
   your association wishes to create a new referral website, or an open
   mailing list, or an informative guide. All sound concepts, effective
   projects. However, if older, established resources exist, the work will
   be long and arduous.

   Despite the marketing message, the internet is not a world where the
   best information floats to the top. The internet will not let you to
   reach millions. You must compete for the attention, participation,
   devotion and assistance in a manner very similar to building a business.

   In concrete terms, information clumps on the internet. The best resource
   could appear on any internet system (webpages, email mailing lists,
   ftp-archives, faqs, online databases, newsgroups...) but we can be
   fairly certain the best information will congregate in just one or two.

   Consider our article "Searching the Web" (http://cn.net.au/webpage.htm).
   We progressively search different web tools, looking for the most
   worthy. Searching the internet is the same. You must touch each system
   to see which system is dominant, where the information is congregating
   for your topic.

__ 32.8 bringing this together

   In summary, we have broken down and discussed various qualities of
   published information and promoted information. We have made sweeping
   generalizations and educated guesses about information on the internet.
   Now what?

   When a painter begins to paint, they have already visualized some of the
   image. They already have a concept of the finished result. Internet
   research is no different. We start by building a vision of the
   information we seek. Who would publish it? Where would I find it? What
   is its motivation? How would we find it? We now have a practical vision.

   The address is the key. The url for any item of information gives us a
   surprising amount of information - particularly now we are making
   generalizations about information patterns. We can guess if information
   resides on a personal webpage, a funded university project, or a
   commercial project. The information resides on a .gov website? - the
   quality is likely to be higher and conform to our expectations of
   government resources.

   We use this new-found experience in three ways. First, we restrict our
   searches to the most likely sources. Second, we quickly jump through
   lists of resources (such as those generated by search engines) to the
   sources that match our expectations. Third, your understanding of the
   relative qualities of information guides your judgement of information
   value.

   Internet newcomers often expect to have instant access to the latest
   information at the touch of the button in beautiful colour and peer
   reviewed quality prose. Who is publishing this? Where is this
   information coming from? Who would help us find this? Such a vision is
   fantasy. If we were instead to look for an association website,
   dedicated to a certain type of research, or an informed newsgroup,
   maintained by people passionate about sharing this technology, then we
   have made four steps forward. We are clear about where to look for the
   answers we seek, and we will know quickly if the answers are online.
   ___________________________________________________

33. More on the Commercial Information Sphere

__ 33.1 structure of the database industry

   The commercial information sphere existed in the 1970's and earlier. It
   is far more developed, far better organized, far better funded, almost
   always far more valuable and expensive than every other research
   resource.

   For the most part, commercial information is arranged reasonably
   uniformly in large databases of full-text or bibliographic information.
   Some databases are small, single source documents, while others are vast
   unfocused collections of, for example, all the news from the last 15
   years.

   Most directories and journals can be made into a database, but
   single-source databases do not enjoy much financial success. The market
   is too limited and the cost of promotion too high (except in a local
   market with newspapers). To overcome this difficulty, single sources are
   grouped together into larger collections of databases on a particular
   topic. These large database groups have become primary tools in
   commercial research.

   Developing these databases requires considerable expertise and expense.
   Sometimes data requires abstracting, interpreting, and as with some
   Lexis-Nexis and WestLaw databases, even expert legal interpretation.
   Sometimes firms develop a portfolio of databases. Sometimes firms build
   just one.

   The marketing and consumer billing of such databases is then provided by
   a relatively small collection of large database retailers. A list can be
   found in our "Commercial Databases" article. As an indication of the
   size of this market, Knight-Ridder sold Dialog & Datastar for a figure
   approaching half a billion dollars.

   This industry consisting of a wide collection of players, each improving
   and developing the information from individual periodicals, journals,
   news items... All very confusing for the end user.

   This is elegantly illustrated by the database descriptions for
   Lexis-Nexis databases (their preferred term is libraries). See
   http://www.lexis-nexis.com/lncc/sources/ as an example of specific
   databases. In particular, see their library on patents.

   Many single-sources appear in different commercial databases. Further,
   different databases sometimes include different information from the
   same single-source. One database may include just abstracts, another may
   include fulltext, chemical indexing and more.

   As a result, most researchers are unfamiliar with what exactly is being
   searched.

   This state of affairs is not unproductive. Searching a 'Database about
   Patents', is uncomplicated. You receive information on Patents. It is
   simple, informative and incomplete. Of course, researchers are busy
   people. Time is critical. Results matter. This system also gives rise to
   great customer loyalty to database retailers. Comparative information is
   dropped in favour of simplicity. (There is too much complexity for
   researchers anyway.) Unfortunately, I am hard pressed to compare prices
   let alone describe the differences between information products.

   Prices actually model many a developed industry, remarkably similar to
   the telephone or banking industry. As one friend commented, "bullshit
   baffles the brains". The prices are complex on purpose. It becomes very
   unrewarding to compare prices, and any conclusions are only valid in
   specific circumstances - and will not hold in others. This trend,
   familiar to us as a multitude of banking changes and telephone pricing
   schedules, reinforces our need to stop price hunting and trust our
   favoured information retailers.

   This is not to say we should not try to compare prices - but for the
   most part, you will find comparing prices a most unrewarding experience.
   It really requires you to search and retrieve the same information on
   different systems - and this does not even begin to touch different
   databases, or database groupings, or variables that change over time
   like download speeds.

   Optimistically, there are actually very few important databases in each
   field. It may be simple to browse each of the databases in your field
   and compare directly. You may never need to know more than a few
   databases intimately.

   Realistically, you will yearn for a simpler solution.

   The commercial information industry has distributed information this way
   for several decades. It is both sophisticated and quite difficult. You
   will need to become experienced with inverted indexes, search techniques
   (Boolean, truncation, proximity, field limits ...) and properly phrasing
   the question in a way that will be answered by a database search. I have
   always found the value of a database search directly proportional to the
   length of the search query.

   If you are incompletely skilled at database research, you will take
   longer, pay more and locate far more information (or unwisely discard
   more) than desired.

   This is very different from searching Altavista and Webcrawler.

   Doing your own research offers an opportunity to more closely influence
   the research process. Sometimes only you understand the topic and
   sometimes you can more quickly discard unimportant details. Certainly it
   is becoming simpler to undertake some work yourself.

   Many of the commercial databases are also available in a CD format.
   Substantial subscription costs limit their availability to large
   research institutions and libraries, but exceptions exist. I believe
   world books in print costs AU$5000+. Provided you can find casual
   access, it will cost you far less. Keep an eye on the age, though.
   Sometimes (and only sometimes) online information is more recent.

   The decision between undertaking research on your own or seeking
   external help is really a decision based on your research expertise,
   your budget, your access to information, your time, and the importance
   of finding all the information available. It also depends on your access
   to some decent research assistance. I will soon be able to help with
   this.

   What I do know is a newcomer to the commercial information sphere will
   seriously underestimate the difficulty involved in searching, and
   underestimate both the cost of research and the cost of research
   assistance. Keep in mind this same system serves the needs of large
   commercial conglomerates, professional legal research, and well financed
   government studies. The commercial information sphere contains far more
   valuable information than you need. Often the internet is just an
   interesting sneeze in comparison.

 #  Article:  The State of Databases Today:2000 by Martha E Williams,
   tracts the development of this industry with survey results. Found in
   the forward of the Gale Directory of Databases.

__ 33.2 advanced search technologies

   Searching is both science and art. The science is a range of
   improvements to the blunt system of simply asking for a word. The good
   news is an experienced searcher can accomplish wonders - collecting
   articles of 70%+ interest regularly on expensive database. The bad news
   is most of the best of search technology is not implemented on all the
   databases you will search and only occasionally on databases free on the
   internet.

   The art is a kind of magic, of choosing just the right words at the
   right times, and in phrasing your request for information in a way that
   tightly describes your interest without removing information that should
   interest you. The art of searching relies heavily on an understanding of
   what is possible within a given system. Much of this, you guessed it,
   involves creative visualizing. (See section 3.1)

   Current search technology allows us several ways to refine our search:

   Straight Word Searches:
   All search situations allow you to ask for the presence of words in a
   block of text. If you ask for the right words, they you will quickly
   locate the information you desire. For best results, you obviously
   search the desired text several times with different terms, and you
   consider the possibility of different spellings for the same words. I
   use this frequently to locate information in web pages, in large
   documents like online directories or the archives of past discussion on
   forums.

   Text Fragments:
   The simplest refinement to straight searching involves searching for
   parts of a word - if you are interested in surfing, search for surf
   better yet, search for " surf" with the space in front of the word.

   Truncation:
   Some search engines don't allow searches for text fragments, and you
   must explain your intention by adding a truncation mark (usually * or ?)
   to the ends of words. For most professional researchable alga? will
   include both algae and algal. I was once badly lost because of the
   spelling difference between aging and ageing. There are a number of
   improvements on this concept to. Sometimes there are special symbols for
   a non-space character car?a, sometimes there is automatic awareness of
   multiple spellings (colour & color). Sometimes there is even automatic
   awareness of synonyms. Often you are initially unaware important
   information is indexed under slightly different spelling, so truncation
   is strongly suggested for most searching.

   Thesaurus:
   An improvement on truncation is the opportunity to look directly at a
   list of words, either keywords, or descriptors. This allows you to see
   the range of spellings before you search. This is also ideal for
   searches of company names or proper places so you can select only the
   words you are interested in. In a simple way, some library catalogues
   present subject searches in this way: a list of subject categories
   arranged alphabetically.

   Boolean operators:
   Changing tack, searching for multiple words calls for "and, or, not"
   concepts. I want this word and that word, but not another word. It is
   simple enough. Many of the search engines allow for this with the -sign,
   and commercial databases often add brackets. Use of the not symbol is
   frowned upon in textbooks (too easy to dismiss information you are
   interested in it is said), but the 'and & or' is absolutely necessary
   for complex questions like I want [(spaghetti or noodle) and pasta] or
   (Italian and cuisine). With most internet search engines, but not all
   commercial searches, you will find 'and' is assumed.

   Proximity operators:
   The next dramatic improvement fixes the position of words relative to
   one another. In this category we have adjacent (often written as adj,
   next, or "inserted in quotes"), near (by how many words), or in the same
   sentence. Often it is wise to stretch the distance a little (within
   two), but where available, proximity is best way to remove the dross
   without affecting the value of information. "Patent near Research" is
   much more precise than "Patent and Research".

   Fields:
   By separating information into different fields, we can selectively
   search different portions of the information. I want the title to show
   the words "Patent" and the abstract to include the words "Patent
   Research". Field searching is a common way to refine a search, but be
   aware searching titles is very likely to remove some desired
   information, where as searching descriptors and not abstracts may
   dramatically improve the content.

   Date Field:
   Are you really interested in information more than 15 years old? Library
   catalogues frequently have many aging books, and date limiting is very
   wise.

   Further Enhancements:
   There are some special techniques available on a few systems that bear
   discussing. Sorting allows you to shape the presentation of the
   information. When applied to financial information, this is particularly
   valuable. Alerts allow you to automatically repeat a previous search and
   have the information sent to you. Multiple database searching allows you
   to search a collection of databases concurrently. Ranking positions
   certain information at the top and is valuable when your search is not
   time or price limited.

__ 33.3 the art of searching

   The artistic side to this deals with two fields. Firstly, the selection
   of accurate words is not automated. The searcher needs to approach the
   information beast fully recognizing he or she is likely to get either
   tons of information... or far to little. When to expand, when to get
   more in-depth and how to handle fields which you may be poorly
   experienced in are talents. The search technology itself is simple.

   The trouble lies in retrieving from databases with far too much
   information for simple word selection. It also flares when you are
   dealing with databases charging up from $2 a minute and an additional
   cost per item retrieved. You decide very quickly to get good at
   searching once you receive a bill for $200 of irrelevant information.

   The simplest solution to this difficulty is to practice. You will find
   all Research Libraries provide access to slightly older articles through
   CD-rom databases. Search these to hone your skills.

   I saw a small book on search techniques from an early course in my state
   library - but it is very basic. Most librarians build experience in
   using search systems either internally, or through a series of courses
   given by travelling database officers like the periodic training by
   Dialog-Insearch. These are expensive, but include some free time
   searching the expensive databases (no, they don't let you take
   information back with you).

   Now, there must be something else I can share with you on this topic.
   First, learn something about how the databases are built in the first
   place. It helps if you know what an inverted text database looks like.

   Second, something personal about technique... I always find the uglier
   the search query, the better the result. Honestly. A search combining
   numerous elements improves your chances of getting it right.

   Third, I always try to change my search techniques to match the medium.
   I am likely to be more careful of broad searches of expensive database,
   where as free databases often lead me to gather 50 articles, then
   weeding them out by hand. (most CD-roms allow you to select only the
   ones you want). Always bring a 3.5'' floppy with you when visiting a
   library on the of-chance you want to download and look at results
   another time.

   Fourth, I almost always find the initial challenge is in locating those
   specific terms that appear in 80% of the documents that interest you.
   When searching the internet for information about government use of the
   web, the specific terms required were government and publishing (not
   even government publish was close) All other search terms gave far to
   much garbage. Yes, of course, being an expert in a particular field is
   an edge in already knowing these special terms.

   There are two escape hatches here. If you can find one or two articles
   that interest you, often you can browse these articles for those special
   words. Sometimes even, the descriptors of an interesting article will
   give you a specific subject heading. I've heard this technique called
   the "Pearl Development Technique" but I just think of it as a good idea.
   The second escape hatch is the use of free databases to prepare you for
   going online. If you have ready access to a CD-rom database, search this
   first - get the right search words on the free databases, then go
   online.

   Oh, of course, there is also the issue of just asking someone involved
   for the proper words. I like to ask my clients if they know what words
   are likely to be used. It's not a mark of an amateur to be asked, by the
   way.

   A couple of side issues

   1) Keep an eye on the type of document you are searching. If you want
   full text - don't go looking in bibliography databases. More to the
   point, don't start word searching databases with really big files
   without using the proximity indicators and descriptive fields. I hated
   paying for that 20-page document which included all the words I was
   interested in - but on different pages.

   2) Also, keep an eye on the quality of the documents you are retrieving.
   I know a search of newspapers sounds impressive, but they are rarely
   capable of explaining anything in depth and are notorious at being
   advertorials. I try to keep newsprint for locating experts - not for
   information. I have also been trapped by obscure magazines with
   appealing articles, only to learn the magazine is one of a large number
   of very basic business mags which likes to use fillers, or just doesn't
   like to pay for good journalism. A single article of 5 pages from
   Scientific American blows 20 small fillers out of the water. In fact the
   length of an article is a hint of depth.

   Oh, if you are looking for some really good books on this issue, try the
   manuals Dialog sends you to start, look for text databases in you
   library, then proceed to one of the search books recommended at the end
   of our 'research as a discipline' article.
   ___________________________________________________

34. More on the Information Service Industry

   Private Detectives, Professional Database Researchers, Library
   Researchers, Legal Researchers, Commercial Database Producers,
   Commercial Database Retailers, Magazines, News Organizations, Libraries,
   this is a big industry. Information Research is just a process linking
   together people seeking information with people who provide it.

__ 34.1 the extended information market

   It seems in vogue to reconcider all businesses as being in the
   information business. My accountant and your stockbroker both provide
   information services. While I agree these two professions are intensive
   users of information, I purchase their interpretation of information. It
   is not a subtle difference, but nonetheless, it serves to cloud the true
   size of the industry just involved in selling you access to information.

   From university days, I was aware of the large commercial database
   retail giants (Dialog, Dun&Bradstreet) and the database producers. I
   also met with some of the firms distributing largely to the library
   market (like SilverPlatter). Little further information about these
   businesses leak beyond the research industry.

   Some of the businesses are aimed primarily towards the library
   community. Database subscriptions are unlikely to interest an
   individual, and few are appropriate to businesses. I cover some of these
   in the article "Research as a Discipline". Lets scan the products and
   services intended for a consumer.

   Commercial Database Retailers - These organizations devote their effort
   at bringing commercial database information to individuals. Dialog,
   Datastar, Infomart, Lexis-Nexis and others will assist you to access
   information only available through commercial databases. (See our
   article, "Commercial Databases".)

   Current News and Current Awareness - If you want to know of new articles
   and news important to you as it is reported, then there are a selection
   of services available: news by email, news by newsgroup, news by
   periodic automated database search, and other novel approaches. Costs
   for this service have fallen dramatically: effective solutions start at
   about US$10/month and are not strictly dependent on range & quality of
   information. (See our article, "Newswires & News Databases".)

   Information Brokers - There is a whole industry of specialized
   researchers who will try to locate and compile research to your
   specifications. The backbone of this industry is payment for access to
   commercial databases, but different information brokers will gladly
   enter into any effort required to locate information. Information
   brokers, business librarians, legal researchers and others all use the
   tools described in this website, as a service for their clientele. (See
   our article, "Research as a Discipline".)

   Patent Assistance - Patent searching is one of the more difficult
   branches of serious research. Some of the resources are free on the
   Internet, and commercial patent databases are readily available through
   the database retailers. If there is serious money at stake, you should
   consider legal assistance. Certainly use lawyers for patent applications
   (beyond the scope of this website). But patents can also be a research
   tool. Patent research can provide you with what is often the first
   appearance of costly commercial research. This is both a source of
   cutting edge solutions and competitive intelligence.

   Media Monitoring - Certain firms solely focus on monitoring TV, radio &
   newspapers. These firms typically run teams who page through newspapers
   looking for matching articles, then post or fax to the client. New
   technologies are also advancing into this field.

   Document Delivery - Most local bookstores will gladly help you locate a
   book from their directories, but if you want a book from abroad, or an
   article from a journal or magazine, you will need the assistance of
   another set of information workers. Many of the document delivery firms
   are closely tied to information organizations. Little information is
   available about these organizations.

__ 34.2 judging information value

   Information has value. It also has other qualities that will assist you
   to judge information you may consider buying.

   Accuracy: the factual nature of the information presented. If the
   statistics purport to show a particular trend - how large is the margin
   of error? How large is the sample size? How likely are there to have
   been factual errors in their development? The measurement of statistical
   error is now a refined science in some fields. A statistical result can
   be inaccurate when the sample size is too small, if the margin of error
   is too large, the sample collection procedure incorrect, or a number of
   other situations.

   Reliability: the support for trusting the solutions, both from
   additional resources and from being able to duplicate the conclusions.
   This includes the reputation of the researchers. No matter how
   inaccurate and biased you may believe certain facts to be, successful
   independent support of a suggested fact does improve its value.

   Bias: conscious or subconscious influences that affect information. Bias
   can occur in collection, preparation and presentation of information.
   Most information you find will be tainted. Secondary information is
   deeply affected. Statistics are not necessarily less biased.

   We counter bias in several ways. Firstly, we try to be aware of bias.
   Where is bias likely? Which direction would the bias affect the
   information? Secondly, we try to collect information with different
   bias. This is why research based solely on government research, no
   matter how accurate and reliable, is less valuable. Often information
   from different countries can counter bias. Thirdly, we need to accept
   bias is likely to exist. This is why primary sources are often more
   valuable than secondary sources. This is why tertiary sources, like
   experts, can rarely stand alone.

   Age: The date information was created or compiled will feature
   prominently in the value of information. Dates given sometimes mean the
   date information was created, or the date information was compiled. How
   old is a book compiled in 1995, which took the author 10 years to
   finish? I find statistics often forecast information, prominently
   displaying recent compilation dates but still use old census data or the
   like to draw their conclusions. Information on the internet typically
   has no date, and can be severely challenged because of this.

   Purpose: purpose merits further discussion. When you are uncertain about
   potential bias, you can look for reasons to distrust the information
   instead. Suspicion is not equivalent to bias, but it can be thought
   provoking. Privately, I have heard repeated rumours important national
   statistics have been fudged in different countries. A government
   research report investigating the price of books in Australia would have
   a political purpose, a purpose that provides the climate for some
   potentially significant bias. A tell-all book by industry experts often
   includes a tremendous quality of insider experience difficult to find
   elsewhere. While there may be a purpose of self-aggrandizement, the
   purpose is less a climate for significant bias. Medical research has
   perhaps the greatest climate for significant bias, and this suggests the
   greatest standard of proof and external, reliable support.

   Accuracy, reliability, bias, age and purpose are very important in
   research. This is what leads us to an appraisal of value. For years, the
   tobacco industry funded 'independent' research finding smoking minimally
   harmful to health. It is now likely there may have been errors brought
   on by accuracy, and bias. Certainly, purpose was in doubt. As new
   studies show smoking is harmful, we can also say the original research
   lacked reliability. In some topics, like the internet, research is
   perpetually suspect because it also ages so quickly.

   I have seen further discussions that add 'Coverage' and 'Authority' to
   this checklist. Both have bearing on the value of the information
   contained. By coverage, we mean how much detail is invested in covering
   a specific topic. Sparse or shallow coverage is closely tied to missing
   critical aspects of information. News stories frequently have limited
   coverage.

   Once you are acclimatized to these elements, you begin to see potential
   for error in a whole range of information. Real-estate association
   figures, expert opinions, Toothpaste advertisements and National GDP
   figures all occasionally display some degree of warping and
   manipulation, clouding the truth. The solution is awareness, comparison
   and careful analysis. As a personal aside, this is part of the reason
   for my personal dislike for market research: it is often taken far more
   seriously than warranted and mean far less than suggested.
   ___________________________________________________

35. Emerging Trends in the information sphere

   For the past few years, individual database owners/maintainers have been
   flirting with the idea of making paid access available through the
   internet, rather than the existing system of allowing database retailing
   firms to promote and market their databases. I have heard rumours most
   database producers earn up to 30% of retail price when delivered through
   database retailers - 70% being retained by the database retailer.

   The internet is not a commercially viable alternative...yet, but some
   databases have emerged with alternative funding despite this (Library of
   Congress, ERIC, see section 13). Others are creeping in around the edges
   by offering subscribers access at a much reduced flat annual fee
   (Computer Select at one time). I expect most database producers are
   waiting for a meaningful way to charge. Digital money holds the key but
   despite the hype, practical use appears to be a medium to long-term
   reality.

   A second trend is internet publishing itself. Gradually, the information
   is getting easier to locate (don't laugh please - its undignified). We
   are also getting better at using the internet as a tool to disseminate
   information. We have the very visible, if perhaps short-lived, search
   engines, but also other efforts like archives of FAQs, archives of
   guidebooks, applying the Dewey decimal system to the internet,
   specialist directories, subject guides, specialist search engines. This
   will be a lively field for several years to come. As it gets easier to
   locate the good information, perhaps the lines between commercial
   quality and internet quality will begin to merge in places.

   The third trend is the very promising prospect of paying for information
   by the page through the internet - viewing the results in a web page
   immediately. There are some technical hurdles yet, but certain elements
   are already appearing in ventures like DialogWeb. This step may prove
   profitable for ATM vendors and owners of internet cafes, pubs and
   kiosks. It will also herald a dramatic drop in the cost of information.

__ 35.1 developing an informative internet?

   Several serious glitches have delayed the further improvement of the
   internet as an effective information resource. Oh, sure it is the
   world's largest library and thousands of new webpages are published
   every hour. But this trite statement disguises how slow the informative
   value of the internet is developing.

   Vision
   The internet holds so very much promise. Marketing mantras tell us so,
   but few of us grasp this technology will completely rewrite the rules of
   community, government and the exchange of intellectually valuable
   information.

   One of the hurdles is vision. We are not yet delivering the information
   pertaining to community, government and the exchange of intellectually
   valuable (improved) information. We are only proceeding quickly with
   market information and computer-related information. We are still toying
   with further ways the internet can transform other areas of our life.

   We should have achieved more by now.

   Organization
   Lets look at information itself. Information passes from producer, to
   organizer, to consumer. It travels many paths in this journey.
   Superficially, we can observe internet communication travels via email,
   newsgroups, and webpages (and others). Let's call these tools.

   Looking deeper, we observe information emerges from just a few
   generalized sources: knowledgeable individuals, informed government
   employees, grant funded educational projects, commercial organizations
   and a few others. Each source produces a particular type of information,
   distributes (publishes & promotes) in particular channels, and hopes to
   pay for (or justify) their effort in a particular way.

   Efficient internet research is infused with an understanding of who
   publishes, where and why.

   Before information reaches the consumer, it passes through a vetting
   which organizes and filters both the quality and the presentation style
   of the information. Let us call these systems. The FAQ is a pivotal
   piece of a system that may start with a post to a mailing list or
   newsgroup, involves the vetting of the faq maintainer, then proceeds to
   an faq archive then to the end consumer. The webpage is published by
   someone who has justified their time and expense, is indexed by a search
   engine or definitive-topic-website or webring or what have you, and then
   is found and read by the end consumer. The internet has many such
   systems.

   Each system again defines many of the traits of the resulting
   information. Faqs are semi-authoritative, collaborative pieces, often
   dense and factual. Private mailing lists are sometimes more informative,
   discussive, as well as serving as a notice board. Newsgroups involve far
   less natural vetting and quality control, but excel in distributing
   popular volume resources like graphics. Search engines don't vett, but
   can be searched.

   Each system reinforces the uniqueness it brings to the whole internet.
   When I blindly declare "Information Clumps" in the section 32.7 of this
   faq, I am really describing a trend whereby certain information
   accumulates in a particular location, others out of self-interest add to
   the pile, and further information reinforces both the logic and
   uniqueness of that pile of information.

   It is just a short jump from this to understanding how faq archives grow
   but maintain a good quality, how the grand internet search engines began
   to lose value about 15 months ago, and how ftp archives still exist for
   many computer topics.

   The internal logic to the organization of information is based on simple
   principles. It defines the environment within which we strive to improve
   the internet as an effective information resource.

   Further Reading:  Searching the Web: Strategy
   (http://cn.net.au/webpage.htm#5)

   Publishing
   As mentioned, thinking about who is publishing assists research.
   Applying this to where information is emerging - and we learn much of
   the best information is not reaching the internet. Certainly, the
   commercially generated information is not reaching the internet (covered
   below). The large research studies paid for by public funds and slowly
   aging on the shelves of government and non-government organizations are
   also not coming online. Government, institutional and commercial
   organizations primarily publish brochure-ware - as befitting the
   presentation of market information. (Even offering to publish such
   documents freely does not appreciably affect this trend as the
   restrictions are not financial, but mindset. See our past work.)

   We should recognize few of the more valuable documents emerge online.

   Further Reading:   Socially Responsible Publishing on the Internet ('97)
   (http://cn.net.au/cn/past/docs/publish.html)
   A Census of Regionally Important Documents on the Web ('96)
   (http://cn.net.au/cn/past/docs/webscan4.html)

   Discussion
   The internet excites me with the promise of a real community rebirth
   arising from this technology. For the first time in history we should be
   able to discuss in an informed manner any number of issues from crime to
   taxation. Tied into this are issues of government transparency,
   international assistance, anti-corporate market reform, and community
   involvement.

   Unfortunately, my experience with mailing lists and more recently with a
   newsgroup confirm the difficulties in developing discussion. Discussion
   groups function as noticeboard, but the difficulty in developing
   participation, and in moderation, are just a little too cumbersome to be
   successful. For many discussion groups, the chaff overwhelms the wheat,
   and the information content is far from considerable.

   The financial rewards are also minimal for establishing and maintaining
   discussion groups. Dramatic improvement to the informative value of the
   internet is unlikely to emerge here.

   Further Reading:  How to build a discussion on the Internet
   (http://cn.net.au/cn/past/docs/forums.html)

   Rewards
   We have alluded to the importance of editorial and organization on the
   internet. There are several severe limitations to this - first and
   foremost the difficulty in gathering financial rewards for meaningful
   work improving and organizing information.

   I am being circumspect here. There is money available - but not where it
   is needed. The most important resources in professional research are the
   contents of the commercial information sphere. This sphere existed
   decades before the internet, is far better funded, and is far larger. To
   compare commercial and internet information is almost heresy. The bridge
   between these two, internet and commercial, is emerging slowly.

   Digital money should grease the exchange of information by dropping the
   cost of exchange considerably. Today, credit cards provide this service.
   This works, at times, but digital money would allow for small amounts of
   money to change hands. This appears to be a critical threshold for
   bringing much of the commercial information onto the net.

   About 5 years ago I was introduced to the Thesius Model - an economic
   model to pay the intellectual investment in publishing and organizing
   interactive multimedia. Years before there was Xanadu. While I have
   serious reservations about both, they do illustrate the intellectual
   foundations for effective use of a tool for exchanging small amounts of
   money. It opens the doors to direct delivery of copyright work - which
   in turn opens an effective economic model for publishing improved
   information on the internet.

   Without digital money, proprietary information can only be exchanged
   digitally by gift (that is free - the initial driving force of the
   internet information sphere, or by credit-card purchase of access to
   passwords to external networks - the current method of accessing
   database retailers.

   This has the unfortunate effect of limiting the interest both of
   internet users in the commercial information sphere and the commercial
   information retailers in the internet. Oh, there is movement in both
   directions, but not at the scale experienced in other industries.

   Further Reading:   The UWA Theseus Project
   (http://www.arts.uwa.edu.au/TheseusWWW/)
   The Xanadu project (http://www.xanadu.com  or  concise summary -
   http://www.sfc.keio.ac.jp/~ted/XU/XuPageKeio.html)

   Understanding
   Finding information on the internet is a skill. Finding information on
   the commercial information sphere is also a skill. There is a great
   degree of overlap. The awareness of the general public as measured by
   use of commercial resources is very limited. This is further seen from
   the simple use of search engines & the abundance of simple web search.

   To hammer this point in, let's take a momentary look at search engines.
   Most searches end in 1000's of results here are the first 10. Do you
   really think the first 10 or 20 or 100 sites listed are particularly
   better than the next? No - you have a random selection of resources. A
   selection generated by computer based on the most simple of criterion.
   (We should also mention how some search engines sell placement in search
   results).

   Remarkably, the search engine is the much-vaulted entryway to the world
   of information!?! Clearly search engines will not dramatically improve
   the informative value of the net - not by themselves.

   Multiplication of Information
   One complication of poor information organization is an inflation of
   information overlapping nuggets. Information on the internet is so
   difficult to locate we have almost a continual need for more publishing.
   Information must exist in numerous locations to reach an intended
   audience. Promotion of the simplest nature - recognition for the best
   for a given topic - becomes exceedingly difficult. Only when 20 sites
   publish or report a given fact does it become accessible.

   Curiously, this is the state of affairs in the wider community.
   Promotion is an expensive speciality. Numerous copies, distributors and
   references are required to generate any kind of significant awareness.
   Why should the internet be different?

   Actually, why should the internet be the same? Definitive like the US
   Census Bureau have no need to duplicate this information; to have
   alternative presentation sites. Yet such sites appear the exception.
   Consider a search for the best resources for patent research, we are
   greeted with 954 websites (Altavista search for "patent research"
   Jan-19-2000). Presumably, most of these sites discuss patent research -
   Right? There is no technical or theoretical need for such confusion. I
   wonder if such duplication may be more of an affliction than natural
   tendency.

   Justification
   It is relatively difficult to earn money from publishing improved
   information, or organizing information already on the internet. Given
   the intense interest in this technology, a collection of models have
   emerged. A brief tour of these models will highlight the financial
   limitations to improving the internet as an informative resource.

- - - Working for fame (but not payment)
   This model works well in open source software programming, and some of
   this ethic certainly extends to publishing information.
   Simple altruism/complete lack of justification
   School students and internet novices in particular may not need to
   justify anything. Unfortunately, such work is usually neither consistent
   nor persistent.
   - - - Commercial promotion
   Promotional funds can be used to publish information. Most promotion is
   short-sighted, limited to presenting market information (like product
   information), but in time government and associations will fund
   publishing in-house information for purely promotional reasons.
   - - - Invested commercial businesses
   There are certain commercial opportunities to earn money through banner
   advertising and sponsorship.

   Direct payment for improved information (perhaps with digital money),
   direct payment to authors (Theseus model, royalty systems), and direct
   state sponsorship need not be necessary to fundamentally improve the
   internet as an information resource. Academic peer-reviewed journals do
   not pay for articles. Commercial periodicals are supported by
   advertising, and the token subscription costs of magazines usually just
   covers distribution costs. Fame motivates many efforts, not just online,
   and we do not feel the need to habitually justify everything we do.

   In no small way, as more people become adept at publishing quickly,
   important information will move on the net faster. Similarly,
   information will also gradually become better organized. Economic models
   will not improve the informative value of the internet like direct
   payment. Most current limitations have economic solutions.
   Unfortunately, my reasoned opinion is no economic system will arrive in
   time to make a difference.

   Conclusion
   We know something of how information gets published, and how many
   important documents do not reach the internet. We have described how
   information is organized on the internet and how limited editorial
   vetting and organization have given rise to certain traits which give
   rise to the traits like superficial indexing, information duplication,
   and a need for research skills.

   Financial rewards and financial tools are unlikely to solve these
   difficulties. We can only hope for a gradual growing out of our current
   difficulties. We will have more of the same for several years to come.
   It is simply the nature of the internet (as currently constructed).

   For you, a greater understanding of the internet will assist you to
   judge the worth, likely source and likely venues of the information you
   seek. The same is true in the larger world... database, book & article.
   Each has different traits and qualities, reinforced over time. Your
   understanding of these traits and qualities in part defines your skill
   as a researcher.

   As to the future of the internet, on the positive side, there are
   certain qualities to internet communication that make it uniquely
   valuable. Internet communication is inexpensive, relatively rapid, and
   increasingly accessible. On the negative side, the internet is badly
   vetted, potentially very time consuming, and up against very well
   entrenched systems that have been running for either decades or
   millenniums (considering databases or books). Elements like a promised
   but functionally absent digital money, and the lack of a meaningful way
   to recoup the costs of vetting online information, make matters worse.
   Despite this, despite ALL the teething and fundamental difficulties, the
   internet is sufficiently superior to ensure considerable continued
   effort to improve the informative value of the net.


   Further Reading:   Theory and Past Projects of Community Networking,
   (http://cn.net.au/cn/past/)
   ___________________________________________________

36. Question and Answer Section

__ 36.1 How do I find information on the internet?

   A search for information on the internet is not essentially different
   from the standard information search process. You still need to start by
   outlining carefully just what you are hoping to locate. You also need to
   be aware of the peculiarities of the internet as a researchable resource
   (or rather a collection of resources). If you expect instant delivery of
   exactly what you require, free, then you need a reality check (and I am
   sure you will get one real soon). Sadly, the printed media tends to
   overlook this.

   As with all resources, the more familiar you are with a given resource,
   the more efficiently you will work. Get to know the internet for a time
   first. Understand how it works. Then re-adjust your expectations and
   file it as just another collection of resources, perhaps preferable in
   certain circumstances.

   A more complete answer to this question starts with a great deal of
   reading and is a primary purpose of The Spire Project.

__ 36.2 The Squeeze is on the Info-Broker

   I was reading an interesting article by Anthea Statigos in ONLINE [1]
   which stirred me to thinking about the future of Information Brokerage.
   The article in question outlined the shift of information brokers into
   the marketing department, towards new roles in negotiating information
   access licenses, helping people understand and select appropriate
   resources - and oddly, in overseeing the intranet development process so
   as to deliver the information people need.

   The article premise is rather accurate - as far as it goes. But I wonder
   if the true message behind this shift is the decline and death of
   information brokering as a profession? If information brokers (also
   known as information professionals) are moving to new roles, are they
   vacating the old roles, the traditional roles in the research process?

   In my library, I reach for the Information Broker's Handbook [2]   for a
   relevant quote:

   "The heart and soul of the information broker's job is information
   retrieval. But many individuals offer information organization services
   as well."

   So, Information Retrieval, and Information Organization.

   Anyone who has seen the simple information retrieval options
   incorporated in recent information packages can be in no mind that the
   information retailing industry is certainly minimizing the need to reach
   for an intermediary. Technology is certainly closing the gap - but this
   development has always been in the cards.

   A central difficulty for information brokers is a simple maxi: provide
   better results than clients doing the search themselves. Often working
   in unfamiliar territory, a researcher may find it very difficult to
   excel. There are two dilemmas here. Firstly, while we may pride
   ourselves in accomplishing unique requests, we have expensive costs
   associated with one-off searches. There is little likelihood someone
   else will ask a similar question. There are simply no possible economies
   of scale.

   Secondly, our search difficulty is not shared by the client. The client
   has difficulty with the technology - certainly. The client does not have
   difficulty with recognizing the wheat from the chaff, the gold embedded
   in the articles and at a basic level, the search words you will need to
   get to the right stuff.

   There is a very good reason why university students are pushed to learn
   basic and sophisticated search technologies.

   There is another take on this story.

   Creating Value in the Network Economy [3] includes a chapter by Philip
   Evans and Thomas Wurster.

   "emerging open standards and the explosion in the number of people and
   organizations connected by networks are freeing information from the
   channels that have been required to exchange it, making those channels
   unnecessary or uneconomical."

   "Newspapers and banking are not special cases. The value chains of
   scores of other industries will become ripe for unbundling. The logic is
   most compelling - and therefore likely to strike soonest - in
   information businesses ... All it will take to deconstruct a business is
   a competitor that focuses on the vulnerable sliver of information in its
   value chain."

   And in the back of my mind comes the thoughts that maybe the information
   retrieval function we have been providing is just one such information
   business. This business, attempting to be the pinnacle of the research
   process, is ripe for unbundling. Not only can our function be
   incorporated directly into the advertising and technology of the
   information resources we use, but our skill can also be coded into
   simpler and simpler guides and resources like my work on The Spire
   Project.

   Perhaps as an industry we never managed to secure our captive market.

   Initially, this will affect that mainstay of information brokerage:
   commercial database retrieval. And like the newspapers that will begin
   lose the profit center of classified advertising (ripe for unbundling
   and delivered electronically,) additional pressure will be applied to
   the business of providing information research services.

   Eventually, we retreat to other areas as information professionals:
   Information Organization, Research Education and Training.

   Somewhere in amidst this story lies a new role for researchers. The need
   for research certainly exists and is forecast to grow dramatically as
   the information age develops. What is lost, sadly, is an understanding
   of the ease at which this work will be done. This is certainly destined
   to move away from being an industry for professionals working at $50/hr
   to $150/hr + costs! Others can provide this work, easier than now.
   People we will most likely call researchers - and not information
   brokers.

   This is more than a push towards specialization. There is another way to
   see this transformation. The information broker was a retail point for
   wholesalers who are now firmly selling directly to the consumer. There
   is much less of a need for an intermediary between database retailers
   and information consumers - and there is a firm trend in this direction.

   Information brokers defined their role in the information industry as
   masters of the difficult technology of research, capable of finding most
   anything. Come to us when you are lost and we will find the answers -
   for a price. We know the technology, the meta-resources, the tricks used
   to find information. We routinely retrieve a higher quality of
   information, far faster, than you can yourself. The standard model: a
   library run service offering primarily database search & retrieval for
   their patrons.

   This business model is coming to an end.

   Yes, perhaps the information broker is dead. Soon to be replaced with
   low-wage researchers and research assistants, and high-end information
   executives and research trainers. Like it or not, most of us will
   incorporate a little more research into our current work, and reach for
   a little more intelligible research resources. Everything else will be
   accomplished by true specialists.

   [1] Online (a periodical with some coverage of library & information
   research. July/August 1999 p71-73, by Anthea Statigos of Outsell Inc.

   [2] The Information Brokers Handbook   p21, by Sue Rugge and Alfred
   Glossbrenner. Windcrest/McGraw-Hill. 1992.

   [3]Creating Value in the Network Economy   Edited by Don Tapscott.
   Chapter 2: Stategy and the New Economics of Information by Philip Evans
   & Thomas Wurster. p18 & 25. A Harvard Business Review Book.
   ___________________________________________________

37) Acknowledgements

   I would like to thank my wife Fiona, whom I love and cherish dearly.

   The Spire Project is the culmination of several years bridging
   information research and internet development. The information research
   industry is on the verge of a radical transformation set to add meaning
   to the oft-used saying "Information Revolution". The development of the
   internet is currently delayed by many factors, but to grow further, we
   need to radically improve the middle ground of content-rich
   resource-linked webpages. I feel this is the most beautiful form
   information can take in this emerging information landscape. It is also
   a most effortful area to work in.

   The Spire Project is the most advanced information guide today. Thanks
   to the many readers who assist in building and refining this
   information. Your help is appreciated.
   ___________________________________________________
   Copyright (c) 1998 by David Novak, all rights reserved.
   This FAQ may be posted to any USENET newsgroup, on-line service,
   website, or BBS as long as it is posted unaltered in its entirety
   including this copyright statement. This FAQ may not be included in
   commercial collections or compilations without express permission from
   the author. Please post permission requests to [email protected]
   -----------------------------------
   David Novak - [email protected]