NAME
UMLS-Association README
SYNOPSIS
This package consists of Perl modules along with supporting Perl
programs that calculate the association between CUI pairs using
frequency information from the Metamapped Medline baseline.
UMLS::Association requires the Text::NSP module to calculate the
association measures. Text::NSP currently implements the following
measures for bigrams:
UMLS::Association requires the UMLS::Interface module to access
the Unified Medical Language System (UMLS) to map input terms
to Concept Unique Identifiers (CUIs) and provide additional
information.
The following sections describe the organization of this software
package and how to use it. A few typical examples are given to help
clearly understand the usage of the modules and the supporting
utilities.
INSTALL
To install the module, run the following magic commands:
perl Makefile.PL
make
make test
make install
This will install the module in the standard location. You will, most
probably, require root privileges to install in standard system
directories. To install in a non-standard directory, specify a prefix
during the 'perl Makefile.PL' stage as:
perl Makefile.PL PREFIX=/home/programs
It is possible to modify other parameters during installation. The
details of these can be found in the ExtUtils::MakeMaker documentation.
However, it is highly recommended not messing around with other
parameters, unless you know what you're doing.
DATABASE SETUP
UMLS-Association assumes that the CUI bigrams extracted from the
Metamapped Medline baseline is present as a mysql database. The names of
these databases can be passed as configuration options at
initialization. However, if the names of the database is not provided at
initialization, then default values are used -- the database is called
CUI_BIGRAMS and contains four tables: 1. N_11 2. N_1P 3. N_P1 4. N_PP
Direction on installing the CUI_BIGRAMS database is in the INSTALL file.
All other tables in the databases will be ignored, and any of these
tables missing would raise an error.
The mysql server can be on the same machine as the module or could be on
a remotely accessible machine. The location of the server can be
provided during initialization of the module.
INITIALIZING THE MODULE
To create an instance of the UMLS-Association object, using default
values for all configuration options:
use UMLS::Association;
my $association = UMLS::Association->new();
The following configuration options are also provided though:
'driver' -> Default value 'mysql'. This option specifies the
Perl DBD driver that should be used to access the
database. This implies that the some other DBMS
system (such as PostgresSQL) could also be used,
as long as there exist Perl DBD drivers to
access the database.
'database' -> Default value 'CUI_BIGRAMS'. This option specifies
the name UMLS-Association database.
'hostname' -> Default value 'localhost'. The name or the IP
address of the machine on which the database
server is running.
'socket' -> Default value '/tmp/mysql.sock'. The socket on
which the database server is using.
'port' -> The port number on which the database server
accepts connections.
'username' -> Username to use to connect to the database server.
If not provided, the module attempts to connect as
an anonymous user.
'password' -> Password for access to the database server. If not
provided, the module attempts to access the server
without a password.
These are passed through a hash. For example:
my %options = ();
$options{'config'} = $config;
$options{'database'} = 'CUI_BIGRAM_V1';
my $association = UMLS::Association->new(\%options);
Keep in mind that the database configuration options can be included in
the MySQL my.cnf file. This is preferable. The directions for this are
in the INSTALL file.
CONTENTS
All the modules that will be installed in the Perl system directory are
present in the '/lib' directory tree of the package.
The package contains a utils/ directory that contain Perl utility
programs. These utilities use the modules or provide some supporting
functionality.
umls-association.pl -- returns the association score of two
terms or UMLS CUIs given a specified
measure (and view of the UMLS).
CUICollector.pl -- script to create the CUI_BIGRAM database
from the Metamapped Medline Baseline
The package also contains a Apache Hadoop MapReduce java implementation
of CUICollector.pl named CUICollectorMapReduce in the '/Hadoop' directory.
Refer to the README file in the '/Hadoop' directory for installation and
running instructions.
REFERENCING
If you write a paper that has used UMLS-Association in some way, we'd
certainly be grateful if you sent us a copy.
CONTACT US
If you have any trouble installing and using UMLS-Interface, please
contact us via the users mailing list :
[email protected]
You can join this group by going to:
<
http://tech.groups.yahoo.com/group/umls-association/>
You may also contact us directly if you prefer :
Bridget T. McInnes: btmcinnes at vcu.edu
SOFTWARE COPYRIGHT AND LICENSE
Copyright (C) 2015 Bridget T McInnes
This suite of programs is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License as published
by the Free Software Foundation; either version 2 of the License, or (at
your option) any later version.
This program is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
Note: The text of the GNU General Public License is provided in the file
'GPL.txt' that you should have received with this distribution.