mirror of
https://github.com/DSpace/DSpace.git
synced 2025-10-14 05:23:14 +00:00
Move docs into dspace CVS
git-svn-id: http://scm.dspace.org/svn/repo/trunk@1241 9c30dcfa-912a-0410-8fc2-9e0234be79fd
This commit is contained in:
1170
dspace/docs/application.html
Normal file
1170
dspace/docs/application.html
Normal file
File diff suppressed because it is too large
Load Diff
137
dspace/docs/architecture.html
Normal file
137
dspace/docs/architecture.html
Normal file
@@ -0,0 +1,137 @@
|
|||||||
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN" "http://www.w3.org/TR/REC-html40/strict.dtd">
|
||||||
|
<html>
|
||||||
|
<head>
|
||||||
|
<title>DSpace System
|
||||||
|
Documentation: Architecture</title>
|
||||||
|
<link rel="StyleSheet" href="style.css" type="text/css">
|
||||||
|
<meta http-equiv="Content-Type"
|
||||||
|
content="text/html; charset=iso-8859-1">
|
||||||
|
</head>
|
||||||
|
<body>
|
||||||
|
<h1>DSpace
|
||||||
|
System Documentation: Architecture</h1>
|
||||||
|
<p><a href="index.html">Back to contents</a></p>
|
||||||
|
<h2><a name="overview">Overview</a></h2>
|
||||||
|
<p>The DSpace system is organized
|
||||||
|
into three layers, each of which
|
||||||
|
consists of a number of components.</p>
|
||||||
|
<p class="figure"><img src="image/architecture-600x450.gif"
|
||||||
|
alt="Application Layer, Business Logic Layer, Storage Layer"></p>
|
||||||
|
<p class="caption">DSpace
|
||||||
|
System Architecture</p>
|
||||||
|
<p>The storage layer is
|
||||||
|
responsible for physical storage of metadata
|
||||||
|
and content. The business logic layer deals with managing the content
|
||||||
|
of the archive, users of the archive (e-people), authorization, and
|
||||||
|
workflow. The application layer contains components that communicate
|
||||||
|
with the world outside of the individual DSpace installation, for
|
||||||
|
example the Web user interface and the <a
|
||||||
|
href="http://www.openarchives.org/">Open Archives
|
||||||
|
Initiative</a>
|
||||||
|
protocol for metadata harvesting service.</p>
|
||||||
|
<p>Each layer only invokes the
|
||||||
|
layer below it; the application layer
|
||||||
|
may not used the storage layer directly, for example. Each component in
|
||||||
|
the storage and business logic layers has a defined public API. The
|
||||||
|
union of the APIs of those components are referred to as the Storage
|
||||||
|
API (in the case of the storage layer) and the DSpace Public API (in
|
||||||
|
the case of the business logic layer). These APIs are in-process Java
|
||||||
|
classes, objects and methods.</p>
|
||||||
|
<p>It is important to note that
|
||||||
|
each layer is <em>trusted</em>.
|
||||||
|
Although the logic for <em>authorising
|
||||||
|
actions</em> is in the business
|
||||||
|
logic layer, the system relies on individual applications in the
|
||||||
|
application layer to correctly and securely <em>authenticate</em>
|
||||||
|
e-people. If a 'hostile' or insecure application were allowed to invoke
|
||||||
|
the Public API directly, it could very easily perform actions as any
|
||||||
|
e-person in the system.</p>
|
||||||
|
<p>The reason for this design
|
||||||
|
choice is that authentication methods
|
||||||
|
will vary widely between different applications, so it makes sense to
|
||||||
|
leave the logic and responsibility for that in these applications.</p>
|
||||||
|
<p>The source code is organized to
|
||||||
|
cohere very strictly to this
|
||||||
|
three-layer architecture. Also, only methods in a component's public
|
||||||
|
API are given the <code>public</code>
|
||||||
|
access level. This means that
|
||||||
|
the Java compiler helps ensure that the source code conforms to the
|
||||||
|
architecture.</p>
|
||||||
|
<table>
|
||||||
|
<caption>Source Code Packages</caption> <tbody>
|
||||||
|
<tr>
|
||||||
|
<th>Packages within</th>
|
||||||
|
<th>Correspond to components
|
||||||
|
in</th>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td><code>org.dspace.app</code></td>
|
||||||
|
<td>Application layer</td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td><code>org.dspace</code></td>
|
||||||
|
<td>Business logic layer
|
||||||
|
(except <code>storage</code>
|
||||||
|
and <code>app</code>)</td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td><code>org.dspace.storage</code></td>
|
||||||
|
<td>Storage layer</td>
|
||||||
|
</tr>
|
||||||
|
</tbody>
|
||||||
|
</table>
|
||||||
|
<p>The storage and business logic
|
||||||
|
layer APIs are extensively documented
|
||||||
|
with Javadoc-style comments. Generate the HTML version of these by
|
||||||
|
entering the source directory and running:</p>
|
||||||
|
<pre>ant public_api</pre>
|
||||||
|
<p>The package-level documentation
|
||||||
|
of each package usually contains an
|
||||||
|
overview of the package and some example usage. This information is not
|
||||||
|
repeated in this architecture document; this and the Javadoc APIs are
|
||||||
|
intended to be used in parallel.</p>
|
||||||
|
<p>Each layer is described in a
|
||||||
|
separate section:</p>
|
||||||
|
<ul>
|
||||||
|
<li><a href="storage.html">Storage Layer</a>
|
||||||
|
<ul>
|
||||||
|
<li><a href="storage.html#rdbms">RDBMS</a></li>
|
||||||
|
<li><a href="storage.html#bitstreams">Bitstream Store</a></li>
|
||||||
|
</ul>
|
||||||
|
</li>
|
||||||
|
<li><a href="business.html">Business Logic Layer</a>
|
||||||
|
<ul>
|
||||||
|
<li><a href="business.html#core">Core Classes</a></li>
|
||||||
|
<li><a href="business.html#content">Content Management API</a></li>
|
||||||
|
<li><a href="business.html#workflow">Workflow System</a></li>
|
||||||
|
<li><a href="business.html#administer">Administration Toolkit</a></li>
|
||||||
|
<li><a href="business.html#eperson">E-person/Group Manager</a></li>
|
||||||
|
<li><a href="business.html#authorize">Authorisation</a></li>
|
||||||
|
<li><a href="business.html#handle">Handle Manager/Handle
|
||||||
|
Plugin</a></li>
|
||||||
|
<li><a href="business.html#search">Search</a></li>
|
||||||
|
<li><a href="business.html#browse">Browse API</a></li>
|
||||||
|
<li><a href="business.html#history">History Recorder</a></li>
|
||||||
|
</ul>
|
||||||
|
</li>
|
||||||
|
<li><a href="application.html">Application Layer</a>
|
||||||
|
<ul>
|
||||||
|
<li><a href="application.html#webui">Web User Interface</a></li>
|
||||||
|
<li><a href="application.html#oai">OAI-PMH Data Provider</a></li>
|
||||||
|
<li><a href="application.html#itemimporter">Item Importer
|
||||||
|
and Exporter</a></li>
|
||||||
|
<li><a href="application.html#transferitem">Transferring
|
||||||
|
Items Between DSpace Instances</a></li>
|
||||||
|
<li><a href="application.html#registration">Registration</a></li>
|
||||||
|
<li><a href="application.html#mets">METS Tools</a></li>
|
||||||
|
<li><a href="application.html#mediafilters">Media Filters</a></li>
|
||||||
|
<li><a href="application.html#filiator">Sub-Community
|
||||||
|
Management</a></li>
|
||||||
|
</ul>
|
||||||
|
</li>
|
||||||
|
</ul>
|
||||||
|
<hr>
|
||||||
|
<address> Copyright ©
|
||||||
|
2002-2004 MIT and Hewlett Packard </address>
|
||||||
|
</body>
|
||||||
|
</html>
|
773
dspace/docs/business.html
Normal file
773
dspace/docs/business.html
Normal file
@@ -0,0 +1,773 @@
|
|||||||
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN" "http://www.w3.org/TR/REC-html40/strict.dtd">
|
||||||
|
<HTML>
|
||||||
|
<HEAD>
|
||||||
|
<TITLE>DSpace System Documentation: Business Logic Layer</TITLE>
|
||||||
|
<LINK REL=StyleSheet HREF="style.css" TYPE="text/css">
|
||||||
|
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" >
|
||||||
|
</HEAD>
|
||||||
|
<BODY>
|
||||||
|
<H1>DSpace System Documentation: Business Logic Layer</H1>
|
||||||
|
|
||||||
|
<P><A HREF="index.html">Back to contents</A><BR><A HREF="architecture.html">Back to architecture overview</A></P>
|
||||||
|
|
||||||
|
<H2><A NAME="core">Core Classes</A></H2>
|
||||||
|
|
||||||
|
<P>The <code>org.dspace.core</code> package provides some basic classes that are used throughout the DSpace code.</P>
|
||||||
|
|
||||||
|
<H3>The Configuration Manager (<code>ConfigurationManager</code>)</H3>
|
||||||
|
|
||||||
|
<P>The configuration manager is responsible for reading the main <code>dspace.cfg</code> properties file, managing the 'template' configuration files for other applications such as Apache, and for obtaining the text for e-mail messages.</P>
|
||||||
|
|
||||||
|
<P>The system is configured by editing the relevant files in <code>/dspace/config</code>, as described in the <a href="configure.html">configuration section</a>.</p>
|
||||||
|
|
||||||
|
<P><strong>When editing configuration files for applications that DSpace uses, such as Apache, remember to edit the file in <code>/dspace/config/templates</code> and then run <code>/dspace/bin/install-configs</code> rather than editing the 'live' version directly!</strong></P>
|
||||||
|
|
||||||
|
<P>The <code>ConfigurationManager</code> class can also be invoked as a command line tool, with two possible uses:</P>
|
||||||
|
|
||||||
|
<UL>
|
||||||
|
<LI>
|
||||||
|
<PRE><code>/dspace/bin/install-configs</code></PRE>
|
||||||
|
<P>This processes and installs configuration files for other applications, as described in the <A HREF="configure.html#templates">configuration section</A>.</P>
|
||||||
|
</LI>
|
||||||
|
<LI>
|
||||||
|
<PRE><code>/dspace/bin/dsrun org.dspace.core.ConfigurationManager -property property.name</code></PRE>
|
||||||
|
<P>This writes the value of <code>property.name</code> from <code>dspace.cfg</code> to the standard output, so that shell scripts can access the DSpace configuration. For an example, see <code>/dspace/bin/start-handle-server</code>. If the property has no value, nothing is written.</P>
|
||||||
|
</LI>
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
|
||||||
|
<H3>Constants</H3>
|
||||||
|
|
||||||
|
<P>This class contains constants that are used to represent types of object and actions in the database. For example, authorization policies can relate to objects of different types, so the <code>resourcepolicy</code> table has columns <code>resource_id</code>, which is the internal ID of the object, and <code>resource_type_id</code>, which indicates whether the object is an item, collection, bitstream etc. The value of <code>resource_type_id</code> is taken from the <code>Constants</code> class, for example <code>Constants.ITEM</code>.</P>
|
||||||
|
|
||||||
|
|
||||||
|
<H3>Context</H3>
|
||||||
|
|
||||||
|
<P>The <code>Context</code> class is central to the DSpace operation. Any code that wishes to use the any API in the business logic layer must first create itself a <code>Context</code> object. This is akin to opening a connection to a database (which is in fact one of the things that happens.)</P>
|
||||||
|
|
||||||
|
<P>A context object is involved in most method calls and object constructors, so that the method or object has access to information about the current operation. When the context object is constructed, the following information is automatically initialized:</P>
|
||||||
|
|
||||||
|
<UL>
|
||||||
|
<LI><P>A connection to the database. This is a transaction-safe connection. i.e. the 'auto-commit' flag is set to false.</P></LI>
|
||||||
|
<LI><P>A cache of content management API objects. Each time a content object is created (for example <code>Item</code> or <code>Bitstream</code>) it is stored in the <code>Context</code> object. If the object is then requested again, the cached copy is used. Apart from reducing database use, this addresses the problem of having two copies of the same object in memory in different states.</P></LI>
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
<P>The following information is also held in a context object, though it is the responsiblity of the application creating the context object to fill it out correctly:</P>
|
||||||
|
|
||||||
|
<UL>
|
||||||
|
<LI><P>The current authenticated user, if any</P></LI>
|
||||||
|
<LI><P>Any 'special groups' the user is a member of. For example, a user might automatically be part of a particular group based on the IP address they are accessing DSpace from, even though they don't have an e-person record. Such a group is called a 'special group'.</P></LI>
|
||||||
|
<LI><P>Any extra information from the application layer that should be added to log messages that are written within this context. For example, the Web UI adds a session ID, so that when the logs are analysed the actions of a particular user in a particular session can be tracked.</P></LI>
|
||||||
|
<LI>
|
||||||
|
<P>A flag indicating whether authorization should be circumvented. This should only be used in rare, specific circumstances. For example, when first installing the system, there are no authorized administrators who would be able to create an administrator account!</P>
|
||||||
|
<P>As noted above, the public API is <em>trusted</em>, so it is up to applications in the application layer to use this flag responsibly.</P>
|
||||||
|
</LI>
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
<P>Typical use of the context object will involve constructing one, and setting the current user if one is authenticated. Several operations may be performed using the context object. If all goes well, <code>complete</code> is called to commit the changes and free up any resources used by the context. If anything has gone wrong, <code>abort</code> is called to roll back any changes and free up the resources.</P>
|
||||||
|
|
||||||
|
<P>You should always <code>abort</code> a context if <em>any</em> error happens during its lifespan; otherwise the data in the system may be left in an inconsistent state. You can also <code>commit</code> a context, which means that any changes are written to the database, and the context is kept active for further use.</P>
|
||||||
|
|
||||||
|
|
||||||
|
<H3>Email</H3>
|
||||||
|
|
||||||
|
<P>Sending e-mails is pretty easy. Just use the configuration manager's <code>getEmail</code> method, set the arguments and recipients, and send.</P>
|
||||||
|
|
||||||
|
<P>The e-mail texts are stored in <code>/dspace/config/emails</code>. They are processed by the standard <code>java.text.MessageFormat</code>. At the top of each e-mail are listed the appropriate arguments that should be filled out by the sender. Example usage is shown in the <code>org.dspace.core.Email</code> Javadoc API documentation.</P>
|
||||||
|
|
||||||
|
|
||||||
|
<H3>LogManager</H3>
|
||||||
|
|
||||||
|
<P>The log manager consists of a method that creates a standard log header, and returns it as a string suitable for logging. Note that this class does not actually write anything to the logs; the log header returned should be logged directly by the sender using an appropriate Log4J call, so that information about where the logging is taking place is also stored.</P>
|
||||||
|
|
||||||
|
<P>The level of logging can be configured on a per-package or per-class basis by editing <code>/dspace/config/templates/log4j.properties</code> and then executing <code>/dspace/bin/install-configs</code>. You will need to stop and restart Tomcat for the changes to take effect.</P>
|
||||||
|
|
||||||
|
<P>A typical log entry looks like this:</P>
|
||||||
|
|
||||||
|
<P><code>2002-11-11 08:11:32,903 INFO org.dspace.app.webui.servlet.DSpaceServlet @ anonymous:session_id=BD84E7C194C2CF4BD0EC3A6CAD0142BB:view_item:handle=1721.1/1686</code></P>
|
||||||
|
|
||||||
|
<P>This is breaks down like this:</P>
|
||||||
|
|
||||||
|
<table>
|
||||||
|
<tr>
|
||||||
|
<td>Date and time, milliseconds</td>
|
||||||
|
<td><code>2002-11-11 08:11:32,903</code></td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td>Level (<code>FATAL</code>, <code>WARN</code>, <code>INFO</code> or <code>DEBUG</code>)</td>
|
||||||
|
<td><code>INFO</code></td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td>Java class</td>
|
||||||
|
<td><code>org.dspace.app.webui.servlet.DSpaceServlet</code></td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td></td>
|
||||||
|
<td><code>@</code></td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td>User email or <code>anonymous</code></td>
|
||||||
|
<td><code>anonymous</code></td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td></td>
|
||||||
|
<td><code>:</code></td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td>Extra log info from context</td>
|
||||||
|
<td><code>session_id=BD84E7C194C2CF4BD0EC3A6CAD0142BB</code></td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td></td>
|
||||||
|
<td><code>:</code></td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td>Action</td>
|
||||||
|
<td><code>view_item</code></td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td></td>
|
||||||
|
<td><code>:</code></td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td>Extra info</td>
|
||||||
|
<td><code>handle=1721.1/1686</code></td>
|
||||||
|
</tr>
|
||||||
|
</table>
|
||||||
|
|
||||||
|
<P>The above format allows the logs to be easily parsed and analysed. The <code>/dspace/bin/log-reporter</code> script is a simple tool for analysing logs. Try:</P>
|
||||||
|
|
||||||
|
<PRE>/dspace/bin/log-reporter --help</PRE>
|
||||||
|
|
||||||
|
<P>It's a good idea to 'nice' this log reporter to avoid an impact on server performance.</P>
|
||||||
|
|
||||||
|
|
||||||
|
<H3>Utils</H3>
|
||||||
|
|
||||||
|
<P><code>Utils</code> comtains miscellaneous utility method that are required in a variety of places throughout the code, and thus have no particular 'home' in a subsystem.</P>
|
||||||
|
|
||||||
|
|
||||||
|
<H2><A NAME="content">Content Management API</A></H2>
|
||||||
|
|
||||||
|
<P>The content management API package <code>org.dspace.content</code> contains Java classes for reading and manipulating content stored in the DSpace system. This is the API that components in the application layer will probably use most.</P>
|
||||||
|
|
||||||
|
<P>Classes corresponding to the main elements in the <A HREF="functional.html#data_model">DSpace data model</A> (<code>Community</code>, <code>Collection</code>, <code>Item</code>, <code>Bundle</code> and <code>Bitstream</code>) are sub-classes of the abstract class <code>DSpaceObject</code>. The <code>Item</code> object handles the Dublin Core metadata record.</P>
|
||||||
|
|
||||||
|
<P>Each class generally has one or more static <code>find</code> methods, which are used to instantiate content objects. Constructors do not have public access and are just used internally. The reasons for this are:</P>
|
||||||
|
|
||||||
|
<UL>
|
||||||
|
<LI><P>"Constructing" an object may be misconstrued as the action of creating an object in the DSpace system, for example one might expect something like:</P>
|
||||||
|
|
||||||
|
<PRE>Context dsContent = new Context();
|
||||||
|
Item myItem = new Item(context, id)</PRE>
|
||||||
|
|
||||||
|
<P>to construct a brand new item in the system, rather than simply instantiating an in-memory instance of an object in the system.</P></LI>
|
||||||
|
|
||||||
|
<LI><P><code>find</code> methods may often be called with invalid IDs, and return <code>null</code> in such a case. A constructor would have to throw an exception in this case. A <code>null</code> return value from a static method can in general be dealt with more simply in code.</P></LI>
|
||||||
|
|
||||||
|
<LI><P>If an instantiation representing the same underlying archival entity already exists, the <code>find</code> method can simply return that same instantiation to avoid multiple copies and any inconsistencies which might result.</P></LI>
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
<P><code>Collection</code>, <code>Bundle</code> and <code>Bitstream</code> do not have <code>create</code> methods; rather, one has to create an object using the relevant method on the container. For example, to create a collection, one must invoke <code>createCollection</code> on the community that the collection is to appear in:</P>
|
||||||
|
|
||||||
|
<PRE>Context context = new Context();
|
||||||
|
Community existingCommunity = Community.find(context, 123);
|
||||||
|
Collection myNewCollection = existingCommunity.createCollection();</PRE>
|
||||||
|
|
||||||
|
<P>The primary reason for this is for determining authorization. In order to know whether an e-person may create an object, the system must know which container the object is to be added to. It makes no sense to create a collection outside of a community, and the authorization system does not have a policy for that.</P>
|
||||||
|
|
||||||
|
<P><code>Item</code>s are first created in the form of an implementation of <code>InProgressSubmission</code>. An <code>InProgressSubmission</code> represents an item under construction; once it is complete, it is installed into the main archive and added to the relevant collection by the <code>InstallItem</code> class. The <code>org.dspace.content</code> package provides an implementation of <code>InProgressSubmission</code> called <code>WorkspaceItem</code>; this is a simple implementation that contains some fields used by the Web submission UI. The <code>org.dspace.workflow</code> also contains an implementation called <code>WorkflowItem</code> which represents a submission undergoing a workflow process.</P>
|
||||||
|
|
||||||
|
<P>In the previous chapter there is an <A HREF="functional.html#ingest">overview of the item ingest process</A> which should clarify the previous paragraph. Also see the section on <A HREF="#workflow">the workflow system</A>.</P>
|
||||||
|
|
||||||
|
<P><code>Community</code> and <code>BitstreamFormat</code> do have static <code>create</code> methods; one must be a site administrator to have authorization to invoke these.</P>
|
||||||
|
|
||||||
|
|
||||||
|
<H3>Other Classes</H3>
|
||||||
|
|
||||||
|
<P>Classes whose name begins <code>DC</code> are for manipulating Dublin Core metadata, as <A HREF="#dublincore">explained below</A>.</P>
|
||||||
|
|
||||||
|
<P>The <code>FormatIdentifier</code> class attempts to guess the bitstream format of a particular bitstream. Presently, it does this simply by looking at any file extension in the bitstream name and matching it up with the file extensions associated with bitstream formats. Hopefully this can be greatly improved in the future!</P>
|
||||||
|
|
||||||
|
<P>The <code>ItemIterator</code> class allows items to be retrieved from storage one at a time, and is returned by methods that may return a large number of items, more than would be desirable to have in memory at once.</P>
|
||||||
|
|
||||||
|
<P>The <code>ItemComparator</code> class is an implementation of the standard <code>java.util.Comparator</code> that can be used to compare and order items based on a particular Dublin Core metadata field.</P>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
<H3>Modifications</H3>
|
||||||
|
|
||||||
|
<P>When creating, modifying or for whatever reason removing data with the content management API, it is important to know when changes happen in-memory, and when they occur in the physical DSpace storage.</P>
|
||||||
|
|
||||||
|
<P>Primarily, one should note that no change made using a particular <code>org.dspace.core.Context</code> object will actually be made in the underlying storage unless <code>complete</code> or <code>commit</code> is invoked on that <code>Context</code>. If anything should go wrong during an operation, the context should always be aborted by invoking <code>abort</code>, to ensure that no inconsistent state is written to the storage.</P>
|
||||||
|
|
||||||
|
<P>Additionally, some changes made to objects only happen in-memory. In these cases, invoking the <code>update</code> method lines up the in-memory changes to occur in storage when the <code>Context</code> is committed or completed. In general, methods that change any [meta]data field only make the change in-memory; methods that involve relationships with other objects in the system line up the changes to be committed with the context. See individual methods in the API Javadoc.</P>
|
||||||
|
|
||||||
|
<P>Some examples to illustrate this are shown below:</P>
|
||||||
|
|
||||||
|
<TABLE>
|
||||||
|
<TR>
|
||||||
|
<TD><PRE>Context context = new Context();
|
||||||
|
Bitstream b = Bitstream.find(context, 1234);
|
||||||
|
b.setName("newfile.txt");
|
||||||
|
b.update();
|
||||||
|
context.complete();</PRE>
|
||||||
|
</TD>
|
||||||
|
<TD><strong>Will</strong> change storage</TD>
|
||||||
|
</TR>
|
||||||
|
<TR>
|
||||||
|
<TD><PRE>Context context = new Context();
|
||||||
|
Bitstream b = Bitstream.find(context, 1234);
|
||||||
|
b.setName("newfile.txt");
|
||||||
|
b.update();
|
||||||
|
context.abort();</PRE>
|
||||||
|
</TD>
|
||||||
|
<TD><strong>Will not</strong> change storage (context aborted)</TD>
|
||||||
|
</TR>
|
||||||
|
<TR>
|
||||||
|
<TD><PRE>Context context = new Context();
|
||||||
|
Bitstream b = Bitstream.find(context, 1234);
|
||||||
|
b.setName("newfile.txt");
|
||||||
|
context.complete();</PRE>
|
||||||
|
</TD>
|
||||||
|
<TD>The new name <strong>will not</strong> be stored since <code>update</code> was not invoked</TD>
|
||||||
|
</TR>
|
||||||
|
<TR>
|
||||||
|
<TD><PRE>Context context = new Context();
|
||||||
|
Bitstream bs = Bitstream.find(context, 1234);
|
||||||
|
Bundle bnd = Bundle.find(context, 5678);
|
||||||
|
bnd.add(bs);
|
||||||
|
context.complete();</PRE>
|
||||||
|
</TD>
|
||||||
|
<TD>The bitstream <strong>will</strong> be included in the bundle, since <code>update</code> doesn't need to be called</TD>
|
||||||
|
</TR>
|
||||||
|
</TABLE>
|
||||||
|
|
||||||
|
|
||||||
|
<H3>What's In Memory?</H3>
|
||||||
|
|
||||||
|
<P>Instantiating some content objects also causes other content objects to be loaded into memory.</P>
|
||||||
|
|
||||||
|
<P>Instantiating a <code>Bitstream</code> object causes the appropriate <code>BitstreamFormat</code> object to be instantiated. Of course the <code>Bitstream</code> object does not load the underlying bits from the bitstream store into memory!</P>
|
||||||
|
|
||||||
|
<P>Instantiating a <code>Bundle</code> object causes the appropriate <code>Bitstream</code> objects (and hence <code>BitstreamFormat</code>s) to be instantiated.</P>
|
||||||
|
|
||||||
|
<P>Instantiating an <code>Item</code> object causes the appropriate <code>Bundle</code> objects (etc.) and hence <code>BitstreamFormat</code>s to be instantiated. All the Dublin Core metadata associated with that item are also loaded into memory.</P>
|
||||||
|
|
||||||
|
<P>The reasoning behind this is that for the vast majority of cases, anyone instantiating an item object is going to need information about the bundles and bitstreams within it, and this methodology allows that to be done in the most efficient way and is simple for the caller. For example, in the Web UI, the servlet (controller) needs to pass information about an item to the viewer (JSP), which needs to have all the information in-memory to display the item without further accesses to the database which may cause errors mid-display.</P>
|
||||||
|
|
||||||
|
<P>You do not need to worry about multiple in-memory instantiations of the same object, or any inconsistenties that may result; the <code>Context</code> object keeps a cache of the instantiated objects. The <code>find</code> methods of classes in <code>org.dspace.content</code> will use a cached object if one exists.</P>
|
||||||
|
|
||||||
|
<P>It may be that in enough cases this automatic instantiation of contained objects reduces performance in situations where it is important; if this proves to be true the API may be changed in the future to include a <code>loadContents</code> method or somesuch, or perhaps a Boolean parameter indicating what to do will be added to the <code>find</code> methods.</P>
|
||||||
|
|
||||||
|
<P>When a <code>Context</code> object is completed, aborted or garbage-collected, any objects instantiated using that context are invalidated and should not be used (in much the same way an AWT button is invalid if the window containing it is destroyed).</P>
|
||||||
|
|
||||||
|
|
||||||
|
<H3><A NAME="dublincore">Dublin Core Metadata</A></H3>
|
||||||
|
|
||||||
|
<P>The <code>DCValue</code> class is a simple container that represents a single Dublin Core element, optional qualifier, value and language. The other classes starting with <code>DC</code> are utility classes for handling types of data in Dublin Core, such as people's names and dates. As supplied, the DSpace registry of elements and qualifiers corresponds to the <A HREF="http://www.dublincore.org/documents/2002/09/24/library-application-profile/">Library Application Profile</A> for Dublin Core. It should be noted that these utility classes assume that the values will be in a certain syntax, which will be true for all data generated within the DSpace system, but since Dublin Core does not always define strict syntax, this may not be true for Dublin Core originating outside DSpace.</P>
|
||||||
|
|
||||||
|
<P>Below is the specific syntax that DSpace expects various fields to adhere to:</P>
|
||||||
|
|
||||||
|
<TABLE>
|
||||||
|
<TR>
|
||||||
|
<TH>Element</TH>
|
||||||
|
<TH>Qualifier</TH>
|
||||||
|
<TH>Syntax</TH>
|
||||||
|
<TH>Helper Class</TH>
|
||||||
|
</TR>
|
||||||
|
<TR>
|
||||||
|
<TD><code>date</code></TD>
|
||||||
|
<TD>Any or unqualified</TD>
|
||||||
|
<TD>
|
||||||
|
<P>ISO 8601 in the UTC time zone, with either year, month, day, or second precision. Examples:</P>
|
||||||
|
<PRE>2000
|
||||||
|
2002-10
|
||||||
|
2002-08-14
|
||||||
|
1999-01-01T14:35:23Z</PRE>
|
||||||
|
</TD>
|
||||||
|
<TD><code>DCDate</code></TD>
|
||||||
|
</TR>
|
||||||
|
<TR>
|
||||||
|
<TD><code>contributor</code></TD>
|
||||||
|
<TD>Any or unqualified</TD>
|
||||||
|
<TD>
|
||||||
|
<P>In general last name, then a comma, then first names, then any additional information like "Jr.". If the contributor is an organization, then simply the name. Examples:</P>
|
||||||
|
<PRE>Doe, John
|
||||||
|
Smith, John Jr.
|
||||||
|
van Dyke, Dick
|
||||||
|
Massachusetts Institute of Technology</PRE>
|
||||||
|
</TD>
|
||||||
|
<TD><code>DCPersonName</code></TD>
|
||||||
|
</TR>
|
||||||
|
<TR>
|
||||||
|
<TD><code>language</code></TD>
|
||||||
|
<TD><code>iso</code></TD>
|
||||||
|
<TD>
|
||||||
|
<P>A two letter code taken ISO 639, followed optionally by a two letter country code taken from ISO 3166. Examples:</P>
|
||||||
|
<PRE>en
|
||||||
|
fr
|
||||||
|
en_US</PRE>
|
||||||
|
</TD>
|
||||||
|
<TD><code>DCLanguage</code></TD>
|
||||||
|
</TR>
|
||||||
|
<TR>
|
||||||
|
<TD><code>relation</code></TD>
|
||||||
|
<TD><code>ispartofseries</code></TD>
|
||||||
|
<TD>
|
||||||
|
<P>The series name, following by a semicolon followed by the number in that series. Alternatively, just free text.</P>
|
||||||
|
<PRE>MIT-TR; 1234
|
||||||
|
My Report Series; ABC-1234
|
||||||
|
NS1234</PRE>
|
||||||
|
</TD>
|
||||||
|
<TD><code>DCSeriesNumber</code></TD>
|
||||||
|
</TR>
|
||||||
|
</TABLE>
|
||||||
|
|
||||||
|
|
||||||
|
<H2><A NAME="workflow">Workflow System</A></H2>
|
||||||
|
|
||||||
|
<P>The primary classes are:</P>
|
||||||
|
|
||||||
|
<TABLE>
|
||||||
|
<TR>
|
||||||
|
<TD><code>org.dspace.content.WorkspaceItem</code></TD>
|
||||||
|
<TD>contains an Item before it enters a workflow</TD>
|
||||||
|
</TR>
|
||||||
|
<TR>
|
||||||
|
<TD><code>org.dspace.workflow.WorkflowItem</code></TD>
|
||||||
|
<TD>contains an Item while in a workflow</TD>
|
||||||
|
</TR>
|
||||||
|
<TR>
|
||||||
|
<TD><code>org.dspace.workflow.WorkflowManager</code></TD>
|
||||||
|
<TD>responds to events, manages the WorkflowItem states</TD>
|
||||||
|
</TR>
|
||||||
|
<TR>
|
||||||
|
<TD><code>org.dspace.content.Collection</code></TD>
|
||||||
|
<TD>contains List of defined workflow steps</TD>
|
||||||
|
</TR>
|
||||||
|
<TR>
|
||||||
|
<TD><code>org.dspace.eperson.Group</code></TD>
|
||||||
|
<TD>people who can perform workflow tasks are defined in EPerson Groups</TD>
|
||||||
|
</TR>
|
||||||
|
<TR>
|
||||||
|
<TD><code>org.dspace.core.Email</code></TD>
|
||||||
|
<TD>used to email messages to Group members and submitters</TD>
|
||||||
|
</TR>
|
||||||
|
</TABLE>
|
||||||
|
|
||||||
|
<P>The workflow system models the states of an Item in a state machine with 5 states (SUBMIT, STEP_1, STEP_2, STEP_3, ARCHIVE.) These are the three optional steps where the item can be viewed and corrected by different groups of people. Actually, it's more like 8 states, with STEP_1_POOL, STEP_2_POOL, and STEP_3_POOL. These pooled states are when items are waiting to enter the primary states.</P>
|
||||||
|
|
||||||
|
<P>The WorkflowManager is invoked by events. While an Item is being submitted, it is held by a WorkspaceItem. Calling the start() method in the WorkflowManager converts a WorkspaceItem to a WorkflowItem, and begins processing the WorkflowItem's state. Since all three steps of the workflow are optional, if no steps are defined, then the Item is simply archived.</P>
|
||||||
|
|
||||||
|
<P>Workflows are set per Collection, and steps are defined by creating corresponding entries in the List named workflowGroup. If you wish the workflow to have a step 1, use the administration tools for Collections to create a workflow Group with members who you want to be able to view and approve the Item, and the workflowGroup[0] becomes set with the ID of that Group.</P>
|
||||||
|
|
||||||
|
<P>If a step is defined in a Collection's workflow, then the WorkflowItem's state is set to that step_POOL. This pooled state is the WorkflowItem waiting for an EPerson in that group to claim the step's task for that WorkflowItem. The WorkflowManager emails the members of that Group notifying them that there is a task to be performed (the text is defined in config/emails,) and when an EPerson goes to their 'My DSpace' page to claim the task, the WorkflowManager is invoked with a claim event, and the WorkflowItem's state advances from STEP_x_POOL to STEP_x (where x is the corresponding step.) The EPerson can also generate an 'unclaim' event, returning the WorkflowItem to the STEP_x_POOL.</P>
|
||||||
|
|
||||||
|
<P>Other events the WorkflowManager handles are advance(), which advances the WorkflowItem to the next state. If there are no further states, then the WorkflowItem is removed, and the Item is then archived. An EPerson performing one of the tasks can reject the Item, which stops the workflow, rebuilds the WorkspaceItem for it and sends a rejection note to the submitter. More drastically, an abort() event is generated by the admin tools to cancel a workflow outright.</P>
|
||||||
|
|
||||||
|
|
||||||
|
<H2><A NAME="administer">Administration Toolkit</A></H2>
|
||||||
|
|
||||||
|
<P>The <code>org.dspace.administer</code> package contains some classes for administering a DSpace system that are not generally needed by most applications.</P>
|
||||||
|
|
||||||
|
<P>The <code>CreateAdministrator</code> class is a simple command-line tool, executed via <code>/dspace/bin/create-administrator</code>, that creates an administrator e-person with information entered from standard input. This is generally used only once when a DSpace system is initially installed, to create an initial administrator who can then use the Web administration UI to further set up the system. This script does not check for authorization, since it is typically run before there are any e-people to authorize! Since it must be run as a command-line tool on the server machine, generally this shouldn't cause a problem. A possibility is to have the script only operate when there are no e-people in the system already, though in general, someone with access to command-line scripts on your server is probably in a position to do what they want anyway!</P>
|
||||||
|
|
||||||
|
<P>The <code>DCType</code> class is similar to the <code>org.dspace.content.BitstreamFormat</code> class. It represents an entry in the Dublin Core type registry, that is, a particular element and qualifier, or unqualified element. It is in the <code>administer</code> package because it is only generally required when manipulating the registry itself. Elements and qualifiers are specified as literals in <code>org.dspace.content.Item</code> methods and the <code>org.dspace.content.DCValue</code> class. Only administrators may modify the Dublin Core type registry.</P>
|
||||||
|
|
||||||
|
<P>The <code>org.dspace.administer.RegistryLoader</code> class contains methods for initialising the Dublin Core type registry and bitstream format registry with entries in an XML file. Typically this is executed via the command line during the build process (see <code>build.xml</code> in the source.) To see examples of the XML formats, see the files in <code>config/registries</code> in the source directory. There is no XML schema, they aren't validated strictly when loaded in.</P>
|
||||||
|
|
||||||
|
|
||||||
|
<H2><A NAME="eperson">E-person/Group Manager</A></H2>
|
||||||
|
|
||||||
|
<P>DSpace keeps track of registered users with the <code>org.dspace.eperson.EPerson</code> class. The class has methods to create and manipulate an <code>EPerson</code> such as get and set methods for first and last names, email, and password. (Actually, there is no <code>getPassword()</code> method--an MD5 hash of the password is stored, and can only be verified with the <code>checkPassword()</code> method.) There are find methods to find an EPerson by email (which is assumed to be unique,) or to find all EPeople in the system.</P>
|
||||||
|
|
||||||
|
<P>The <code>EPerson</code> object should probably be reworked to allow for easy expansion; the current EPerson object tracks pretty much only what MIT was interested in tracking - first and last names, email, phone. The access methods are hardcoded and should probably be replaced with methods to access arbitrary name/value pairs for institutions that wish to customize what EPerson information is stored.</P>
|
||||||
|
|
||||||
|
<P>Groups are simply lists of <code>EPerson</code> objects. Other than membership, <code>Group</code> objects have only one other attribute: a name. Group names must be unique, so we have adopted naming conventions where the role of the group is its name, such as <code>COLLECTION_100_ADD</code>. Groups add and remove EPerson objects with <code>addMember()</code> and <code>removeMember()</code> methods. One important thing to know about groups is that they store their membership in memory until the <code>update()</code> method is called - so when modifying a group's membership don't forget to invoke <code>update()</code> or your changes will be lost! Since group membership is used heavily by the authorization system a fast <code>isMember()</code> method is also provided.</P>
|
||||||
|
|
||||||
|
<P>Another kind of Group is also implemented in DSpace--special Groups. The <code>Context</code> object for each session carries around a List of Group IDs that the user is also a member of--currently the MITUser Group ID is added to the list of a user's special groups if certain IP address or certificate criteria are met.</P>
|
||||||
|
|
||||||
|
|
||||||
|
<H2><A NAME="authorize">Authorization</A></H2>
|
||||||
|
|
||||||
|
<P>The primary classes are:</P>
|
||||||
|
|
||||||
|
<TABLE>
|
||||||
|
<TR>
|
||||||
|
<TD><code>org.dspace.authorize.AuthorizeManager</code></td>
|
||||||
|
<Td>does all authorization, checking policies against Groups</td>
|
||||||
|
</TR>
|
||||||
|
<TR>
|
||||||
|
<TD><code>org.dspace.authorize.ResourcePolicy</code></TD>
|
||||||
|
<TD>defines all allowable actions for an object</TD>
|
||||||
|
</TR>
|
||||||
|
<TR>
|
||||||
|
<TD><code>org.dspace.eperson.Group</code></TD>
|
||||||
|
<TD>all policies are defined in terms of EPerson Groups</TD>
|
||||||
|
</TR>
|
||||||
|
</table>
|
||||||
|
|
||||||
|
<P>The authorization system is based on the classic 'police state' model of security; no action is allowed unless it is expressed in a policy. The policies are attached to resources (hence the name <code>ResourcePolicy</code>,) and detail who can perform that action. The resource can be any of the DSpace object types, listed in <code>org.dspace.core.Constants</code> (<code>BITSTREAM</code>, <code>ITEM</code>, <code>COLLECTION</code>, etc.) The 'who' is made up of EPerson groups. The actions are also in <code>Constants.java</code> (<code>READ</code>, <code>WRITE</code>, <code>ADD</code>, etc.) The only non-obvious actions are <code>ADD</code> and <code>REMOVE</code>, which are authorizations for container objects. To be able to create an Item, you must have <code>ADD</code> permission in a Collection, which contains Items. (Communities, Collections, Items, and Bundles are all container objects.)</P>
|
||||||
|
|
||||||
|
<P>Currently most of the read policy checking is done with items--communities and collections are assumed to be openly readable, but items and their bitstreams are checked. Separate policy checks for items and their bitstreams enables policies that allow publicly readable items, but parts of their content may be restricted to certain groups.</P>
|
||||||
|
|
||||||
|
<P>The <code>AuthorizeManager</code> class' <code>authorizeAction(Context, object, action)</code> is the primary source of all authorization in the system. It gets a list of all of the ResourcePolicies in the system that match the object and action. It then iterates through the policies, extracting the EPerson Group from each policy, and checks to see if the EPersonID from the Context is a member of any of those groups. If all of the policies are queried and no permission is found, then an <code>AuthorizeException</code> is thrown. An <code>authorizeAction()</code> method is also supplied that returns a boolean for applications that require higher performance.</P>
|
||||||
|
|
||||||
|
<P>ResourcePolicies are very simple, and there are quite a lot of them. Each can only list a single group, a single action, and a single object. So each object will likely have several policies, and if multiple groups share permissions for actions on an object, each group will get its own policy. (It's a good thing they're small.)</P>
|
||||||
|
|
||||||
|
|
||||||
|
<H3>Special Groups</H3>
|
||||||
|
|
||||||
|
<P>All users are assumed to be part of the public group (ID=0.) DSpace admins (ID=1) are automatically part of all groups, much like super-users in the Unix OS. The Context object also carries around a List of special groups, which are also first checked for membership. These special groups are used at MIT to indicate membership in the MIT community, something that is very difficult to enumerate in the database! When a user logs in with an MIT certificate or with an MIT IP address, the login code adds this MIT user group to the user's Context.</P>
|
||||||
|
|
||||||
|
|
||||||
|
<H3>Miscellaneous Authorization Notes</H3>
|
||||||
|
|
||||||
|
<P>Where do items get their read policies? From the their collection's read policy. There once was a separate item read default policy in each collection, and perhaps there will be again since it appears that administrators are notoriously bad at defining collection's read policies. There is also code in place to enable policies that are timed--have a start and end date. However, the admin tools to enable these sorts of policies have not been written.</P>
|
||||||
|
|
||||||
|
|
||||||
|
<H2><A NAME="handle">Handle Manager/Handle Plugin</A></H2>
|
||||||
|
|
||||||
|
<P>The <code>org.dspace.handle</code> package contains two classes; <code>HandleManager</code> is used to create and look up Handles, and <code>HandlePlugin</code> is used to expose and resolve DSpace Handles for the outside world via the CNRI Handle Server code.</P>
|
||||||
|
|
||||||
|
<P>Handles are stored internally in the <code>handle</code> database table in the form:</P>
|
||||||
|
|
||||||
|
<PRE>1721.123/4567</PRE>
|
||||||
|
|
||||||
|
<P>Typically when they are used outside of the system they are displayed in either URI or "URL proxy" forms:</P>
|
||||||
|
|
||||||
|
<PRE>hdl:1721.123/4567
|
||||||
|
http://hdl.handle.net/1721.123/4567</PRE>
|
||||||
|
|
||||||
|
<P>It is the responsibility of the caller to extract the basic form from whichever displayed form is used.</P>
|
||||||
|
|
||||||
|
<P>The <code>handle</code> table maps these Handles to resource type/resource ID pairs, where resource type is a value from <code>org.dspace.core.Constants</code> and resource ID is the internal identifier (database primary key) of the object. This allows Handles to be assigned to any type of object in the system, though as <A HREF="functional.html#handles">explained in the functional overview</A>, only communities, collections and items are presently assigned Handles.</P>
|
||||||
|
|
||||||
|
<P><code>HandleManager</code> contains static methods for:</P>
|
||||||
|
|
||||||
|
<UL>
|
||||||
|
<LI>Creating a Handle</LI>
|
||||||
|
<LI>Finding the Handle for a <code>DSpaceObject</code>, though this is usually only invoked by the object itself, since <code>DSpaceObject</code> has a <code>getHandle</code> method</LI>
|
||||||
|
<LI>Retrieving the <code>DSpaceObject</code> identified by a particular Handle</LI>
|
||||||
|
<LI>Obtaining displayable forms of the Handle (URI or "proxy URL").</LI>
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
<P><code>HandlePlugin</code> is a simple implementation of the Handle Server's <code>net.handle.hdllib.HandleStorage</code> interface. It only implements the basic Handle retrieval methods, which get information from the <code>handle</code> database table. The CNRI Handle Server is configured to use this plug-in via its <code>config.dct</code> file.</P>
|
||||||
|
|
||||||
|
<P>Note that since the Handle server runs as a separate JVM to the DSpace Web applications, it uses a separate 'Log4J' configuration, since Log4J does not support multiple JVMs using the same daily rolling logs. This alternative configuration is held as a template in <code>/dspace/config/templates/log4j-handle-plugin.properties</code>, written to <code>/dspace/config/log4j-handle-plugin.properties</code> by the <code>install-configs</code> script. The <code>/dspace/bin/start-handle-server</code> script passes in the appropriate command line parameters so that the Handle server uses this configuration.</P>
|
||||||
|
|
||||||
|
|
||||||
|
<H2><A NAME="search">Search</A></H2>
|
||||||
|
|
||||||
|
<P>DSpace's search code is a simple API which currently wraps the Lucene search engine. The first half of the search task is indexing, and <code>org.dspace.search.DSIndexer</code> is the indexing class, which contains <code>indexContent()</code> which if passed an <code>Item</code>, <code>Community</code>, or <code>Collection</code>, will add that content's fields to the index. The methods <code>unIndexContent()</code> and <code>reIndexContent()</code> remove and update content's index information. The <code>DSIndexer</code> class also has a <code>main()</code> method which will rebuild the index completely. This is invoked by the <code>dspace/bin/index-all</code> script. The intent was for the <code>main()</code> method to be invoked on a regular basis to avoid index corruption, but we have had no problem with that so far. Which fields are indexed by <code>DSIndexer</code>? These fields are currently hardcoded in <code>indexItemContent()</code> <code>indexCollectionContent()</code> and <code>indexCommunityContent()</code>/ methods.</P>
|
||||||
|
|
||||||
|
<P>The query class <code>DSQuery</code> contains the three flavors of <code>doQuery()</code> methods--one searches the DSpace site, and the other two restrict searches to Collections and Communities. The results from a query are returned as three lists of handles; each list represents a type of result. One list is a list of Items with matches, and the other two are Collections and Communities that match. This separation allows the UI to handle the types of results gracefully without resolving all of the handles first to see what kind of content the handle points to. The <code>DSQuery</code> class also has a <code>main()</code> method for debugging via command-line searches.</P>
|
||||||
|
|
||||||
|
<H3>Our Lucene Implementation</H3>
|
||||||
|
|
||||||
|
<P>Currently we have our own Analyzer and Tokenizer classes (<code>DSAnalyzer</code> and <code>DSTokenizer</code>) to customize our indexing. They invoke the stemming and stop word features within Lucene. We create an <code>IndexReader</code> for each query, which we now realize isn't the most efficient use of resources - we seem to run out of filehandles on really heavy loads. (A wildcard query can open many filehandles!) Since Lucene is thread-safe, a better future implementation would be to have a single Lucene IndexReader shared by all queries, and then is invalidated and re-opened when the index changes. Future API growth could include relevance scores (Lucene generates them, but we ignore them,) and abstractions for more advanced search concepts such as booleans.</P>
|
||||||
|
|
||||||
|
<H3>Indexed Fields</H3>
|
||||||
|
|
||||||
|
<P>The <code>DSIndexer</code> class shipped with DSpace indexes the Dublin Core metadata in the following way:</P>
|
||||||
|
|
||||||
|
<TABLE>
|
||||||
|
<TR>
|
||||||
|
<TH>Search Field</TH>
|
||||||
|
<TH>Taken from Dublin Core Fields</TH>
|
||||||
|
</TR>
|
||||||
|
|
||||||
|
<TR>
|
||||||
|
<TD>Authors</TD>
|
||||||
|
<TD><code>contributor.*</code><br>
|
||||||
|
<code>creator.*</code><br>
|
||||||
|
<code>description.statementofresponsibility</code></TD>
|
||||||
|
</TR>
|
||||||
|
|
||||||
|
<TR>
|
||||||
|
<TD>Titles</TD>
|
||||||
|
<TD><CODE>title.*</code></td>
|
||||||
|
</tr>
|
||||||
|
|
||||||
|
<tr>
|
||||||
|
<td>Keywords</tD>
|
||||||
|
<td><code>subject.*</code>
|
||||||
|
</tr>
|
||||||
|
|
||||||
|
<tr>
|
||||||
|
<td>Abstracts</td>
|
||||||
|
<td><code>description.abstract</code><br>
|
||||||
|
<code>description.tableofcontents</code></td>
|
||||||
|
</tr>
|
||||||
|
|
||||||
|
<tr>
|
||||||
|
<td>Series</td>
|
||||||
|
<td><code>relation.ispartofseries</code></td>
|
||||||
|
</tr>
|
||||||
|
|
||||||
|
<tr>
|
||||||
|
<td>MIME types</td>
|
||||||
|
<td><code>format.mimetype</code></td>
|
||||||
|
</tr>
|
||||||
|
|
||||||
|
<tr>
|
||||||
|
<td>Sponsors</td>
|
||||||
|
<td><code>description.sponsorship</code></td>
|
||||||
|
</tr>
|
||||||
|
|
||||||
|
<tr>
|
||||||
|
<td>Identifiers</td>
|
||||||
|
<td><code>identifier.*</code></td>
|
||||||
|
</tr>
|
||||||
|
</table>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
<H3>Harvesting API</H3>
|
||||||
|
|
||||||
|
<P>The <code>org.dspace.search</code> package also provides a 'harvesting' API. This allows callers to extract information about items modified within a particular timeframe, and within a particular scope (all of DSpace, or a community or collection.) Currently this is used by the Open Archives Initiative metadata harvesting protocol application, and the e-mail subscription code.</P>
|
||||||
|
|
||||||
|
<P>The <code>Harvest.harvest</code> is invoked with the required scope and start and end dates. Either date can be omitted. The dates should be in the ISO8601, UTC time zone format used elsewhere in the DSpace system.</P>
|
||||||
|
|
||||||
|
<P><code>HarvestedItemInfo</code> objects are returned. These objects are simple containers with basic information about the items falling within the given scope and date range. Depending on parameters passed to the <code>harvest</code> method, the <code>containers</code> and <code>item</code> fields may have been filled out with the IDs of communities and collections containing an item, and the corresponding <code>Item</code> object respectively. Electing not to have these fields filled out means the harvest operation executes considerable faster.</P>
|
||||||
|
|
||||||
|
<P>In case it is required, <code>Harvest</code> also offers a method for creating a single <code>HarvestedItemInfo</code> object, which might make things easier for the caller.</P>
|
||||||
|
|
||||||
|
|
||||||
|
<H2><A NAME="browse">Browse API</A></H2>
|
||||||
|
|
||||||
|
<P>The browse API maintains indices of dates, authors and titles, and allows callers to extract parts of these:</P>
|
||||||
|
|
||||||
|
<DL>
|
||||||
|
<DT>Title</DT>
|
||||||
|
|
||||||
|
<DD><P>Values of the Dublin Core lement <strong><code>title</code></strong> (unqualified) are indexed. These are sorted in a case-insensitive fashion, with any leading article removed. For example:</P>
|
||||||
|
|
||||||
|
<PRE>The DSpace System</PRE>
|
||||||
|
|
||||||
|
<P>Appears under 'D' rather than 'T'.</P></DD>
|
||||||
|
|
||||||
|
<DT>Author</DT>
|
||||||
|
|
||||||
|
<DD><P>Values of the <strong><code>contributor</code></strong> (any qualifier or unqualified) element are indexed. Since <code>contributor</code> values typically are in the form 'last name, first name', a simple case-insensitive alphanumeric sort is used which orders authors in last name order.</P>
|
||||||
|
|
||||||
|
<P>Note that this is an index of <em>authors</em>, and not <em>items by author</em>. If four items have the same author, that author will appear in the index only once. Hence, the index of authors may be greater or smaller than the index of titles; items often have more than one author, though the same author may have authored several items.</P>
|
||||||
|
|
||||||
|
<P>The author indexing in the browse API does have limitations:</P>
|
||||||
|
|
||||||
|
<UL>
|
||||||
|
<LI><P>Ideally, a name that appears as an author for more than one item would appear in the author index only once. For example, 'Doe, John' may be the author of tens of items. However, in practice, author's names often appear in slightly differently forms, for example:</P>
|
||||||
|
|
||||||
|
<PRE>Doe, John
|
||||||
|
Doe, John Stewart
|
||||||
|
Doe, John S.</PRE>
|
||||||
|
|
||||||
|
<P>Currently, the above three names would all appear as separate entries in the author index even though they may refer to the same author. In order for an author of several papers to be correctly appear once in the index, each item must specify <em>exactly</em> the same form of their name, which doesn't always happen in practice.</P></LI>
|
||||||
|
|
||||||
|
<LI><P>Another issue is that two authors may have the same name, even within a single institution. If this is the case they may appear as one author in the index.</P></LI>
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
<P>These issues are typically resolved in libraries with <em>authority control records</em>, in which are kept a 'preferred' form of the author's name, with extra information (such as date of birth/death) in order to distinguish between authors of the same name. Maintaining such records is a huge task with many issues, particularly when metadata is received from faculty directly rather than trained library cataloguers. For these reasons, DSpace does not yet feature 'authority control' functionality.</P></DD>
|
||||||
|
|
||||||
|
<DT>Date of Issue</DT>
|
||||||
|
|
||||||
|
<DD><P>Items are indexed by date of issue. This may be different from the date that an item appeared in DSpace; many items may have been originally published elsewhere beforehand. The Dublin Core field used is <strong><code>date.issued</code></strong>. The ordering of this index may be reversed so 'earliest first' and 'most recent first' orderings are possible.</P>
|
||||||
|
|
||||||
|
<P>Note that the index is of <em>items by date</em>, as opposed to an index of <em>dates</em>. If 30 items have the same issue date (say 2002), then those 30 items all appear in the index adjacent to each other, as opposed to a single 2002 entry.</P>
|
||||||
|
|
||||||
|
<P>Since dates in DSpace Dublin Core are in ISO8601, all in the UTC time zone, a simple alphanumeric sort is sufficient to sort by date, including dealing with varying granularities of date reasonably. For example:</P>
|
||||||
|
|
||||||
|
<PRE>2001-12-10
|
||||||
|
2002
|
||||||
|
2002-04
|
||||||
|
2002-04-05
|
||||||
|
2002-04-09T15:34:12Z
|
||||||
|
2002-04-09T19:21:12Z
|
||||||
|
2002-04-10</PRE></DD>
|
||||||
|
|
||||||
|
<DT>Date Accessioned</DT>
|
||||||
|
|
||||||
|
<DD><P>In order to determine which items most recently appeared, rather than using the date of issue, an item's accession date is used. This is the Dublin Core field <strong><code>date.accessioned</code></strong>. In other aspects this index is identical to the date of issue index.</P></DD>
|
||||||
|
|
||||||
|
|
||||||
|
<DT>Items by a Particular Author</DT>
|
||||||
|
|
||||||
|
<DD><P>One last operation the browse API can perform is to extract items by a particular author. They do not have to be primary author of an item for that item to be extracted. You can specify a scope, too; that is, you can ask for items by author X in collection Y, for example.</P>
|
||||||
|
|
||||||
|
<P>This particular flavour of browse is slightly simpler than the others. You cannot presently specify a particular subset of results to be returned. The API call will simply return all of the items by a particular author within a certain scope.</P>
|
||||||
|
|
||||||
|
<P>Note that the author of the item must <em>exactly</em> match the author passed in to the API; see the explanation about the caveats of the author index browsing to see why this is the case.</P></DD>
|
||||||
|
|
||||||
|
</DL>
|
||||||
|
|
||||||
|
<P>The API is generally invoked by creating a <code>BrowseScope</code> object, and setting the parameters for which particular part of an index you want to extract. This is then passed to the relevent <code>Browse</code> method call, which returns a <code>BrowseInfo</code> object which contains the results of the operation. The parameters set in the <code>BrowseScope</code> object are:</P>
|
||||||
|
|
||||||
|
<UL>
|
||||||
|
<LI>How many entries from the index you want</LI>
|
||||||
|
<LI>Whether you only want entries from a particular community or collection, or from the whole of DSpace</LI>
|
||||||
|
<LI>Which part of the index to start from (called the <em>focus</em> of the browse). If you don't specify this, the start of the index is used</LI>
|
||||||
|
<LI>How many entries to include before the <em>focus</em> entry</LI>
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
<P>To illustrate, here is an example:</P>
|
||||||
|
|
||||||
|
<UL>
|
||||||
|
<LI>We want <strong>7</strong> entries in total</LI>
|
||||||
|
<LI>We want entries from collection <em>x</em></LI>
|
||||||
|
<LI>We want the focus to be 'Really'</LI>
|
||||||
|
<LI>We want <strong>2</strong> entries included before the focus.</LI>
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
<P>The results of invoking <code>Browse.getItemsByTitle</code> with the above parameters might look like this:</P>
|
||||||
|
|
||||||
|
<PRE> Rabble-Rousing Rabbis From Sardinia
|
||||||
|
Reality TV: Love It or Hate It?
|
||||||
|
FOCUS> The Really Exciting Research Video
|
||||||
|
Recreational Housework Addicts: Please Visit My House
|
||||||
|
Regional Television Variation Studies
|
||||||
|
Revenue Streams
|
||||||
|
Ridiculous Example Titles: I'm Out of Ideas</PRE>
|
||||||
|
|
||||||
|
<P>Note that in the case of title and date browses, <code>Item</code> objects are returned as opposed to actual titles. In these cases, you can specify the 'focus' to be a specific item, or a partial or full literal value. In the case of a literal value, if no entry in the index matches exactly, the closest match is used as the focus. It's quite reasonable to specify a focus of a single letter, for example.</P>
|
||||||
|
|
||||||
|
<P>Being able to specify a specific item to start at is particularly important with dates, since many items may have the save issue date. Say 30 items in a collection have the issue date 2002. To be able to page through the index 20 items at a time, you need to be able to specify exactly which item's 2002 is the focus of the browse, otherwise each time you invoked the browse code, the results would start at the first item with the issue date 2002.</P>
|
||||||
|
|
||||||
|
<P>Author browses return <code>String</code> objects with the actual author names. You can only specify the focus as a full or partial literal <code>String</code>.</P>
|
||||||
|
|
||||||
|
<P>Another important point to note is that presently, the browse indices contain metadata for all items in the main archive, regardless of authorization policies. This means that all items in the archive will appear to all users when browsing. Of course, should the user attempt to access a non-public item, the usual authorization mechanism will apply. Whether this approach is ideal is under review; implementing the browse API such that the results retrieved reflect a user's level of authorization may be possible, but rather tricky.</P>
|
||||||
|
|
||||||
|
|
||||||
|
<H3>Index Maintenance</H3>
|
||||||
|
|
||||||
|
<P>The browse API contains calls to add and remove items from the index, and to regenerate the indices from scratch. In general the content management API invokes the necessary browse API calls to keep the browse indices in sync with what is in the archive, so most applications will not need to invoke those methods.</P>
|
||||||
|
|
||||||
|
<P>If the browse index becomes inconsistent for some reason, the <code>InitializeBrowse</code> class is a command line tool (generally invoked using the <code>/dspace/bin/index-all</code> shell script) that causes the indices to be regenerated from scratch.</P>
|
||||||
|
|
||||||
|
|
||||||
|
<H3>Caveats</H3>
|
||||||
|
|
||||||
|
<P>Presently, the browse API is not tremendously efficient. 'Indexing' takes the form of simply extracting the relevant Dublin Core value, normalising it (lower-casing and removing any leading article in the case of titles), and inserting that normalized value with the corresponding item ID in the appropriate browse database table. Database views of this table include collection and community IDs for browse operations with a limited scope. When a browse operation is performed, a simple <code>SELECT</code> query is performed, along the lines of:</P>
|
||||||
|
|
||||||
|
<PRE>SELECT item_id FROM ItemsByTitle ORDER BY sort_title OFFSET 40 LIMIT 20</PRE>
|
||||||
|
|
||||||
|
<P>There are two main drawbacks to this: Firstly, <code>LIMIT</code> and <code>OFFSET</code> are PostgreSQL-specific keywords. Secondly, the database is still actually performing dynamic sorting of the titles, so the browse code as it stands will not scale particularly well. The code does cache <code>BrowseInfo</code> objects, so that common browse operations are performed quickly, but this is not an ideal solution.</P>
|
||||||
|
|
||||||
|
|
||||||
|
<H2><A NAME="history">History Recorder</A></H2>
|
||||||
|
|
||||||
|
<P>The purpose of the history subsystem is to capture a time-based record of significant changes in DSpace, in a manner suitable for later refactoring or repurposing. Note that the history data is not expected to provide current information about the archive; it simply records what has happened in the past.</P>
|
||||||
|
|
||||||
|
<P>The <A HREF="http://www.metadata.net/harmony/">Harmony project</A> describes a simple and powerful approach for modeling temporal data. The DSpace history framework adopts this model. The Harmony model is used by the serialization mechanism (and ultimately by agents who interpret the serializations); users of the History API need not be aware of it. The content management API handles invocations of the history system. Users of the DSpace public API do not generally need to use the history API.</P>
|
||||||
|
|
||||||
|
<P>When anything of archival interest occurs in DSpace, the <code>saveHistory</code> method of the <code>HistoryManager</code> is invoked. The parameters contains a reference to anything of archival interest. Upon reception of the object, it serializes the state of all archive objects referred to by it, and creates Harmony-style objects and associations to describe the relationships between the objects. (A simple example is given below). Note that each archive object must have a unique identifier to allow linkage between discrete events; this is discussed under "Unique IDs" below.</P>
|
||||||
|
|
||||||
|
<P>The serializations (including the Harmony objects and associations) are persisted as files in the <code>/dspace/history</code> (or other configured) directory. The <code>history</code> and <code>historystate</code> tables contain simple indicies into the serializations in the file system.</P>
|
||||||
|
|
||||||
|
<H3>Archival Events</H3>
|
||||||
|
|
||||||
|
<P>The following events are significant enough to warrant history records:</P>
|
||||||
|
|
||||||
|
<UL>
|
||||||
|
<LI>Communities
|
||||||
|
<UL>
|
||||||
|
<LI>create/modify/delete</LI>
|
||||||
|
<LI>add/remove Collection to/from Community</LI>
|
||||||
|
</UL>
|
||||||
|
</LI>
|
||||||
|
<LI>Collections
|
||||||
|
<UL>
|
||||||
|
<LI>create/modify/delete</LI>
|
||||||
|
<LI>add/remove Item to/from Collection</LI>
|
||||||
|
</UL>
|
||||||
|
</LI>
|
||||||
|
<LI>Items
|
||||||
|
<UL>
|
||||||
|
<LI>create/modify/delete</LI>
|
||||||
|
<LI>assign Handle to Item</LI>
|
||||||
|
<LI>modify Item contents (Bundles, Bitstreams, metadata fields, etc)</LI>
|
||||||
|
</UL>
|
||||||
|
</LI>
|
||||||
|
<LI>EPerson
|
||||||
|
<UL>
|
||||||
|
<LI>create/modify/delete</LI>
|
||||||
|
</UL>
|
||||||
|
</LI>
|
||||||
|
<LI>Workflow
|
||||||
|
<UL>
|
||||||
|
<LI>Workflow completed</LI>
|
||||||
|
</UL>
|
||||||
|
</LI>
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
|
||||||
|
<H3>Serializations</H3>
|
||||||
|
|
||||||
|
<P>The serialization of an archival object consists of:</P>
|
||||||
|
|
||||||
|
<UL>
|
||||||
|
<LI>Its instance fields (ie, non-static, non-transient fields)</LI>
|
||||||
|
<LI>The serializations of associated objects (or references to these serializations).</LI>
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
|
||||||
|
<H3>Unique Ids</H3>
|
||||||
|
|
||||||
|
<P>To be able to trace the history of an object, it is essential that the object have a unique identifier. Since not all objects in the system have Handles, the unique identifiers are only weakly tied to the Handle system. Instead, the identifier consists of:</P>
|
||||||
|
|
||||||
|
<UL>
|
||||||
|
<LI>an identifer for the project</LI>
|
||||||
|
<LI>a site id (using the handle prefix)</LI>
|
||||||
|
<LI>an RDBMS-based id for objects</LI>
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
|
||||||
|
<H3>Storage</H3>
|
||||||
|
|
||||||
|
<P>When an archive object is serialized, an object ID and MD5 checksum are recorded. When another object is serialized, the checksum for the serialization is matched against existing checksums for that object. If the checksum already exists, the object is not stored; a reference to the object is used instead. Note that since none of the serializations are deleted, reference counting is unnecessary.</P>
|
||||||
|
|
||||||
|
<P>The history data is not initially stored in a queryable form. Two simple RDBMS tables give basic indications of what is stored, and where. The <code>history</code> table is an index of serializations with checksums and dates. The <code>history_id</code> column corresponds to the file in which a serialization is stored. For example, if the history ID is 123456, it will be stored in the file:</P>
|
||||||
|
|
||||||
|
<PRE>/dspace/history/00/12/34/123456</PRE>
|
||||||
|
|
||||||
|
<P>The table also contains the date the serialization was written and the MD5 checksum of the serialization.</P>
|
||||||
|
|
||||||
|
<P>The <code>historystate</code> table is supposed to indicate the most recent serialization of any given object.</P>
|
||||||
|
|
||||||
|
|
||||||
|
<H3>Example</H3>
|
||||||
|
|
||||||
|
<P>An item is submitted to a collection via bulk upload. When (and if) the item is eventually added to the collection, the history method is called, with references to the item, its collection, the e-person who performed the bulk upload, and some indication of the fact that it was submitted via a bulk upload.</P>
|
||||||
|
|
||||||
|
<P>When called, the HistoryManager does the following: It creates the following new resources (all with unique ids):</P>
|
||||||
|
|
||||||
|
<UL>
|
||||||
|
<LI>An event</LI>
|
||||||
|
<LI>A state</LI>
|
||||||
|
<LI>An action</LI>
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
<P>It also generates the following relationships:</P>
|
||||||
|
|
||||||
|
<PRE>event --atTime--> time
|
||||||
|
event --hasOutput--> state
|
||||||
|
Item --inState--> state
|
||||||
|
state --contains--> Item
|
||||||
|
action --creates--> Item
|
||||||
|
event --hasAction--> action
|
||||||
|
action --usesTool--> DSpace Upload
|
||||||
|
action --hasAgent--> User</PRE>
|
||||||
|
|
||||||
|
<P>The history component serializes the state of all archival objects involved (in this case, the item, the e-person, and the collection). It creates entries in the history database tables which associate the archival objects with the generated serializations.</P>
|
||||||
|
|
||||||
|
<H3>Caveats</H3>
|
||||||
|
|
||||||
|
<P>This history system is a largely untested experiment. It also needs further documentation. There have been no serious efforts to determine whether the information written by the history system, either to files or the database tables, is accurate. In particular, the <code>historystate</code> table does not seem to be correctly written.</P>
|
||||||
|
|
||||||
|
<HR>
|
||||||
|
|
||||||
|
<ADDRESS>
|
||||||
|
Copyright © 2002-2004 MIT and Hewlett Packard
|
||||||
|
</ADDRESS>
|
||||||
|
</BODY>
|
||||||
|
</HTML>
|
847
dspace/docs/configure.html
Normal file
847
dspace/docs/configure.html
Normal file
@@ -0,0 +1,847 @@
|
|||||||
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN" "http://www.w3.org/TR/REC-html40/strict.dtd">
|
||||||
|
<html>
|
||||||
|
<head>
|
||||||
|
<title>DSpace System Documentation: Configuration and Customization</title>
|
||||||
|
<link rel="StyleSheet" href="style.css" type="text/css">
|
||||||
|
<meta http-equiv="Content-Type"
|
||||||
|
content="text/html; charset=iso-8859-1">
|
||||||
|
</head>
|
||||||
|
<body>
|
||||||
|
<h1>DSpace System Documentation: Configuration and Customization</h1>
|
||||||
|
<p><a href="index.html">Back to contents</a></p>
|
||||||
|
<p>There are a number of ways in which DSpace can be configured and/or
|
||||||
|
customized:</p>
|
||||||
|
<ul>
|
||||||
|
<li>Altering the configuration files in <code><i>[dspace]</i>/config</code></li>
|
||||||
|
<li>Creating modified versions of the JSPs; these can be placed
|
||||||
|
separately from and override the default installed JSPs, so that future
|
||||||
|
updates of the code won't overwrite your changes</li>
|
||||||
|
<li>Implementing a custom 'authenticator' class, so that user
|
||||||
|
authentication in the Web UI can be adapted and integrated with any
|
||||||
|
existing mechanisms your organization might use</li>
|
||||||
|
<li>Editing the source code</li>
|
||||||
|
</ul>
|
||||||
|
<p>Of these methods, only the last is likely to cause any headaches; if
|
||||||
|
you update the DSpace source code directly, it may make applying future
|
||||||
|
updates difficult. However, DSpace is open source, of course, and if
|
||||||
|
you make any modifications that might be helpful to other institutions
|
||||||
|
or organizations, feel free to send them to the DSpace team at MIT.</p>
|
||||||
|
<h2><a name="dspacecfg">The <code>dspace.cfg</code> Configuration
|
||||||
|
Properties File</a></h2>
|
||||||
|
<p>The primary way of configuring DSpace is to edit the <code>dspace.cfg</code>.
|
||||||
|
You'll definitely have to do this before you can operate DSpace
|
||||||
|
properly. <code>dspace.cfg</code> contains basic information about a
|
||||||
|
DSpace installation, including system path information, network host
|
||||||
|
information, and other things like site name.</p>
|
||||||
|
<p>The default <code>dspace.cfg</code> is a good source of
|
||||||
|
information, and contains comments for all properties. It's a basic
|
||||||
|
Java properties file, where lines are either comments, starting with a '<code>#</code>',
|
||||||
|
blank lines, or property/value pairs of the form:</p>
|
||||||
|
<pre>property.name = property value</pre>
|
||||||
|
<p>Due to time constraints, this document does not contain an
|
||||||
|
exhaustive list of properties; they are all listed in the supplied <code>dspace.cfg</code>.
|
||||||
|
Below are some particularly relevant properties with notes for their
|
||||||
|
use:</p>
|
||||||
|
<table>
|
||||||
|
<caption><code>dspace.cfg</code> Main Properties (Not Complete)</caption>
|
||||||
|
<tbody>
|
||||||
|
<tr>
|
||||||
|
<th>Property</th>
|
||||||
|
<th>Example Values</th>
|
||||||
|
<th>Notes</th>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td><code>dspace.dir</code></td>
|
||||||
|
<td><code>/dspace</code></td>
|
||||||
|
<td>Root directory of DSpace installation. Omit the trailing '/'.
|
||||||
|
Note that if you change this, there are several other parameters you
|
||||||
|
will probably want to change to match, e.g. <code>assetstore.dir</code>.</td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td><code>dspace.url</code></td>
|
||||||
|
<td><code>http://dspace.myu.edu</code><br>
|
||||||
|
<code>http://dspacetest.myu.edu:8080</code></td>
|
||||||
|
<td>Main URL at which DSpace Web UI webapp is deployed. Include
|
||||||
|
any port number, but do not include a trailing '/'</td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td><code>dspace.hostname</code></td>
|
||||||
|
<td><code>dspace.myu.edu</code></td>
|
||||||
|
<td>Fully qualified hostname; do not include port number</td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td><code>dspace.name</code></td>
|
||||||
|
<td><code>DSpace at My University</code></td>
|
||||||
|
<td>Short and sweet site name, used throughout Web UI, e-mails
|
||||||
|
and elsewhere (such as OAI protocol)</td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td><code>config.template.foo</code></td>
|
||||||
|
<td><code>/opt/othertool/cfg/foo</code></td>
|
||||||
|
<td>When <code>install-configs</code> is run, the file <code><i>[dspace]</i>/config/templates/foo</code>
|
||||||
|
file will be filled out with values from <code>dspace.cfg</code> and
|
||||||
|
copied to the value of this property, in this example <code>/opt/othertool/cfg/foo</code>.
|
||||||
|
<a href="#templates">See here for more information.</a></td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td><code>webui.site.authenticator</code></td>
|
||||||
|
<td><code>edu.myu.MyAuthenticator</code></td>
|
||||||
|
<td>The Java class name of a class implementing the <code>org.dspace.app.webui.SiteAuthenticator</code>
|
||||||
|
interface.</td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td><code>handle.prefix</code></td>
|
||||||
|
<td><code>1721.1234</code></td>
|
||||||
|
<td>The Handle prefix for your site, <a
|
||||||
|
href="install.html#handles">see the Handle section</a></td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td><code>assetstore.dir</code></td>
|
||||||
|
<td><code>/bigdisk/store</code></td>
|
||||||
|
<td>The location in the file system for asset (bitstream) store
|
||||||
|
number zero. This should be a directory for the sole use of DSpace.</td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td><code>assetstore.dir.n</code></td>
|
||||||
|
<td><code>/anotherdisk/store1</code></td>
|
||||||
|
<td>The location in the file system of asset (bitstream) store
|
||||||
|
number <code>n</code>. When adding additional stores, start with 1 (<code>assetstore.dir.1</code>
|
||||||
|
and count upwards. Always leave asset store zero (<code>assetstore.dir</code>).
|
||||||
|
For more details, see <a href="storage.html#bitstreams">the Bitstream
|
||||||
|
Storage section</a>.</td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td><code>assetstore.incoming</code></td>
|
||||||
|
<td><code>1</code></td>
|
||||||
|
<td>The asset store number to use for storing new bitstreams. For
|
||||||
|
example, if <code>assetstore.dir.1</code> is <code>/anotherdisk/store1</code>,
|
||||||
|
and <code>assetstore.incoming</code> is <code>1</code>, new
|
||||||
|
bitstreams will be stored under <code>/anotherdisk/store1</code>. A
|
||||||
|
value of <code>0</code> (zero) corresponds to <code>assetstore.dir</code>.
|
||||||
|
For more details, see <a href="storage.html#bitstreams">the Bitstream
|
||||||
|
Storage section</a>.</td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td><span style="font-family: monospace;">srb.xxx</span><br
|
||||||
|
style="font-family: monospace;">
|
||||||
|
<span style="font-family: monospace;">srb.xxx.n</span><br>
|
||||||
|
</td>
|
||||||
|
<td><span style="font-family: monospace;"></span><span
|
||||||
|
style="font-family: monospace;">/zone/home/user.domain</span><br>
|
||||||
|
</td>
|
||||||
|
<td>The sets of SRB access parameters (see <span
|
||||||
|
style="font-family: monospace;">dspace.cfg</span>) if one or more SRB
|
||||||
|
accounts are used. The <span style="font-family: monospace;">srb.xxx</span>
|
||||||
|
set would correspond to asset (bitstream) store number zero. The <span
|
||||||
|
style="font-family: monospace;">srb.xxx.n</span> set would correspond
|
||||||
|
to asset (bitstream) store number <span style="font-family: monospace;">n</span>.
|
||||||
|
For more details, see <a href="storage.html#bitstreams">the Bitstream
|
||||||
|
Storage section</a>.</td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td><code>webui.submit.enable-cc</code></td>
|
||||||
|
<td><code>true</code></td>
|
||||||
|
<td>Enable the Creative Commons license step in the submission
|
||||||
|
process. Submitters are given an opportunity to select a Creative
|
||||||
|
Commons license to accompany the Item. Creative Commons licenses govern
|
||||||
|
the use of the content. For more details, see <a
|
||||||
|
href="http://creativecommons.org">the Creative Commons website</a>.</td>
|
||||||
|
</tr>
|
||||||
|
</tbody>
|
||||||
|
</table>
|
||||||
|
<p>Whenever you edit <code>dspace.cfg</code>, you should then run <code><i>[dspace]</i>/bin/install-configs</code>
|
||||||
|
so that any changes you may have made are reflected in the
|
||||||
|
configuration files of other applications, for example Apache. You may
|
||||||
|
then need to restart those applications, depending on what you changed.</p>
|
||||||
|
<h2><a name="email">Wording of E-mail Messages</a></h2>
|
||||||
|
<p>Sometimes DSpace automatically sends e-mail messages to users, for
|
||||||
|
example to inform them of a new workflow task, or as a subscription
|
||||||
|
e-mail alert. The wording of emails can be changed by editing the
|
||||||
|
relevant file in <code><i>[dspace]</i>/config/emails</code>. Each file
|
||||||
|
is commented. Be careful to keep the right number 'placeholders' (e.g.<code>{2}</code>).</p>
|
||||||
|
<h2><a name="email">Local DSpace Administrator Contact Information</a></h2>
|
||||||
|
<p>There are several places in DSpace in which the user will be shown
|
||||||
|
contact information for the local DSpace Administrator: for instance,
|
||||||
|
when an error occurs, or in the on-line help when the user is looking
|
||||||
|
for more information. The contact information is displayed by <code><i>[dspace-source]</i>/jsp/components/contact-info.jsp.</code>
|
||||||
|
This JSP retrieves the help e-mail in dspace.cfg, but the phone number
|
||||||
|
in the JSP is a dummy phone number that needs to be edited directly in
|
||||||
|
the JSP. You should be sure to edit this file (adding any additional
|
||||||
|
information you feel might be useful) so that users know who to contact
|
||||||
|
for further information.</p>
|
||||||
|
<h2><a name="registries">The Dublin Core and Bitstream Format Registries</a></h2>
|
||||||
|
<p>The <code><i>[dspace]</i>/config/registries</code> directory
|
||||||
|
contains two XML files. These are used to load the <em>initial</em>
|
||||||
|
contents of the Dublin Core type registry and Bitstream Format
|
||||||
|
registry. After the initial loading (performed by <code>ant
|
||||||
|
fresh_install</code> above), the registries reside in the database; the
|
||||||
|
XML files are not updated.</p>
|
||||||
|
<p>Currently, the system requires that every item have a Dublin Core
|
||||||
|
record. The exact Dublin Core elements and qualifiers that are used can
|
||||||
|
be configured by editing the Dublin Core registry. This can either be
|
||||||
|
done at install-time, by editing <code><i>[dspace]</i>/config/registries/dublin-core-types.xml</code>,
|
||||||
|
or at run-time using the administration Web UI. However, note that some
|
||||||
|
elements and qualifiers must be present for DSpace to function
|
||||||
|
correctly since they are used for various purposes by the code. Details
|
||||||
|
are in the relevant <code>.xml</code> file.</p>
|
||||||
|
<p>Also note that altering the Dublin Core registry does not, at the
|
||||||
|
current time, cause corresponding changes in the Web UI (e.g. the
|
||||||
|
submission interface or search indices).</p>
|
||||||
|
<p>The bitstream formats recognized by the system and levels of support
|
||||||
|
are similarly stored in the bitstream format registry. This can also be
|
||||||
|
edited at install-time via <code><i>[dspace]</i>/config/registries/bitstream-formats.xml</code>
|
||||||
|
or by the administation Web UI. The contents of the bitstream format
|
||||||
|
registry are entirely up to you, though the system requires that the
|
||||||
|
following two formats are present:</p>
|
||||||
|
<ul>
|
||||||
|
<li><code>Unknown</code></li>
|
||||||
|
<li><code>License</code></li>
|
||||||
|
</ul>
|
||||||
|
<h2><a name="templates">Configuration Files for Other Applications</a></h2>
|
||||||
|
<p>To ease the hassle of keeping configuration files for other
|
||||||
|
applications involved in running a DSpace site, for example Apache, in
|
||||||
|
sync, the DSpace system can automatically update them for you when the
|
||||||
|
main DSpace configuration is changed. This feature of the DSpace system
|
||||||
|
is entirely optional, but we found it useful.</p>
|
||||||
|
<p>The way this is done is by placing the configuration files for those
|
||||||
|
applications in <code><i>[dspace]</i>/config/templates</code>, and
|
||||||
|
inserting special values in the configuration file that will be filled
|
||||||
|
out with appropriate DSpace configuration properties. Then, tell DSpace
|
||||||
|
where to put filled-out, 'live' version of the configuration by adding
|
||||||
|
an appropriate property to <code>dspace.cfg</code>, and run <code><i>[dspace]</i>/bin/install-configs</code>.</p>
|
||||||
|
<p>Take the <code>apache13.conf</code> file as an example. This
|
||||||
|
contains plenty of Apache-specific stuff, but where it uses a value
|
||||||
|
that should be kept in sync across DSpace and associated applications,
|
||||||
|
a 'placeholder' value is written. For example, the host name:</p>
|
||||||
|
<pre>ServerName @@dspace.hostname@@</pre>
|
||||||
|
<p>The text <code>@@dspace.hostname@@</code> will be filled out with
|
||||||
|
the value of the <code>dspace.hostname</code> property in <code>dspace.cfg</code>.
|
||||||
|
Then we decide where we want the 'live' version, that is, the version
|
||||||
|
actually read in by Apache when it starts up, will go.</p>
|
||||||
|
<p>Let's say we want the live version to be located at <code>/opt/apache/conf/dspace-httpd.conf</code>.
|
||||||
|
To do this, we add the following property to <code>dspace.cfg</code>
|
||||||
|
so DSpace knows where to put it:</p>
|
||||||
|
<pre>config.template.apache13.conf = /opt/apache/conf/dspace-httpd.conf</pre>
|
||||||
|
<p>Now, we run <code><i>[dspace]</i>/bin/install-configs</code>. This
|
||||||
|
reads in <code><i>[dspace]</i>/config/templates/apache13.conf</code>,
|
||||||
|
and places a copy at <code>/opt/apache/conf/dspace-httpd.conf</code>
|
||||||
|
with the placeholders filled out.</p>
|
||||||
|
<p>So, in <code>/opt/apache/conf/dspace-httpd.conf</code>, there will
|
||||||
|
be a line like:</p>
|
||||||
|
<pre>ServerName dspace.myu.edu</pre>
|
||||||
|
<p>The advantage of this approach is that if a property like the
|
||||||
|
hostname changes, you can just change it in <code>dspace.cfg</code>
|
||||||
|
and run <code>install-configs</code>, and all of your tools'
|
||||||
|
configuration files will be updated.</p>
|
||||||
|
<p>However, take care to make all your edits to the versions in <code><i>[dspace]</i>/config/templates</code>!
|
||||||
|
It's a wise idea to put a big reminder at the top of each file, since
|
||||||
|
someone might unwittingly edit a 'live' configuration file which would
|
||||||
|
later be overwritten.</p>
|
||||||
|
<h2><a name="customui">Customizing the Web User Interface</a></h2>
|
||||||
|
<p>The Web UI is implemented using Java Servlets which handle the
|
||||||
|
business logic, and JavaServer Pages (JSPs) which produce the HTML
|
||||||
|
pages sent to an end-user. Since the JSPs are much closer to HTML than
|
||||||
|
Java code, altering the look and feel of DSpace is relatively easy.</p>
|
||||||
|
<p>To make it even easier, DSpace allows you to 'override' the JSPs
|
||||||
|
included in the source distribution with modified versions, that are
|
||||||
|
stored in a separate place, so when it comes to updating your site with
|
||||||
|
a new DSpace release, your modified versions will not be overwritten.</p>
|
||||||
|
<p>However, note that the data (attributes) passed from an underlying
|
||||||
|
Servlet to the JSP may change between versions, so you may have to
|
||||||
|
modify your customized Servlet to deal with the new data.</p>
|
||||||
|
<p>The JSPs are stored in <code><i>[dspace-source]</i>/jsp</code>.
|
||||||
|
Place your edited version of a JSP in the <code><i>[dspace-source]</i>/jsp/local</code>
|
||||||
|
directory, with the same path as the original. If they exist, these
|
||||||
|
will be used in preference to the distributed versions in <code><i>[dspace-source]</i>/jsp</code>.
|
||||||
|
For example:</p>
|
||||||
|
<table>
|
||||||
|
<tbody>
|
||||||
|
<tr>
|
||||||
|
<th>DSpace default</th>
|
||||||
|
<th>Locally-modified version</th>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td><code><i>[dspace-source]</i>/jsp/community-list.jsp</code></td>
|
||||||
|
<td><code><i>[dspace-source]</i>/jsp/local/community-list.jsp</code></td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td><code><i>[dspace-source]</i>/jsp/mydspace/main.jsp</code></td>
|
||||||
|
<td><code><i>[dspace-source]</i>/jsp/local/mydspace/main.jsp</code></td>
|
||||||
|
</tr>
|
||||||
|
</tbody>
|
||||||
|
</table>
|
||||||
|
<p>Heavy use is made of a style sheet, in <code><i>[dspace-source]</i>/jsp/styles.css.jsp</code>.
|
||||||
|
If you make edits, call the local version <code><i>[dspace-source]</i>/jsp/local/styles.css.jsp</code>,
|
||||||
|
and it will be used automatically in preference to the default, as
|
||||||
|
described above.</p>
|
||||||
|
<p>Fonts and colors can be easily changed using the stylesheet. The
|
||||||
|
stylesheet is a JSP so that the user's browser version can be detected
|
||||||
|
and the stylesheet tweaked accordingly.</p>
|
||||||
|
<p>The 'layout' of each page, that is, the top and bottom banners and
|
||||||
|
the navigation bar, are determined by the JSPs <code><i>[dspace-source]</i>/jsp/layout/header-*.jsp</code>
|
||||||
|
and <code><i>[dspace-source]</i>/jsp/layout/footer-*.jsp</code>. You
|
||||||
|
can provide modified versions of these (in <code><i>[dspace-source]</i>/jsp/local/layout</code>,
|
||||||
|
or define more styles and apply them to pages by using the "style"
|
||||||
|
attribute of the <code>dspace:layout</code> tag.</p>
|
||||||
|
<p>After you've customized your JSPs, <strong>you must rebuild the
|
||||||
|
DSpace Web application</strong>. If you haven't already built and
|
||||||
|
installed it, follow the <a href="install.html">install</a>
|
||||||
|
directions. Otherwise, follow the steps below:</p>
|
||||||
|
<ol>
|
||||||
|
<li>
|
||||||
|
<p>Rebuild the <code>dspace.war</code> file by running the
|
||||||
|
following command from your <code><i>[dspace-source]</i></code>
|
||||||
|
directory:</p>
|
||||||
|
<pre>ant -Dconfig=<i>[dspace]</i>/config/dspace.cfg build_wars</pre>
|
||||||
|
</li>
|
||||||
|
<li>
|
||||||
|
<p>Shut down Tomcat, and delete the existing <i>[tomcat]</i>/webapps/dspace
|
||||||
|
directory.</p>
|
||||||
|
</li>
|
||||||
|
<li>
|
||||||
|
<p>Copy the new .war file to the Tomcat webapps directory:</p>
|
||||||
|
<pre>cp <i>[dspace-source]</i>/build/dspace.war <i>[tomcat]</i>/webapps</pre>
|
||||||
|
</li>
|
||||||
|
</ol>
|
||||||
|
<p>When you restart the web server you should see your customized JSPs.</p>
|
||||||
|
<h2><a name="authenticate">Custom Authentication Code</a></h2>
|
||||||
|
<p>Since many institutions and organizations have exisiting
|
||||||
|
authentication systems, DSpace has been designed to allow these to be
|
||||||
|
easily integrated. To do this, you can provide a custom class
|
||||||
|
implementing the Java interface <code>org.dspace.app.webui.SiteAuthenticator</code>.
|
||||||
|
These methods are invoked when various authentication-related events
|
||||||
|
occur in the Web user interface.</p>
|
||||||
|
<p>The basic authentication procedure in the DSpace Web UI is this:</p>
|
||||||
|
<ol>
|
||||||
|
<li>A request is received from an end-user's browser that, if
|
||||||
|
fulfilled, would lead to an action requiring authorization taking place.</li>
|
||||||
|
<li>If the end-user is already authenticated:
|
||||||
|
<ul>
|
||||||
|
<li>If the end-user is allowed to perform the action, the action
|
||||||
|
proceeds</li>
|
||||||
|
<li>If the end-user is NOT allowed to perform the action, an
|
||||||
|
authorization error is displayed.</li>
|
||||||
|
</ul>
|
||||||
|
</li>
|
||||||
|
</ol>
|
||||||
|
<h2><a name="ldap">LDAP Authentication</a></h2>
|
||||||
|
<p>If LDAP is enabled in the dspace.cfg file, then new users
|
||||||
|
will be able to register by entering their username and
|
||||||
|
password without being sent the registration token. If
|
||||||
|
users do not have a username and password, then they
|
||||||
|
can still register and login with just their email address
|
||||||
|
the same way they do now.
|
||||||
|
</p>
|
||||||
|
<p>If you want to give any special privileges to LDAP users,
|
||||||
|
you will still need to extend the SiteAuthenticator class to
|
||||||
|
automatically put people who have a netid into a special
|
||||||
|
group. You might also want to give certain email addresses
|
||||||
|
special privileges. Refer to the <a href="#authenticate">Custom
|
||||||
|
Authentication Code</a> section above for more information about
|
||||||
|
how to do this.
|
||||||
|
</p>
|
||||||
|
<p>Here is an explanation of what each of the different
|
||||||
|
configuration parameters are for:
|
||||||
|
<ul>
|
||||||
|
<li><b>ldap.enable</b><br />
|
||||||
|
This setting will enable or disable LDAP authentication in DSpace.
|
||||||
|
With the setting off, users will be required to register and login with
|
||||||
|
their email address. With this setting on, users will be able to login
|
||||||
|
and register with their LDAP user ids and passwords.
|
||||||
|
</li>
|
||||||
|
<li><b>webui.ldap.autoregister</b> <br />
|
||||||
|
This will turn LDAP autoregistration on or off. With this
|
||||||
|
on, a new EPerson object will be created for any user who
|
||||||
|
successfully authenticates against the LDAP server when they
|
||||||
|
first login. With this setting off, the user
|
||||||
|
must first register to get an EPerson object by
|
||||||
|
entering their ldap username and password and filling out
|
||||||
|
the forms.</li>
|
||||||
|
<li><b>ldap.provider_url = ldap://ldap.myu.edu/o=myu.edu</b><br />
|
||||||
|
This is the url to your institution's ldap server. You may or may
|
||||||
|
not need the /o=myu.edu part at the end. Your server may
|
||||||
|
also require the ldaps:// protocol.
|
||||||
|
</li>
|
||||||
|
<li><b>ldap.id_field = uid</b><br />
|
||||||
|
This is the unique identifier field in the LDAP directory
|
||||||
|
where the username is stored.
|
||||||
|
</li>
|
||||||
|
<li><b>ldap.object_context = ou=people,o=myu.edu</b><br />
|
||||||
|
This is the object context used when authenticating the
|
||||||
|
user. It is appended to the ldap.id_field and username.
|
||||||
|
For example uid=username,ou=people,o=myu.edu. You will need
|
||||||
|
to modify this to match your ldap configuration.
|
||||||
|
</li>
|
||||||
|
<li><b>ldap.search_context = ou=people</b><br />
|
||||||
|
This is the search context used when looking up a user's
|
||||||
|
ldap object to retrieve their data for autoregistering.
|
||||||
|
With ldap.autoregister turned on, when a user authenticates
|
||||||
|
without an EPerson object we search the ldap directory to
|
||||||
|
get their name and email address so that we can create one
|
||||||
|
for them. So after we have authenticated against
|
||||||
|
uid=username,ou=people,o=byu.edu we now search in ou=people
|
||||||
|
for filtering on [uid=username]. Often the
|
||||||
|
ldap.search_context is the same as the ldap.object_context
|
||||||
|
parameter. But again this depends on your ldap server
|
||||||
|
configuration.
|
||||||
|
</li>
|
||||||
|
<li><b>ldap.email_field = mail</b><br />
|
||||||
|
This is the ldap object field where the user's email address
|
||||||
|
is stored. "mail" is the default and the most common for
|
||||||
|
ldap servers. If the mail field is not found the username
|
||||||
|
will be used as the email address when creating the eperson
|
||||||
|
obejct.
|
||||||
|
</li>
|
||||||
|
<li><b>ldap.surname_field = sn</b><br />
|
||||||
|
This is the ldap object field where the user's last name is
|
||||||
|
stored. "sn" is the default and is the most common for ldap
|
||||||
|
servers. If the field is not found the field will be left
|
||||||
|
blank in the new eperson object.
|
||||||
|
</li>
|
||||||
|
<li><b>ldap.givenname_field = givenName</b><br />
|
||||||
|
This is the ldap object field where the user's given names
|
||||||
|
are stored. I'm not sure how common the givenName field is
|
||||||
|
in different ldap instances. If the field is not found the
|
||||||
|
field will be left blank in the new eperson object.
|
||||||
|
</li>
|
||||||
|
<li><b>ldap.phone_field = telephoneNumber</b><br />
|
||||||
|
This is the field where the user's phone number is stored in
|
||||||
|
the ldap directory. If the field is not found the field
|
||||||
|
will be left blank in the new eperson object.
|
||||||
|
</li>
|
||||||
|
</ul>
|
||||||
|
</p>
|
||||||
|
<h2><a name="statistics">System Statistical Reports</a></h2>
|
||||||
|
|
||||||
|
<p>Statistics for the system can be made available at <code>http://www.mydspaceinstance.edu/statistics</code>. To use the system statistics you will have to initialise them as per the installation documentation, but before you do so you need to perform the customisations discussed here in order to ensure that the reports are generated correctly.</p>
|
||||||
|
|
||||||
|
<h3>Configuration File</h3>
|
||||||
|
|
||||||
|
<p>Configuration for the statistics system are in <code>[dspace]/config/dstat.cfg</code> and the file should guide you to correctly filling in the details required. For the most part you will not need to change this file.</p>
|
||||||
|
|
||||||
|
<h3>Customising Shell Scripts</h3>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
To customise the supplied perl scripts to do monthly and general report
|
||||||
|
generation it is necessary to modify the scripts themselves sightly. This
|
||||||
|
is because these scripts were developed to speed up the process of using
|
||||||
|
DStat at Edinburgh University Library and were not particularly intended for
|
||||||
|
external use. They appear here for the convenience of others and in order to
|
||||||
|
bridge the gap between the report generation and the inclusion of those reports
|
||||||
|
into the DSpace UI, which is currently a clunky process.
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
In order to get these scripts to work for you, open each of the following in
|
||||||
|
turn:
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<pre>
|
||||||
|
dstat-general
|
||||||
|
dstat-initial
|
||||||
|
dstat-monthly
|
||||||
|
dstat-report-general
|
||||||
|
dstat-report-initial
|
||||||
|
dstat-report-monthly
|
||||||
|
</pre>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
scripts eding with <code>-general</code> do the work for building reports spanning the
|
||||||
|
entire history of the archive; scripts ending <code>-initial</code> are to initialise the
|
||||||
|
reports by doing monthly reports from some start date up to the present;
|
||||||
|
scripts ending <code>-monthly</code> generate a single monthly report <em>for the current
|
||||||
|
month</em>. These scripts are just designed to make life easier, and are not
|
||||||
|
particularly clever or elegant.
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
In each file you will find a section:
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<pre>
|
||||||
|
# Details used
|
||||||
|
################################################
|
||||||
|
|
||||||
|
... some perl ...
|
||||||
|
|
||||||
|
################################################
|
||||||
|
</pre>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
the perl between the lines of hashes defines the variables which will be used
|
||||||
|
to do all of the processing in the report. The following explains what the
|
||||||
|
variables mean and what they should be set to for each of the scripts
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
<strong>dstat-initial:</strong><br />
|
||||||
|
|
||||||
|
<code>$out_prefix</code>: prefix to place in front of each output file.<br />
|
||||||
|
<code>$out_suffix</code>: suffix for output file. A date will be inserted between the
|
||||||
|
prefix and suffix<br />
|
||||||
|
<code>$start_year</code>: year to start back-analysing monthly logs from<br />
|
||||||
|
<code>$start_month</code>: month to start back-analysing monthly logs from<br />
|
||||||
|
<code>$dsrun</code>: path to your dsrun script, usually <code>[dspace]/bin/dsrun</code><br />
|
||||||
|
<code>$out_directory</code>: directory into which to place analysis files, for example
|
||||||
|
<code>[dspace]/bin/log/</code><br />
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
<strong>dstat-monthly:</strong><br />
|
||||||
|
|
||||||
|
<code>$out_prefix</code>: prefix to place in front of each output file.<br />
|
||||||
|
<code>$out_suffix</code>: suffix for output file. A date will be inserted between the
|
||||||
|
prefix and suffix<br />
|
||||||
|
<code>$dsrun</code>: path to your dsrun script, usually <code>[dspace]/bin/dsrun</code><br />
|
||||||
|
<code>$out_directory</code>: directory into which to place analysis files, for example
|
||||||
|
<code>[dspace]/bin/log/</code><br />
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
<strong>dstat-general:</strong><br />
|
||||||
|
|
||||||
|
<code>$out_prefix</code>: prefix to place in front of each output file.<br />
|
||||||
|
<code>$out_suffix</code>: suffix for output file. Today's date will be inserted between the
|
||||||
|
prefix and suffix<br />
|
||||||
|
<code>$dsrun</code>: path to your dsrun script, usually <code>[dspace]/bin/dsrun</code><br />
|
||||||
|
<code>$out_directory</code>: directory into which to place analysis files, for example
|
||||||
|
<code>[dspace]/bin/log/</code><br />
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
<strong>dstat-report-initial:</strong><br />
|
||||||
|
|
||||||
|
<code>$in_prefix</code>: the prefix of the files generated by dstat-initial<br />
|
||||||
|
<code>$in_suffix</code>: the suffix of the files generated by dstat-initial<br />
|
||||||
|
<code>$out_prefix</code>: the report file prefix. Should be "<code>report-</code>" in order to work with
|
||||||
|
DSpace UI<br />
|
||||||
|
<code>$out_suffix</code>: the report file suffix. Should be "<code>.html</code>" in order to work with
|
||||||
|
DSpace UI<br />
|
||||||
|
<code>$start_year</code>: the start year used in dstat-initial<br />
|
||||||
|
<code>$start_month</code>: the start month used in dstat-initial<br />
|
||||||
|
<code>$dsrun</code>: path to your dsrun script, usually <code>[dspace]/bin/dsrun</code><br />
|
||||||
|
<code>$in_directory</code>: directory where analysis files were placed in dstat-initial<br />
|
||||||
|
<code>$out_directory</code>: the live reports directory: <code>[dspace]/reports/</code><br />
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
<strong>dstat-report-monthly:</strong><br />
|
||||||
|
|
||||||
|
<code>$in_prefix</code>: the prefix of the files generated by dstat-monthly<br />
|
||||||
|
<code>$in_suffix</code>: the suffix of the files generated by dstat-monthly<br />
|
||||||
|
<code>$out_prefix</code>: the report file prefix. Should be "<code>report-</code>" in order to work with
|
||||||
|
DSpace UI<br />
|
||||||
|
<code>$out_suffix</code>: the report file suffix. Should be "<code>.html</code>" in order to work with
|
||||||
|
DSpace UI<br />
|
||||||
|
<code>$dsrun</code>: path to your dsrun script, usually <code>[dspace]/bin/dsrun</code><br />
|
||||||
|
<code>$in_directory</code>: directory where analysis files were placed in dstat-monthly<br />
|
||||||
|
<code>$out_directory</code>: the live reports directory: <code>[dspace]/reports/</code><br />
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
<strong>dstat-report-general:</strong><br />
|
||||||
|
|
||||||
|
<code>$in_prefix</code>: the prefix of the files generated by dstat-general<br />
|
||||||
|
<code>$in_suffix</code>: the suffix of the files generated by dstat-general<br />
|
||||||
|
<code>$out_prefix</code>: the report file prefix. Should be "<code>report-general-</code>" in order to
|
||||||
|
work with DSpace UI<br />
|
||||||
|
<code>$out_suffix</code>: the report file suffix. Should be "<code>.html</code>" in order to work with
|
||||||
|
DSpace UI<br />
|
||||||
|
<code>$dsrun</code>: path to your dsrun script, usually <code>[dspace]/bin/dsrun</code><br />
|
||||||
|
<code>$in_directory</code>: directory where analysis files were placed in dstat-general<br />
|
||||||
|
<code>$out_directory</code>: the live reports directory: <code>[dspace]/reports/</code><br />
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
If you want additional customisations, you will need to modify the lines which
|
||||||
|
build the command to be executed and change the parameters passed to the java
|
||||||
|
processes which actually carry out the analysis. For more information on these
|
||||||
|
processes either build the javadocs or run:
|
||||||
|
</p>
|
||||||
|
|
||||||
|
|
||||||
|
<code>[dspace]/bin/dsrun ac.ed.dspace.stats.LogAnalyser -help</code><br />
|
||||||
|
|
||||||
|
<code>[dspace]/bin/dsrun ac.ed.dspace.stats.ReportGenerator -help</code>
|
||||||
|
|
||||||
|
<h3>Cron Jobs</h3>
|
||||||
|
|
||||||
|
<p>In order that the reports should be generated regularly and thus kept up to date you should set up the following cron jobs:</p>
|
||||||
|
|
||||||
|
<pre>
|
||||||
|
#run stat analyses
|
||||||
|
0 1 * * * [dspace]/bin/stat-general
|
||||||
|
0 1 * * * [dspace]/bin/stat-monthly
|
||||||
|
0 2 * * * [dspace]/bin/stat-report-general
|
||||||
|
0 2 * * * [dspace]/bin/stat-report-monthly
|
||||||
|
</pre>
|
||||||
|
|
||||||
|
<p>Obviously, you should choose execution times which are most useful to you, and you should ensure that the <code>-report-</code> scripts run a short while after the analysis scripts to give them time to complete (a run of around 8 months worth of logs can take around 25 seconds to complete); the resulting reports will let you know how long analysis took and you can adjust your cron times accordingly.</p>
|
||||||
|
|
||||||
|
|
||||||
|
</li>
|
||||||
|
<li>If the end-user is NOT authenticated, i.e. is accessing DSpace
|
||||||
|
anonymously:
|
||||||
|
<ul>
|
||||||
|
<li>The parameters etc. of the request are stored</li>
|
||||||
|
<li>The <code>startAuthentication</code> method is invoked on
|
||||||
|
the currently configured <code>SiteAuthenticator</code> implementation</li>
|
||||||
|
<li>That <code>startAuthentication</code> might instantly
|
||||||
|
authenticate the user somehow, or forward the request to some sort of
|
||||||
|
log-in page--the parameters of the original request are safely stored
|
||||||
|
and will be accessible after the log-in process has completed</li>
|
||||||
|
<li>If authentication is successful, the original request is
|
||||||
|
resumed from step 2. above.</li>
|
||||||
|
</ul>
|
||||||
|
</li>
|
||||||
|
</ol>
|
||||||
|
<p>Please see the <code>SiteAuthenticator.java</code> source file for
|
||||||
|
information about each of the methods. The default implementation, <code>org.dspace.app.webui.SimpleAuthenticator</code>,
|
||||||
|
is a simple implementation that implements these policies:</p>
|
||||||
|
<ul>
|
||||||
|
<li>
|
||||||
|
<p>Use of inbuilt e-mail address/password-based log-in. This is
|
||||||
|
achieved by forwarding a request that is attempting an action requiring
|
||||||
|
authorization to the password log-in servlet, <code>/password-login</code>.
|
||||||
|
The password log-in servlet (<code>org.dspace.app.webui.servlet.PasswordServlet</code>
|
||||||
|
contains code that will resume the original request if authentication
|
||||||
|
is successful, as per step 3. described above.</p>
|
||||||
|
</li>
|
||||||
|
<li>
|
||||||
|
<p>Users can register themselves (i.e. add themselves as e-people
|
||||||
|
without needing approval from the administrators), and can set their
|
||||||
|
own passwords when they do this</p>
|
||||||
|
</li>
|
||||||
|
<li>
|
||||||
|
<p>Users are not members of any special (dynamic) e-person groups</p>
|
||||||
|
</li>
|
||||||
|
</ul>
|
||||||
|
<p>Included in the source is the implementation of <code>SiteAuthenticator</code>
|
||||||
|
used at MIT, <code>edu.mit.dspace.MITAuthenticator</code>. This
|
||||||
|
implements a slightly more complex authentication mechanism:</p>
|
||||||
|
<ul>
|
||||||
|
<li>
|
||||||
|
<p>If an authentication user in an MIT user, they must log in using
|
||||||
|
an X509 Web certificate. The <code>certificate-login</code> servlet,
|
||||||
|
similar to the <code>password-login</code> servlet, authenticates
|
||||||
|
users via these certificates, and if successful, resumes the original
|
||||||
|
request just as the password log-in servlet would.</p>
|
||||||
|
</li>
|
||||||
|
<li>
|
||||||
|
<p>MIT users are also automatically added to the special (dynamic)
|
||||||
|
group called 'MIT Users' (which must be present in the system!). This
|
||||||
|
allows us to create authorization policies for MIT users without having
|
||||||
|
to manually maintain membership of the MIT users group.</p>
|
||||||
|
</li>
|
||||||
|
<li>
|
||||||
|
<p>Anyone can register themselves, but MIT users doing this cannot
|
||||||
|
set a password; they must use their X509 Web certificate to log in.</p>
|
||||||
|
</li>
|
||||||
|
</ul>
|
||||||
|
<p>The X509 certificate login servlet has an extra feature: If the <code>webui.cert.autoregister</code>
|
||||||
|
configuration property is <code>true</code>, it will automatically
|
||||||
|
register the user with the system.</p>
|
||||||
|
<p>You could create a customized version of the password login servlet
|
||||||
|
to perform a similar action. For example, if your organization uses
|
||||||
|
Windows NT domain authentication, you could implement a version of
|
||||||
|
PasswordServlet.java that validates against Windows NT authentication,
|
||||||
|
and automatically adds an e-person record for new users. It is strongly
|
||||||
|
recommended that you do not edit PasswordServlet but create a new
|
||||||
|
servlet for this, so that future updates of the DSpace code do not
|
||||||
|
overwrite your changes. You would also have to implement a customized <code>SiteAuthenticator</code>
|
||||||
|
in which the <code>startAuthentication</code> method would forward
|
||||||
|
requests to your new servlet.</p>
|
||||||
|
<h2><a name="webuithumbs">Displaying Image Thumbnails</a></h2>
|
||||||
|
<h3>Browse and Search Results Page Thumbnails</h3>
|
||||||
|
<p>Image thumbnails can be enabled on the Browse and Search Results
|
||||||
|
pages by setting the appropriate configuration values. To enable the
|
||||||
|
display of thumbnails the following items must be set in the <code>dspace.cfg</code>
|
||||||
|
file:</p>
|
||||||
|
<p><code>webui.browse.thumbnail.show = true</code></p>
|
||||||
|
<p>If set to <code>false</code> or this configuration item is missing
|
||||||
|
then thumbnails will not be shown.</p>
|
||||||
|
<p>The size of the browse/search thumbnails can also be configured to a
|
||||||
|
smaller size than that generated by the mediafilter. To do this set the
|
||||||
|
following configuration items:</p>
|
||||||
|
<p><code>webui.browse.thumbnail.maxheight = <maxheight in pixels></code></p>
|
||||||
|
<p><code>webui.browse.thumbnail.maxwidth = <maxwidth in pixels></code></p>
|
||||||
|
<p>If these configuration items are not set, <code>thumbnail.maxheight</code>
|
||||||
|
and <code>thumbnail.maxwidth</code> are used. Setting these values
|
||||||
|
greater than or equal to the size of the thumbnail generated by the
|
||||||
|
mediafilter (i.e. <code>thumbnail.maxheight</code> and <code>thumbnail.maxwidth</code>)
|
||||||
|
will have no effect.</p>
|
||||||
|
<p><strong>Note:</strong></p>
|
||||||
|
<ul>
|
||||||
|
<li>where the primary bitstream is HTML, no thumbnail is shown;</li>
|
||||||
|
<li>where the primary bitstream has a thumbnail, its thumbnail is
|
||||||
|
shown;</li>
|
||||||
|
<li>where the primary bitstream is not set, the first thumbnail found
|
||||||
|
by DSpace will be shown;</li>
|
||||||
|
<li>where the user does not have read access to the thumbnail
|
||||||
|
bitstream, no thumbnail is shown;</li>
|
||||||
|
<li>currently, for a thumbnail to display, a JPEG thumbnail under the
|
||||||
|
current implementation rules must exist (i.e. primary bitstream name
|
||||||
|
with ".jpg" suffix).</li>
|
||||||
|
</ul>
|
||||||
|
<h3>Configuring Thumbnail Link Behaviour</h3>
|
||||||
|
<p>The target of a thumbnail in the Browse and Search Results Page can
|
||||||
|
be configured by setting the following configuration item:</p>
|
||||||
|
<p><code>webui.browse.thumbnail.linkbehaviour = <target page type></code></p>
|
||||||
|
<p>Currently the values <code>item</code> and <code>bitstream</code>
|
||||||
|
are allowed. If this configuration item is not set, or set incorrectly,
|
||||||
|
the default is <code>item</code>.</p>
|
||||||
|
<h3>Item Display Page Thumbnails</h3>
|
||||||
|
<p>Thumbnails may also be enabled or disabled on the Item Display page
|
||||||
|
by setting the following configuration item in <code>dspace.cfg</code>:</p>
|
||||||
|
<p><code>webui.item.thumbnail.show = true</code></p>
|
||||||
|
<p>If set to <code>false</code> or this configuration item is missing
|
||||||
|
then thumbnails will not be shown.</p>
|
||||||
|
<h2><a name="strengths">Displaying Community and Collection Item Counts</a></h2>
|
||||||
|
<p>To show the item count against communities and collections set the <code>webui.strengths.show</code>
|
||||||
|
configuration item in the <code>dspace.cfg</code> file as follows:</p>
|
||||||
|
<p><code>webui.strengths.show = true</code></p>
|
||||||
|
<p>If this config item is missing or is set to any value other than <code>true</code>
|
||||||
|
the item counts will not be shown</p>
|
||||||
|
|
||||||
|
<h2><a name="i18n">Internationalisation</a></h2>
|
||||||
|
<p>
|
||||||
|
The <a class="external" href="http://jakarta.apache.org/taglibs/doc/standard-1.0-doc/intro.html">Java Standard Tag Library v1.0</a> is used to specify messages in the JSPs like this:
|
||||||
|
|
||||||
|
</p>
|
||||||
|
<p>
|
||||||
|
OLD:
|
||||||
|
<pre><H1>Search Results</H1>
|
||||||
|
</pre>
|
||||||
|
</p>
|
||||||
|
<p>
|
||||||
|
NEW:
|
||||||
|
<pre><H1><fmt:message key="jsp.search.results.title" /></H1>
|
||||||
|
</pre>
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
This message can now be changed using the <tt>config/Messages_en.properties</tt> file. (This must be done at build-time: <tt>Messages_en.properties</tt> is placed in the <tt>dspace.war</tt> Web application file.)
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<pre>jsp.search.results.title = Search Results
|
||||||
|
</pre><p>
|
||||||
|
Phrases may have parameters to be passed in, to make the job of translating easier, reduce the number of 'keys' and to allow translators to make the translated text flow more appropriately for the target language.
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
OLD:
|
||||||
|
<pre><P>Results <%= r.getFirst() %> to <%= r.getLast() %> of <%= r.getTotal() %></P>
|
||||||
|
</pre>
|
||||||
|
</p>
|
||||||
|
<p>
|
||||||
|
NEW:
|
||||||
|
|
||||||
|
<pre><fmt:message key="jsp.search.results.text">
|
||||||
|
<fmt:param><%= r.getFirst() %></fmt:param>
|
||||||
|
<fmt:param><%= r.getLast() %></fmt:param>
|
||||||
|
<fmt:param><%= r.getTotal() %></fmt:param>
|
||||||
|
|
||||||
|
</fmt:message>
|
||||||
|
</pre>
|
||||||
|
</p>
|
||||||
|
<p>
|
||||||
|
(Note: JSTL 1.0 does not seem to allow JSP <%= %> expressions to be passed in as values of attribute in <fmt:param value=""/>)
|
||||||
|
</p>
|
||||||
|
<p>
|
||||||
|
The above would appear in the <tt>Messages_xx.properties</tt> file as:
|
||||||
|
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<pre>jsp.search.results.text = Results {0}-{1} of {2}
|
||||||
|
</pre><p>
|
||||||
|
Introducing number parameters that should be formatted according to the locale used makes no difference in the message key compared to atring parameters:
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<pre>jsp.submit.show-uploaded-file.size-in-bytes = {0} bytes
|
||||||
|
</pre><p>
|
||||||
|
In the JSP using this key can be used in the way belov:
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<pre><fmt:message key="jsp.submit.show-uploaded-file.size-in-bytes">
|
||||||
|
<fmt:param><fmt:formatNumber><%= bitstream.getSize() %></fmt:formatNumber></fmt:param>
|
||||||
|
|
||||||
|
</fmt:message>
|
||||||
|
</pre><p>
|
||||||
|
(Note: JSTL offers a way to include numbers in the message keys as <tt>jsp.foo.key = {0,number} bytes</tt>. Setting the parameter as <tt><fmt:param value="${variable}" /></tt> workes when <tt>variable</tt> is a single variable name and doesn't work when trying to use a method's return value instead: <tt>bitstream.getSize()</tt>. Passing the number as string (or using the <%= %> expression) also does not work.)
|
||||||
|
|
||||||
|
</p>
|
||||||
|
<p>
|
||||||
|
Multiple <tt>Messages.properties</tt> can be created for different languages. See <a class="external" href="http://java.sun.com/j2se/1.4.2/docs/api/java/util/ResourceBundle.html#getBundle(java.lang.String,%20java.util.Locale,%20java.lang.ClassLoader)">ResourceBundle.getBundle</a>. e.g. you can add German and Canadian French translations:
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<pre>Messages_de.properties
|
||||||
|
Messages_fr_CA.properties
|
||||||
|
</pre><p>
|
||||||
|
The end user's browser settings determine which language is used. <tt>Messages_en.properties</tt> (or the default server locale) will be used as a default if there's no language bundle for the end user's preferred language.
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
The <tt>dspace:layout</tt> tag has been updated to allow dictionary keys to be passed in for the titles. It now has two new parameters: <tt>titlekey</tt> and <tt>parenttitlekey</tt>. So where before you'd do:
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<pre><dspace:layout title="Here"
|
||||||
|
parentlink="/mydspace"
|
||||||
|
parenttitle="My DSpace">
|
||||||
|
</pre><p>
|
||||||
|
You now do:
|
||||||
|
<pre><dspace:layout titlekey="jsp.page.title"
|
||||||
|
parentlink="/mydspace"
|
||||||
|
parenttitlekey="jsp.mydspace">
|
||||||
|
|
||||||
|
</pre>And so the layout tag itself gets the relavent stuff out of the dictionary. <tt>title</tt> and <tt>parenttitle</tt> still work as before for backwards compatibility, and the odd spot where that's preferable.
|
||||||
|
</p>
|
||||||
|
<h3>Message Key Convention</h3>
|
||||||
|
<p>
|
||||||
|
When translating further pages, please follow the convention for naming message keys to avoid clashes.
|
||||||
|
</p>
|
||||||
|
<p>
|
||||||
|
<strong>For text in JSPs</strong> use the complete path + filename of the JSP, then a one-word name for the message. e.g. for the title of <tt>jsp/mydspace/main.jsp</tt> use:
|
||||||
|
|
||||||
|
<pre>jsp.mydspace.main.title
|
||||||
|
</pre>
|
||||||
|
</p>
|
||||||
|
<p>
|
||||||
|
Some common words (e.g. "Help") can be brought out into keys starting <tt>jsp.</tt> for ease of translation, e.g.:
|
||||||
|
<pre>jsp.admin = Administer
|
||||||
|
</pre>
|
||||||
|
</p>
|
||||||
|
<p>
|
||||||
|
Other common words/phrases are brought out into 'general' parameters if they relate to a set (directory) of JSPs, e.g.
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<pre>jsp.tools.general.delete = Delete
|
||||||
|
</pre><p>
|
||||||
|
|
||||||
|
Phrases that relate <strong>strongly</strong> to a topic (eg. MyDSpace) but used in many JSPs outside the particular directory are more convenient to be cross-referenced. For example one could use the key below in <tt>jsp/submit/saved.jsp</tt> to provide a link back to the user's <em>MyDSpace</em>:
|
||||||
|
</p>
|
||||||
|
<p>
|
||||||
|
<em>(Cross-referencing of keys <strong>in general</strong> is not a good idea as it may make maintenance more difficult. But in some cases it has more advantages as the meaning is obvious.)</em>
|
||||||
|
<pre>jsp.mydspace.general.goto-mydspace = Go to My DSpace
|
||||||
|
|
||||||
|
</pre>
|
||||||
|
</p>
|
||||||
|
<p>
|
||||||
|
<strong>For text in servlet code</strong>, in custom JSP tags or wherever applicable use the fully qualified classname + a one-word name for the message. e.g.
|
||||||
|
<pre>org.dspace.app.webui.jsptag.ItemListTag.title = Title
|
||||||
|
org.dspace.app.webui.servlet.CommunityListServlet.
|
||||||
|
</pre>
|
||||||
|
</p>
|
||||||
|
<h3>Which Languages are currently supported?</h3>
|
||||||
|
<p>To view translations currently being developed, please refer to the <a href="http://wiki.dspace.org/I18nSupport">i18n page</a> of the DSpace Wiki.
|
||||||
|
|
||||||
|
<h2><a name="help">On-line Help About File Formats</a></h2>
|
||||||
|
<p>Because the file format support policy is determined by each
|
||||||
|
individual institution, the on-line help on this subject is
|
||||||
|
intentionally left blank. The help file will, however, retrieve a list
|
||||||
|
of formats and the support levels associated with them in your database
|
||||||
|
and display this information to the user. We highly recommend that you
|
||||||
|
edit the "Format Support Policy" section of the file <code><i>[dspace-source]</i>/jsp/help/formats.jsp</code>.</p>
|
||||||
|
<hr>
|
||||||
|
<address> Copyright © 2002-2004 MIT and Hewlett Packard </address>
|
||||||
|
</body>
|
||||||
|
</html>
|
135
dspace/docs/directories.html
Normal file
135
dspace/docs/directories.html
Normal file
@@ -0,0 +1,135 @@
|
|||||||
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN" "http://www.w3.org/TR/REC-html40/strict.dtd">
|
||||||
|
<HTML>
|
||||||
|
<HEAD>
|
||||||
|
<TITLE>DSpace System Documentation: Directories and Files</TITLE>
|
||||||
|
<LINK REL=StyleSheet HREF="style.css" TYPE="text/css">
|
||||||
|
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" >
|
||||||
|
</HEAD>
|
||||||
|
<BODY>
|
||||||
|
<H1>DSpace System Documentation: Directories and Files</H1>
|
||||||
|
|
||||||
|
<P><A HREF="index.html">Back to contents</A></P>
|
||||||
|
|
||||||
|
<H2><A NAME="overview">Overview</A></H2>
|
||||||
|
|
||||||
|
<p>A complete DSpace installation consists of three separate directory trees:
|
||||||
|
|
||||||
|
<dl>
|
||||||
|
<dt>The source directory:</dt>
|
||||||
|
<dd><p>This is where (surprise!) the source code lives. Note that the config files here are used only during the initial install process. After the install, config files should be changed in the install directory. It is referred to in this document as <code><i>[dspace-source]</i></code>.</p></dd>
|
||||||
|
|
||||||
|
<dt>The install directory:<dt>
|
||||||
|
<dd><p>This directory is populated during the install process and also by DSpace as it runs. It contains config files, command-line tools (and the libraries necessary to run them), and usually--although not necessarily--the contents of the DSpace archive (depending on how DSpace is configured). After the initial build and install, changes to config files should be made in this directory. It is referred to in this document as <code><i>[dspace]</i></code>.<p> </dd>
|
||||||
|
|
||||||
|
<dt>The web deployment directory: </dt>
|
||||||
|
<dd><p>This directory is generated by the web server the first time it finds a dspace.war file in its webapps directory. It contains the unpacked contents of dspace.war, i.e. the JSPs and java classes and libraries necessary to run DSpace. Files in this directory should never be edited directly; if you wish to modify your DSpace installation, you should edit files in the source directory and then rebuild. The contents of this directory aren't listed here since its creation is completely automatic. It is usually referred to in this document as <code><i>[tomcat]</i>/webapps/dspace</code>.</p></dd>
|
||||||
|
</dl>
|
||||||
|
|
||||||
|
<H2><A NAME="sourcedir">Source Directory Layout</A></H2>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li><code><i>[dspace-source]</i></code>
|
||||||
|
<ul>
|
||||||
|
<li><code>bin/</code> - Some shell scripts for running DSpace command-line tasks</li>
|
||||||
|
<li><code>build.xml</code> - build file for Ant</li>
|
||||||
|
<li><code>config/</code> - configuration files
|
||||||
|
<ul>
|
||||||
|
<li><code>dspace.cfg</code> - main DSpace configuration file</li>
|
||||||
|
<li><code>dc2mods.cfg</code> - Mappings from Dublin Core metadata to <A HREF="http://www.loc.gov/standards/mods/">MODS</A> for the METS export</li>
|
||||||
|
<li><code>default.license</code> - default license that users must grant when submitting items</li>
|
||||||
|
<li><code>mediafilter.cfg</code> - Media Filter configuration</li>
|
||||||
|
<li><code>news-side.html</code> - Text of the front-page news in the sidebar</li>
|
||||||
|
<li><code>news-top.html</code> - Text of the front-page news in the top box</li>
|
||||||
|
<li><code>emails/</code> - Texts of emails sent out by the system</li>
|
||||||
|
<li><code>registries/</code> - INITIAL contents of the bitstream format registry and Dublin Core element/qualifier registry. These are only used on initial system setup, after which they are maintained in the database.</li>
|
||||||
|
<li><code>templates/</code> - configuration files for libraries and external applications (e.g. Apache, Tomcat) are kept and edited here. They can refer to properties in the main DSpace configuration - have a look at a couple. When they're updated, a command line tool fills out these files with appropriate values from dspace.cfg, and copies them to their appropriate location (hence "templates".)</li>
|
||||||
|
</ul>
|
||||||
|
</li>
|
||||||
|
<li><code>etc/</code> - miscelleanous stuff that isn't really to do with system configuration - e.g. the database schema, and a couple of configuration files that are used during the build process but not by the live system. Also contains the deployment descriptors (<code>web.xml</code> files) for the Web UI and OAI-PMH support <code>.war</code> files.</li>
|
||||||
|
<li><code>jsp/</code> - The Web UI JSPs. As much as possible, these are simply HTML with little bits of Java - the business code resides in the servlets</li>
|
||||||
|
<li><code>lib/</code> - Library JARs used by the system
|
||||||
|
<ul>
|
||||||
|
<li><code>README</code> - Lists the packages third-party libraries (JARs) and their use</li>
|
||||||
|
<li><code>licenses</code> - Contains the licenses associated with the JARs</li>
|
||||||
|
</ul>
|
||||||
|
</li>
|
||||||
|
<li><code>src/</code> - DSpace system source code. For details on how this is laid out, see the overview page of the Javadoc.</li>
|
||||||
|
</ul>
|
||||||
|
</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
|
||||||
|
<H2><A NAME="installdir">Installed Directory Layout</A></H2>
|
||||||
|
|
||||||
|
<P>Below is the basic layout of a DSpace installation using the default configuration. These paths can be configured if necessary.</P>
|
||||||
|
|
||||||
|
|
||||||
|
<UL>
|
||||||
|
<li><code><i>[dspace]</i></code>
|
||||||
|
<UL>
|
||||||
|
<li><code>assetstore/</code> - asset store files</li>
|
||||||
|
<li><code>bin/</code> - shell scripts</li>
|
||||||
|
<li><code>config/</code> - configuration</li>
|
||||||
|
<li><code>handle-server/</code> - Handles server files</li>
|
||||||
|
<li><code>history/</code> - history files</li>
|
||||||
|
<li><code>lib/</code> - JARs, including dspace.jar, containing the DSpace classes</li>
|
||||||
|
<li><code>log/</code> - Log files</li>
|
||||||
|
<li><code>search/</code> - Lucene search index files</li>
|
||||||
|
</ul>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<H2><A NAME="logfiles">Log Files</A></H2>
|
||||||
|
|
||||||
|
<P>The first source of potential confusion is the log files. Since DSpace uses a number of third-party tools, problems can occur in a variety of places. Below is a table listing the main log files used in a typical DSpace setup. The locations given are defaults, and might be different for your system depending on where you installed DSpace and the third-party tools. The ordering of the list is roughly the recommended order for searching them for the details about a particular problem or error.</P>
|
||||||
|
|
||||||
|
<TABLE>
|
||||||
|
<CAPTION>DSpace Log File Locations</CAPTION>
|
||||||
|
<TR>
|
||||||
|
<TH>Log File</TH>
|
||||||
|
<TH>What's In It</TH>
|
||||||
|
</TR>
|
||||||
|
<TR>
|
||||||
|
<TD><code><i>[dspace]</i>/log/dspace.log</code></TD>
|
||||||
|
<TD>Main DSpace log file. This is where the DSpace code writes a simple log of events and errors that occur within the DSpace code. You can control the verbosity of this by editing the <code><i>[dspace]</i>/config/templates/log4j.properties</code> file and then running <code><i>[dspace]</i>/bin/install-configs</code>.</TD>
|
||||||
|
</TR>
|
||||||
|
<TR>
|
||||||
|
<TD><code><i>[tomcat]</i>/logs/catalina.out</code></TD>
|
||||||
|
<TD>This is where Tomcat's standard output is written. Many errors that occur within the Tomcat code are logged here. For example, if Tomcat can't find the DSpace code (<code>dspace.jar</code>), it would be logged in <code>catalina.out</code>.</TD>
|
||||||
|
</TR>
|
||||||
|
<TR>
|
||||||
|
<TD><code><i>[tomcat]</i>/logs/hostname_log.yyyy-mm-dd.txt</code></TD>
|
||||||
|
<TD>If you're running Tomcat stand-alone (without Apache), it logs some information and errors for specific Web applications to this log file. <code>hostname</code> will be your host name (e.g. <code>dspace.myu.edu</code>) and <code>yyyy-mm-dd</code> will be the date.</TD>
|
||||||
|
</TR>
|
||||||
|
<TR>
|
||||||
|
<TD><code><i>[tomcat]</i>/logs/apache_log.yyyy-mm-dd.txt</code></TD>
|
||||||
|
<TD>If you're using Apache, Tomcat logs information about Web applications running through Apache (<code>mod_webapp</code>) in this log file (<code>yyyy-mm-dd</code> being the date.)</TD>
|
||||||
|
</TR>
|
||||||
|
<TR>
|
||||||
|
<TD><code><i>[apache]</i>/error_log</code></TD>
|
||||||
|
<TD>Apache logs to this file. If there is a problem with getting <code>mod_webapp</code> working, this is a good place to look for clues. Apache also writes to several other log files, though <code>error_log</code> tends to contain the most useful information for tracking down problems.</TD>
|
||||||
|
</TR>
|
||||||
|
<TR>
|
||||||
|
<TD><code><i>[dspace]</i>/log/handle-plug.log</code></TD>
|
||||||
|
<TD>The Handle server runs as a separate process from the DSpace Web UI (which runs under Tomcat's JVM). Due to a limitation of log4j's 'rolling file appenders', the DSpace code running in the Handle server's JVM must use a separate log file. The DSpace code that is run as part of a Handle resolution request writes log information to this file. You can control the verbosity of this by editing <code><i>[dspace]</i>/config/templates/log4j-handle-plugin.properties</code>.</TD>
|
||||||
|
</TR>
|
||||||
|
<TR>
|
||||||
|
<TD><code><i>[dspace]</i>/log/handle-server.log</code></TD>
|
||||||
|
<TD>This is the log file for CNRI's Handle server code. If a problem occurs within the Handle server code, before DSpace's plug-in is invoked, this is where it may be logged.</TD>
|
||||||
|
</TR>
|
||||||
|
<TR>
|
||||||
|
<TD><code><i>[dspace]</i>/handle-server/error.log</code></TD>
|
||||||
|
<TD>On the other hand, a problem with CNRI's Handle server code might be logged here.</TD>
|
||||||
|
</TR>
|
||||||
|
<TR>
|
||||||
|
<TD>PostgreSQL log</TD>
|
||||||
|
<TD>PostgreSQL also writes a log file. This one doesn't seem to have a default location, you probably had to specify it yourself at some point during installation. In general, this log file rarely contains pertinent information--PostgreSQL is pretty stable, you're more likely to encounter problems with connecting via JDBC, and these problems will be logged in <code>dspace.log</code>.</TD>
|
||||||
|
</TR>
|
||||||
|
</TABLE>
|
||||||
|
|
||||||
|
<HR>
|
||||||
|
|
||||||
|
<ADDRESS>
|
||||||
|
Copyright © 2002-2004 MIT and Hewlett Packard
|
||||||
|
</ADDRESS>
|
||||||
|
</BODY>
|
||||||
|
</HTML>
|
765
dspace/docs/functional.html
Normal file
765
dspace/docs/functional.html
Normal file
@@ -0,0 +1,765 @@
|
|||||||
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN" "http://www.w3.org/TR/REC-html40/strict.dtd">
|
||||||
|
<html>
|
||||||
|
<head>
|
||||||
|
<title>DSpace System
|
||||||
|
Documentation: Functional Overview</title>
|
||||||
|
<link rel="StyleSheet" href="style.css" type="text/css">
|
||||||
|
<meta http-equiv="Content-Type"
|
||||||
|
content="text/html; charset=iso-8859-1">
|
||||||
|
</head>
|
||||||
|
<body>
|
||||||
|
<h1>DSpace
|
||||||
|
System Documentation:
|
||||||
|
Functional Overview</h1>
|
||||||
|
<p><a href="index.html">Back to contents</a></p>
|
||||||
|
<p>The following sections describe
|
||||||
|
the various functional aspects of the DSpace system.</p>
|
||||||
|
<h2><a name="data_model">Data Model</a></h2>
|
||||||
|
<p class="figure"><img src="image/data-model.gif"
|
||||||
|
alt="Data Model Diagram"></p>
|
||||||
|
<p class="caption">Data
|
||||||
|
Model Diagram</p>
|
||||||
|
<p>The way data is organized in
|
||||||
|
DSpace is intended to reflect the structure of the organization using
|
||||||
|
the DSpace system. Each DSpace site is divided into <em>communities</em>;
|
||||||
|
these typically correspond to a laboratory, research center or
|
||||||
|
department. As of DSpace version 1.2, these communities can be
|
||||||
|
organized into an hierarchy.</p>
|
||||||
|
<p>Communities contain <em>collections</em>,
|
||||||
|
which are groupings of related content. A collection may appear in more
|
||||||
|
than one community.</p>
|
||||||
|
<p>Each collection is composed of <em>items</em>,
|
||||||
|
which are the basic archival elements of the archive. Each item is
|
||||||
|
owned by one collection. Additionally, an item may appear in additional
|
||||||
|
collections; however every item has one and only one owning collection.</p>
|
||||||
|
<p>Items are further subdivided
|
||||||
|
into named <em>bundles</em>
|
||||||
|
of <em>bitstreams</em>.
|
||||||
|
Bitstreams are, as the name suggests, streams of bits, usually ordinary
|
||||||
|
computer files. Bitstreams that are somehow closely related, for
|
||||||
|
example HTML files and images that compose a single HTML document, are
|
||||||
|
organised into bundles.</p>
|
||||||
|
<p>In practice, most items tend to
|
||||||
|
have three named bundles:</p>
|
||||||
|
<ul>
|
||||||
|
<li><em>ORIGINAL</em>
|
||||||
|
-- the bundle with the original, deposited bitstreams</li>
|
||||||
|
<li><em>THUMBNAILS</em>
|
||||||
|
-- thumbnails of any image bitstreams</li>
|
||||||
|
<li><em>TEXT</em>
|
||||||
|
-- extracted full-text from bitstreams in ORIGINAL, for indexing</li>
|
||||||
|
</ul>
|
||||||
|
<p>Each bitstream is associated
|
||||||
|
with one <em>Bitstream Format</em>.
|
||||||
|
Because preservation services may be an important aspect of the DSpace
|
||||||
|
service, it is important to capture the specific formats of files that
|
||||||
|
users submit. In DSpace, a bitstream format is a unique and consistent
|
||||||
|
way to refer to a particular file format. An integral part of a
|
||||||
|
bitstream format is an either implicit or explicit notion of how
|
||||||
|
material in that format can be interpreted. For example, the
|
||||||
|
interpretation for bitstreams encoded in the JPEG standard for still
|
||||||
|
image compression is defined explicitly in the Standard ISO/IEC
|
||||||
|
10918-1. The interpretation of bitstreams in Microsoft Word 2000 format
|
||||||
|
is defined implicitly, through reference to the Microsoft Word 2000
|
||||||
|
application. Bitstream formats can be more specific than MIME types or
|
||||||
|
file suffixes. For example, <code>application/ms-word</code>
|
||||||
|
and <code>.doc</code>
|
||||||
|
span multiple versions of the Microsoft Word application, each of which
|
||||||
|
produces bitstreams with presumably different characteristics.</p>
|
||||||
|
<p>Each bitstream format
|
||||||
|
additionally has a <em>support
|
||||||
|
level</em>, indicating how well the
|
||||||
|
hosting institution is likely to be able to preserve content in the
|
||||||
|
format in the future. There are three possible support levels that
|
||||||
|
bitstream formats may be assigned by the hosting institution. The host
|
||||||
|
institution should determine the exact meaning of each support level,
|
||||||
|
after careful consideration of costs and requirements. MIT Libraries'
|
||||||
|
interpretation is shown below:</p>
|
||||||
|
<table>
|
||||||
|
<caption>MIT Libraries'
|
||||||
|
Definitions of Bitstream Format Support Levels</caption> <tbody>
|
||||||
|
<tr>
|
||||||
|
<td><strong>Supported</strong></td>
|
||||||
|
<td>The format is
|
||||||
|
recognized, and the hosting institution is confident it can make
|
||||||
|
bitstreams of this format useable in the future, using whatever
|
||||||
|
combination of techniques (such as migration, emulation, etc.) is
|
||||||
|
appropriate given the context of need.</td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td><strong>Known</strong></td>
|
||||||
|
<td>The format is
|
||||||
|
recognized, and the hosting institution will promise to preserve the
|
||||||
|
bitstream as-is, and allow it to be retrieved. The hosting institution
|
||||||
|
will attempt to obtain enough information to enable the format to be
|
||||||
|
upgraded to the 'supported' level.</td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td><strong>Unsupported</strong></td>
|
||||||
|
<td>The format is
|
||||||
|
unrecognized, but the hosting institution will undertake to preserve
|
||||||
|
the bitstream as-is and allow it to be retrieved.</td>
|
||||||
|
</tr>
|
||||||
|
</tbody>
|
||||||
|
</table>
|
||||||
|
<p>Each item has one qualified
|
||||||
|
Dublin Core metadata record. Other metadata might be stored in an item
|
||||||
|
as a serialized bitstream, but we store Dublin Core for every item for
|
||||||
|
interoperability and ease of discovery. The Dublin Core may be entered
|
||||||
|
by end-users as they submit content, or it might be derived from other
|
||||||
|
metadata as part of an ingest process.</p>
|
||||||
|
<p><a name="#deletions">Items can be removed from DSpace in
|
||||||
|
one of two ways:</a> They may be
|
||||||
|
'withdrawn', which means they remain in the archive but are completely
|
||||||
|
hidden from view. In this case, if an end-user attempts to access the
|
||||||
|
withdrawn item, they are presented with a 'tombstone,' that indicates
|
||||||
|
the item has been removed. For whatever reason, an item may also be
|
||||||
|
'expunged' if necessary, in which case all traces of it are removed
|
||||||
|
from the archive.</p>
|
||||||
|
<table>
|
||||||
|
<caption>Objects in the DSpace
|
||||||
|
Data Model</caption> <tbody>
|
||||||
|
<tr>
|
||||||
|
<th>Object</th>
|
||||||
|
<th>Example</th>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td>Community</td>
|
||||||
|
<td>Laboratory of Computer
|
||||||
|
Science; Oceanographic Research Center</td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td>Collection</td>
|
||||||
|
<td>LCS Technical Reports;
|
||||||
|
ORC Statistical Data Sets</td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td>Item</td>
|
||||||
|
<td>A technical report; a
|
||||||
|
data set with accompanying description; a video recording of a lecture</td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td>Bundle</td>
|
||||||
|
<td>A group of HTML and
|
||||||
|
image bitstreams making up an HTML document</td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td>Bitstream</td>
|
||||||
|
<td>A single HTML file; a
|
||||||
|
single image file; a source code file</td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td>Bitstream Format</td>
|
||||||
|
<td>Microsoft Word version
|
||||||
|
6.0; JPEG encoded image format</td>
|
||||||
|
</tr>
|
||||||
|
</tbody>
|
||||||
|
</table>
|
||||||
|
<h2><a name="metadata">Metadata</a></h2>
|
||||||
|
<p>Broadly speaking, DSpace holds
|
||||||
|
three sorts of metadata about archived content:</p>
|
||||||
|
<dl>
|
||||||
|
<dt>Descriptive Metadata</dt>
|
||||||
|
<dd>
|
||||||
|
<p>Each <em>Item</em>
|
||||||
|
has one qualified Dublin Core metadata record. The <a
|
||||||
|
href="http://dspace.org/technology/metadata.html">set
|
||||||
|
of elements and qualifiers used by MIT Libraries</a>
|
||||||
|
is the default configuration included in the DSpace source code. These
|
||||||
|
are loosely based on the <a
|
||||||
|
href="http://www.dublincore.org/documents/library-application-profile/">Library
|
||||||
|
Application Profile</a> set of
|
||||||
|
elements and qualifiers, though there are some differences.</p>
|
||||||
|
<p>Other descriptive metadata
|
||||||
|
about items may be held in serialized bitstreams. <em>Communities</em>
|
||||||
|
and <em>collections</em>
|
||||||
|
have some simple descriptive metadata (a name, and some descriptive
|
||||||
|
prose), held in the DBMS.</p>
|
||||||
|
</dd>
|
||||||
|
<dt>Administrative Metadata</dt>
|
||||||
|
<dd>
|
||||||
|
<p>This includes preservation
|
||||||
|
metadata, provenance and authorization policy data. Most of this is
|
||||||
|
held within DSpace's relation DBMS schema. Provenance metadata (prose)
|
||||||
|
is stored in Dublin Core records. Additionally, some other
|
||||||
|
administrative metadata (for example, bitstream byte sizes and MIME
|
||||||
|
types) is replicated in Dublin Core records so that it is easily
|
||||||
|
accessible outside of DSpace.</p>
|
||||||
|
</dd>
|
||||||
|
<dt>Structural Metadata</dt>
|
||||||
|
<dd>
|
||||||
|
<p>This includes information
|
||||||
|
about how to present an item, or bitstreams within an item, to an
|
||||||
|
end-user, and the relationships between constituent parts of the item.
|
||||||
|
As an example, consider a thesis consisting of a number of TIFF images,
|
||||||
|
each depicting a single page of the thesis. Structural metadata would
|
||||||
|
include the fact that each image is a single page, and the ordering of
|
||||||
|
the TIFF images/pages. Structural metadata in DSpace is currently
|
||||||
|
fairly basic; within an item, bitstreams can be arranged into separate
|
||||||
|
bundles as <a href="#data_model">described
|
||||||
|
above</a>. A bundle may also
|
||||||
|
optionally have a <em>primary
|
||||||
|
bitstream</em>. This is currently
|
||||||
|
used by the <a href="#html">HTML
|
||||||
|
support</a> to indicate which
|
||||||
|
bitstream in the bundle is the first HTML file to send to a browser.</p>
|
||||||
|
<p>In addition to some basic
|
||||||
|
technical metadata, bitstreams also have a 'sequence ID' that uniquely
|
||||||
|
identifies it within an item. This is used to produce a <a
|
||||||
|
href="#bitstream_ids">'persistent' bitstream
|
||||||
|
identifier</a> for each bitstream.</p>
|
||||||
|
<p>Additional structural
|
||||||
|
metadata can be stored in serialized bitstreams, but DSpace does not
|
||||||
|
currently understand this natively.</p>
|
||||||
|
</dd>
|
||||||
|
</dl>
|
||||||
|
<h2><a name="epeople">E-people</a></h2>
|
||||||
|
<p>Many of DSpace's features such
|
||||||
|
as document discovery and retrieval can be used anonymously, but users
|
||||||
|
must be authenticated to perform functions such as submission, email
|
||||||
|
notification ('subscriptions') or administration. Users are also
|
||||||
|
grouped for easier administration. DSpace calls users <em>e-people</em>,
|
||||||
|
to reflect that some users may be machines rather than actual people.</p>
|
||||||
|
<p>DSpace hold the following
|
||||||
|
information about each e-person:</p>
|
||||||
|
<ul>
|
||||||
|
<li>E-mail address</li>
|
||||||
|
<li>First and last names</li>
|
||||||
|
<li>Whether the user is able to
|
||||||
|
log in to the system via the Web UI, and whether they must use an X509
|
||||||
|
certificate to do so;</li>
|
||||||
|
<li>A password (encrypted), if
|
||||||
|
appropriate</li>
|
||||||
|
<li>A list of collections for
|
||||||
|
which the e-person wishes to be notified of new items</li>
|
||||||
|
<li>Whether the e-person
|
||||||
|
'self-registered' with the system; that is, whether the system created
|
||||||
|
the e-person record automatically as a result of the end-user
|
||||||
|
independently registering with the system, as opposed to the e-person
|
||||||
|
record being generated from the institution's personnel database, for
|
||||||
|
example.</li>
|
||||||
|
</ul>
|
||||||
|
<p>E-people authenticate with
|
||||||
|
username/password pairs or X509 certificates. E-people can be members
|
||||||
|
of 'groups' to make administrator's lives easier when manipulating
|
||||||
|
authorization policies.</p>
|
||||||
|
<h2><a name="auth">Authorization</a></h2>
|
||||||
|
<p>DSpace's authorization system
|
||||||
|
is based on associating actions with objects and the lists of EPeople
|
||||||
|
who can perform them. The associations are called Resource Policies,
|
||||||
|
and the lists of EPeople are called Groups. There are two special
|
||||||
|
groups: 'administrators', who can do anything in a site, and
|
||||||
|
'anonymous', which is a list that contains all users. Assigning a
|
||||||
|
policy for an action on an object to anonymous means giving everyone
|
||||||
|
permission to do that action. (For example, most objects in DSpace
|
||||||
|
sites have a policy of 'anonymous' READ.) Permissions must be explicit
|
||||||
|
- lack of an explicit permission results in the default policy of
|
||||||
|
'deny'. Permissions also do not 'commute'; for example, if an e-person
|
||||||
|
has READ permission on an item, they might not necessarily have READ
|
||||||
|
permission on the bundles and bitstreams in that item. Currently
|
||||||
|
Collections, Communities and Items are discoverable in the browse and
|
||||||
|
search systems regardless of READ authorization.</p>
|
||||||
|
<p>The following actions are
|
||||||
|
possible:</p>
|
||||||
|
<p><strong>Community</strong></p>
|
||||||
|
<table>
|
||||||
|
<tbody>
|
||||||
|
<tr>
|
||||||
|
<td>ADD/REMOVE</td>
|
||||||
|
<td>add or remove
|
||||||
|
collections or sub-communities</td>
|
||||||
|
</tr>
|
||||||
|
</tbody>
|
||||||
|
</table>
|
||||||
|
<p><strong>Collection</strong></p>
|
||||||
|
<table>
|
||||||
|
<tbody>
|
||||||
|
<tr>
|
||||||
|
<td>ADD/REMOVE</td>
|
||||||
|
<td>add or remove items (ADD
|
||||||
|
= permission to submit items)</td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td>DEFAULT_ITEM_READ</td>
|
||||||
|
<td>inherited as READ by all
|
||||||
|
submitted items</td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td>DEFAULT_BITSTREAM_READ</td>
|
||||||
|
<td>inherited as READ by
|
||||||
|
bitstreams of all submitted items</td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td>COLLECTION_ADMIN</td>
|
||||||
|
<td>collection admins can
|
||||||
|
edit items in a collection, withdraw items, map other items into this
|
||||||
|
collection.</td>
|
||||||
|
</tr>
|
||||||
|
</tbody>
|
||||||
|
</table>
|
||||||
|
<p><strong>Item</strong></p>
|
||||||
|
<table>
|
||||||
|
<tbody>
|
||||||
|
<tr>
|
||||||
|
<td>ADD/REMOVE</td>
|
||||||
|
<td>add or remove bundles</td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td>READ</td>
|
||||||
|
<td>can view item (item
|
||||||
|
metadata is always viewable)</td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td>WRITE</td>
|
||||||
|
<td>can modify item</td>
|
||||||
|
</tr>
|
||||||
|
</tbody>
|
||||||
|
</table>
|
||||||
|
<p><strong>Bundle</strong></p>
|
||||||
|
<table>
|
||||||
|
<tbody>
|
||||||
|
<tr>
|
||||||
|
<td>ADD/REMOVE</td>
|
||||||
|
<td>add or remove bitstreams
|
||||||
|
to a bundle</td>
|
||||||
|
</tr>
|
||||||
|
</tbody>
|
||||||
|
</table>
|
||||||
|
<p><strong>Bitstream</strong></p>
|
||||||
|
<table>
|
||||||
|
<tbody>
|
||||||
|
<tr>
|
||||||
|
<td>READ</td>
|
||||||
|
<td>view bitstream</td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td>WRITE</td>
|
||||||
|
<td>modify bitstream</td>
|
||||||
|
</tr>
|
||||||
|
</tbody>
|
||||||
|
</table>
|
||||||
|
<p>Note that there is no 'DELETE'
|
||||||
|
action. In order to 'delete' an object (e.g. an item) from the archive,
|
||||||
|
one must have REMOVE permission on all objects (in this case,
|
||||||
|
collection) that contain it. The 'orphaned' item is automatically
|
||||||
|
deleted.</p>
|
||||||
|
<p>Policies can apply to
|
||||||
|
individual e-people or groups of e-people.</p>
|
||||||
|
<h2><a name="ingest">Ingest Process and Workflow</a></h2>
|
||||||
|
<p>Rather than being a single
|
||||||
|
subsystem, ingesting is a process that spans several. Below is a simple
|
||||||
|
illustration of the current ingesting process in DSpace.</p>
|
||||||
|
<p class="figure"><img src="image/ingest.gif"
|
||||||
|
alt="Ingest Process Diagram"></p>
|
||||||
|
<p class="caption">DSpace
|
||||||
|
Ingest Process</p>
|
||||||
|
<p>The batch item importer is an
|
||||||
|
application, which turns an external SIP (an XML metadata document with
|
||||||
|
some content files) into an "in progress submission" object. The Web
|
||||||
|
submission UI is similarly used by an end-user to assemble an "in
|
||||||
|
progress submission" object.</p>
|
||||||
|
<p>Depending on the policy of the
|
||||||
|
collection to which the submission in targeted, a workflow process may
|
||||||
|
be started. This typically allows one or more human reviewers or
|
||||||
|
'gatekeepers' to check over the submission and ensure it is suitable
|
||||||
|
for inclusion in the collection.</p>
|
||||||
|
<p>When the Batch Ingester or Web
|
||||||
|
Submit UI completes the InProgressSubmission object, and invokes the
|
||||||
|
next stage of ingest (be that workflow or item installation), a
|
||||||
|
provenance message is added to the Dublin Core which includes the
|
||||||
|
filenames and checksums of the content of the submission. Likewise,
|
||||||
|
each time a workflow changes state (e.g. a reviewer accepts the
|
||||||
|
submission), a similar provenance statement is added. This allows us to
|
||||||
|
track how the item has changed since a user submitted it. (The <a
|
||||||
|
href="#history">History system</a>
|
||||||
|
is also invoked, but provenance is easier for us to access at the
|
||||||
|
moment.)</p>
|
||||||
|
<p>Once any workflow process is
|
||||||
|
successfully and positively completed, the InProgressSubmission object
|
||||||
|
is consumed by an "item installer", that converts the
|
||||||
|
InProgressSubmission into a fully blown archived item in DSpace. The
|
||||||
|
item installer:</p>
|
||||||
|
<ul>
|
||||||
|
<li>Assigns an accession date</li>
|
||||||
|
<li>Adds a "date.available"
|
||||||
|
value to the Dublin Core metadata record of the item</li>
|
||||||
|
<li>Adds an issue date if none
|
||||||
|
already present</li>
|
||||||
|
<li>Adds a provenance message
|
||||||
|
(including bitstream checksums)</li>
|
||||||
|
<li>Assigns a <a href="#handles">Handle</a>
|
||||||
|
persistent identifier</li>
|
||||||
|
<li>Adds the item to the target
|
||||||
|
collection, and adds appropriate authorization policies</li>
|
||||||
|
<li>Adds the new item to the
|
||||||
|
search and browse indices</li>
|
||||||
|
</ul>
|
||||||
|
<h3>Workflow Steps</h3>
|
||||||
|
<p>A collection's workflow can
|
||||||
|
have up to three steps. Each collection may have an associated e-person
|
||||||
|
group for performing each step; if no group is associated with a
|
||||||
|
certain step, that step is skipped. If a collection has no e-person
|
||||||
|
groups associated with any step, submissions to that collection are
|
||||||
|
installed straight into the main archive.</p>
|
||||||
|
<p>In other words, the sequence is
|
||||||
|
this: The collection receives a submission. If the collection has a
|
||||||
|
group assigned for workflow step 1, that step is invoked, and the group
|
||||||
|
is notified. Otherwise, workflow step 1 is skipped. Likewise, workflow
|
||||||
|
steps 2 and 3 are performed if and only if the collection has a group
|
||||||
|
assigned to those steps.</p>
|
||||||
|
<p>When a step is invoked, the
|
||||||
|
task of performing that workflow step put in the 'task pool' of the
|
||||||
|
associated group. One member of that group takes the task from the
|
||||||
|
pool, and it is then removed from the task pool, to avoid the situation
|
||||||
|
where several people in the group may be performing the same task
|
||||||
|
without realizing it.</p>
|
||||||
|
<p>The member of the group who has
|
||||||
|
taken the task from the pool may then perform one of three actions:</p>
|
||||||
|
<table>
|
||||||
|
<tbody>
|
||||||
|
<tr>
|
||||||
|
<th>Workflow Step</th>
|
||||||
|
<th>Possible actions</th>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td>1</td>
|
||||||
|
<td>Can accept submission
|
||||||
|
for inclusion, or reject submission.</td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td>2</td>
|
||||||
|
<td>Can edit metadata
|
||||||
|
provided by the user with the submission, but cannot change the
|
||||||
|
submitted files. Can accept submission for inclusion, or reject
|
||||||
|
submission.</td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td>3</td>
|
||||||
|
<td>Can edit metadata
|
||||||
|
provided by the user with the submission, but cannot change the
|
||||||
|
submitted files. Must then commit to archive; may not reject submission.</td>
|
||||||
|
</tr>
|
||||||
|
</tbody>
|
||||||
|
</table>
|
||||||
|
<p class="figure"><img src="image/workflow.gif"
|
||||||
|
alt="Submission Workflow Diagram"></p>
|
||||||
|
<p class="caption">Submission
|
||||||
|
Workflow in DSpace</p>
|
||||||
|
<p>If a submission is rejected,
|
||||||
|
the reason (entered by the workflow participant) is e-mailed to the
|
||||||
|
submitter, and it is returned to the submitter's 'My DSpace' page. The
|
||||||
|
submitter can then make any necessary modifications and re-submit,
|
||||||
|
whereupon the process starts again.</p>
|
||||||
|
<p>If a submission is 'accepted',
|
||||||
|
it is passed to the next step in the workflow. If there are no more
|
||||||
|
workflow steps with associated groups, the submission is installed in
|
||||||
|
the main archive.</p>
|
||||||
|
<p>One last possibility is that a
|
||||||
|
workflow can be 'aborted' by a DSpace site administrator. This is
|
||||||
|
accomplished using the administration UI.</p>
|
||||||
|
<p>The reason for this apparently
|
||||||
|
arbitrary design is that is was the simplist case that covered the
|
||||||
|
needs of the early adopter communities at MIT. The functionality of the
|
||||||
|
workflow system will no doubt be extended in the future.</p>
|
||||||
|
<h2><a name="handles">Handles</a></h2>
|
||||||
|
<p>Researchers require a stable
|
||||||
|
point of reference for their works. The simple evolution from sharing
|
||||||
|
of citations to emailing of URLs broke when Web users learned that
|
||||||
|
sites can disappear or be reconfigured without notice, and that their
|
||||||
|
bookmark files containing critical links to research results couldn't
|
||||||
|
be trusted long term. To help solve this problem, a core DSpace feature
|
||||||
|
is the creation of persistent identifier for every item, collection and
|
||||||
|
community stored in DSpace. To persist identifier, DSpace requires a
|
||||||
|
storage- and location- independent mechanism for creating and
|
||||||
|
maintaining identifiers. DSpace uses the <a
|
||||||
|
href="http://www.handle.net/">CNRI Handle System</a>
|
||||||
|
for creating these identifiers. The rest of this section assumes a
|
||||||
|
basic familiarity with the Handle system.</p>
|
||||||
|
<p>DSpace uses Handles primarily
|
||||||
|
as a means of assigning globally unique identifiers to objects. Each
|
||||||
|
site running DSpace needs to obtain a Handle 'prefix' from CNRI, so we
|
||||||
|
know that if we create identifiers with that prefix, they won't clash
|
||||||
|
with identifiers created elsewhere.</p>
|
||||||
|
<p>Presently, Handles are assigned
|
||||||
|
to communities, collections, and items. Bundles and bitstreams are not
|
||||||
|
assigned Handles, since over time, the way in which an item is encoded
|
||||||
|
as bits may change, in order to allow access with future technologies
|
||||||
|
and devices. Older versions may be moved to off-line storage as a new
|
||||||
|
standard becomes de facto. Since it's usually the <em>item</em>
|
||||||
|
that is being preserved, rather than the particular bit encoding, it
|
||||||
|
only makes sense to persistently identify and allow access to the item,
|
||||||
|
and allow users to access the appropriate bit encoding from there.</p>
|
||||||
|
<p>Of course, it may be that a
|
||||||
|
particular bit encoding of a file is explicitly being preserved; in
|
||||||
|
this case, the bitstream could be the only one in the item, and the
|
||||||
|
item's Handle would then essentially refer just to that bitstream. The
|
||||||
|
same bitstream can also be included in other items, and thus would be
|
||||||
|
citable as part of a greater item, or individually.</p>
|
||||||
|
<p>The Handle system also features
|
||||||
|
a global resolution infrastructure; that is, an end-user can enter a
|
||||||
|
Handle into any service (e.g. Web page) that can resolve Handles, and
|
||||||
|
the end-user will be directed to the object (in the case of DSpace,
|
||||||
|
community, collection or item) identified by that Handle. In order to
|
||||||
|
take advantage of this feature of the Handle system, a DSpace site must
|
||||||
|
also run a 'Handle server' that can accept and resolve incoming
|
||||||
|
resolution requests. All the code for this is included in the DSpace
|
||||||
|
source code bundle.</p>
|
||||||
|
<p>Handles can be written in two
|
||||||
|
forms:</p>
|
||||||
|
<pre>hdl:1721.123/4567<br>http://hdl.handle.net/1721.123/4567</pre>
|
||||||
|
<p>The above represent the same
|
||||||
|
Handle. The first is possibly more convenient to use only as an
|
||||||
|
identifier; however, by using the second form, any Web browser becomes
|
||||||
|
capable of resolving Handles. An end-user need only access this form of
|
||||||
|
the Handle as they would any other URL. It is possible to enable some
|
||||||
|
browsers to resolve the first form of Handle as if they were standard
|
||||||
|
URLs using <a href="http://www.handle.net/resolver/index.html">CNRI's
|
||||||
|
Handle Resolver plug-in</a>, but
|
||||||
|
since the first form can always be simply derived from the second,
|
||||||
|
DSpace displays Handles in the second form, so that it is more useful
|
||||||
|
for end-users.</p>
|
||||||
|
<p>It is important to note that
|
||||||
|
DSpace uses the CNRI Handle infrastructure only at the 'site' level.
|
||||||
|
For example, in the above example, the DSpace site has been assigned
|
||||||
|
the prefix '1721.123'. It is still the responsibility of the DSpace
|
||||||
|
site to maintain the association between a full Handle (including the
|
||||||
|
'4567' local part) and the community, collection or item in question.</p>
|
||||||
|
<h2><a name="bitstream_ids">Bitstream 'Persistent'
|
||||||
|
Identifiers</a></h2>
|
||||||
|
<p>As of DSpace 1.2, bitstreams in
|
||||||
|
DSpace also have more persistent identifiers. They are more volatile
|
||||||
|
than Handles, since if the content is moved to a different server or
|
||||||
|
organizaion, they will no longer work (hence the quotes around
|
||||||
|
'persistent'). However, they are more easily persisted than the simple
|
||||||
|
URLs based on database primary key previously used. This means that
|
||||||
|
external systems can more reliably refer to specific bitstreams stored
|
||||||
|
in a DSpace instance.</p>
|
||||||
|
<p>Each bitstream has a sequence
|
||||||
|
ID, unique within an item. This sequence ID is used to create a
|
||||||
|
persistent ID, of the form:</p>
|
||||||
|
<p><code><em>dspace
|
||||||
|
url</em>/bitstream/<em>handle</em>/<em>sequence
|
||||||
|
ID</em>/<em>filename</em></code></p>
|
||||||
|
<p>For example:</p>
|
||||||
|
<pre>https://dspace.myu.edu/bitstream/123.456/789/24/foo.html</pre>
|
||||||
|
<p>The above refers to the
|
||||||
|
bitstream with sequence ID 24 in the item with the Handle <code>hdl:123.456/789</code>.
|
||||||
|
The <code>foo.html</code>
|
||||||
|
is really just there as a hint to browsers: Although DSpace will
|
||||||
|
provide the appropriate MIME type, some browsers only function
|
||||||
|
correctly if the file has an expected extension.<br>
|
||||||
|
</p>
|
||||||
|
<h2><a name="srb">Storage Resource Broker (SRB) Support<br>
|
||||||
|
</a></h2>
|
||||||
|
<p>DSpace offers two means for
|
||||||
|
storing bitstreams. The first is in the file system on the server. The
|
||||||
|
second is using <a href="http://www.sdsc.edu/srb">SRB (Storage
|
||||||
|
Resource
|
||||||
|
Broker)</a>. Both are achieved using
|
||||||
|
a simple, lightweight API. <br>
|
||||||
|
</p>
|
||||||
|
SRB is purely an option but may
|
||||||
|
be used in lieu of the server's file system or in addition to the file
|
||||||
|
system. Without going into a full description, SRB is a very robust,
|
||||||
|
sophisticated storage manager that offers essentially unlimited storage
|
||||||
|
and straightforward means to replicate (in simple terms, backup) the
|
||||||
|
content on other local or remote storage resources.
|
||||||
|
<p></p>
|
||||||
|
<h2><a name="search_browse">Search and Browse</a></h2>
|
||||||
|
<p>DSpace allows end-users to
|
||||||
|
discover content in a number of ways, including:</p>
|
||||||
|
<ul>
|
||||||
|
<li>Via external reference, such
|
||||||
|
as a Handle</li>
|
||||||
|
<li>Searching for one or more
|
||||||
|
keywords in metadata or extracted full-text</li>
|
||||||
|
<li>Browsing though title, date
|
||||||
|
and author indices, with optional image thumbnails</li>
|
||||||
|
</ul>
|
||||||
|
<p>Search is an essential
|
||||||
|
component of discovery in DSpace. Users' expectations from a search
|
||||||
|
engine are quite high, so a goal for DSpace is to supply as many search
|
||||||
|
features as possible. DSpace's indexing and search module has a very
|
||||||
|
simple API which allows for indexing new content, regenerating the
|
||||||
|
index, and performing searches on the entire corpus, a community, or
|
||||||
|
collection. Behind the API is the Java freeware search engine <a
|
||||||
|
href="http://jakarta.apache.org/lucene/">Lucene</a>.
|
||||||
|
Lucene gives us fielded searching, stop word removal, stemming, and the
|
||||||
|
ability to incrementally add new indexed content without regenerating
|
||||||
|
the entire index.</p>
|
||||||
|
<p>Another important mechanism for
|
||||||
|
discovery in DSpace is the browse. This is the process whereby the user
|
||||||
|
views a particular index, such as the title index, and navigates around
|
||||||
|
it in search of interesting items. The browse subsystem provides a
|
||||||
|
simple API for achieving this by allowing a caller to specify an index,
|
||||||
|
and a subsection of that index. The browse subsystem then discloses the
|
||||||
|
portion of the index of interest. Indices that may be browsed are item
|
||||||
|
title, item issue date and authors. Additionally, the browse can be
|
||||||
|
limited to items within a particular collection or community.</p>
|
||||||
|
<h2><a name="html">HTML Support</a></h2>
|
||||||
|
<p>For the most part, at present
|
||||||
|
DSpace simply supports uploading and downloading of bitstreams as-is.
|
||||||
|
This is fine for the majority of commonly-used file formats -- for
|
||||||
|
example PDFs, Microsoft Word documents, spreadsheets and so forth. HTML
|
||||||
|
documents (Web sites and Web pages) are far more complicated, and this
|
||||||
|
has important ramifications when it comes to digital preservation:</p>
|
||||||
|
<ul>
|
||||||
|
<li>
|
||||||
|
<p>Web pages tend to consist
|
||||||
|
of several files -- one or more HTML files that contain references to
|
||||||
|
each other, and stylesheets and image files that are referenced by the
|
||||||
|
HTML files.</p>
|
||||||
|
</li>
|
||||||
|
<li>
|
||||||
|
<p>Web pages also link to or
|
||||||
|
include content from other sites, often imperceptably to the end-user.
|
||||||
|
Thus, in a few year's time, when someone views the preserved Web site,
|
||||||
|
they will probably find that many links are now broken or refer to
|
||||||
|
other sites than are now out of context.</p>
|
||||||
|
<p>In fact, it may be unclear
|
||||||
|
to an end-user when they are viewing content stored in DSpace and when
|
||||||
|
they are seeing content included from another site, or have navigated
|
||||||
|
to a page that is not stored in DSpace. This problem can manifest when
|
||||||
|
a submitter uploads some HTML content. For example, the HTML document
|
||||||
|
may include an image from an external Web site, or even their local
|
||||||
|
hard drive. When the submitter views the HTML in DSpace, their browser
|
||||||
|
is able to use the reference in the HTML to retrieve the appropriate
|
||||||
|
image, and so to the submitter, the whole HTML document appears to have
|
||||||
|
been deposited correctly. However, later on, when another user tries to
|
||||||
|
view that HTML, their browser might not be able to retrieve the
|
||||||
|
included image since it may have been removed from the external server.
|
||||||
|
Hence the HTML will seem broken.</p>
|
||||||
|
</li>
|
||||||
|
<li>
|
||||||
|
<p>Often Web pages are
|
||||||
|
produced dynamically by software running on the Web server, and
|
||||||
|
represent the state of a changing database underneath it.</p>
|
||||||
|
</li>
|
||||||
|
</ul>
|
||||||
|
<p>Dealing with these issues is
|
||||||
|
the topic of much active research. Currently, DSpace bites off a small,
|
||||||
|
tractable chunk of this problem. DSpace can store and provide on-line
|
||||||
|
browsing capability for <em>self-contained,
|
||||||
|
non-dynamic</em> HTML documents. In
|
||||||
|
practical terms, this means:</p>
|
||||||
|
<ul>
|
||||||
|
<li>No dynamic content (CGI
|
||||||
|
scripts and so forth)</li>
|
||||||
|
<li>All links to preserved
|
||||||
|
content must be <em>relative links</em>,
|
||||||
|
that do not refer to 'parents':
|
||||||
|
<ul>
|
||||||
|
<li><code>diagram.gif</code>
|
||||||
|
is OK</li>
|
||||||
|
<li><code>image/foo.gif</code>
|
||||||
|
is OK</li>
|
||||||
|
<li><code>/stylesheet.css</code>
|
||||||
|
is not OK</li>
|
||||||
|
<li><code>http://somedomain.com/content.html</code>
|
||||||
|
is not OK</li>
|
||||||
|
</ul>
|
||||||
|
</li>
|
||||||
|
<li>Any 'absolute links' (e.g. <code>http://somedomain.com/content.html</code>)
|
||||||
|
are stored 'as is', and will continue to link to the external content
|
||||||
|
(as opposed to relative links, which will link to the copy of the
|
||||||
|
content stored in DSpace.) Thus, over time, the content refered to by
|
||||||
|
the absolute link may change or disappear.</li>
|
||||||
|
</ul>
|
||||||
|
<h2><a name="oai">OAI Support</a></h2>
|
||||||
|
<p>The <a href="http://www.openarchives.org/">Open Archives
|
||||||
|
Initiative</a> has developed a <a
|
||||||
|
href="http://www.openarchives.org/OAI/openarchivesprotocol.html">protocol
|
||||||
|
for metadata harvesting</a>. This
|
||||||
|
allows sites to programmatically retrieve or 'harvest' the metadata
|
||||||
|
from several sources, and offer services using that metadata, such as
|
||||||
|
indexing or linking services. Such a service could allow users to
|
||||||
|
access information from a large number of sites from one place.</p>
|
||||||
|
<p>DSpace exposes the Dublin Core
|
||||||
|
metadata for items that are publicly (anonymously) accessible.
|
||||||
|
Additionally, the collection structure is also exposed via the OAI
|
||||||
|
protocol's 'sets' mechanism. OCLC's open source <a
|
||||||
|
href="http://www.oclc.org/research/software/oai/cat.shtm">OAICat</a>
|
||||||
|
framework is used to provide this functionality.</p>
|
||||||
|
<p>DSpace's OAI service does
|
||||||
|
support the exposing of deletion information for withdrawn items, but
|
||||||
|
not for items that are 'expunged' (<a href="#deletions">see above</a>).
|
||||||
|
DSpace also supports OAI-PMH resumption tokens.</p>
|
||||||
|
<h2><a name="openurl">OpenURL Support</a></h2>
|
||||||
|
<p>DSpace supports the <a href="http://www.sfxit.com/OpenURL/">OpenURL
|
||||||
|
protocol</a>
|
||||||
|
from <a href="http://www.sfxit.com/">SFX</a>,
|
||||||
|
in a rather simple fashion. If your institution has an SFX server,
|
||||||
|
DSpace will display an OpenURL link on every item page, automatically
|
||||||
|
using the Dublin Core metadata. Additionally, DSpace can respond to
|
||||||
|
incoming OpenURLs. Presently it simply passes the information in the
|
||||||
|
OpenURL to the search subsystem. A list of results is then displayed,
|
||||||
|
which usually gives the relevant item (if it is in DSpace) at the top
|
||||||
|
of the list.</p>
|
||||||
|
<h2><a name="creativecommons">Creative Commons Support</a></h2>
|
||||||
|
<p>Dspace provides support for
|
||||||
|
Creative Commons licenses to be attached to items in the repository.
|
||||||
|
They represent an alternative to traditional copyright. To learn more
|
||||||
|
about Creative Commons, visit <a href="http://creativecommons.org">their
|
||||||
|
website</a>.
|
||||||
|
Support for the licenses is controlled by a site-wide configuration
|
||||||
|
option, and since license selection involves redirection to the
|
||||||
|
Creative Commons website, additional parameters may be configured to
|
||||||
|
work with a proxy server. If the option is enabled, users may select a
|
||||||
|
Creative Commons license during the submission process, or elect to
|
||||||
|
skip Creative Commons licensing. If a selection is made a copy of the
|
||||||
|
license text and RDF metadata is stored along with the item in the
|
||||||
|
repository. There is also an indication - text and a Creative Commons
|
||||||
|
icon - in the item display page of the web user interface when an item
|
||||||
|
is licensed under Creative Commons.</p>
|
||||||
|
<h2><a name="subscriptions">Subscriptions</a></h2>
|
||||||
|
<p>As <a href="#epeople">noted above</a>,
|
||||||
|
end-users (e-people) may 'subscribe' to collections in order to be
|
||||||
|
alerted when new items appear in those collections. Each day, end-users
|
||||||
|
who are subscribed to one or more collections will receive an e-mail
|
||||||
|
giving brief details of all new items that appeared in any of those
|
||||||
|
collections the previous day. If no new items appeared in any of the
|
||||||
|
subscribed collections, no e-mail is sent. Users can unsubscribe
|
||||||
|
themselves at any time.</p>
|
||||||
|
<h2><a name="history">History</a></h2>
|
||||||
|
<p>While provenance information in
|
||||||
|
the form of prose is very useful, it is not easily programmatically
|
||||||
|
manipulated. The History system captures a time-based record of
|
||||||
|
significant changes in DSpace, in a manner suitable for later
|
||||||
|
'refactoring' or repurposing.</p>
|
||||||
|
<p>Currently, the History
|
||||||
|
subsystem is explicitly invoked when significant events occur (e.g.,
|
||||||
|
DSpace accepts an item into the archive). The History subsystem then
|
||||||
|
creates RDF data describing the current state of the object. The RDF
|
||||||
|
data is modeled using <a href="http://www.metadata.net/harmony/">Harmony/ABC</a>,
|
||||||
|
an ontology for describing temporal-based data, and stored in the file
|
||||||
|
system. Some simple indices for unwinding the data are available.</p>
|
||||||
|
<h2><a name="importexport">Import
|
||||||
|
and Export</a></h2>
|
||||||
|
<p>DSpace also includes batch
|
||||||
|
tools to import and export items in a simple directory structure, where
|
||||||
|
the Dublin Core metadata is stored in an XML file. This may be used as
|
||||||
|
the basis for moving content between DSpace and other systems.</p>
|
||||||
|
<p>There is also a METS-based
|
||||||
|
export tool, which exports items as METS-based metadata with associated
|
||||||
|
bitstreams referenced from the METS file.<br>
|
||||||
|
</p>
|
||||||
|
<h2><a name="registration">Registration</a></h2>
|
||||||
|
<p>Registration is an alternate
|
||||||
|
means of incorporating items, their metadata, and their bitstreams into
|
||||||
|
DSpace by taking advantage of the bitstreams already being in
|
||||||
|
accessible computer storage. An example might be that there is a
|
||||||
|
repository for existing digital assets. Rather than using the normal
|
||||||
|
<a href="#ingest">interactive
|
||||||
|
ingest process</a> or the <a href="#importexport">batch import</a>
|
||||||
|
to furnish DSpace the metadata
|
||||||
|
and to upload bitstreams, registration provides DSpace the metadata and
|
||||||
|
the <span style="font-style: italic;">location</span>
|
||||||
|
of the
|
||||||
|
bitstreams. DSpace uses a variation of the import tool to accomplish
|
||||||
|
registration.<br>
|
||||||
|
</p>
|
||||||
|
<hr>
|
||||||
|
<address>Copyright ©
|
||||||
|
2002-2004 MIT and Hewlett Packard</address>
|
||||||
|
</body>
|
||||||
|
</html>
|
535
dspace/docs/history.html
Normal file
535
dspace/docs/history.html
Normal file
@@ -0,0 +1,535 @@
|
|||||||
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN" "http://www.w3.org/TR/REC-html40/strict.dtd">
|
||||||
|
<html>
|
||||||
|
<head>
|
||||||
|
<title>DSpace System Documentation: Version History</title>
|
||||||
|
<link rel="StyleSheet" href="style.css" type="text/css">
|
||||||
|
<meta http-equiv="Content-Type"
|
||||||
|
content="text/html; charset=iso-8859-1">
|
||||||
|
</head>
|
||||||
|
<body>
|
||||||
|
<h1>DSpace System Documentation: Version History</h1>
|
||||||
|
<p><a href="index.html">Back to contents</a></p>
|
||||||
|
|
||||||
|
<h2><a name="version1_3">Changes in DSpace 1.3beta1</a></h2>
|
||||||
|
|
||||||
|
<h3>General Improvements</h3>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>Initial i18n Support for JSPs - Note: the implementation of this feature required changes to almost all JSP pages</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<h3>Bug fixes</h3>
|
||||||
|
<ul>
|
||||||
|
<li>Set the content type in the HTTP header</li>
|
||||||
|
<li>Fix issue where EPerson edit would not work due to form indexing (partial fix)</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<h2><a name="version1_3">Changes in DSpace 1.2.2</a></h2>
|
||||||
|
|
||||||
|
<H3>General Improvements</h3>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>LDAP authentication support</li>
|
||||||
|
<li>Log file analysis and report generation</li>
|
||||||
|
<li>Item licence viewing</li>
|
||||||
|
<li>Supervision order/collaborative workspace administrative tools</li>
|
||||||
|
<li>Basic workspace for submissions in progress, with support for supervion</li>
|
||||||
|
<li>SRB storage system option</li>
|
||||||
|
<li>Updated handle server system</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<h3>Bug fixes</h3>
|
||||||
|
<ul>
|
||||||
|
<li>POST handling in HTMLServlet</li>
|
||||||
|
<li>Missing ContentType directives added to some JSPs</li>
|
||||||
|
<li>Name dependency on Collection Admin and Submitter groups fixed</li>
|
||||||
|
<li>Improved OAI-PMH XML encoding~</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<H3><A NAME="jsp-changes-1_2_1-1_3">Changes in JSPs (since 1.2.1)</A></H3>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li><code>collection-home.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>community-home.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>community-list.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>display-item.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>home.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>styles.css.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>components/ldap-form.jsp</code> <em>is new</em></li>
|
||||||
|
<li><code>dspace-admin/eperson-edit.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>dspace-admin/list-formats.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>dspace-admin/supervise-confirm-remove.jsp</code> <em>is new</em></li>
|
||||||
|
<li><code>dspace-admin/supervise-duplicate.jsp</code> <em>is new</em></li>
|
||||||
|
<li><code>dspace-admin/supervise-link.jsp</code> <em>is new</em></li>
|
||||||
|
<li><code>dspace-admin/supervise-list.jsp</code> <em>is new</em></li>
|
||||||
|
<li><code>dspace-admin/supervise-main.jsp</code> <em>is new</em></li>
|
||||||
|
<li><code>dspace-admin/wizard-questions.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>error/404.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>layout/footer-default.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>layout/header-default.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>layout/location-bar.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>layout/navbar-admin.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>layout/navbar-default.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>login/ldap-incorrect.jsp</code> <em>is new</em></li>
|
||||||
|
<li><code>login/ldap.jsp</code> <em>is new</em></li>
|
||||||
|
<li><code>mydspace/main.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>register/edit-profile.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>register/new-ldap-user.jsp</code> <em>is new</em></li>
|
||||||
|
<li><code>register/registration-form.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>search/results.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>statistics/no-report.jsp</code> <em>is new</em></li>
|
||||||
|
<li><code>statistics/report.jsp</code> <em>is new</em></li>
|
||||||
|
<li><code>submit/cancel.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>submit/change-file-description.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>submit/choose-file.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>submit/complete.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>submit/creative-commons.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>submit/edit-metadata-1.jsp</code> <em>is removed</em></li>
|
||||||
|
<li><code>submit/edit-metadata-2.jsp</code> <em>is removed</em></li>
|
||||||
|
<li><code>submit/edit-metadata.jsp</code> <em>is new</em></li>
|
||||||
|
<li><code>submit/get-file-format.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>submit/initial-questions.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>submit/progressbar.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>submit/review.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>submit/select-collection.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>submit/show-license.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>submit/show-uploaded-file.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>submit/upload-error.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>submit/upload-file-list.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>workspace/ws-error.jsp</code> <em>is new</em></li>
|
||||||
|
<li><code>workspace/ws-main.jsp</code> <em>is new</em></li>
|
||||||
|
<li><code>workspace/wsv-error.jsp</code> <em>is new</em></li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<h2><a name="version1_2_2">Changes in DSpace 1.2.2</a></h2>
|
||||||
|
|
||||||
|
<H3>General Improvements</h3>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>Customisable submission forms added</li>
|
||||||
|
<li>Configurable number of index terms in Lucene for full-text indexing</li>
|
||||||
|
<li>Improved scalability in media filter</li>
|
||||||
|
<li>Submit button on collection pages only appears if user has authorisation</li>
|
||||||
|
<li>PostgreSQL 8.0 compatibility</li>
|
||||||
|
<li>Search scope retention to improve browsing</li>
|
||||||
|
<li>Community and collection strengths displayed</li>
|
||||||
|
<li>Upgraded OAICat software</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<h3>Bug fixes</h3>
|
||||||
|
<ul>
|
||||||
|
<li>Fix for Oracle too many cursors problem.</li>
|
||||||
|
<li>Fix for UTF-8 encoded searches in advanced search.</li>
|
||||||
|
<li>Fix for handling "\" in bitstream names.</li>
|
||||||
|
<li>Fix to prevent delete of "unknown" bitstream format</li>
|
||||||
|
<li>Fix for ItemImport creating new handles for replaced items</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<H3><A NAME="jsp-changes-1_2_1-1_2_2">Changes in JSPs</A></H3>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li><code>collection-home.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>community-home.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>community-list.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>home.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>dspace-admin/list-formats.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>dspace-admin/wizard-questions.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>search/results.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>submit/cancel.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>submit/change-file-description.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>submit/choose-file.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>submit/complete.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>submit/creative-commons.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>submit/edit-metadata.jsp</code> <em>new</em></li>
|
||||||
|
<li><code>submit/get-file-format.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>submit/initial-questions.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>submit/progressbar.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>submit/review.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>submit/select-collection.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>submit/show-license.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>submit/show-uploaded-file.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>submit/upload-error.jsp</code> <em>changed</em></li>
|
||||||
|
<li><code>submit/upload-file-list.jsp</code> <em>changed</em></li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<h2><a name="version1_2_1">Changes in DSpace 1.2.1</a></h2>
|
||||||
|
|
||||||
|
|
||||||
|
<H3>General Improvements</h3>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>Oracle support added</li>
|
||||||
|
<li>Thumbnails in item view can now be switched off/on</li>
|
||||||
|
<li>Browse and search thumbnail options</li>
|
||||||
|
<li>Improved item importer
|
||||||
|
<ul>
|
||||||
|
<li> can now import to multiple collections</li>
|
||||||
|
<li> added --test flag to simulate an import, without actually making any changes </li>
|
||||||
|
<li> added --resume flag to try to resume the import in case the import is aborted</li>
|
||||||
|
</ul>
|
||||||
|
</li>
|
||||||
|
<li>Configurable fields for the search index</li>
|
||||||
|
<li>Script for transferring items between DSpace instances</li>
|
||||||
|
<li>Sun library JARs (JavaMail, Java Activation Framework and Servlet) now included in DSpace source code bundle</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<h3>Bug fixes</h3>
|
||||||
|
<ul>
|
||||||
|
<li>A logo to existing collection can now be added. Fixes SF bug #1065933</li>
|
||||||
|
<li>The community logo can now be edited. Fixes SF bug #1035692</li>
|
||||||
|
<li>MediaFilterManager doesn't 'touch' every item every time. Fixes SF bug #1015296 </li>
|
||||||
|
<li>Supported formats help page, set the format support level to "known" as default</li>
|
||||||
|
<LI>Fixed various database connection pool leaks</LI>
|
||||||
|
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<H3><A NAME="jsp-changes-1_2-1_2_1">Changed JSPs</A></H3>
|
||||||
|
|
||||||
|
<UL>
|
||||||
|
<li><code>collection-home</code> <em>changed</em></li>
|
||||||
|
<li><code>community-home</code> <em>changed</em></li>
|
||||||
|
<li><code>display-item</code> <em>changed</em></li>
|
||||||
|
<li><code>dspace-admin/confirm-delete-collection</code> <em>moved to tools/ and changed</em></li>
|
||||||
|
<li><code>dspace-admin/confirm-delete-community</code> <em>moved to tools/ and changed</em></li>
|
||||||
|
<li><code>dspace-admin/edit-collection</code> <em>moved to tools/ and changed</em></li>
|
||||||
|
<li><code>dspace-admin/edit-community</code> <em>moved to tools/ and changed</em></li>
|
||||||
|
<li><code>dspace-admin/index</code> <em>changed </em></li>
|
||||||
|
<li><code>dspace-admin/upload-logo</code> <em>changed </em></li>
|
||||||
|
<li><code>dspace-admin/wizard-basicinfo</code> <em>changed </em></li>
|
||||||
|
<li><code>dspace-admin/wizard-default-item</code> <em>changed </em></li>
|
||||||
|
<li><code>dspace-admin/wizard-permissions</code> <em>changed </em></li>
|
||||||
|
<li><code>dspace-admin/wizard-questions</code> <em>changed </em></li>
|
||||||
|
<li><code>help/formats.html</code> <em>removed</em></li>
|
||||||
|
<li><code>help/formats</code> <em>changed</em></li>
|
||||||
|
<li><code>index</code> <em>changed</em></li>
|
||||||
|
<li><code>layout/navbar-admin</code> <em>changed</em></li>
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
|
||||||
|
<h2><a name="version1_2">Changes in DSpace 1.2</a></h2>
|
||||||
|
|
||||||
|
<H3>General Improvments</h3>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>Communities can now contain sub-communities</li>
|
||||||
|
<li>Items may be included in more than one collection</li>
|
||||||
|
<li>Full text extraction and searching for MS Word, PDF, HTML, text
|
||||||
|
documents</li>
|
||||||
|
<li>Thumbnails displayed in item view for items that contain images</li>
|
||||||
|
<li>Configurable MediaFilter tool creates both extracted text and
|
||||||
|
thumbnails</li>
|
||||||
|
<li>Bitstream IDs are now persistent - generated from item's handle
|
||||||
|
and a sequence number</li>
|
||||||
|
<li>Creative Commons licenses can optionally be added to items
|
||||||
|
during web submission process</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<H3>Administration</H3>
|
||||||
|
<ul>
|
||||||
|
<li>If you are logged in as administrator, you see admin buttons on
|
||||||
|
item, collection, and community pages</li>
|
||||||
|
<li>New collection administration wizard</li>
|
||||||
|
<li>Can now administer collection's submitters from collection admin
|
||||||
|
tool</li>
|
||||||
|
<li>Delegated administration - new 'collection editor' role - edits
|
||||||
|
item metadata, manages submitters list, edits collection metadata, links
|
||||||
|
to items from other collections, and can withdraw items</li>
|
||||||
|
<li>Admin UI moved from /admin to /dspace-admin to avoid conflict
|
||||||
|
with Tomcat /admin JSPs</li>
|
||||||
|
<li>New EPerson selector popup makes Group editing much easier</li>
|
||||||
|
<li>'News' section is now editable using admin UI (no more mucking
|
||||||
|
with JSPs)</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<H3>Import/Export/OAI</H3>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>New tool that exports DSpace content in AIPs that use METS XML
|
||||||
|
for metadata (incomplete)</li>
|
||||||
|
<li>OAI - sets are now collections, identified by Handles ('safe'
|
||||||
|
with /, : converted to _)</li>
|
||||||
|
<li>OAI - contributor.author now mapped to oai_dc:creator</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<H3>Miscellaneous</H3>
|
||||||
|
<ul>
|
||||||
|
<li>Build process streamlined with use of WAR files, symbolic links
|
||||||
|
no longer used, friendlier to later versions of Tomcat</li>
|
||||||
|
<li>MIT-specific aspects of UI removed to avoid confusion</li>
|
||||||
|
<li>Item metadata now rendered to avoid interpreting as HTML
|
||||||
|
(displays as entered)</li>
|
||||||
|
<li>Forms now have no-cache directive to avoid trouble with browser
|
||||||
|
'back' button</li>
|
||||||
|
<li>Bundles now have 'names' for more structure in item's content</li>
|
||||||
|
</ul>
|
||||||
|
<div id="jsp-file-changes">
|
||||||
|
<h3><a name="jsp-file-changes">JSP file changes between 1.1 and 1.2</a></h3>
|
||||||
|
<p>This list generated with <code>cvs -Q rdiff -s -r dspace-1_1 dspace</code>
|
||||||
|
and a sprinkling of perl.</p>
|
||||||
|
<ul>
|
||||||
|
<li>Changed: dspace/jsp/collection-home.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/community-home.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/community-list.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/display-item.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/index.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/home.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/styles.css.jsp </li>
|
||||||
|
<li>Moved to dspace-admin and changed: dspace/jsp/admin/authorize-advanced.jsp </li>
|
||||||
|
<li>Moved to dspace-admin and changed: dspace/jsp/admin/authorize-collection-edit.jsp </li>
|
||||||
|
<li>Moved to dspace-admin and changed: dspace/jsp/admin/authorize-community-edit.jsp </li>
|
||||||
|
<li>Moved to dspace-admin and changed: dspace/jsp/admin/authorize-item-edit.jsp </li>
|
||||||
|
<li>Moved to dspace-admin and changed: dspace/jsp/admin/authorize-main.jsp </li>
|
||||||
|
<li>Moved to dspace-admin and changed: dspace/jsp/admin/authorize-policy-edit.jsp </li>
|
||||||
|
<li>Moved to dspace-admin: dspace/jsp/admin/collection-select.jsp </li>
|
||||||
|
<li>Moved to dspace-admin: dspace/jsp/admin/community-select.jsp </li>
|
||||||
|
<li>Moved to dspace-admin: dspace/jsp/admin/confirm-delete-collection.jsp </li>
|
||||||
|
<li>Moved to dspace-admin: dspace/jsp/admin/confirm-delete-community.jsp </li>
|
||||||
|
<li>Moved to dspace-admin: dspace/jsp/admin/confirm-delete-dctype.jsp </li>
|
||||||
|
<li>Moved to dspace-admin: dspace/jsp/admin/confirm-delete-eperson.jsp </li>
|
||||||
|
<li>Moved to dspace-admin: dspace/jsp/admin/confirm-delete-format.jsp </li>
|
||||||
|
<li>Moved to dspace/jsp/tools: dspace/jsp/admin/confirm-delete-item.jsp </li>
|
||||||
|
<li>Moved to dspace/jsp/tools: dspace/jsp/admin/confirm-withdraw-item.jsp </li>
|
||||||
|
<li>Moved to dspace-admin and changed: dspace/jsp/admin/edit-collection.jsp </li>
|
||||||
|
<li>Moved to dspace-admin and changed: dspace/jsp/admin/edit-community.jsp </li>
|
||||||
|
<li>Moved to dspace/jsp/tools and changed: dspace/jsp/admin/edit-item-form.jsp </li>
|
||||||
|
<li>Moved to dspace-admin and changed: dspace/jsp/admin/eperson-browse.jsp </li>
|
||||||
|
<li>Moved to dspace-admin: dspace/jsp/admin/eperson-confirm-delete.jsp </li>
|
||||||
|
<li>Moved to dspace-admin and changed: dspace/jsp/admin/eperson-edit.jsp </li>
|
||||||
|
<li>Moved to dspace-admin and changed: dspace/jsp/admin/eperson-main.jsp </li>
|
||||||
|
<li>Moved to dspace/jsp/tools and changed: dspace/jsp/admin/get-item-id.jsp </li>
|
||||||
|
<li>Moved to dspace/jsp/tools and changed: dspace/jsp/admin/group-edit.jsp </li>
|
||||||
|
<li>Moved to dspace-admin and changed: dspace/jsp/admin/group-eperson-select.jsp </li>
|
||||||
|
<li>Moved to dspace/jsp/tools and changed: dspace/jsp/admin/group-list.jsp </li>
|
||||||
|
<li>Moved to dspace-admin: dspace/jsp/admin/index.jsp </li>
|
||||||
|
<li>Moved to dspace-admin and changed: dspace/jsp/admin/item-select.jsp </li>
|
||||||
|
<li>Moved to dspace-admin and changed: dspace/jsp/admin/list-communities.jsp </li>
|
||||||
|
<li>Moved to dspace-admin and changed: dspace/jsp/admin/list-dc-types.jsp </li>
|
||||||
|
<li>Removed: dspace/jsp/admin/list-epeople.jsp </li>
|
||||||
|
<li>Moved to dspace-admin and changed: dspace/jsp/admin/list-formats.jsp </li>
|
||||||
|
<li>Moved to dspace/jsp/tools: dspace/jsp/admin/upload-bitstream.jsp </li>
|
||||||
|
<li>Moved to dspace-admin and changed: dspace/jsp/admin/upload-logo.jsp </li>
|
||||||
|
<li>Moved to dspace-admin: dspace/jsp/admin/workflow-abort-confirm.jsp </li>
|
||||||
|
<li>Moved to dspace-admin and changed: dspace/jsp/admin/workflow-list.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/browse/authors.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/browse/items-by-author.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/browse/items-by-date.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/browse/no-results.jsp </li>
|
||||||
|
<li>New: dspace-admin/eperson-deletion-error.jsp </li>
|
||||||
|
<li>New: dspace/jsp/dspace-admin/news-edit.jsp </li>
|
||||||
|
<li>New: dspace/jsp/dspace-admin/news-main.jsp </li>
|
||||||
|
<li>New: dspace/jsp/dspace-admin/wizard-basicinfo.jsp </li>
|
||||||
|
<li>New: dspace/jsp/dspace-admin/wizard-default-item.jsp </li>
|
||||||
|
<li>New: dspace/jsp/dspace-admin/wizard-permissions.jsp </li>
|
||||||
|
<li>New: dspace/jsp/dspace-admin/wizard-questions.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/components/contact-info.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/error/internal.jsp </li>
|
||||||
|
<li>New: dspace/jsp/help/formats.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/layout/footer-default.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/layout/header-default.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/layout/navbar-admin.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/layout/navbar-default.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/login/password.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/mydspace/main.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/mydspace/perform-task.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/mydspace/preview-task.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/mydspace/reject-reason.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/mydspace/remove-item.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/register/edit-profile.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/register/inactive-account.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/register/new-password.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/register/registration-form.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/search/advanced.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/search/results.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/submit/cancel.jsp </li>
|
||||||
|
<li>New: dspace/jsp/submit/cc-license.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/submit/choose-file.jsp </li>
|
||||||
|
<li>New: dspace/jsp/submit/creative-commons.css </li>
|
||||||
|
<li>New: dspace/jsp/submit/creative-commons.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/submit/edit-metadata-1.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/submit/edit-metadata-2.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/submit/get-file-format.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/submit/initial-questions.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/submit/progressbar.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/submit/review.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/submit/select-collection.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/submit/show-license.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/submit/show-uploaded-file.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/submit/upload-error.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/submit/upload-file-list.jsp </li>
|
||||||
|
<li>Changed: dspace/jsp/submit/verify-prune.jsp </li>
|
||||||
|
<li>New: dspace/jsp/tools/edit-item-form.jsp </li>
|
||||||
|
<li>New: dspace/jsp/tools/eperson-list.jsp </li>
|
||||||
|
<li>New: dspace/jsp/tools/itemmap-browse.jsp </li>
|
||||||
|
<li>New: dspace/jsp/tools/itemmap-info.jsp </li>
|
||||||
|
<li>New: dspace/jsp/tools/itemmap-main.jsp </li>
|
||||||
|
|
||||||
|
</ul>
|
||||||
|
</div>
|
||||||
|
<div id="java-file-changes">
|
||||||
|
<h3><a name="java-file-changes">Java file changes between 1.1 and 1.2</a></h3>
|
||||||
|
<p>This list generated with <code>cvs -Q rdiff -s -r dspace-1_1 dspace</code>
|
||||||
|
and a sprinkling of perl.</p>
|
||||||
|
<ul>
|
||||||
|
<li>New: dspace/src/org/dspace/administer/CommunityFiliator.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/administer/CreateAdministrator.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/administer/DCType.java </li>
|
||||||
|
<li>New: dspace/src/org/dspace/administer/Upgrade11To12.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/app/itemexport/ItemExport.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/app/itemimport/ItemImport.java </li>
|
||||||
|
<li>New: dspace/src/org/dspace/app/mediafilter/HTMLFilter.java </li>
|
||||||
|
<li>New: dspace/src/org/dspace/app/mediafilter/JPEGFilter.java </li>
|
||||||
|
<li>New: dspace/src/org/dspace/app/mediafilter/MediaFilter.java </li>
|
||||||
|
<li>New: dspace/src/org/dspace/app/mediafilter/MediaFilterManager.java </li>
|
||||||
|
<li>New: dspace/src/org/dspace/app/mediafilter/PDFFilter.java </li>
|
||||||
|
<li>New: dspace/src/org/dspace/app/mediafilter/WordFilter.java </li>
|
||||||
|
<li>New: dspace/src/org/dspace/app/mets/METSExport.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/app/oai/DSpaceOAICatalog.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/app/oai/DSpaceRecordFactory.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/app/oai/OAIDCCrosswalk.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/app/webui/filter/AdminOnlyFilter.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/app/webui/filter/RegisteredOnlyFilter.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/app/webui/jsptag/IncludeTag.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/app/webui/jsptag/ItemListTag.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/app/webui/jsptag/ItemTag.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/app/webui/jsptag/LayoutTag.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/app/webui/jsptag/PopupTag.java </li>
|
||||||
|
<li>New: dspace/src/org/dspace/app/webui/jsptag/SelectEPersonTag.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/app/webui/servlet/AdvancedSearchServlet.java </li>
|
||||||
|
<li>New: dspace/src/org/dspace/app/webui/servlet/BitstreamServlet.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/app/webui/servlet/CommunityListServlet.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/app/webui/servlet/HandleServlet.java </li>
|
||||||
|
<li>New: dspace/src/org/dspace/app/webui/servlet/HTMLServlet.java </li>
|
||||||
|
<li>New: dspace/src/org/dspace/app/webui/servlet/LoadDSpaceConfig.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/app/webui/servlet/RegisterServlet.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/app/webui/servlet/SimpleSearchServlet.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/app/webui/servlet/SubmitServlet.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/app/webui/servlet/admin/AuthorizeAdminServlet.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/app/webui/servlet/admin/BitstreamFormatRegistry.java </li>
|
||||||
|
<li>New: dspace/src/org/dspace/app/webui/servlet/admin/CollectionWizardServlet.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/app/webui/servlet/admin/DCTypeRegistryServlet.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/app/webui/servlet/admin/EditCommunitiesServlet.java </li>
|
||||||
|
<li>Removed: dspace/src/org/dspace/app/webui/servlet/admin/EditEPersonServlet.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/app/webui/servlet/admin/EditItemServlet.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/app/webui/servlet/admin/EPersonAdminServlet.java </li>
|
||||||
|
<li>New: dspace/src/org/dspace/app/webui/servlet/admin/EPersonListServlet.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/app/webui/servlet/admin/GroupEditServlet.java </li>
|
||||||
|
<li>New: dspace/src/org/dspace/app/webui/servlet/admin/ItemMapServlet.java </li>
|
||||||
|
<li>New: dspace/src/org/dspace/app/webui/servlet/admin/NewsEditServlet.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/app/webui/servlet/admin/WorkflowAbortServlet.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/app/webui/util/Authenticate.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/app/webui/util/FileUploadRequest.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/app/webui/util/JSPManager.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/app/webui/util/UIUtil.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/authorize/AuthorizeManager.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/authorize/FixDefaultPolicies.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/authorize/PolicySet.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/authorize/ResourcePolicy.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/browse/Browse.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/content/Bitstream.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/content/Bundle.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/content/Collection.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/content/Community.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/content/DSpaceObject.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/content/InProgressSubmission.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/content/InstallItem.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/content/Item.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/content/WorkspaceItem.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/core/ConfigurationManager.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/core/Constants.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/core/Utils.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/eperson/EPerson.java </li>
|
||||||
|
<li>New: dspace/src/org/dspace/eperson/EPersonDeletionException.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/eperson/Group.java </li>
|
||||||
|
<li>New: dspace/src/org/dspace/license/CreativeCommons.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/search/DSIndexer.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/search/DSQuery.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/search/Harvest.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/search/HarvestedItemInfo.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/search/QueryArgs.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/search/QueryResults.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/storage/rdbms/DatabaseManager.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/storage/rdbms/TableRow.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/workflow/WorkflowItem.java </li>
|
||||||
|
<li>Changed: dspace/src/org/dspace/workflow/WorkflowManager.java </li>
|
||||||
|
</ul>
|
||||||
|
</div>
|
||||||
|
<h2><a name="version1_1_1">Changes in DSpace 1.1.1</a></h2>
|
||||||
|
<h3>Bug fixes</h3>
|
||||||
|
<ul>
|
||||||
|
<li>non-administrators can now submit again</li>
|
||||||
|
<li>installations now preserve file creation dates, eliminating
|
||||||
|
confusion with upgrades</li>
|
||||||
|
<li>authorization editing pages no longer create null entries in
|
||||||
|
database, and no longer handles them poorly (no longer gives blank page
|
||||||
|
instead of displaying policies.)</li>
|
||||||
|
<li>registration page Invalid token error page now displayed when an
|
||||||
|
invalid token is received (as opposed to internal server error.) Fixes
|
||||||
|
SF bug #739999</li>
|
||||||
|
<li>eperson admin 'recent submission' links fixed for DSpaces
|
||||||
|
deployed somewhere other than at / (e.g. /dspace).</li>
|
||||||
|
<li>help pages Link to help pages now includes servlet context (e.g.
|
||||||
|
'/dspace'). Fixes SF bug #738399.</li>
|
||||||
|
</ul>
|
||||||
|
<h3>Improvements</h3>
|
||||||
|
<ul>
|
||||||
|
<li><code>bin/dspace-info.pl</code> now checks jsp and asset store
|
||||||
|
files for zero-length files</li>
|
||||||
|
<li><code>make-release-package</code> now works with SourceForge CVS</li>
|
||||||
|
<li>eperson editor now doesn't display the spurious text 'null'</li>
|
||||||
|
<li>item exporter now uses Jakarta's cli command line arg parser
|
||||||
|
(much cleaner)</li>
|
||||||
|
<li>item importer improvements:
|
||||||
|
<ul>
|
||||||
|
<li>now uses Jakarta's cli command line arg parser (much cleaner)</li>
|
||||||
|
<li>imported items can now be routed through a workflow</li>
|
||||||
|
<li>more validation and error messages before import</li>
|
||||||
|
<li>can now use email addresses and handles instead of just
|
||||||
|
database IDs</li>
|
||||||
|
<li>can import an item to a collection with the workflow
|
||||||
|
suppressed</li>
|
||||||
|
</ul>
|
||||||
|
</li>
|
||||||
|
</ul>
|
||||||
|
<h2><a name="version1_1">Changes in DSpace 1.1</a></h2>
|
||||||
|
<ul>
|
||||||
|
<li>Fixed various OAI-related bugs; DSpace's OAI support should now
|
||||||
|
be correct. Note that harvesting is now based on the new Item 'last
|
||||||
|
modified' date (as opposed to the Dublin Core <code>date.available</code>
|
||||||
|
date.)</li>
|
||||||
|
<li>Fixed Handle support--DSpace now responds to naming authority
|
||||||
|
requests correctly.</li>
|
||||||
|
<li>Multiple bitstream stores can now be specified; this allows
|
||||||
|
DSpace storage to span several disks, and so there is no longer a hard
|
||||||
|
limit on storage.</li>
|
||||||
|
<li>Search improvements:
|
||||||
|
<ul>
|
||||||
|
<li>New fielded searching UI</li>
|
||||||
|
<li>Search results are now paged</li>
|
||||||
|
<li>Abstracts are indexed</li>
|
||||||
|
<li>Better use of Lucene API; should stop the number of open file
|
||||||
|
handles getting large</li>
|
||||||
|
</ul>
|
||||||
|
</li>
|
||||||
|
<li>Submission UI improvements:
|
||||||
|
<ul>
|
||||||
|
<li>now insists on a title being specified</li>
|
||||||
|
<li>fixed navigation on file upload page</li>
|
||||||
|
<li>citation & identifier fields for previously published
|
||||||
|
submissions now fixed</li>
|
||||||
|
</ul>
|
||||||
|
</li>
|
||||||
|
<li>Many Unicode fixes to the database and Web user interface</li>
|
||||||
|
<li>Collections can now be deleted</li>
|
||||||
|
<li>Bitstream descriptions (if available) displayed on item display
|
||||||
|
page</li>
|
||||||
|
<li>Modified a couple of servlets to handle invalid parameters better
|
||||||
|
(i.e. to report a suitable error message instead of an internal server
|
||||||
|
error)</li>
|
||||||
|
<li>Item templates now work</li>
|
||||||
|
<li>Fixed registration token expiration problem (they no longer
|
||||||
|
expire.)</li>
|
||||||
|
</ul>
|
||||||
|
<hr>
|
||||||
|
<address> Copyright © 2002-2004 MIT and Hewlett Packard </address>
|
||||||
|
</body>
|
||||||
|
</html>
|
BIN
dspace/docs/image/architecture-600x450.gif
Normal file
BIN
dspace/docs/image/architecture-600x450.gif
Normal file
Binary file not shown.
After Width: | Height: | Size: 23 KiB |
BIN
dspace/docs/image/data-model.gif
Normal file
BIN
dspace/docs/image/data-model.gif
Normal file
Binary file not shown.
After Width: | Height: | Size: 13 KiB |
BIN
dspace/docs/image/ingest.gif
Normal file
BIN
dspace/docs/image/ingest.gif
Normal file
Binary file not shown.
After Width: | Height: | Size: 10 KiB |
BIN
dspace/docs/image/web-ui-flow.gif
Normal file
BIN
dspace/docs/image/web-ui-flow.gif
Normal file
Binary file not shown.
After Width: | Height: | Size: 29 KiB |
BIN
dspace/docs/image/workflow.gif
Normal file
BIN
dspace/docs/image/workflow.gif
Normal file
Binary file not shown.
After Width: | Height: | Size: 8.1 KiB |
135
dspace/docs/index.html
Normal file
135
dspace/docs/index.html
Normal file
@@ -0,0 +1,135 @@
|
|||||||
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN" "http://www.w3.org/TR/REC-html40/strict.dtd">
|
||||||
|
<HTML>
|
||||||
|
<HEAD>
|
||||||
|
<TITLE>DSpace System Documentation: Contents</TITLE>
|
||||||
|
<LINK REL=StyleSheet HREF="style.css" TYPE="text/css">
|
||||||
|
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" >
|
||||||
|
</HEAD>
|
||||||
|
<BODY>
|
||||||
|
<H1>DSpace System Documentation: Contents</H1>
|
||||||
|
|
||||||
|
<P>Authors: Robert Tansley, Mick Bass, Margret Branschofsky, Grace Carpenter, Greg McClellan, David Stuve</P>
|
||||||
|
|
||||||
|
<P>DSpace Version: <strong>1.3beta1</strong> (28-Jun-2005)</P>
|
||||||
|
<P>Documentation Version: <strong>1.3beta1</strong> (28-Jun-2005)</P>
|
||||||
|
<P>Documentation for other versions of DSpace may be <A HREF="http://sourceforge.net/project/showfiles.php?group_id=19984&package_id=64664">downloaded from SourceForge</A></P>
|
||||||
|
|
||||||
|
<UL>
|
||||||
|
<LI><A HREF="introduction.html">Introduction</A></LI>
|
||||||
|
<LI><A HREF="functional.html">Functional Overview</A>
|
||||||
|
<UL>
|
||||||
|
<LI><A HREF="functional.html#data_model">Data Model</A></LI>
|
||||||
|
<LI><A HREF="functional.html#metadata">Metadata</A></LI>
|
||||||
|
<LI><A HREF="functional.html#epeople">E-people</A></LI>
|
||||||
|
<LI><A HREF="functional.html#auth">Authorization</A></LI>
|
||||||
|
<LI><A HREF="functional.html#ingest">Ingest Process and Workflow</A></LI>
|
||||||
|
<LI><A HREF="functional.html#handles">Handles</A></LI>
|
||||||
|
<LI><A HREF="functional.html#bitstream_ids">Bitstream 'Persistent' Identifiers</A></LI>
|
||||||
|
<li><a href="functional.html#srb">Storage Resource Broker (SRB) Support</a></li>
|
||||||
|
<LI><A HREF="functional.html#search_browse">Search and Browse</A></LI>
|
||||||
|
<LI><A HREF="functional.html#html">HTML Support</A></LI>
|
||||||
|
<LI><A HREF="functional.html#oai">OAI Support</A></LI>
|
||||||
|
<LI><A HREF="functional.html#openurl">OpenURL Support</A></LI>
|
||||||
|
<LI><A HREF="functional.html#creativecommons">Creative Commons Support</A></LI>
|
||||||
|
<LI><A HREF="functional.html#subscriptions">Subscriptions</A></LI>
|
||||||
|
<LI><A HREF="functional.html#history">History</A></LI>
|
||||||
|
<LI><A HREF="functional.html#history">Import and Export</A></LI>
|
||||||
|
<li><a href="functional.html#registration">Registration</a></li>
|
||||||
|
</UL>
|
||||||
|
</LI>
|
||||||
|
<LI><A HREF="install.html">Installation</A>
|
||||||
|
<UL>
|
||||||
|
<LI><A HREF="install.html#prerequisite">Prerequisite Software</A></LI>
|
||||||
|
<LI><A HREF="install.html#installsteps">Quick Installation Steps</A></LI>
|
||||||
|
<LI><A HREF="install.html#advancedinstall">Advanced Installation</A></LI>
|
||||||
|
<LI><A HREF="install.html#knownbugs">Known Bugs</A></LI>
|
||||||
|
<LI><A HREF="install.html#problems">Common Problems</A></LI>
|
||||||
|
</UL>
|
||||||
|
</LI>
|
||||||
|
<LI><A HREF="update.html">Updating a DSpace Installation</A>
|
||||||
|
<UL>
|
||||||
|
<LI><A HREF="update.html#12_13">Updating From 1.2.x to 1.3</A></LI>
|
||||||
|
<LI><A HREF="update.html#121_122">Updating From 1.2.1 to 1.2.2</A></LI>
|
||||||
|
<LI><A HREF="update.html#12_121">Updating From 1.2 to 1.2.1</A></LI>
|
||||||
|
<LI><A HREF="update.html#11_12">Updating From 1.1 (or 1.1.1) to 1.2</A></LI>
|
||||||
|
<LI><A HREF="update.html#11_111">Updating From 1.1 to 1.1.1</A></LI>
|
||||||
|
<LI><A HREF="update.html#101_11">Updating From 1.0.1 to 1.1</A></LI>
|
||||||
|
</UL>
|
||||||
|
</LI>
|
||||||
|
<LI><A HREF="configure.html">Configuration and Customization</A>
|
||||||
|
<UL>
|
||||||
|
<LI><A HREF="configure.html#dspacecfg">The <code>dspace.cfg</code> Configuration Properties File</A></LI>
|
||||||
|
<LI><A HREF="configure.html#email">Wording of E-mail Messages</A></LI>
|
||||||
|
<LI><A HREF="configure.html#registries">The Dublin Core and Bitstream Format Registries</A></LI>
|
||||||
|
<LI><A HREF="configure.html#templates">Configuration Files for Other Applications</A></LI>
|
||||||
|
<LI><A HREF="configure.html#customui">Customizing the Web User Interface</A></LI>
|
||||||
|
<LI><A HREF="submission.html">Customizing Submission Metadata Entry</A></LI>
|
||||||
|
<LI><A HREF="configure.html#authenticate">Custom Authentication Code</A></LI>
|
||||||
|
<LI><A HREF="configure.html#ldap">LDAP Authentication</A></LI>
|
||||||
|
<LI><A HREF="configure.html#webuithumbs">Displaying Image Thumbnails</A></LI>
|
||||||
|
<LI><A HREF="configure.html#strengths">Displaying Item Counts</A></LI>
|
||||||
|
<LI><A HREF="configure.html#statistics">System Statistical Reports</A></LI>
|
||||||
|
<li><a href="configure.html#i18n">Internationalisation</a></li>
|
||||||
|
</UL>
|
||||||
|
</LI>
|
||||||
|
<LI><A HREF="directories.html">Directories and Files</A>
|
||||||
|
<UL>
|
||||||
|
<LI><A HREF="directories.html#sourcedir">Source Directory Layout</A></LI>
|
||||||
|
<LI><A HREF="directories.html#installdir">Installed Directory Layout</A></LI>
|
||||||
|
<LI><A HREF="directories.html#logfiles">Log Files</A></LI>
|
||||||
|
</UL>
|
||||||
|
</LI>
|
||||||
|
<LI><A HREF="architecture.html">Architecture</A>
|
||||||
|
<UL>
|
||||||
|
<LI><A HREF="architecture.html#overview">Overview</A></LI>
|
||||||
|
<LI><A HREF="storage.html">Storage Layer</A>
|
||||||
|
<UL>
|
||||||
|
<LI><A HREF="storage.html#rdbms">RDBMS</A></LI>
|
||||||
|
<LI><A HREF="storage.html#bitstreams">Bitstream Store</A></LI>
|
||||||
|
</UL>
|
||||||
|
</LI>
|
||||||
|
<LI><A HREF="business.html">Business Logic Layer</A>
|
||||||
|
<UL>
|
||||||
|
<LI><A HREF="business.html#core">Core Classes</A></LI>
|
||||||
|
<LI><A HREF="business.html#content">Content Management API</A></LI>
|
||||||
|
<LI><A HREF="business.html#workflow">Workflow System</A></LI>
|
||||||
|
<LI><A HREF="business.html#administer">Administration Toolkit</A></LI>
|
||||||
|
<LI><A HREF="business.html#eperson">E-person/Group Manager</A></LI>
|
||||||
|
<LI><A HREF="business.html#authorize">Authorization</A></LI>
|
||||||
|
<LI><A HREF="business.html#handle">Handle Manager/Handle Plugin</A></LI>
|
||||||
|
<LI><A HREF="business.html#search">Search</A></LI>
|
||||||
|
<LI><A HREF="business.html#browse">Browse API</A></LI>
|
||||||
|
<LI><A HREF="business.html#history">History Recorder</A></LI>
|
||||||
|
</UL>
|
||||||
|
</LI>
|
||||||
|
<LI><A HREF="application.html">Application Layer</A>
|
||||||
|
<UL>
|
||||||
|
<LI><A HREF="application.html#webui">Web User Interface</A></LI>
|
||||||
|
<LI><A HREF="application.html#oai">OAI-PMH Data Provider</A></LI>
|
||||||
|
<LI><A HREF="application.html#itemimporter">Item Importer and Exporter</A></LI>
|
||||||
|
<LI><A HREF="application.html#transferitem">Transferring Items Between DSpace Instances</A></LI>
|
||||||
|
<LI><A HREF="application.html#mets">METS Tools</A></LI>
|
||||||
|
<LI><A HREF="application.html#mediafilters">Media Filters</A></LI>
|
||||||
|
<LI><A HREF="application.html#filiator">Sub-Community Management</A></LI>
|
||||||
|
</UL>
|
||||||
|
</LI>
|
||||||
|
</UL>
|
||||||
|
</LI>
|
||||||
|
<LI><A HREF="history.html">Version History</A>
|
||||||
|
<UL>
|
||||||
|
<LI><A HREF="history.html#version1_3">Changes in DSpace 1.3</A></LI>
|
||||||
|
<LI><A HREF="history.html#version1_2_2">Changes in DSpace 1.2.2</A></LI>
|
||||||
|
<LI><A HREF="history.html#version1_2_1">Changes in DSpace 1.2.1</A></LI>
|
||||||
|
<LI><A HREF="history.html#version1_2">Changes in DSpace 1.2</A></LI>
|
||||||
|
<LI><A HREF="history.html#version1_1_1">Changes in DSpace 1.1.1</A></LI>
|
||||||
|
<LI><A HREF="history.html#version1_1">Changes in DSpace 1.1</A></LI>
|
||||||
|
</UL>
|
||||||
|
</LI>
|
||||||
|
</UL>
|
||||||
|
<HR>
|
||||||
|
|
||||||
|
<ADDRESS>
|
||||||
|
Copyright © 2002-2005 MIT and Hewlett Packard
|
||||||
|
</ADDRESS>
|
||||||
|
</BODY>
|
||||||
|
</HTML>
|
336
dspace/docs/install.html
Normal file
336
dspace/docs/install.html
Normal file
@@ -0,0 +1,336 @@
|
|||||||
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN" "http://www.w3.org/TR/REC-html40/strict.dtd">
|
||||||
|
<HTML>
|
||||||
|
<HEAD>
|
||||||
|
<TITLE>DSpace System Documentation: Installation</TITLE>
|
||||||
|
<LINK REL=StyleSheet HREF="style.css" TYPE="text/css">
|
||||||
|
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" >
|
||||||
|
</HEAD>
|
||||||
|
<BODY>
|
||||||
|
<H1>DSpace System Documentation: Installation</H1>
|
||||||
|
|
||||||
|
<P><A HREF="index.html">Back to contents</A></P>
|
||||||
|
|
||||||
|
|
||||||
|
<H2><A NAME="prerequisite">Prerequisites</A></H2>
|
||||||
|
|
||||||
|
<P>The list below describes the third-party components and tools you'll need to run a DSpace server. These are simply recommendations based on our setup at MIT; since DSpace is built on open source, standards-based tools, there are numerous other possibilities and setups.</P>
|
||||||
|
|
||||||
|
<P>Also, please note that the configuration and installation guidelines relating to a particular tool below are here for convenience. You should refer to the documentation for each individual component for complete and up-to-date details. Many of the tools are updated on a frequent basis, and the guidelines below may become out of date.</P>
|
||||||
|
|
||||||
|
<ol>
|
||||||
|
<li><P>UNIX-like OS (Linux, HP/UX etc)</P></li>
|
||||||
|
|
||||||
|
<li><P><A HREF="http://java.sun.com/">Java 1.4</A> or later (standard SDK is fine, you don't need J2EE)</P></li>
|
||||||
|
|
||||||
|
<li><P><A HREF="http://jakarta.apache.org/ant/index.html">Apache Ant 1.5</A> or later (Java make-like tool)</P></li>
|
||||||
|
|
||||||
|
<li>
|
||||||
|
<P><A HREF="http://www.postgresql.org/">PostgreSQL 7.3</A> or later, an open source relational database.</P>
|
||||||
|
|
||||||
|
<P>Be sure to compile with the following options to the '<code>configure</code>' script:</P>
|
||||||
|
|
||||||
|
<PRE>--enable-multibyte --enable-unicode --with-java</PRE>
|
||||||
|
|
||||||
|
<P><A NAME="enabletcpip"></a>Once installed, you need to enable TCP/IP connections (DSpace uses JDBC). Edit <code>postgresql.conf</code> (usually in <code>/usr/local/pgsql/data</code> or <code>/var/lib/pgsql/data</code>), and add this line:</P>
|
||||||
|
|
||||||
|
<PRE>tcpip_socket = true</PRE>
|
||||||
|
|
||||||
|
<P>Then tighten up security a bit by editing <code>pg_hba.conf</code> and adding this line:</P>
|
||||||
|
|
||||||
|
<PRE>host dspace dspace 127.0.0.1 255.255.255.255 md5</PRE>
|
||||||
|
|
||||||
|
<P>Then restart PostgreSQL.</P>
|
||||||
|
</li>
|
||||||
|
|
||||||
|
<li><P><A HREF="http://jakarta.apache.org/tomcat/">Jakarta Tomcat 4.x/5.x</A> or equivalent, such as <A HREF="http://www.mortbay.org/jetty/index.html">Jetty</A> or <A HREF="http://www.caucho.com/">Caucho Resin</A>.</P>
|
||||||
|
|
||||||
|
<P>Note that DSpace will need to run as the same user as Tomcat, so you might want to install and run Tomcat as a user called '<code>dspace</code>'.</P>
|
||||||
|
|
||||||
|
<P>You need to ensure that Tomcat has a) enough memory to run DSpace and b) uses UTF-8 as its default file encoding for international character support. So ensure in your startup scripts (etc) that the following environment variable is set:</P>
|
||||||
|
|
||||||
|
<PRE>JAVA_OPTS="-Xmx512M -Xms64M -Dfile.encoding=UTF-8"</PRE>
|
||||||
|
|
||||||
|
<P>You also need to alter Tomcat's default configuration to support searching and browsing of multi-byte UTF-8 correctly. You need to add a configuration option to the <code><Connector></code> element in <code><i>[tomcat]</i>/config/server.xml</code>:</P>
|
||||||
|
|
||||||
|
<PRE>URIEncoding="UTF-8"</PRE>
|
||||||
|
|
||||||
|
<P>e.g. if you're using the default Tomcat config, it should read:</P>
|
||||||
|
|
||||||
|
<PRE><!-- Define a non-SSL HTTP/1.1 Connector on port 8080 -->
|
||||||
|
<Connector port="8080"
|
||||||
|
maxThreads="150" minSpareThreads="25" maxSpareThreads="75"
|
||||||
|
enableLookups="false" redirectPort="8443" acceptCount="100"
|
||||||
|
connectionTimeout="20000" disableUploadTimeout="true"
|
||||||
|
<strong>URIEncoding="UTF-8"</strong> /></PRE>
|
||||||
|
|
||||||
|
<P>Jetty and Resin are configured for correct handling of UTF-8 by default.</P>
|
||||||
|
</li>
|
||||||
|
</ol>
|
||||||
|
|
||||||
|
|
||||||
|
<H2><A NAME="installsteps">Quick Installation Steps</A></H2>
|
||||||
|
|
||||||
|
<p><strong>But First, a Word on Directories and Path Names</strong></p>
|
||||||
|
|
||||||
|
<p>DSpace uses three separate directory trees. Although you don't need to know all the details
|
||||||
|
of them in order to install DSpace, you do need to know they exist and also know how they're referred to in this document:<p>
|
||||||
|
<ul>
|
||||||
|
<li>the source directory, referred to as <i><code>[dspace-source]</code></i></li>
|
||||||
|
<li>the install directory, referred to as <i><code>[dspace]</code></i></li>
|
||||||
|
<li>the web deployment directory. If you're using Tomcat, this will be <code><i>[tomcat]</i>/webapps/dspace</code> (with <code><i>[tomcat]</i></code> being wherever
|
||||||
|
you installed Tomcat--also known as $CATALINA_HOME). This directory is generated by the web server when it unpacks dspace.war, and should never be edited.</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<p>For details on the contents of these separate directory trees, refer to
|
||||||
|
<a href="directories.html">directories.html</a>.
|
||||||
|
<strong>Note that the source directory and install directory should always be separate!</strong></p>
|
||||||
|
|
||||||
|
<ol>
|
||||||
|
<li>
|
||||||
|
<P>Create the DSpace user. This needs to be the same user that Tomcat (or Jetty etc) will run as. e.g. as root run:</P>
|
||||||
|
|
||||||
|
<PRE>useradd -m dspace</PRE>
|
||||||
|
</li>
|
||||||
|
|
||||||
|
<li>
|
||||||
|
<P>Download the <A HREF="http://sourceforge.net/projects/dspace/">latest DSpace source code release</A> and unpack it:</P>
|
||||||
|
|
||||||
|
<PRE>gunzip -c dspace-source-1.x.tar.gz | tar -xf -</PRE>
|
||||||
|
|
||||||
|
</li>
|
||||||
|
|
||||||
|
<li>
|
||||||
|
<P><A NAME="pgdriver"></A>Copy the PostgreSQL JDBC driver (<code>.jar</code> file) into
|
||||||
|
<code><i>[dspace-source]</i>/lib</code>. If you compiled PostgreSQL yourself, it'll be in <code>postgresql-7.x.x/src/interfaces/jdbc/jars/postgresql.jar</code>. Alternatively you can download it directly from <A HREF="http://jdbc.postgresql.org/download.html">the PostgreSQL JDBC site</A>. Make sure you get the driver for the version of PostgreSQL you're running and for JDBC2.</P>
|
||||||
|
</li>
|
||||||
|
|
||||||
|
<li>
|
||||||
|
<P>Create a <code>dspace</code> database, owned by the <code>dspace</code> PostgreSQL user:</P>
|
||||||
|
|
||||||
|
<PRE>createuser -U postgres -d -A -P dspace ; createdb -U dspace -E UNICODE dspace</PRE>
|
||||||
|
|
||||||
|
<P>Enter a password for the DSpace database. (This isn't the same as the <code>dspace</code> user's UNIX password.)</P>
|
||||||
|
</li>
|
||||||
|
|
||||||
|
<li>
|
||||||
|
<P>Edit <code><i>[dspace-source]</i>/config/dspace.cfg</code>, in particular you'll need to set these properties:</P>
|
||||||
|
|
||||||
|
<PRE>dspace.url
|
||||||
|
dspace.hostname
|
||||||
|
dspace.name
|
||||||
|
db.password (the password you entered in the previous step)
|
||||||
|
mail.server
|
||||||
|
mail.from.address
|
||||||
|
feedback.recipient
|
||||||
|
mail.admin
|
||||||
|
alert.recipient (not essential but very useful!)</PRE>
|
||||||
|
|
||||||
|
<P>Note that if you change <code>dspace.dir</code> you'll also need to change other properties with values that start with <code>/dspace</code>, e.g. <code>assetstore.dir</code>, <code>log.dir</code>...</P>
|
||||||
|
</li>
|
||||||
|
|
||||||
|
<li>
|
||||||
|
<P>Create the directory for the DSpace installation. As root, run:</P>
|
||||||
|
|
||||||
|
<PRE>mkdir <i>[dspace]</i> ; chown dspace <i>[dspace]</i></PRE>
|
||||||
|
|
||||||
|
<P>(Assuming the <code>dspace</code> UNIX username.)</P>
|
||||||
|
</li>
|
||||||
|
|
||||||
|
<li>
|
||||||
|
<P>As the <code>dspace</code> UNIX user, compile and install DSpace:</P>
|
||||||
|
|
||||||
|
<pre>cd <i>[dspace-source]</i> ; ant fresh_install</pre>
|
||||||
|
|
||||||
|
<P>The most likely thing to go wrong here is the database connection. See the <A HREF="#problems">common problems section</A>.</P>
|
||||||
|
</li>
|
||||||
|
|
||||||
|
<li>
|
||||||
|
<P>Copy the DSpace Web application archives (<code>.war</code> files) to the appropriate directory in your Tomcat/Jetty/Resin installation. For example:</P>
|
||||||
|
|
||||||
|
<PRE>cp <i>[dspace-source]</i>/build/*.war <i>[tomcat]</i>/webapps</PRE>
|
||||||
|
</li>
|
||||||
|
|
||||||
|
<LI>
|
||||||
|
<p>Create an initial administrator account:</p>
|
||||||
|
|
||||||
|
<pre><i>[dspace]</i>/bin/create-administrator</pre>
|
||||||
|
</LI>
|
||||||
|
|
||||||
|
<LI>
|
||||||
|
<P>Now the moment of truth! Start up (or restart) Tomcat. Visit the base URL of your server, e.g. http://dspace.myu.edu:8080/dspace. You should see the DSpace home page. Congratulations!</P>
|
||||||
|
</LI>
|
||||||
|
</ol>
|
||||||
|
|
||||||
|
<p>In order to set up some communities and collections, you'll need to access the administration UI. To do this, append 'admin' to your server's URL, e.g. http://dspace.myu.edu:8080/dspace/dspace-admin.</P>
|
||||||
|
|
||||||
|
<H2><A NAME="advancedinstall">Advanced Installation</A></H2>
|
||||||
|
|
||||||
|
<P>The above installation steps are sufficient to set up a test server to play around with, but there are a few other steps and options you should probably consider before deploying a DSpace production site.</P>
|
||||||
|
|
||||||
|
<H3>'cron' Jobs</H3>
|
||||||
|
|
||||||
|
<P>A couple of DSpace features require that a script is run regularly -- the e-mail subscription feature that alerts users of new items being deposited, and the new 'media filter' tool, that generates thumbnails of images and extracts the full-text of documents for indexing.</P>
|
||||||
|
|
||||||
|
<P>To set these up, you just need to run the following command as the <code>dspace</code> UNIX user:</P>
|
||||||
|
|
||||||
|
<PRE>crontab -e</PRE>
|
||||||
|
|
||||||
|
<P>Then add the following lines:</P>
|
||||||
|
|
||||||
|
<PRE># Send out subscription e-mails at 01:00 every day
|
||||||
|
0 1 * * * <i>[dspace]</i>/bin/sub-daily
|
||||||
|
# Run the media filter at 02:00 every day
|
||||||
|
0 2 * * * <i>[dspace]</i>/bin/filter-media</PRE>
|
||||||
|
|
||||||
|
<P>Naturally you should change the frequencies to suit your environment.</P>
|
||||||
|
|
||||||
|
<P>PostgreSQL also benefits from regular 'vacuuming', which optimizes the indices and clears out any deleted data. Become the <code>postgres</code> UNIX user, run <code>crontab -e</code> and add (for example):
|
||||||
|
|
||||||
|
<pre># Clean up the database nightly at 2.40am
|
||||||
|
40 2 * * * vacuumdb --analyze dspace > /dev/null 2>&1</pre>
|
||||||
|
|
||||||
|
<H3>DSpace over HTTPS</H3>
|
||||||
|
|
||||||
|
<P>Plain old HTTP is totally insecure, and if your DSpace uses username/password authentication or stores some restricted content, running it over HTTPS (HTTP over a Secure Socket Layer (SSL)) is advisable. There are two options for this: Using Apache HTTPD, or Tomcat/Jetty's in-built HTTPS support.</P>
|
||||||
|
|
||||||
|
<P><strong>To use Apache HTTPD:</strong> The DSpace source bundle includes a partial Apache configuration <code>apache13.conf</code>, which contains most of the DSpace-specific configuration required. It assumes you're using <A HREF="http://jakarta.apache.org/tomcat/tomcat-4.1-doc/config/webapp.html">mod_webapp</A>, which is deprecated and tricky to compile but a lot easier to configure than <code>mod_jk2</code> which is the current recommendation from Tomcat. Use of this is optional, you might just want to use it as an example. To use it directly, in the main Apache <code>httpd.conf</code>, you should:</P>
|
||||||
|
|
||||||
|
<UL>
|
||||||
|
<LI>Make sure <code>mod_ssl</code> and <code>mod_webapp</code> are configured and loaded</LI>
|
||||||
|
<LI>Remove/comment out etc. any existing or default SSL virtual host</LI>
|
||||||
|
<LI>Ensure Apache will run with the UNIX user and group DSpace will run as</LI>
|
||||||
|
<LI>Include the DSpace part, e.g. with: <code>Include <i>[dspace]</i>/config/httpd.conf</code>. You can decide where the DSpace part will go in your file system--see the <A HREF="configure.html#templates">configuration section</A>.</LI>
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
<P><strong>To use Tomcat or Jetty's HTTPS support</strong> consult the documentation for the relevant tool. Also, <A HREF="http://wiki.dspace.org/index.php/DspaceInstallationDocs">these alternative DSpace install docs</A> briefly describe getting Tomcat running with SSL.</P>
|
||||||
|
|
||||||
|
|
||||||
|
<H3><A NAME="handles">The Handle Server</A></H3>
|
||||||
|
|
||||||
|
<P>First a few facts to clear up some common misconceptions:</P>
|
||||||
|
|
||||||
|
<UL>
|
||||||
|
<LI><P>You don't <strong>have</strong> to use CNRI's Handle system. At the moment, you need to change the code a little to use something else (e.g PURLs) but that should change soon.</P></LI>
|
||||||
|
|
||||||
|
<LI><P>You'll notice that while you've been playing around with a test server, DSpace has apparently been creating handles for you looking like <code>hdl:123456789/24</code> and so forth. These aren't really Handles, since the global Handle system doesn't actually know about them, and lots of other DSpace test installs will have created the same IDs.</P>
|
||||||
|
|
||||||
|
<P>They're only really Handles once you've registered a prefix with CNRI (see below) and have correctly set up the Handle server included in the DSpace distribution. This Handle server communicates with the rest of the global Handle infrastructure so that anyone that understands Handles can find the Handles your DSpace has created.</P>
|
||||||
|
</LI>
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
<P>If you want to use the Handle system, you'll need to set up a Handle server. This is included with DSpace. Note that this is not required in order to evaluate DSpace; you only need one if you are running a production service. You'll need to obtain a Handle prefix from <A HREF="http://www.handle.net/">the central CNRI Handle site</A>.</P>
|
||||||
|
|
||||||
|
<P>A Handle server runs as a separate process that receives TCP requests from other Handle servers, and issues resolution requests to a global server or servers if a Handle entered locally does not correspond to some local content. The Handle protocol is based on TCP, so it will need to be installed on a server that can broadcast and receive TCP on port 2641.</P>
|
||||||
|
|
||||||
|
<P>The Handle server code is included with the DSpace code in
|
||||||
|
<code><i>[dspace-source]</i>/lib/handle.jar</code>. A script exists to create a simple Handle configuration - simply run <code><i>[dspace]</i>/bin/make-handle-config</code> after you've set the appropriate parameters in <code>dspace.cfg</code>. You can also create a Handle configuration directly by following the <A HREF="http://www.handle.net/hs_manual_18jan02/server_manual_2.html">installation instructions on handle.net</A>, but with these changes:</P>
|
||||||
|
|
||||||
|
<UL>
|
||||||
|
<LI>Instead of running:
|
||||||
|
<PRE>java -cp /hs/bin/handle.jar net.handle.server.SimpleSetup /hs/svr_1</pre>
|
||||||
|
as directed in the <A HREF="http://hdl.handle.net/4263537/4093">Handle Server Administration Guide</A>, you should run
|
||||||
|
<pre><i>[dspace]</i>/bin/dsrun net.handle.server.SimpleSetup <i>[dspace]</i>/handle-server</pre>
|
||||||
|
ensuring that <code><i>[dspace]</i>/handle-server</code> matches whatever you have in <code>dspace.cfg</code> for the <code>handle.dir</code> property.</LI>
|
||||||
|
|
||||||
|
<LI>Edit the resulting <code><i>[dspace]</i>/handle-server/config.dct</code> file to include the following lines in the <code>"server_config"</code> clause:
|
||||||
|
|
||||||
|
<pre>"storage_type" = "CUSTOM"
|
||||||
|
"storage_class" = "org.dspace.handle.HandlePlugin"</pre>
|
||||||
|
|
||||||
|
<P>This tells the Handle server to get information about individual Handles from the DSpace code.</P></LI>
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
<P>Whichever approach you take, start the Handle server with <code><i>[dspace]</i>/bin/start-handle-server</code>, as the DSpace user. You will need to send the <code>sitebndl.zip</code> file to <A HREF="mailto:hdladmin@cnri.reston.va.us">hdladmin@cnri.reston.va.us</A> as described in the <A HREF="http://www.handle.net/hs_manual_18jan02/server_manual_2.html#SEC14">Handle server documentation</A>.</P>
|
||||||
|
|
||||||
|
<P>Note that since the DSpace code manages individual Handles, administrative operations such as Handle creation and modification aren't supported by DSpace's Handle server.</P>
|
||||||
|
|
||||||
|
|
||||||
|
<H2><A NAME="checking">Checking Your Installation</A></H2>
|
||||||
|
<p>TODO</p>
|
||||||
|
|
||||||
|
<H2><A NAME="knownbugs">Known Bugs</A></H2>
|
||||||
|
|
||||||
|
<P>In any software project of the scale of DSpace, there will be bugs. Sometimes, a stable version of DSpace includes known bugs. We do not always wait until every known bug is fixed before a release. If the software is sufficiently stable and an improvement on the previous release, and the bugs are minor and have known workarounds, we release it to enable the community to take advantage of those improvements.</P>
|
||||||
|
|
||||||
|
<P>The known bugs in a release are documented in the <code>KNOWN_BUGS</code> file in the source package.</P>
|
||||||
|
|
||||||
|
<P>Please see the <A HREF="http://sourceforge.net/tracker/?atid=119984&group_id=19984&func=browse">DSpace bug tracker</A> for further information on current bugs, and to find out if the bug has subsequently been fixed. This is also where you can report any further bugs you find.</P>
|
||||||
|
|
||||||
|
|
||||||
|
<H2><A NAME="problems">Common Problems</A></H2>
|
||||||
|
|
||||||
|
<P>In an ideal world everyone would follow the above steps and have a fully functioning DSpace. Of couse, in the real world it doesn't always seem to work out that way. This section lists common problems that people encounter when installing DSpace, and likely causes and fixes. This is likely to grow over time as we learn about users' experiences.</P>
|
||||||
|
|
||||||
|
<DL>
|
||||||
|
<DT>Database errors occur when you run <code>ant fresh_install</code></DT>
|
||||||
|
|
||||||
|
<DD>
|
||||||
|
<P>There are two common errors that occur. If your error looks like this--</P>
|
||||||
|
|
||||||
|
<PRE>[java] 2004-03-25 15:17:07,730 INFO org.dspace.storage.rdbms.InitializeDatabase @ Initializing Database
|
||||||
|
[java] 2004-03-25 15:17:08,816 FATAL org.dspace.storage.rdbms.InitializeDatabase @ Caught exception:
|
||||||
|
[java] org.postgresql.util.PSQLException: Connection refused. Check that the hostname and port are correct and that the postmaster is accepting TCP/IP connections.
|
||||||
|
[java] at org.postgresql.jdbc1.AbstractJdbc1Connection.openConnection(AbstractJdbc1Connection.java:204)
|
||||||
|
[java] at org.postgresql.Driver.connect(Driver.java:139)</PRE>
|
||||||
|
|
||||||
|
<P>it usually means you haven't yet added the relevant configuration parameter to your PostgreSQL configuration <A HREF="#enabletcpip">(see above)</A>, or perhaps you haven't restarted PostgreSQL after making the change.
|
||||||
|
Also, make sure that the <code>db.username</code> and <code>db.password</code> properties are correctly set in
|
||||||
|
<code><i>[dspace-source]</i>/config/dspace.cfg</code>.</P>
|
||||||
|
|
||||||
|
<P>An easy way to check that your DB is working OK over TCP/IP is to try this on the command line:</P>
|
||||||
|
|
||||||
|
<PRE>psql -U dspace -W -h localhost</PRE>
|
||||||
|
|
||||||
|
<P>Enter the <code>dspace</code> <em>database</em> password, and you should be dropped into the psql tool with a <code>dspace=></code> prompt.</P>
|
||||||
|
|
||||||
|
<P>Another common error looks like this:</P>
|
||||||
|
|
||||||
|
<PRE>[java] 2004-03-25 16:37:16,757 INFO org.dspace.storage.rdbms.InitializeDatabase @ Initializing Database
|
||||||
|
[java] 2004-03-25 16:37:17,139 WARN org.dspace.storage.rdbms.DatabaseManager @ Exception initializing DB pool
|
||||||
|
[java] java.lang.ClassNotFoundException: org.postgresql.Driver
|
||||||
|
[java] at java.net.URLClassLoader$1.run(URLClassLoader.java:198)
|
||||||
|
[java] at java.security.AccessController.doPrivileged(Native Method)
|
||||||
|
[java] at java.net.URLClassLoader.findClass(URLClassLoader.java:186)</PRE>
|
||||||
|
|
||||||
|
<P>This means that the PostgreSQL JDBC driver is not present in <code><i>[dspace-source]</i>/lib</code>. <A HREF="#pgdriver">See above.</A></P>
|
||||||
|
|
||||||
|
|
||||||
|
<DT>Tomcat doesn't shut down</DT>
|
||||||
|
<DD><P>If you're trying to tweak Tomcat's configuration but nothing seems to make a difference to the error you're seeing, you might find that Tomcat hasn't been shutting down properly, perhaps because it's waiting for a stale connection to close gracefully which won't happen. To see if this is the case, try:</P>
|
||||||
|
|
||||||
|
<PRE>ps -ef | grep java</PRE>
|
||||||
|
|
||||||
|
<P>and look for Tomcat's Java processes. If they stay arround after running Tomcat's <code>shutdown.sh</code> script, trying <code>kill</code>ing them (with <code>-9</code> if necessary), then starting Tomcat again.</P></DD>
|
||||||
|
|
||||||
|
<DT>Database connections don't work, or accessing DSpace takes forever</DT>
|
||||||
|
<DD><P>If you find that when you try to access a DSpace Web page and your browser sits there connecting, or if the database connections fail, you might find that a 'zombie' database connection is hanging around preventing normal operation. To see if this is the case, try:</P>
|
||||||
|
|
||||||
|
<PRE>ps -ef | grep postgres</PRE>
|
||||||
|
|
||||||
|
<P>You might see some processes like this</P>
|
||||||
|
|
||||||
|
<PRE>dspace 16325 1997 0 Feb 14 ? 0:00 postgres: dspace dspace 127.0.0.1 idle in transaction</PRE>
|
||||||
|
|
||||||
|
<P>This is normal--DSpace maintains a 'pool' of open database connections, which are re-used to avoid the overhead of constantly opening and closing connections. If they're 'idle' it's OK; they're waiting to be used. However sometimes, if something went wrong, they might be stuck in the middle of a query, which seems to prevent other connections from operating, e.g.:</P>
|
||||||
|
|
||||||
|
<PRE>dspace 16325 1997 0 Feb 14 ? 0:00 postgres: dspace dspace 127.0.0.1 SELECT</PRE>
|
||||||
|
|
||||||
|
<P>This means the connection is in the middle of a <CODE>SELECT</CODE> operation, and if you're not using DSpace right that instant, it's probably a 'zombie' connection. If this is the case, try <code>kill</code>ing the process, and stopping and restarting Tomcat.</P></DD>
|
||||||
|
|
||||||
|
<dt>You've made changes to the code or to the JSP's and rebuilt DSpace successfully, but when you run Tomcat
|
||||||
|
you don't see any of your changes in DSpace.</dt>
|
||||||
|
|
||||||
|
<dd><p>After you've rebuilt DSpace and copied <code>dspace.war</code> from your <code><i>[dspace-source]</i>/build</code> directory
|
||||||
|
into your <code><i>[tomcat]</i>/webapps</code> directory, you must
|
||||||
|
also <strong>delete</strong> the existing <code><i>[tomcat]</i>/webapps/dspace</code> directory <strong>before</strong> re-starting Tomcat. Otherwise
|
||||||
|
Tomcat will continue to use the old code.<p></dd>
|
||||||
|
|
||||||
|
</DL>
|
||||||
|
|
||||||
|
<HR>
|
||||||
|
|
||||||
|
<ADDRESS>
|
||||||
|
Copyright © 2002-2004 MIT and Hewlett Packard
|
||||||
|
</ADDRESS>
|
||||||
|
</BODY>
|
||||||
|
</HTML>
|
48
dspace/docs/introduction.html
Normal file
48
dspace/docs/introduction.html
Normal file
@@ -0,0 +1,48 @@
|
|||||||
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN" "http://www.w3.org/TR/REC-html40/strict.dtd">
|
||||||
|
<HTML>
|
||||||
|
<HEAD>
|
||||||
|
<TITLE>DSpace System Documentation: Introduction</TITLE>
|
||||||
|
<LINK REL=StyleSheet HREF="style.css" TYPE="text/css">
|
||||||
|
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" >
|
||||||
|
</HEAD>
|
||||||
|
<BODY>
|
||||||
|
<H1>DSpace System Documentation: Introduction</H1>
|
||||||
|
|
||||||
|
<P><A HREF="index.html">Back to contents</A></P>
|
||||||
|
|
||||||
|
<P>DSpace is an open source software platform that enables institutions to:</p>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>capture and describe digital works using a submission workflow module</li>
|
||||||
|
<li>distribute an institution's digital works over the web through a search and retrieval system</li>
|
||||||
|
<li>preserve digital works over the long term</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<P>This system documentation includes <A HREF="functional.html">a functional overview of the system</A>, which is a good introduction to the capabilities of the system, and should be readable by non-technical folk. Everyone should read this section first because it introduces some terminology used throughout the rest of the documentation.</P>
|
||||||
|
|
||||||
|
<P>For people actually running a DSpace service, there is
|
||||||
|
<A HREF="install.html">an installation guide</A>, and sections on <A HREF="configure.html">configuration</A> and <A HREF="directories.html">the directory structure</A>. Note that as of DSpace 1.2, the administration user interface guide is now on-line help available from within the DSpace system.</P>
|
||||||
|
|
||||||
|
<P>Finally, for those interested in the details of how DSpace works, and those potentially interested in modifying the code for their own purposes, there is <A HREF="architecture.html">a detailed architecture and design section</A>.</P>
|
||||||
|
|
||||||
|
<P>Other good sources of information are:</P>
|
||||||
|
|
||||||
|
<UL>
|
||||||
|
<LI>The DSpace Public API Javadocs. Build these with the <code>public_api</code> Ant target.</LI>
|
||||||
|
|
||||||
|
<LI>The <A HREF="http://wiki.dspace.org/">DSpace Wiki</A> contains stacks of useful information about the DSpace platform and the work people are doing with it. You are strongly encouraged to visit this site and add information about your own work.</LI>
|
||||||
|
|
||||||
|
<LI><A HREF="http://www.dspace.org/">www.dspace.org</A> has announcements and contains useful information about bringing up an instance of DSpace at your organization.</LI>
|
||||||
|
|
||||||
|
<LI>The University of Tennessee's Jason Simms has written some <A HREF="http://sunsite.utk.edu/diglib/dspace/">additional installation notes</A>.</LI>
|
||||||
|
|
||||||
|
<LI>The <A HREF="http://sourceforge.net/mailarchive/forum.php?forum_id=13580">dspace-tech e-mail list on SourceForge</A> is the recommended place to ask questions, since a growing community of DSpace developers and users is on hand on that list to help with any questions you might have. The e-mail archive of that list is a useful resource. For example, the archive contains <A HREF="http://sourceforge.net/mailarchive/forum.php?thread_id=2014424&forum_id=13580">notes on running DSpace on Windows platform</A>.</LI>
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
<HR>
|
||||||
|
|
||||||
|
<ADDRESS>
|
||||||
|
Copyright © 2002-2005 MIT and Hewlett Packard
|
||||||
|
</ADDRESS>
|
||||||
|
</BODY>
|
||||||
|
</HTML>
|
36
dspace/docs/make-doc-package
Executable file
36
dspace/docs/make-doc-package
Executable file
@@ -0,0 +1,36 @@
|
|||||||
|
#!/bin/sh
|
||||||
|
|
||||||
|
USAGE="$0 cvs-tag version"
|
||||||
|
|
||||||
|
# Just in case you need to 'socksify' etc
|
||||||
|
CVS_COMMAND="cvs"
|
||||||
|
|
||||||
|
# Check args
|
||||||
|
if [ "$#" != "2" ]; then
|
||||||
|
echo $USAGE
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
FILENAME="dspace-$2-docs"
|
||||||
|
|
||||||
|
|
||||||
|
mkdir tmp
|
||||||
|
cd tmp
|
||||||
|
|
||||||
|
echo "Checking out docs..."
|
||||||
|
$CVS_COMMAND -q export -r $1 docs
|
||||||
|
|
||||||
|
# Remove stuff not to include in ZIP
|
||||||
|
rm -f docs/.cvsignore
|
||||||
|
rm -rf docs/originals
|
||||||
|
rm -f docs/make-doc-package
|
||||||
|
|
||||||
|
echo "Creating ZIP..."
|
||||||
|
mv docs $FILENAME
|
||||||
|
zip -9 -r ../$FILENAME.zip $FILENAME
|
||||||
|
|
||||||
|
echo "Cleaning up..."
|
||||||
|
cd ..
|
||||||
|
rm -rf tmp
|
||||||
|
|
||||||
|
echo "Package created as $FILENAME.zip"
|
BIN
dspace/docs/originals/architecture.vsd
Normal file
BIN
dspace/docs/originals/architecture.vsd
Normal file
Binary file not shown.
BIN
dspace/docs/originals/data-model.vsd
Normal file
BIN
dspace/docs/originals/data-model.vsd
Normal file
Binary file not shown.
BIN
dspace/docs/originals/ingest.vsd
Normal file
BIN
dspace/docs/originals/ingest.vsd
Normal file
Binary file not shown.
BIN
dspace/docs/originals/web-ui-flow.vsd
Normal file
BIN
dspace/docs/originals/web-ui-flow.vsd
Normal file
Binary file not shown.
BIN
dspace/docs/originals/workflow.vsd
Normal file
BIN
dspace/docs/originals/workflow.vsd
Normal file
Binary file not shown.
117
dspace/docs/postgres-upgrade-notes.txt
Normal file
117
dspace/docs/postgres-upgrade-notes.txt
Normal file
@@ -0,0 +1,117 @@
|
|||||||
|
Updating Postgres with a DSpace installation.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
1. Build new postgres.
|
||||||
|
Be sure to run configure with at least these options:
|
||||||
|
./configure --enable-multibyte --enable-unicode --with-java
|
||||||
|
|
||||||
|
2. shutdown tomcat
|
||||||
|
|
||||||
|
3. dump current data
|
||||||
|
pg_dumpall -o >dspace.out
|
||||||
|
|
||||||
|
4. shut down postgres
|
||||||
|
pg_ctl stop -D /dspace/database/data -m fast
|
||||||
|
|
||||||
|
5. back up old data directory
|
||||||
|
mv /dspace/database/data /dspace/database/data.old
|
||||||
|
|
||||||
|
6. install new postgres
|
||||||
|
|
||||||
|
7. start new postgres
|
||||||
|
initdb -D /dspace/database/data
|
||||||
|
|
||||||
|
edit /dspace/database/data/postgresql.conf (Add 'tcpip_socket = true')
|
||||||
|
|
||||||
|
pg_ctl start -D /dspace/database/data
|
||||||
|
|
||||||
|
8. restore data
|
||||||
|
psql -d template1 -f dspace.out
|
||||||
|
|
||||||
|
9. Install new JDBC driver
|
||||||
|
from the new postgres installation directory:
|
||||||
|
cp share/java/postgres.jar /dspace/lib
|
||||||
|
|
||||||
|
10. restart tomcat
|
||||||
|
|
||||||
|
|
||||||
|
-------------------------------------------------------------------------------
|
||||||
|
Notes from postgres install docs:
|
||||||
|
-------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
If You Are Upgrading
|
||||||
|
|
||||||
|
The internal data storage format changes with new releases of PostgreSQL.
|
||||||
|
Therefore, if you are upgrading an existing installation that does not have a
|
||||||
|
version number "7.3.x", you must back up and restore your data as shown here.
|
||||||
|
These instructions assume that your existing installation is under the "/usr/
|
||||||
|
local/pgsql" directory, and that the data area is in "/usr/local/pgsql/data".
|
||||||
|
Substitute your paths appropriately.
|
||||||
|
|
||||||
|
1. Make sure that your database is not updated during or after the backup.
|
||||||
|
This does not affect the integrity of the backup, but the changed data
|
||||||
|
would of course not be included. If necessary, edit the permissions in
|
||||||
|
the file "/usr/local/pgsql/data/pg_hba.conf" (or equivalent) to disallow
|
||||||
|
access from everyone except you.
|
||||||
|
|
||||||
|
2. To back up your database installation, type:
|
||||||
|
|
||||||
|
pg_dumpall > outputfile
|
||||||
|
|
||||||
|
If you need to preserve OIDs (such as when using them as foreign keys),
|
||||||
|
then use the "-o" option when running "pg_dumpall".
|
||||||
|
"pg_dumpall" does not save large objects. Check the Administrator's Guide
|
||||||
|
if you need to do this.
|
||||||
|
To make the backup, you can use the "pg_dumpall" command from the version
|
||||||
|
you are currently running. For best results, however, try to use the
|
||||||
|
"pg_dumpall" command from PostgreSQL 7.3.1, since this version contains
|
||||||
|
bug fixes and improvements over older versions. While this advice might
|
||||||
|
seem idiosyncratic since you haven't installed the new version yet, it is
|
||||||
|
advisable to follow it if you plan to install the new version in parallel
|
||||||
|
with the old version. In that case you can complete the installation
|
||||||
|
normally and transfer the data later. This will also decrease the
|
||||||
|
downtime.
|
||||||
|
|
||||||
|
3. If you are installing the new version at the same location as the old one
|
||||||
|
then shut down the old server, at the latest before you install the new
|
||||||
|
files:
|
||||||
|
|
||||||
|
kill -INT `cat /usr/local/pgsql/data/postmaster.pid`
|
||||||
|
|
||||||
|
Versions prior to 7.0 do not have this "postmaster.pid" file. If you are
|
||||||
|
using such a version you must find out the process id of the server
|
||||||
|
yourself, for example by typing "ps ax | grep postmaster", and supply it
|
||||||
|
to the "kill" command.
|
||||||
|
On systems that have PostgreSQL started at boot time, there is probably a
|
||||||
|
start-up file that will accomplish the same thing. For example, on a Red
|
||||||
|
Hat Linux system one might find that
|
||||||
|
|
||||||
|
/etc/rc.d/init.d/postgresql stop
|
||||||
|
|
||||||
|
works. Another possibility is "pg_ctl stop".
|
||||||
|
|
||||||
|
4. If you are installing in the same place as the old version then it is
|
||||||
|
also a good idea to move the old installation out of the way, in case you
|
||||||
|
have trouble and need to revert to it. Use a command like this:
|
||||||
|
|
||||||
|
mv /usr/local/pgsql /usr/local/pgsql.old
|
||||||
|
|
||||||
|
After you have installed PostgreSQL 7.3.1, create a new database directory and
|
||||||
|
start the new server. Remember that you must execute these commands while
|
||||||
|
logged in to the special database user account (which you already have if you
|
||||||
|
are upgrading).
|
||||||
|
|
||||||
|
/usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data
|
||||||
|
/usr/local/pgsql/bin/postmaster -D /usr/local/pgsql/data
|
||||||
|
|
||||||
|
Finally, restore your data with
|
||||||
|
|
||||||
|
/usr/local/pgsql/bin/psql -d template1 -f outputfile
|
||||||
|
|
||||||
|
using the *new* psql.
|
||||||
|
These topics are discussed at length in the Administrator's Guide, which you
|
||||||
|
are encouraged to read in any case.
|
||||||
|
|
||||||
|
-------------------------------------------------------------------------------
|
383
dspace/docs/storage.html
Normal file
383
dspace/docs/storage.html
Normal file
@@ -0,0 +1,383 @@
|
|||||||
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN" "http://www.w3.org/TR/REC-html40/strict.dtd">
|
||||||
|
<html>
|
||||||
|
<head>
|
||||||
|
<title>DSpace System Documentation: Storage Layer</title>
|
||||||
|
<link rel="StyleSheet" href="style.css" type="text/css">
|
||||||
|
<meta http-equiv="Content-Type"
|
||||||
|
content="text/html; charset=iso-8859-1">
|
||||||
|
</head>
|
||||||
|
<body>
|
||||||
|
<h1>DSpace
|
||||||
|
System Documentation:
|
||||||
|
Storage Layer</h1>
|
||||||
|
<p><a href="index.html">Back to contents</a><br>
|
||||||
|
<a href="architecture.html">Back
|
||||||
|
to architecture overview</a></p>
|
||||||
|
<h2><a name="rdbms">RDBMS</a></h2>
|
||||||
|
<p>DSpace uses a relational
|
||||||
|
database to store all information about the organization of content,
|
||||||
|
metadata about the content, information about e-people and
|
||||||
|
authorization, and the state of currently-running workflows. The DSpace
|
||||||
|
system also uses the relational database in order to maintain indices
|
||||||
|
that users can browse.</p>
|
||||||
|
<p>Most of the functionality that
|
||||||
|
DSpace uses can be offered by any standard SQL database that supports
|
||||||
|
transactions. Presently, the browse indices use some features specific
|
||||||
|
to <a href="http://www.postgresql.org/">PostgreSQL</a>,
|
||||||
|
a mature open-source relational database, so some modification to the
|
||||||
|
code would be needed before DSpace would function fully with an
|
||||||
|
alternative database back-end.</p>
|
||||||
|
<p>The <code>org.dspace.storage.rdbms</code>
|
||||||
|
package provides access to an SQL database in a somewhat simpler form
|
||||||
|
than using JDBC directly. The main class is <code>DatabaseManager</code>,
|
||||||
|
which executes SQL queries and returns <code>TableRow</code>
|
||||||
|
or <code>TableRowIterator</code>
|
||||||
|
objects. The <code>InitializeDatabase</code>
|
||||||
|
class is used to load SQL into the database via JDBC, for example to
|
||||||
|
set up the schema.</p>
|
||||||
|
<p>All calls to the <code>Database
|
||||||
|
Manager</code> require a <a href="business.html#core">DSpace <code>Context</code>
|
||||||
|
object</a>. Example use of the
|
||||||
|
database manager API is given in the <code>org.dspace.storage.rdbms</code>
|
||||||
|
package Javadoc.</p>
|
||||||
|
<p>The database schema used by
|
||||||
|
DSpace is stored in <code><i>[dspace-source]</i>/etc/database_schema.sql</code>
|
||||||
|
in the source distribution. It is stored in the form of SQL that can be
|
||||||
|
fed straight into the DBMS to construct the database. The schema SQL
|
||||||
|
file also directly creates two e-person groups in the database that are
|
||||||
|
required for the system to function properly.</p>
|
||||||
|
<p>The DSpace database code uses
|
||||||
|
an SQL function <code>getnextid</code>
|
||||||
|
to assign primary keys to newly created rows. This SQL function must be
|
||||||
|
safe to use if several JVMs are accessing the database at once; for
|
||||||
|
example, the Web UI might be creating new rows in the database at the
|
||||||
|
same time as the batch item importer. The PostgreSQL-specific
|
||||||
|
implementation of the method uses <code>SEQUENCES</code>
|
||||||
|
for each table in order to create new IDs. If an alternative database
|
||||||
|
backend were to be used, the implementation of <code>getnextid</code>
|
||||||
|
could be updated to operate with that specific DBMS.</p>
|
||||||
|
<p>The <code>etc</code>
|
||||||
|
directory in the source distribution contains two further SQL files. <code>clean-database.sql</code>
|
||||||
|
contains the SQL necessary to completely clean out the database, so use
|
||||||
|
with caution! The Ant target <code>clean_database</code>
|
||||||
|
can be used to execute this. <code>update-sequences.sql</code>
|
||||||
|
contains SQL to reset the primary key generation sequences to
|
||||||
|
appropriate values. You'd need to do this if, for example, you're
|
||||||
|
restoring a backup database dump which creates rows with specific
|
||||||
|
primary keys already defined. In such a case, the sequences would
|
||||||
|
allocate primary keys that were already used.</p>
|
||||||
|
<h3>Maintenance and Backup</h3>
|
||||||
|
<p>When using PostgreSQL, it's a
|
||||||
|
good idea to perform regular 'vacuuming' of the database to optimize
|
||||||
|
performance. This is performed by the <code>vacuumdb</code>
|
||||||
|
command which can be executed via a 'cron' job, for example by putting
|
||||||
|
this in the system <code>crontab</code>:</p>
|
||||||
|
<pre># clean up the database nightly<br>40 2 * * * /usr/local/pgsql/bin/vacuumdb --analyze dspace > /dev/null 2>&1</pre>
|
||||||
|
<p>The DSpace database can be
|
||||||
|
backed up and restored using usual methods, for example with <code>pg_dump</code>
|
||||||
|
and <code>psql</code>.
|
||||||
|
However when restoring a database, you will need to perform these
|
||||||
|
additional steps:</p>
|
||||||
|
<ul>
|
||||||
|
<li>
|
||||||
|
<p>The <code>fresh_install</code>
|
||||||
|
target loads up the initial contents of the Dublin Core type and
|
||||||
|
bitstream format registries, as well as two entries in the <code>epersongroup</code>
|
||||||
|
table for the system anonymous and administrator groups. Before you
|
||||||
|
restore a raw backup of your database you will need to remove these,
|
||||||
|
since they will already exist in your backup, possibly having been
|
||||||
|
modified. For example, use:</p>
|
||||||
|
<pre>DELETE FROM dctyperegistry;<br>DELETE FROM bitstreamformatregistry;<br>DELETE FROM epersongroup;</pre>
|
||||||
|
</li>
|
||||||
|
<li>
|
||||||
|
<p>After restoring a backup,
|
||||||
|
you will need to reset the primary key generation sequences so that
|
||||||
|
they do not produce already-used primary keys. Do this by executing the
|
||||||
|
SQL in <code><i>[dspace-source]</i>/etc/update-sequences.sql</code>,
|
||||||
|
for example with:</p>
|
||||||
|
<pre>psql -U dspace -f <i>[dspace-source]</i>/etc/update-sequences.sql</pre>
|
||||||
|
</li>
|
||||||
|
</ul>
|
||||||
|
<p>Future updates of DSpace may
|
||||||
|
involve minor changes to the database schema. Specific instructions on
|
||||||
|
how to update the schema whilst keeping live data will be included. The
|
||||||
|
current schema also contains a few currently unused database columns,
|
||||||
|
to be used for extra functionality in future releases. These unused
|
||||||
|
columns have been added in advance to minimize the effort required to
|
||||||
|
upgrade.</p>
|
||||||
|
<h3>Configuring the RDBMS Component</h3>
|
||||||
|
<p>The database manager is
|
||||||
|
configured with the following properties in <code>dspace.cfg</code>:</p>
|
||||||
|
<table>
|
||||||
|
<tbody>
|
||||||
|
<tr>
|
||||||
|
<td><code>db.url</code></td>
|
||||||
|
<td>The JDBC URL to use for
|
||||||
|
accessing the database. This should not point to a connection pool,
|
||||||
|
since DSpace already implements a connection pool.</td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td><code>db.driver</code></td>
|
||||||
|
<td>JDBC driver class name.
|
||||||
|
Since presently, DSpace uses PostgreSQL-specific features, this should
|
||||||
|
be <code>org.postgresql.Driver</code>.</td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td><code>db.username</code></td>
|
||||||
|
<td>Username to use when
|
||||||
|
accessing the database.</td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td><code>db.password</code></td>
|
||||||
|
<td>Corresponding password
|
||||||
|
ot use when accessing the database.</td>
|
||||||
|
</tr>
|
||||||
|
</tbody>
|
||||||
|
</table>
|
||||||
|
<h2><a name="bitstreams">Bitstream Store</a></h2>
|
||||||
|
<p>DSpace offers two means for
|
||||||
|
storing content. The first is in the file system on the server. The
|
||||||
|
second is using <a href="http://www.sdsc.edu/srb">SRB (Storage
|
||||||
|
Resource
|
||||||
|
Broker)</a>. Both are achieved using
|
||||||
|
a simple, lightweight API. <br>
|
||||||
|
</p>
|
||||||
|
<p>SRB is purely an option but may
|
||||||
|
be used in lieu of the server's file system or in addition to the file
|
||||||
|
system. Without going into a full description, SRB is a very robust,
|
||||||
|
sophisticated storage manager that offers essentially unlimited storage
|
||||||
|
and straightforward means to replicate (in simple terms, backup) the
|
||||||
|
content on other local or remote storage resources.<br>
|
||||||
|
</p>
|
||||||
|
<p>The terms "store", "retrieve",
|
||||||
|
"in the system", "storage", and so forth, used below can refer to
|
||||||
|
storage in the file system on the server ("traditional") or in SRB.<br>
|
||||||
|
</p>
|
||||||
|
<p>The <code>BitstreamStorageManager</code>
|
||||||
|
provides low-level access to bitstreams stored in the system. In
|
||||||
|
general, it should not be used directly; instead, use the <code>Bitstream</code>
|
||||||
|
object in the <a href="business.html#content">content management API</a>
|
||||||
|
since that encapsulated authorization and other metadata to do with a
|
||||||
|
bitstream that are not maintained by the <code>BitstreamStorageManager</code>.</p>
|
||||||
|
<p>The bitstream storage manager
|
||||||
|
provides three methods that store, retrieve and delete bitstreams.
|
||||||
|
Bitstreams are referred to by their 'ID'; that is the primary key <code>bitstream_id</code>
|
||||||
|
column of the corresponding row in the database.</p>
|
||||||
|
<p>As of DSpace version 1.1, there
|
||||||
|
can be multiple bitstream stores. Each of these bitstream stores can be
|
||||||
|
traditional storage or SRB storage. This means that the potential
|
||||||
|
storage of a DSpace system is not bound by the maximum size of a single
|
||||||
|
disk or file system and also that traditional and SRB storage can be
|
||||||
|
combined in one DSpace installation. Both traditional and SRB storage
|
||||||
|
are specified by <a href="configure.html">configuration
|
||||||
|
parameters</a>. Also see Configuring
|
||||||
|
the Bitstream Store below.<br>
|
||||||
|
</p>
|
||||||
|
<p>Stores are numbered, starting
|
||||||
|
with zero, then counting upwards. Each bitstream entry in the database
|
||||||
|
has a store number, used to retrieve the bitstream when required.</p>
|
||||||
|
<p>At the moment, the store in
|
||||||
|
which new bitstreams are placed is decided using a configuration
|
||||||
|
parameter, and there is no provision for moving bitstreams between
|
||||||
|
stores. Administrative tools for manipulating bitstreams and stores
|
||||||
|
will be provided in future releases. Right now you can move a whole
|
||||||
|
store (e.g. you could move store number 1 from <code>/localdisk/store</code>
|
||||||
|
to <code>/fs/anotherdisk/store</code>
|
||||||
|
but it would still have to be store number 1 and have the exact same
|
||||||
|
contents.</p>
|
||||||
|
<p>Bitstreams also have an
|
||||||
|
38-digit internal ID, different from the primary key ID of the
|
||||||
|
bitstream table row. This is not visible or used outside of the
|
||||||
|
bitstream storage manager. It is used to determine the exact location
|
||||||
|
(relative to the relevant store directory) that the bitstream is stored
|
||||||
|
in traditional or SRB storage. The first three pairs of digits are the
|
||||||
|
directory path that the bitstream is stored under. The bitstream is
|
||||||
|
stored in a file with the internal ID as the filename.</p>
|
||||||
|
<p>For example, a bitstream with
|
||||||
|
the internal ID <code>12345678901234567890123456789012345678</code>
|
||||||
|
is stored in the directory:</p>
|
||||||
|
<pre>(assetstore dir)/12/34/56/12345678901234567890123456789012345678</pre>
|
||||||
|
<p>The reasons for storing files
|
||||||
|
this way are:</p>
|
||||||
|
<ul>
|
||||||
|
<li>
|
||||||
|
<p>Using a randomly-generated
|
||||||
|
38-digit number means that the 'number space' is less cluttered than
|
||||||
|
simply using the primary keys, which are allocated sequentially and are
|
||||||
|
thus close together. This means that the bitstreams in the store are
|
||||||
|
distributed around the directory structure, improving access efficiency.</p>
|
||||||
|
</li>
|
||||||
|
<li>
|
||||||
|
<p>The internal ID is used as
|
||||||
|
the filename partly to avoid requiring an extra lookup of the filename
|
||||||
|
of the bitstream, and partly because bitstreams may be received from a
|
||||||
|
variety of operating systems. The original name of a bitstream may be
|
||||||
|
an illegal UNIX filename.</p>
|
||||||
|
</li>
|
||||||
|
</ul>
|
||||||
|
<p>When storing a bitstream, the <code>BitstreamStorageManager</code>
|
||||||
|
DOES set the following fields in the corresponding database table row:</p>
|
||||||
|
<ul>
|
||||||
|
<li><code>bitstream_id</code></li>
|
||||||
|
<li><code>size</code></li>
|
||||||
|
<li><code>checksum</code></li>
|
||||||
|
<li><code>checksum_algorithm</code></li>
|
||||||
|
<li><code>internal_id</code></li>
|
||||||
|
<li><code>deleted</code></li>
|
||||||
|
<li><code>store_number</code></li>
|
||||||
|
</ul>
|
||||||
|
<p>The remaining fields are the
|
||||||
|
responsibility of the <code>Bitstream</code>
|
||||||
|
content management API class.</p>
|
||||||
|
<p>The bitstream storage manager
|
||||||
|
is fully transaction-safe. In order to implement transaction-safety,
|
||||||
|
the following algorithm is used to store bitstreams:</p>
|
||||||
|
<ol>
|
||||||
|
<li>A database connection is
|
||||||
|
created, separately from the currently active connection in the <a
|
||||||
|
href="business.html#core">current DSpace context</a>.</li>
|
||||||
|
<li>An unique internal
|
||||||
|
identifier (separate from the database primary key) is generated.</li>
|
||||||
|
<li>The bitstream DB table row
|
||||||
|
is created using this new connection, with the <code>deleted</code>
|
||||||
|
column set to <code>true</code>.</li>
|
||||||
|
<li>The new connection is <code>commit</code>ted,
|
||||||
|
so the 'deleted' bitstream row is written to the database</li>
|
||||||
|
<li>The bitstream itself is
|
||||||
|
stored in a file in the configured 'asset store directory', with a
|
||||||
|
directory path and filename derived from the internal ID</li>
|
||||||
|
<li>The <code>deleted</code>
|
||||||
|
flag in the bitstream row is set to <code>false</code>.
|
||||||
|
This will occur (or not) as part of the current DSpace <code>Context</code>.</li>
|
||||||
|
</ol>
|
||||||
|
<p>This means that should anything
|
||||||
|
go wrong before, during or after the bitstream storage, only one of the
|
||||||
|
following can be true:</p>
|
||||||
|
<ul>
|
||||||
|
<li>No bitstream table row was
|
||||||
|
created, and no file was stored</li>
|
||||||
|
<li>A bitstream table row with <code>deleted=true</code>
|
||||||
|
was created, no file was stored</li>
|
||||||
|
<li>A bitstream table row with <code>deleted=true</code>
|
||||||
|
was created, and a file was stored</li>
|
||||||
|
</ul>
|
||||||
|
<p>None of these affect the
|
||||||
|
integrity of the data in the database or bitstream store.</p>
|
||||||
|
<p>Similarly, when a bitstream is
|
||||||
|
deleted for some reason, its <code>deleted</code>
|
||||||
|
flag is set to true as part of the overall transaction, and the
|
||||||
|
corresponding file in storage is <em>not</em>
|
||||||
|
deleted.</p>
|
||||||
|
<p>The above techniques mean that
|
||||||
|
the bitstream storage manager is transaction-safe. Over time, the
|
||||||
|
bitstream database table and file store may contain a number of
|
||||||
|
'deleted' bitstreams. The <code>cleanup</code>
|
||||||
|
method of <code>BitstreamStorageManager</code>
|
||||||
|
goes through these deleted rows, and actually deletes them along with
|
||||||
|
any corresponding files left in the storage. It only removes 'deleted'
|
||||||
|
bitstreams that are more than one hour old, just in case cleanup is
|
||||||
|
happening in the middle of a storage operation.</p>
|
||||||
|
<p>This cleanup can be invoked
|
||||||
|
from the command line via the <code>Cleanup</code>
|
||||||
|
class, which can in turn be easily executed from a shell on the server
|
||||||
|
machine using <code>/dspace/bin/cleanup</code>.
|
||||||
|
You might like to have this run regularly by <code>cron</code>,
|
||||||
|
though since DSpace is read-lots, write-not-so-much it doesn't need to
|
||||||
|
be run very often.</p>
|
||||||
|
<h3>Backup</h3>
|
||||||
|
<p>The bitstreams (files) in
|
||||||
|
traditional storage may be backed up very easily by simply 'tarring' or
|
||||||
|
'zipping' the <code>assetstore</code>
|
||||||
|
directory (or whichever directory is configured in <code>dspace.cfg</code>).
|
||||||
|
Restoring is as simple as extracting the backed-up compressed file in
|
||||||
|
the appropriate location.<br>
|
||||||
|
</p>
|
||||||
|
<p>Similar means could be used for
|
||||||
|
SRB, but SRB offers many more options for managing backup.<br>
|
||||||
|
</p>
|
||||||
|
<p>It is important to note that
|
||||||
|
since the bitstream storage manager holds the bitstreams in storage,
|
||||||
|
and information about them in the database, that a database backup and
|
||||||
|
a backup of the files in the bitstream store must be made at the same
|
||||||
|
time; the bitstream data in the database must correspond to the stored
|
||||||
|
files.</p>
|
||||||
|
<p>Of course, it isn't really
|
||||||
|
ideal to 'freeze' the system while backing up to ensure that the
|
||||||
|
database and files match up. Since DSpace uses the bitstream data in
|
||||||
|
the database as the authoritative record, it's best to back up the
|
||||||
|
database before the files. This is because it's better to have a
|
||||||
|
bitstream in storage but not the database (effectively non-existent to
|
||||||
|
DSpace) than a bitstream record in the database but not storage, since
|
||||||
|
people would be able to find the bitstream but not actually get the
|
||||||
|
contents.</p>
|
||||||
|
<h3>Configuring the Bitstream Store</h3>
|
||||||
|
Both traditional and SRB bitstream stores are configured in <code>dspace.cfg</code>.
|
||||||
|
<h4>Configuring Traditonal Storage</h4>
|
||||||
|
Bitstream stores in the file system on the server are configured like
|
||||||
|
this:<span style="font-family: monospace;"><br>
|
||||||
|
</span>
|
||||||
|
<pre>assetstore.dir = <i>[dspace]</i>/assetstore</pre>
|
||||||
|
<p>(Remember that <i>[dspace]</i>
|
||||||
|
is a placeholder for the actual name of your DSpace install directory).</p>
|
||||||
|
<p>The above example specifies a
|
||||||
|
single asset store.</p>
|
||||||
|
<pre>assetstore.dir = <i>[dspace]</i>/assetstore_0<br>assetstore.dir.1 = /mnt/other_filesystem/assetstore_1</pre>
|
||||||
|
<p>The above example specifies two
|
||||||
|
asset stores. assetstore.dir specifies the asset store number 0 (zero);
|
||||||
|
after that use assetstore.dir.1, assetstore.dir.2 and so on. The
|
||||||
|
particular asset store a bitstream is stored in is held in the
|
||||||
|
database, so don't move bitstreams between asset stores, and don't
|
||||||
|
renumber them.</p>
|
||||||
|
<p>By default, newly created
|
||||||
|
bitstreams are put in asset store 0 (i.e. the one specified by the
|
||||||
|
assetstore.dir property.) This allows backwards compatibility with
|
||||||
|
pre-DSpace 1.1 configurations. To change this, for example when asset
|
||||||
|
store 0 is getting full, add a line to <code>dspace.cfg</code>
|
||||||
|
like:</p>
|
||||||
|
<pre>assetstore.incoming = 1</pre>
|
||||||
|
<p>Then restart DSpace (Tomcat).
|
||||||
|
New bitstreams will be written to the asset store specified by <code>assetstore.dir.1</code>,
|
||||||
|
which is <code>/mnt/other_filesystem/assetstore_1</code>
|
||||||
|
in the above example.<br>
|
||||||
|
</p>
|
||||||
|
<h4>Configuring SRB Storage</h4>
|
||||||
|
The same framework is used to configure SRB storage. That is, the asset
|
||||||
|
store number (0..n) can reference a file system directory as above or
|
||||||
|
it can reference a <span style="font-weight: bold;">set</span>
|
||||||
|
of SRB account parameters. But any particular asset store number can
|
||||||
|
reference one or the other but not both. This way traditional and SRB
|
||||||
|
storage can both be used but with different asset store numbers. The
|
||||||
|
same cautions mentioned above apply to SRB asset stores as well: The
|
||||||
|
particular asset store a bitstream is stored in is held in the
|
||||||
|
database, so don't move bitstreams between asset stores, and don't
|
||||||
|
renumber them.<br>
|
||||||
|
<br>
|
||||||
|
For example, let's say asset store number 1 will refer to SRB. The
|
||||||
|
there will be a set of SRB account parameters like this:<br>
|
||||||
|
<pre>srb.host.1 = mysrbmcathost.myu.edu<br>srb.port.1 = 5544<br>srb.mcatzone.1 = mysrbzone<br>srb.mdasdomainname.1 = mysrbdomain<br>srb.defaultstorageresource.1 = mydefaultsrbresource<br>srb.username.1 = mysrbuser<br>srb.password.1 = mysrbpassword<br>srb.homedirectory.1 = /mysrbzone/home/mysrbuser.mysrbdomain<br>srb.parentdir.1 = mysrbdspaceassetstore</pre>
|
||||||
|
Several of the terms, such as <span style="font-family: monospace;">mcatzone</span>,
|
||||||
|
have meaning only in the SRB context and will be familiar to SRB users.
|
||||||
|
The last, <span style="font-family: monospace;">srb.parentdir.n</span>,
|
||||||
|
can be used to used for addition (SRB) upper directory structure within
|
||||||
|
an SRB account. This property value could be blank as well.<br>
|
||||||
|
<br>
|
||||||
|
(If asset store 0 would refer to SRB it would be <span
|
||||||
|
style="font-family: monospace;">srb.host =</span> ...,
|
||||||
|
<span style="font-family: monospace;">srb.port =</span> ..., and so on (<span
|
||||||
|
style="font-family: monospace;">.0</span> omitted) to be consistent
|
||||||
|
with the traditional storage configuration above.)<br>
|
||||||
|
<br>
|
||||||
|
The similar use of <span style="font-family: monospace;">assetstore.incoming</span>
|
||||||
|
to reference asset store 0 (default) or 1..n (explicit property) means
|
||||||
|
that new bitstreams will be written to traditional or SRB storage
|
||||||
|
determined by whether a file system directory on the server is
|
||||||
|
referenced or a set of SRB account parameters are referenced.<br>
|
||||||
|
<br>
|
||||||
|
There are comments in dspace.cfg that further elaborate the
|
||||||
|
configuration of traditional and SRB storage.<br>
|
||||||
|
<br>
|
||||||
|
<hr>
|
||||||
|
<address> Copyright ©
|
||||||
|
2002-2004 MIT and Hewlett Packard </address>
|
||||||
|
</body>
|
||||||
|
</html>
|
29
dspace/docs/style.css
Normal file
29
dspace/docs/style.css
Normal file
@@ -0,0 +1,29 @@
|
|||||||
|
BODY { font-family: "verdana", Arial, Helvetica, sans-serif;
|
||||||
|
font-size: 10pt;
|
||||||
|
font-style: normal;
|
||||||
|
color: #000000;
|
||||||
|
background: #ffffff;
|
||||||
|
margin: 30px }
|
||||||
|
|
||||||
|
P { text-align: justify }
|
||||||
|
|
||||||
|
H1 { text-align: center }
|
||||||
|
|
||||||
|
TABLE { text-align: center }
|
||||||
|
|
||||||
|
TH { text-align: center;
|
||||||
|
font-size: 10pt;
|
||||||
|
font-weight: bold }
|
||||||
|
|
||||||
|
TD { text-align: left;
|
||||||
|
font-size: 10pt;
|
||||||
|
padding: 4px }
|
||||||
|
|
||||||
|
DT { font-weight: bold }
|
||||||
|
|
||||||
|
.figure { text-align: center;
|
||||||
|
margin-bottom: 2px }
|
||||||
|
|
||||||
|
.caption { text-align: center;
|
||||||
|
margin-top: 0;
|
||||||
|
font-size: 8pt }
|
274
dspace/docs/submission.html
Normal file
274
dspace/docs/submission.html
Normal file
@@ -0,0 +1,274 @@
|
|||||||
|
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
|
||||||
|
"http://www.w3.org/TR/html4/loose.dtd">
|
||||||
|
|
||||||
|
<HTML>
|
||||||
|
<HEAD>
|
||||||
|
<META name="generator" content="HTML Tidy for Windows (vers 1st December 2004), see www.w3.org">
|
||||||
|
|
||||||
|
<TITLE>DSpace System Documentation: Submission Forms Customization</TITLE>
|
||||||
|
<LINK rel="StyleSheet" href="style.css" type="text/css">
|
||||||
|
<META http-equiv="Content-Type" content="text/html; charset=us-ascii">
|
||||||
|
</HEAD>
|
||||||
|
|
||||||
|
<BODY>
|
||||||
|
<H1>Custom Metadata-entry Pages for Submission</H1>
|
||||||
|
|
||||||
|
<P><A href="index.html">Back to contents</A></P>
|
||||||
|
|
||||||
|
<H2>Introduction</H2>
|
||||||
|
|
||||||
|
<P>This section explains how to customize the Web forms used by submitters and editors to enter and modify the metadata for a new item.</P>
|
||||||
|
|
||||||
|
<P>You can customize the "default" metadata forms used by all collections, and also create alternate sets of metadata forms and assign them to specific collections. In creating custom metadata forms, you can choose:</P>
|
||||||
|
|
||||||
|
<UL>
|
||||||
|
<LI>The number of metadata-entry pages.</LI>
|
||||||
|
|
||||||
|
<LI>Which fields appear on each page, and their sequence.</LI>
|
||||||
|
|
||||||
|
<LI>Labels, prompts, and other text associated with each field.</LI>
|
||||||
|
|
||||||
|
<LI>List of available choices for each menu-driven field.</LI>
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
<P><STRONG>N.B.</STRONG>The cosmetic and ergonomic details of metadata entry fields remain the same as the fixed metadata pages in previous DSpace releases, and can only be altered by modifying the appropriate stylesheet and JSP pages.</P>
|
||||||
|
|
||||||
|
<P>All of the custom metadata-entry forms for a DSpace instance are controlled by a single XML file, <CODE>input-forms.xml</CODE>, in the <CODE>config</CODE> subdirectory under the DSpace home. DSpace comes with a sample configuration that implements the traditional metadata-entry forms, which also serves as a well-documented example. The rest of this section explains how to create your own sets of custom forms.</P>
|
||||||
|
|
||||||
|
<H2>Describing Custom Metadata Forms</H2>
|
||||||
|
|
||||||
|
<P>The description of a set of pages through which submitters enter their metadata is called a <EM>form</EM> (although it is actually a set of forms, in the HTML sense of the term). A form is identified by a unique symbolic <EM>name</EM>. In the XML structure, the <EM>form</EM> is broken down into a series of <EM>pages</EM>: each of these represents a separate Web page for collecting metadata elements.</P>
|
||||||
|
|
||||||
|
<P>To set up one of your DSpace collections with customized submission forms, first you make an entry in the <EM>form-map</EM>. This is effectively a table that relates a collection to a form set, by connecting the collection's <EM>Handle</EM> to the form name. Collections are identified by handle because their names are mutable and not necessarily unique, while handles are unique and persistent.</P>
|
||||||
|
|
||||||
|
<P>A special map entry, for the collection handle "default", defines the <EM>default</EM> form set. It applies to all collections which are not explicitly mentioned in the map. In the example XML this form set is named <CODE>traditional</CODE> (for the "traditional" DSpace user interface) but it could be named anything.</P>
|
||||||
|
|
||||||
|
<H2>The Structure of <CODE>input-forms.xml</CODE></H2>
|
||||||
|
|
||||||
|
<P>The XML configuration file has a single top-level element, <CODE>input-forms</CODE>, which contains three elements in a specific order. The outline is as follows:</P>
|
||||||
|
<PRE>
|
||||||
|
<input-forms>
|
||||||
|
|
||||||
|
<-- <EM>Map of Collections to Form Sets</EM> -->
|
||||||
|
<form-map>
|
||||||
|
<name-map collection-handle="default" form-name="traditional" />
|
||||||
|
...
|
||||||
|
</form-map>
|
||||||
|
|
||||||
|
<-- <EM>Form Set Definitions</EM> -->
|
||||||
|
<form-definitions>
|
||||||
|
<form name="traditional">
|
||||||
|
...
|
||||||
|
</form-definitions>
|
||||||
|
|
||||||
|
<-- <EM>Name/Value Pairs used within Multiple Choice Widgets</EM> -->
|
||||||
|
<form-value-pairs>
|
||||||
|
<value-pairs value-pairs-name="common_iso_languages" dc-term="language_iso">
|
||||||
|
...
|
||||||
|
</form-value-pairs>
|
||||||
|
</input-forms>
|
||||||
|
</PRE>
|
||||||
|
|
||||||
|
<H3>Adding a Collection Map</H3>
|
||||||
|
|
||||||
|
<P>Each <CODE>name-map</CODE> element within <CODE>form-map</CODE> associates a collection with the name of a form set. Its <CODE>collection-handle</CODE> attribute is the Handle of the collection, and its <CODE>form-name</CODE> attribute is the form set name, which must match the <CODE>name</CODE> attribute of a <CODE>form</CODE> element.</P>
|
||||||
|
|
||||||
|
<P>For example, the following fragment shows how the collection with handle "12345.6789/42" is attached to the "TechRpt" form set:</P>
|
||||||
|
<PRE>
|
||||||
|
<form-map>
|
||||||
|
<name-map collection-handle="<STRONG>12345.6789/42</STRONG>" form-name="<STRONG>TechRpt</STRONG>" />
|
||||||
|
...
|
||||||
|
</form-map>
|
||||||
|
|
||||||
|
<form-definitions>
|
||||||
|
<form name="<STRONG>TechRept</STRONG>">
|
||||||
|
...
|
||||||
|
</form-definitions>
|
||||||
|
</PRE>
|
||||||
|
|
||||||
|
<P>It's a good idea to keep the definition of the <CODE><STRONG>default</STRONG></CODE> name-map from the example <CODE>input-forms.xml</CODE> so there is always a default for collections which do not have a custom form set.</P>
|
||||||
|
|
||||||
|
<H4>Getting A Collection's Handle</H4>
|
||||||
|
|
||||||
|
<P>You will need the <EM>handle</EM> of a collection in order to assign it a custom form set. To discover the handle, go to the "Communities & Collections" page under "<STRONG>Browse</STRONG>" in the left-hand menu on your DSpace home page. Then, find the link to your collection. It should look something like:</P>
|
||||||
|
<PRE>
|
||||||
|
http://myhost.my.edu/dspace/handle/<U><STRONG>12345.6789/42</STRONG></U>
|
||||||
|
</PRE>
|
||||||
|
|
||||||
|
<P>The underlined part of the URL is the handle. It should look familiar to any DSpace administrator. That is what goes in the <CODE>collection-handle</CODE> attribute of your <CODE>name-map</CODE> element.</P>
|
||||||
|
|
||||||
|
<H3>Adding a Form Set</H3>
|
||||||
|
|
||||||
|
<P>You can add a new form set by creating a new <CODE>form</CODE> element within the <CODE>form-definitions</CODE> element. It has one attribute, <CODE>name</CODE>, which as seen above must match the value of the <CODE>name-map</CODE> for the collections it is to be used for.</P>
|
||||||
|
|
||||||
|
<H4>Forms and Pages</H4>
|
||||||
|
|
||||||
|
<P>The content of the <CODE>form</CODE> is a sequence of <CODE>page</CODE> elements. Each of these corresponds to a Web page of forms for entering metadata elements, presented in sequence between the initial "Describe" page and the final "Verify" page (which presents a summary of all the metadata collected).</P>
|
||||||
|
|
||||||
|
<P>A <CODE>form</CODE> must contain at least one and at most six pages. They are presented in the order they appear in the XML. Each <CODE>page</CODE> element must include a <CODE>number</CODE> attribute, that should be its sequence number, e.g.</P>
|
||||||
|
<PRE>
|
||||||
|
<page number="1">
|
||||||
|
</PRE>
|
||||||
|
|
||||||
|
<P>The <CODE>page</CODE> element, in turn, contains a sequence of <CODE>field</CODE> elements. Each field defines an interactive dialog where the submitter enters one of the Dublin Core metadata items.</P>
|
||||||
|
|
||||||
|
<H4>Composition of a Field</H4>
|
||||||
|
|
||||||
|
<P>Each <CODE>field</CODE> contains the following elements, in the order indicated. The required sub-elements are so marked:</P>
|
||||||
|
|
||||||
|
<DL>
|
||||||
|
<DT><STRONG><CODE>dc-element</CODE></STRONG> <EM>(Required)</EM></DT>
|
||||||
|
|
||||||
|
<DD>Name of the Dublin Core element entered in this field, e.g. <CODE>contributor</CODE>.</DD>
|
||||||
|
|
||||||
|
<DT><STRONG><CODE>dc-qualifier</CODE></STRONG></DT>
|
||||||
|
|
||||||
|
<DD>Qualifier of the Dublin Core element entered in this field, e.g. when the field is <CODE>contributor.advisor</CODE> the value of this element would be <CODE>advisor</CODE>. Leaving this out means the input is for an unqualified DC element.</DD>
|
||||||
|
|
||||||
|
<DT><STRONG><CODE>repeatable</CODE></STRONG></DT>
|
||||||
|
|
||||||
|
<DD>Value is <CODE>true</CODE> when multiple values of this field are allowed, <CODE>false</CODE> otherwise. When you mark a field repeatable, the UI servlet will add a control to let the user ask for more fields to enter additional values. Intended to be used for arbitrarily-repeating fields such as subject keywords, when it is impossible to know in advance how many input boxes to provide.</DD>
|
||||||
|
|
||||||
|
<DT><STRONG><CODE>label</CODE></STRONG> <EM>(Required)</EM></DT>
|
||||||
|
|
||||||
|
<DD>Text to display as the label of this field, describing what to enter, e.g. "<CODE>Your Advisor's Name</CODE>".</DD>
|
||||||
|
|
||||||
|
<DT><STRONG><CODE>input-type</CODE></STRONG> <EM>(Required)</EM></DT>
|
||||||
|
|
||||||
|
<DD>
|
||||||
|
Defines the kind of interactive widget to put in the form to collect the Dublin Core value. Content must be one of the following keywords:
|
||||||
|
|
||||||
|
<UL>
|
||||||
|
<LI><STRONG>onebox</STRONG> -- A single text-entry box.</LI>
|
||||||
|
|
||||||
|
<LI><STRONG>twobox</STRONG> -- A pair of simple text-entry boxes, used for <EM>repeatable</EM> values such as the DC <CODE>subject</CODE> item.</LI>
|
||||||
|
|
||||||
|
<LI><STRONG>textarea</STRONG> -- Large block of text that can be entered on multiple lines, e.g. for an abstract.</LI>
|
||||||
|
|
||||||
|
<LI><STRONG>name</STRONG> -- Personal name, with separate fields for family name and first name.</LI>
|
||||||
|
|
||||||
|
<LI><STRONG>date</STRONG> -- Calendar date. when required, demands that at least the year be entered.</LI>
|
||||||
|
|
||||||
|
<LI><STRONG>dropdown</STRONG> -- Choose value(s) from a "drop-down" menu list. <STRONG>Note:</STRONG> You must also include a value for the <CODE>value-pairs-name</CODE> attribute to specify a list of menu entries, from which to choose, for this item. Use this to make a choice from a restricted set of options, such as for the <CODE>language</CODE> item.</LI>
|
||||||
|
|
||||||
|
<LI><STRONG>qualdrop_value</STRONG> -- Enter a "qualified value", which includes <EM>both</EM> a qualifier from a drop-down menu and a free-text value. Used to enter items like alternate identifers and codes for a submitted item, e.g. the DC <CODE>identifier</CODE> field. <STRONG>Note:</STRONG> As for the <CODE>dropdown</CODE> type, you must include the <CODE>value-pairs-name</CODE> attribute to specify a menu choice list.</LI>
|
||||||
|
</UL>
|
||||||
|
</DD>
|
||||||
|
|
||||||
|
<DT><STRONG><CODE>hint</CODE></STRONG> <EM>(Required)</EM></DT>
|
||||||
|
|
||||||
|
<DD>Content is the text that will appear as a "hint", or instructions, next to the input fields. Can be left empty, but it must be present.</DD>
|
||||||
|
|
||||||
|
<DT><STRONG><CODE>required</CODE></STRONG></DT>
|
||||||
|
|
||||||
|
<DD>When this element is included with any content, it marks the field as a required input. If the user tries to leave the page without entering a value for this field, that text is displayed as a warning message. For example,<BR>
|
||||||
|
<CODE><required>You must enter a title.</required><BR>
|
||||||
|
Note that leaving the</CODE> required element empty will <EM>not</EM> mark a field as required, e.g.:<BR>
|
||||||
|
<CODE><required></required></CODE></DD>
|
||||||
|
</DL>
|
||||||
|
|
||||||
|
<P>Look at the example <CODE>input-forms.xml</CODE> and experiment with a a trial custom form to learn this specification language thoroughly. It is a very simple way to express the layout of data-entry forms, but the only way to learn all its subtleties is to use it.</P>
|
||||||
|
|
||||||
|
<H4>Automatically Elided Fields</H4>
|
||||||
|
|
||||||
|
<P>You may notice that some fields are automatically skipped when a custom form page is displayed, depending on the kind of item being submitted. This is because the DSpace user-interface engine skips Dublin Core fields which are not needed, according to the initial description of the item. For example, if the user indicates there are no alternate titles on the first "Describe" page (the one with a few checkboxes), the input for the <CODE>title.alternative</CODE> DC element is automatically elided, <EM>even on custom submission pages.</EM></P>When a user initiates a submission, DSpace first displays what we'll call the "initial-questions page". By default, it contains three questions with check-boxes:
|
||||||
|
|
||||||
|
<OL>
|
||||||
|
<LI><STRONG>The item has more than one title, e.g. a translated title</STRONG><BR>
|
||||||
|
Controls <CODE>title.alternative</CODE> field.</LI>
|
||||||
|
|
||||||
|
<LI>
|
||||||
|
<STRONG>The item has been published or publicly distributed before</STRONG><BR>
|
||||||
|
Controls DC fields:
|
||||||
|
|
||||||
|
<UL>
|
||||||
|
<LI><CODE>date.issued</CODE></LI>
|
||||||
|
|
||||||
|
<LI><CODE>publisher</CODE></LI>
|
||||||
|
|
||||||
|
<LI><CODE>identifier.citation</CODE></LI>
|
||||||
|
</UL>
|
||||||
|
</LI>
|
||||||
|
|
||||||
|
<LI><STRONG>The item consists of more than one file</STRONG><BR>
|
||||||
|
<EM>Does not affect any metadata input fields.</EM></LI>
|
||||||
|
</OL>The answers to the first two questions control whether inputs for certain of the DC metadata fields will displayed, even if they are defined as fields in a custom page.
|
||||||
|
|
||||||
|
<P>Conversely, if the metadata fields controlled by a checkbox are not mentioned in the custom form, the checkbox is elided from the initial page to avoid confusing or misleading the user.</P>
|
||||||
|
|
||||||
|
<P>The two relevant checkbox entries are "The item has more than one title, e.g. a translated title", and "The item has been published or publicly distributed before". The checkbox for multiple titles trigger the display of the field with dc-element equal to 'title' and dc-qualifier equal to 'alternative'. If the controlling collection's form set does not contain this field, then the multiple titles question will not appear on the initial questions page.</P>
|
||||||
|
|
||||||
|
<H3>Adding <CODE>Value-Pairs</CODE></H3>Finally, your custom form description needs to define the "value pairs" for any fields with input types that refer to them. Do this by adding a <CODE>value-pairs</CODE> element to the contents of <CODE>form-value-pairs</CODE>. It has the following required attributes:
|
||||||
|
|
||||||
|
<UL>
|
||||||
|
<LI><STRONG><CODE>value-pairs-name</CODE></STRONG> -- Name by which an <CODE>input-type</CODE> refers to this list.</LI>
|
||||||
|
|
||||||
|
<LI><STRONG><CODE>dc-term</CODE></STRONG> -- Qualified Dublin Core field for which this choice list is selecting a value.</LI>
|
||||||
|
</UL>Each <CODE>value-pairs</CODE> element contains a sequence of <CODE>pair</CODE> sub-elements, each of which in turn contains two elements:
|
||||||
|
|
||||||
|
<UL>
|
||||||
|
<LI><STRONG><CODE>displayed-value</CODE></STRONG> -- Name shown (on the web page) for the menu entry.</LI>
|
||||||
|
|
||||||
|
<LI><STRONG><CODE>stored-value</CODE></STRONG> -- Value stored in the DC element when this entry is chosen.</LI>
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
<P>Unlike the HTML <CODE>select</CODE> tag, there is no way to indicate one of the entries should be the default, so the first entry is always the default choice.</P>
|
||||||
|
|
||||||
|
<H4>Example</H4>
|
||||||
|
|
||||||
|
<P>Here is a menu of types of common identifiers:</P>
|
||||||
|
<PRE>
|
||||||
|
<value-pairs value-pairs-name="common_identifiers" dc-term="identifier">
|
||||||
|
<pair>
|
||||||
|
<displayed-value>Gov't Doc #</displayed-value>
|
||||||
|
<stored-value>govdoc</stored-value>
|
||||||
|
</pair>
|
||||||
|
<pair>
|
||||||
|
<displayed-value>URI</displayed-value>
|
||||||
|
<stored-value>uri</stored-value>
|
||||||
|
</pair>
|
||||||
|
<pair>
|
||||||
|
<displayed-value>ISBN</displayed-value>
|
||||||
|
<stored-value>isbn</stored-value>
|
||||||
|
</pair>
|
||||||
|
</value-pairs>
|
||||||
|
</PRE>It generates the following HTML, which results in the menu widget below. (Note that there is no way to indicate a default choice in the custom input XML, so it cannot generate the HTML <CODE>SELECTED</CODE> attribute to mark one of the options as a pre-selected default.)
|
||||||
|
<PRE>
|
||||||
|
<select name="identifier_qualifier_0">
|
||||||
|
<option VALUE="govdoc">Gov't Doc #</option>
|
||||||
|
<option VALUE="uri">URI</option>
|
||||||
|
<option VALUE="isbn">ISBN</option>
|
||||||
|
</select>
|
||||||
|
</PRE>
|
||||||
|
|
||||||
|
<FORM ACTION="submission.html">
|
||||||
|
<STRONG>Identifiers:</STRONG> <SELECT name="identifier_qualifier_0">
|
||||||
|
<OPTION value="govdoc">
|
||||||
|
Gov't Doc #
|
||||||
|
</OPTION>
|
||||||
|
|
||||||
|
<OPTION value="uri">
|
||||||
|
URI
|
||||||
|
</OPTION>
|
||||||
|
|
||||||
|
<OPTION value="isbn">
|
||||||
|
ISBN
|
||||||
|
</OPTION>
|
||||||
|
</SELECT>
|
||||||
|
</FORM>
|
||||||
|
|
||||||
|
<H2>Deploying Your Custom Forms</H2>The DSpace web application only reads your custom form definitions when it starts up, so it is important to remember:
|
||||||
|
|
||||||
|
<BLOCKQUOTE>
|
||||||
|
<EM><STRONG>You must always restart Tomcat</STRONG> (or whatever servlet container you are using) for changes made to the <CODE>input-forms.xml</CODE> file take effect.</EM>
|
||||||
|
</BLOCKQUOTE>
|
||||||
|
|
||||||
|
<P>Any mistake in the syntax or semantics of the form definitions, such as poorly formed XML or a reference to a nonexistent field name, will cause a fatal error in the DSpace UI. The exception message (at the top of the stack trace in the <CODE>dspace.log</CODE> file) usually has a concise and helpful explanation of what went wrong. Don't forget to stop and restart the servlet container before testing your fix to a bug.</P>
|
||||||
|
|
||||||
|
<HR>
|
||||||
|
|
||||||
|
<ADDRESS>
|
||||||
|
Copyright © 2002-2005 MIT and Hewlett Packard
|
||||||
|
</ADDRESS>
|
||||||
|
</BODY>
|
||||||
|
</HTML>
|
522
dspace/docs/update.html
Normal file
522
dspace/docs/update.html
Normal file
@@ -0,0 +1,522 @@
|
|||||||
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN" "http://www.w3.org/TR/REC-html40/strict.dtd">
|
||||||
|
<HTML>
|
||||||
|
<HEAD>
|
||||||
|
<TITLE>DSpace System Documentation: Updating a DSpace Installation</TITLE>
|
||||||
|
<LINK REL=StyleSheet HREF="style.css" TYPE="text/css">
|
||||||
|
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" >
|
||||||
|
</HEAD>
|
||||||
|
<BODY>
|
||||||
|
<H1>DSpace System Documentation: Updating a DSpace Installation</H1>
|
||||||
|
|
||||||
|
<P><A HREF="index.html">Back to contents</A></P>
|
||||||
|
|
||||||
|
<P>This section describes how to update a DSpace installation from one version to the next. Details of the differences between the functionality of each version are given in the <A HREF="history.html">Version History</A> section.</P>
|
||||||
|
|
||||||
|
<H2><A NAME="12_13">Updating From 1.2.x to 1.3</A></H2>
|
||||||
|
|
||||||
|
FIXME: this is the start of the 1.3 update documentation
|
||||||
|
|
||||||
|
<P>In the notes below <code><i>[dspace]</i></code> refers to the install directory for your existing DSpace installation, and <code><i>[dspace-1.3-source]</i></code> to the source directory for DSpace 1.3. Whenever you see these path references, be sure to replace them with the actual path names on your local system.</p>
|
||||||
|
|
||||||
|
<ol>
|
||||||
|
<LI><P>Step one is, of course, to <strong>back up all your data</strong> before proceeding!! Include all of the contents of <code><i>[dspace]</i></code> and the PostgreSQL database in your backup.</P></LI>
|
||||||
|
<LI><P>Get the new DSpace 1.3 source code from <A HREF="http://sourceforge.net/projects/dspace/">the DSpace page on SourceForge</A> and unpack it somewhere. Do not unpack it on top of your existing installation!!</P></LI>
|
||||||
|
<LI><P>Copy the PostgreSQL driver JAR to the source tree. For example:</P>
|
||||||
|
<p><code>cd <i>[dspace]</i>/lib</code></p>
|
||||||
|
<p><code>cp postgresql.jar <i>[dspace-1.2.2-source]</i>/lib</code></p>
|
||||||
|
<LI><P>Take down Tomcat (or whichever servlet container you're using).</P></LI>
|
||||||
|
<li><p>Install the new config files by moving <code>dstat.cfg</code> and <code>dstat.map</code> from <code>[dspace-1.3-source]/config/</code> to <code>[dspace]/config</code></p>
|
||||||
|
<LI><P>You need to add new parameters to your <code><i>[dspace]/</i>dspace.cfg</code>:</P>
|
||||||
|
<pre>
|
||||||
|
##### SRB File Storage #####
|
||||||
|
|
||||||
|
# The same 'assetstore.incoming' property is used to support the use of SRB
|
||||||
|
# (Storage Resource Broker - see http://www.sdsc.edu/srb/) as an _optional_
|
||||||
|
# replacement of or supplement to conventional file storage. DSpace will work
|
||||||
|
# with or without SRB and full backward compatibility is maintained.
|
||||||
|
#
|
||||||
|
# The 'assetstore.incoming' property is an integer that references where _new_
|
||||||
|
# bitstreams will be stored. The default (say the starting reference) is zero.
|
||||||
|
# The value will be used to identify the storage where all new bitstreams will
|
||||||
|
# be stored until this number is changed. This number is stored in the
|
||||||
|
# Bitstream table (store_number column) in the DSpace database, so older
|
||||||
|
# bitstreams that may have been stored when 'asset.incoming' had a different
|
||||||
|
# value can be found.
|
||||||
|
#
|
||||||
|
# In the simple case in which DSpace uses local (or mounted) storage the
|
||||||
|
# number can refer to different directories (or partitions). This gives DSpace
|
||||||
|
# some level of scalability. The number links to another set of properties
|
||||||
|
# 'assetstore.dir', 'assetstore.dir.1' (remember zero is default),
|
||||||
|
# 'assetstore.dir.2', etc., where the values are directories.
|
||||||
|
#
|
||||||
|
# To support the use of SRB DSpace uses this same scheme but broadened to
|
||||||
|
# support:
|
||||||
|
# - using SRB instead of the local filesystem
|
||||||
|
# - using the local filesystem (native DSpace)
|
||||||
|
# - using a mix of SRB and local filesystem
|
||||||
|
#
|
||||||
|
# In this broadened use the 'asset.incoming' integer will refer one of the
|
||||||
|
# following storage locations
|
||||||
|
# - a local filesystem directory (native DSpace)
|
||||||
|
# - a set of SRB account parameters (host, port, zone, domain, username,
|
||||||
|
# password, home directory, and resource)
|
||||||
|
#
|
||||||
|
# Should the be any conflict, like '2' refering to a local directory and
|
||||||
|
# to a set of SRB parameters, the program will select the local directory.
|
||||||
|
#
|
||||||
|
# If SRB is chosen from the first install of DSpace, it is suggested that
|
||||||
|
# 'assetstore.dir' (no integer appended) be retained to reference a local
|
||||||
|
# directory (as above under File Storage) because build.xml uses this value
|
||||||
|
# to do a mkdir. In this case, 'assetstore.incoming' can be set to 1 (i.e.
|
||||||
|
# uncomment the line in File Storage above) and the 'assetstore.dir' will not
|
||||||
|
# be used.
|
||||||
|
#
|
||||||
|
# Here is an example set of SRB parameters:
|
||||||
|
# Assetstore 1 - SRB
|
||||||
|
#srb.host.1 = mysrbmcathost.myu.edu
|
||||||
|
#srb.port.1 = 5544
|
||||||
|
#srb.mcatzone.1 = mysrbzone
|
||||||
|
#srb.mdasdomainname.1 = mysrbdomain
|
||||||
|
#srb.defaultstorageresource.1 = mydefaultsrbresource
|
||||||
|
#srb.username.1 = mysrbuser
|
||||||
|
#srb.password.1 = mysrbpassword
|
||||||
|
#srb.homedirectory.1 = /mysrbzone/home/mysrbuser.mysrbdomain
|
||||||
|
#srb.parentdir.1 = mysrbdspaceassetstore
|
||||||
|
#
|
||||||
|
# Assetstore n, n+1, ...
|
||||||
|
# Follow same pattern as for assetstores above (local or SRB)
|
||||||
|
|
||||||
|
# Directory for history serializations
|
||||||
|
history.dir = /dspace/history
|
||||||
|
|
||||||
|
# Where to put search index files
|
||||||
|
search.dir = /dspace/search
|
||||||
|
# Higher values of search.max-clauses will enable prefix searches to work on large
|
||||||
|
# repositories
|
||||||
|
#search.max-clauses=2048
|
||||||
|
|
||||||
|
# Where to put the logs
|
||||||
|
log.dir = /dspace/log
|
||||||
|
|
||||||
|
# Where to temporarily store uploaded files
|
||||||
|
upload.temp.dir = /dspace/upload
|
||||||
|
|
||||||
|
# Maximum size of uploaded files in bytes, must be positive
|
||||||
|
# 512Mb
|
||||||
|
upload.max = 536870912
|
||||||
|
|
||||||
|
|
||||||
|
###### Statistical Report Configuration Settings ######
|
||||||
|
|
||||||
|
# directory where live reports are stored
|
||||||
|
report.directory = /dspace/reports/
|
||||||
|
</pre>
|
||||||
|
<LI><P>Build and install the updated DSpace 1.3 code. Go to the <code>[dspace-1.3-source]</code> directory, and run:</P>
|
||||||
|
<p><code>ant -Dconfig=<i>[dspace]</i>/config/dspace.cfg update</code></p>
|
||||||
|
<LI><P>You'll need to make some changes to the database schema in your PostgreSQL database. <code><i>[dspace-1.3-source]</i>/etc/database_schema_12-13.sql</code> contains the SQL commands to achieve this. If you've modified the schema locally, you may need to check over this and make alterations.</P>
|
||||||
|
<P>To apply the changes, go to the source directory, and run:</P>
|
||||||
|
<p><code>psql -f etc/database_schema_12-13.sql [DSpace database name] -h localhost</code></p>
|
||||||
|
<li><p>Customise the stat generating statistics as per the instructions in <a href="configure.html#statistics">System Statistical Reports</a></p>
|
||||||
|
<li><p>Initialise the statistics using:</p>
|
||||||
|
<p><code>[dspace]/bin/stat-initial</code></p>
|
||||||
|
<p><code>[dspace]/bin/stat-general</code></p>
|
||||||
|
<p><code>[dspace]/bin/stat-report-initial</code></p>
|
||||||
|
<p><code>[dspace]/bin/stat-report-general</code></p>
|
||||||
|
<LI><P>Rebuild the search indices:</P>
|
||||||
|
<p><code><i>[dspace]</i>/bin/index-all</code></p>
|
||||||
|
<LI><P>Copy the <code>.war</code> Web application files in <code><i>[dspace-1.3-source]</i>/build</code> to the <code>webapps</code> sub-directory of your servlet container (e.g. Tomcat). e.g.:</P>
|
||||||
|
<p><code>cp <i>[dspace-1.3-source]</i>/build/*.war <i>[tomcat]</i>/webapps</code></ps>
|
||||||
|
<LI><P>Restart Tomcat.</P></LI>
|
||||||
|
</ol>
|
||||||
|
|
||||||
|
<H2><A NAME="121_122">Updating From 1.2.1 to 1.2.2</A></H2>
|
||||||
|
|
||||||
|
<P>The changes in 1.2.2 are only code and config changes so the update should be fairly simple.</P>
|
||||||
|
|
||||||
|
<P>In the notes below <code><i>[dspace]</i></code> refers to the install directory for your existing DSpace installation, and <code><i>[dspace-1.2.2-source]</i></code> to the source directory for DSpace 1.2.2. Whenever you see these path references, be sure to replace them with the actual path names on your local system.</p>
|
||||||
|
|
||||||
|
<OL>
|
||||||
|
<LI><P>Get the new DSpace 1.2.2 source code from <A HREF="http://sourceforge.net/projects/dspace/">the DSpace page on SourceForge</A> and unpack it somewhere. Do not unpack it on top of your existing installation!!</P></LI>
|
||||||
|
|
||||||
|
<LI><P>Copy the PostgreSQL driver JAR to the source tree. For example:</P>
|
||||||
|
|
||||||
|
<PRE>cd <i>[dspace]</i>/lib
|
||||||
|
cp postgresql.jar <i>[dspace-1.2.2-source]</i>/lib</PRE></LI>
|
||||||
|
|
||||||
|
<LI><P>Take down Tomcat (or whichever servlet container you're using).</P></LI>
|
||||||
|
|
||||||
|
<LI><P>Your 'localized' JSPs (those in jsp/local) now need to be maintained in the <em>source</em> directory. If you have locally modified JSPs in your <code><i>[dspace]</i>/jsp/local</code> directory, you might like to merge the changes in the new 1.2.2 versions into your locally modified ones. You can use the <code>diff</code> command to compare the 1.2.1 and 1.2.2 versions to do this. Also see <A HREF="history.html#jsp-changes-1_2_1-1_2_2">the version history</A> for a list of modified JSPs.</P></LI>
|
||||||
|
|
||||||
|
<LI>
|
||||||
|
<P>You need to add a new parameter to your <code><i>[dspace]/</i>dspace.cfg</code> for configurable fulltext indexing </P>
|
||||||
|
|
||||||
|
|
||||||
|
<PRE>##### Fulltext Indexing settings #####
|
||||||
|
# Maximum number of terms indexed for a single field in Lucene.
|
||||||
|
# Default is 10,000 words - often not enough for full-text indexing.
|
||||||
|
# If you change this, you'll need to re-index for the change
|
||||||
|
# to take effect on previously added items.
|
||||||
|
# -1 = unlimited (Integer.MAX_VALUE)
|
||||||
|
search.maxfieldlength = 10000
|
||||||
|
</PRE></LI>
|
||||||
|
|
||||||
|
<LI><P>In <code><i>[dspace-1.2.2-source]</i></code> run:</P>
|
||||||
|
|
||||||
|
<pre>ant -Dconfig=<i>[dspace]</i>/config/dspace.cfg update</pre></LI>
|
||||||
|
|
||||||
|
<LI><P>Copy the <code>.war</code> Web application files in <code><i>[dspace-1.2.2-source]</i>/build</code> to the <code>webapps</code> sub-directory of your servlet container (e.g. Tomcat). e.g.:</P>
|
||||||
|
|
||||||
|
<PRE>cp <i>[dspace-1.2.2-source]</i>/build/*.war <i>[tomcat]</i>/webapps</PRE>
|
||||||
|
|
||||||
|
<P>If you're using Tomcat, you need to delete the directories corresponding to the old <code>.war</code> files. For example, if <code>dspace.war</code> is installed in <code><i>[tomcat]</i>/webapps/dspace.war</code>, you should delete the <code><i>[tomcat]</i>/webapps/dspace</code> directory. Otherwise, Tomcat will continue to use the old code in that directory. </P></LI>
|
||||||
|
|
||||||
|
<LI><P>To finialise the install of the new configurable submission forms you need to copy the file <code><em>[dspace-1.2.2-source]</em>/config/input-forms.xml</code> into <code><em>[dspace]</em>/config</code>. </P></LI>
|
||||||
|
|
||||||
|
<LI><P>Restart Tomcat.</P></LI>
|
||||||
|
</OL>
|
||||||
|
|
||||||
|
|
||||||
|
<H2><A NAME="12_121">Updating From 1.2 to 1.2.1</A></H2>
|
||||||
|
|
||||||
|
<P>The changes in 1.2.1 are only code changes so the update should be fairly simple.</P>
|
||||||
|
|
||||||
|
<P>In the notes below <code><i>[dspace]</i></code> refers to the install directory for your existing DSpace installation, and <code><i>[dspace-1.2.1-source]</i></code> to the source directory for DSpace 1.2.1. Whenever you see these path references, be sure to replace them with the actual path names on your local system.</p>
|
||||||
|
|
||||||
|
<OL>
|
||||||
|
<LI><P>Get the new DSpace 1.2.1 source code from <A HREF="http://sourceforge.net/projects/dspace/">the DSpace page on SourceForge</A> and unpack it somewhere. Do not unpack it on top of your existing installation!!</P></LI>
|
||||||
|
|
||||||
|
<LI><P>Copy the PostgreSQL driver JAR to the source tree. For example:</P>
|
||||||
|
|
||||||
|
<PRE>cd <i>[dspace]</i>/lib
|
||||||
|
cp postgresql.jar <i>[dspace-1.2.1-source]</i>/lib</PRE></LI>
|
||||||
|
|
||||||
|
<LI><P>Take down Tomcat (or whichever servlet container you're using).</P></LI>
|
||||||
|
|
||||||
|
<LI><P>Your 'localized' JSPs (those in jsp/local) now need to be maintained in the <em>source</em> directory. If you have locally modified JSPs in your <code><i>[dspace]</i>/jsp/local</code> directory, you might like to merge the changes in the new 1.2.1 versions into your locally modified ones. You can use the <code>diff</code> command to compare the 1.2 and 1.2.1 versions to do this. Also see <A HREF="history.html#jsp-changes-1_2-1_2_1">the version history</A> for a list of modified JSPs.</P></LI>
|
||||||
|
|
||||||
|
<LI>
|
||||||
|
<P>You need to add a few new parameters to your <code><i>[dspace]/</i>dspace.cfg</code> for browse/search and item thumbnails display, and for configurable DC metadata fields to be indexed. </P>
|
||||||
|
|
||||||
|
|
||||||
|
<PRE># whether to display thumbnails on browse and search results pages (1.2+)
|
||||||
|
webui.browse.thumbnail.show = false
|
||||||
|
|
||||||
|
# max dimensions of the browse/search thumbs. Must be <= thumbnail.maxwidth
|
||||||
|
# and thumbnail.maxheight. Only need to be set if required to be smaller than
|
||||||
|
# dimension of thumbnails generated by mediafilter (1.2+)
|
||||||
|
#webui.browse.thumbnail.maxheight = 80
|
||||||
|
#webui.browse.thumbnail.maxwidth = 80
|
||||||
|
|
||||||
|
# whether to display the thumb against each bitstream (1.2+)
|
||||||
|
webui.item.thumbnail.show = true
|
||||||
|
|
||||||
|
# where should clicking on a thumbnail from browse/search take the user
|
||||||
|
# Only values currently supported are "item" and "bitstream"
|
||||||
|
#webui.browse.thumbnail.linkbehaviour = item
|
||||||
|
|
||||||
|
|
||||||
|
##### Fields to Index for Search #####
|
||||||
|
|
||||||
|
# DC metadata elements.qualifiers to be indexed for search
|
||||||
|
# format: - search.index.[number] = [search field]:element.qualifier
|
||||||
|
# - * used as wildcard
|
||||||
|
|
||||||
|
### changing these will change your search results, ###
|
||||||
|
### but will NOT automatically change your search displays ###
|
||||||
|
|
||||||
|
search.index.1 = author:contributor.*
|
||||||
|
search.index.2 = author:creator.*
|
||||||
|
search.index.3 = title:title.*
|
||||||
|
search.index.4 = keyword:subject.*
|
||||||
|
search.index.5 = abstract:description.abstract
|
||||||
|
search.index.6 = author:description.statementofresponsibility
|
||||||
|
search.index.7 = series:relation.ispartofseries
|
||||||
|
search.index.8 = abstract:description.tableofcontents
|
||||||
|
search.index.9 = mime:format.mimetype
|
||||||
|
search.index.10 = sponsor:description.sponsorship
|
||||||
|
search.index.11 = id:identifier.* </PRE></LI>
|
||||||
|
|
||||||
|
<LI><P>In <code><i>[dspace-1.2.1-source]</i></code> run:</P>
|
||||||
|
|
||||||
|
<pre>ant -Dconfig=<i>[dspace]</i>/config/dspace.cfg update</pre></LI>
|
||||||
|
|
||||||
|
<LI><P>Copy the <code>.war</code> Web application files in <code><i>[dspace-1.2.1-source]</i>/build</code> to the <code>webapps</code> sub-directory of your servlet container (e.g. Tomcat). e.g.:</P>
|
||||||
|
|
||||||
|
<PRE>cp <i>[dspace-1.2.1-source]</i>/build/*.war <i>[tomcat]</i>/webapps</PRE>
|
||||||
|
|
||||||
|
<P>If you're using Tomcat, you need to delete the directories corresponding to the old <code>.war</code> files. For example, if <code>dspace.war</code> is installed in <code><i>[tomcat]</i>/webapps/dspace.war</code>, you should delete the <code><i>[tomcat]</i>/webapps/dspace</code> directory. Otherwise, Tomcat will continue to use the old code in that directory. </P></LI>
|
||||||
|
|
||||||
|
<LI><P>Restart Tomcat.</P></LI>
|
||||||
|
</OL>
|
||||||
|
<H2><A NAME="11_12">Updating From 1.1 (or 1.1.1) to 1.2</A></H2>
|
||||||
|
|
||||||
|
<P>The process for upgrading to 1.2 from either 1.1 or 1.1.1 is the same. If you are running DSpace 1.0 or 1.0.1, you need to follow the <A HREF="#101_11">instructions for upgrading from 1.0.1 to 1.1</A> to before following these instructions.</P>
|
||||||
|
|
||||||
|
<P>Note also that if you've substantially modified DSpace, these instructions apply to an unmodified 1.1.1 DSpace instance, and you'll need to adapt the process to any modifications you've made.</P>
|
||||||
|
|
||||||
|
<p>This document refers to the install directory for your existing DSpace installation as <code><i>[dspace]</i></code>, and to the source directory for
|
||||||
|
DSpace 1.2 as <code><i>[dspace-1.2-source]</i></code>. Whenever you see these path references below, be sure to replace them with the actual path names on your local system.
|
||||||
|
|
||||||
|
<OL>
|
||||||
|
<LI><P>Step one is, of course, to <strong>back up all your data</strong> before proceeding!! Include all of the contents of <code><i>[dspace]</i></code> and the PostgreSQL database in your backup.</P></LI>
|
||||||
|
|
||||||
|
<LI><P>Get the new DSpace 1.2 source code from <A HREF="http://sourceforge.net/projects/dspace/">the DSpace page on SourceForge</A> and unpack it somewhere. Do not unpack it on top of your existing installation!!</P></LI>
|
||||||
|
|
||||||
|
<LI><P>Copy the <A HREF="install.html#javalibs">required Java libraries</A> that we couldn't include in the bundle to the source tree. For example:</P>
|
||||||
|
|
||||||
|
<PRE>cd <i>[dspace]</i>/lib
|
||||||
|
cp activation.jar servlet.jar mail.jar <i>[dspace-1.2-source]</i>/lib</PRE></LI>
|
||||||
|
|
||||||
|
<LI><P>Stop Tomcat (or other servlet container.)</P></LI>
|
||||||
|
|
||||||
|
<LI>
|
||||||
|
<P>It's a good idea to upgrade all of the various third-party tools that DSpace uses to their latest versions:</P>
|
||||||
|
<UL>
|
||||||
|
<LI><P>Java (note that now version 1.4.0 or later is <em>required</em>)</P></LI>
|
||||||
|
<LI><P>Tomcat (Any version after 4.0 will work; symbolic links are no longer an issue)</P></LI>
|
||||||
|
<LI><P>PostgreSQL (don't forget to build/download an updated JDBC driver .jar file! Also, <strong>back up the database</strong> first.)</P></LI>
|
||||||
|
<LI><P>Ant</P></LI>
|
||||||
|
</UL>
|
||||||
|
</LI>
|
||||||
|
|
||||||
|
<LI>
|
||||||
|
<P>You need to add the following new parameters to your <code><i>[dspace]/</i>dspace.cfg</code>:</P>
|
||||||
|
|
||||||
|
<PRE>##### Media Filter settings #####
|
||||||
|
# maximum width and height of generated thumbnails
|
||||||
|
thumbnail.maxwidth 80
|
||||||
|
thumbnail.maxheight 80</PRE>
|
||||||
|
|
||||||
|
<P>There are one or two other, optional extra parameters (for controlling the pool of database connections). See <A HREF="history.html">the version history</A> for details. If you leave them out, defaults will be used.</P>
|
||||||
|
|
||||||
|
<P>Also, to avoid future confusion, you might like to <strong>remove</strong> the following property, which is no longer required:</P>
|
||||||
|
|
||||||
|
<PRE>config.template.oai-web.xml = <em>[dspace]</em>/oai/WEB-INF/web.xml</PRE>
|
||||||
|
</LI>
|
||||||
|
|
||||||
|
<LI><P>The layout of the installation directory (i.e. the structure of the contents of <code><i>[dspace]</i></code>) has changed somewhat since 1.1.1. First up, your 'localized' JSPs (those in jsp/local) now need to be maintained in the <em>source</em> directory. So make a copy of them now!</P>
|
||||||
|
|
||||||
|
<P>Once you've done that, you can remove <code><i>[dspace]/</i>jsp</code> and <code><i>[dspace]</i>/oai</code>, these are no longer used. (.war Web application archive files are used instead).</P>
|
||||||
|
|
||||||
|
<P>Also, if you're using the same version of Tomcat as before, you need to <strong>remove the lines from Tomcat's conf/server.xml file that enable symbolic links for DSpace.</strong> These are the <code><Context></code> elements you added to get DSpace 1.1.1 working, looking something like this:</P>
|
||||||
|
|
||||||
|
<pre><Context path="/dspace" docBase="dspace" debug="0" reloadable="true" crossContext="true">
|
||||||
|
<Resources className="org.apache.naming.resources.FileDirContext" allowLinking="true" />
|
||||||
|
</Context></pre>
|
||||||
|
|
||||||
|
<P>Be sure to remove the <Context> elements for both the Web UI and the OAI Web applications.</P>
|
||||||
|
</LI>
|
||||||
|
|
||||||
|
<LI><P>Build and install the updated DSpace 1.2 code. Go to the DSpace 1.2 source directory, and run:</P>
|
||||||
|
|
||||||
|
<PRE>ant -Dconfig=<i>[dspace]</i>/config/dspace.cfg update</PRE></LI>
|
||||||
|
|
||||||
|
<LI><P>Copy the new config files in <code>config</code> to your installation, e.g.:</P>
|
||||||
|
|
||||||
|
<PRE>cp <i>[dspace-1.2-source]</i>/config/news-* <i>[dspace-1.2-source]</i>/config/mediafilter.cfg <i>[dspace-1.2-source]</i>/config/dc2mods.cfg <i>[dspace]</i>/config</PRE></LI>
|
||||||
|
|
||||||
|
<LI>
|
||||||
|
<P>You'll need to make some changes to the database schema in your PostgreSQL database. <code><i>[dspace-1.2-source]</i>/etc/database_schema_11-12.sql</code> contains the SQL commands to achieve this. If you've modified the schema locally, you may need to check over this and make alterations.</P>
|
||||||
|
|
||||||
|
<P>To apply the changes, go to the source directory, and run:</P>
|
||||||
|
|
||||||
|
<pre>psql -f etc/database_schema_11-12.sql [DSpace database name] -h localhost</pre>
|
||||||
|
</LI>
|
||||||
|
|
||||||
|
<LI><P>A tool supplied with the DSpace 1.2 codebase will then update the actual data in the relational database. Run it using:</P>
|
||||||
|
|
||||||
|
<PRE><i>[dspace]</i>/bin/dsrun org.dspace.administer.Upgrade11To12</PRE></LI>
|
||||||
|
|
||||||
|
<LI><P>Then rebuild the search indices:</P>
|
||||||
|
|
||||||
|
<PRE><i>[dspace]</i>/bin/index-all</PRE></LI>
|
||||||
|
|
||||||
|
<LI><P>Delete the existing symlinks from your servlet container's (e.g. Tomcat's) <code>webapp</code> sub-directory.</P>
|
||||||
|
|
||||||
|
<P>Copy the <code>.war</code> Web application files in <code><i>[dspace-1.2-source]</i>/build</code> to the <code>webapps</code> sub-directory of your servlet container (e.g. Tomcat). e.g.:</P>
|
||||||
|
|
||||||
|
<PRE>cp <i>[dspace-1.2-source]</i>/build/*.war <i>[tomcat]</i>/webapps</PRE></LI>
|
||||||
|
|
||||||
|
<LI><P>Restart Tomcat.</P></LI>
|
||||||
|
|
||||||
|
<LI><P>To get image thumbnails generated and full-text extracted for indexing automatically, you need to set up a 'cron' job, for example one like this:</P>
|
||||||
|
|
||||||
|
<PRE># Run the media filter at 02:00 every day
|
||||||
|
0 2 * * * <i>[dspace]</i>/bin/filter-media</PRE>
|
||||||
|
|
||||||
|
<P>You might also wish to run it now to generate thumbnails and index full text for the content already in your system.</P></LI>
|
||||||
|
|
||||||
|
<LI>
|
||||||
|
<P><strong>Note 1</strong>: This update process has effectively 'touched' all of your items. Although the dates in the Dublin Core metadata won't have changed (accession date and so forth), the 'last modified' date in the database for each will have been changed.</P>
|
||||||
|
|
||||||
|
<P>This means the e-mail subscription tool may be confused, thinking that all items in the archive have been deposited that day, and could thus send a rather long email to lots of subscribers. So, it is recommended that you <strong>turn off the e-mail subscription feature for the next day</strong>, by commenting out the relevant line in DSpace's cron job, and then re-activating it the next day.</P>
|
||||||
|
|
||||||
|
<P>Say you performed the update on 08-June-2004 (UTC), and your e-mail subscription cron job runs at 4am (UTC). When the subscription tool runs at 4am on 09-June-2004, it will find that everything in the system has a modification date in 08-June-2004, and accordingly send out huge emails. So, immediately after the update, you would edit DSpace's 'crontab' and comment out the <code>/dspace/bin/subs-daily</code> line. Then, after 4am on 09-June-2004 you'd 'un-comment' it out, so that things proceed normally.</P>
|
||||||
|
|
||||||
|
<P>Of course this means, any <em>real</em> new deposits on 08-June-2004 won't get e-mailed, however if you're updating the system it's likely to be down for some time so this shouldn't be a big problem.</P>
|
||||||
|
</LI>
|
||||||
|
|
||||||
|
<LI>
|
||||||
|
<P><strong>Note 2:</strong> After consulation with the OAI community, various OAI-PMH changes have occurred:</P>
|
||||||
|
|
||||||
|
<UL>
|
||||||
|
<LI><P>The OAI-PMH identifiers have changed (they're now of the form <code>oai:<em>hostname</em>:<em>handle</em></code> as opposed to just Handles)<P></LI>
|
||||||
|
|
||||||
|
<LI><P>The set structure has changed, due to the new sub-communities feature.</P></LI>
|
||||||
|
|
||||||
|
<LI><P>The default base URL has changed</P></LI>
|
||||||
|
|
||||||
|
<LI><P>As noted in note 1, every item has been 'touched' and will need re-harvesting.</P></LI>
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
<P>The above means that, if already registered and harvested, you will need to re-register your repository, effectively as a 'new' OAI-PMH data provider. You should also consider posting an announcement to the <A HREF="http://www.openarchives.org/mailman/listinfo/OAI-implementers">OAI implementers e-mail list</A> so that harvesters know to update their systems.</P>
|
||||||
|
|
||||||
|
<P>Also note that your site may, over the next few days, take quite a big hit from OAI-PMH harvesters. The resumption token support should alleviate this a little, but you might want to temporarily whack up the database connection pool parameters in <code><em>[dspace]</em>/config/dspace.cfg</code>. See the <code>dspace.cfg</code> distributed with the source code to see what these parameters are and how to use them. (You need to stop and restart Tomcat after changing them.)</P>
|
||||||
|
|
||||||
|
<P>I realize this is not ideal; for discussion as to the reasons behind this please see relevant posts to the OAI community: <A HREF="http://openarchives.org/pipermail/oai-implementers/2004-June/001214.html">post one</A>, <A HREF="http://openarchives.org/pipermail/oai-implementers/2004-June/001224.html">post two</A>, as well as <A HREF="http://sourceforge.net/mailarchive/forum.php?thread_id=4961727&forum_id=13580">this post to the dspace-tech mailing list</A>.</P>
|
||||||
|
|
||||||
|
<P>If you really can't live with updating the base URL like this, you can fairly easily have thing proceed more-or-less as they are, by doing the following:</P>
|
||||||
|
|
||||||
|
<UL>
|
||||||
|
<LI><P>Change the value of <code>OAI_ID_PREFIX</code> at the top of the <code>org.dspace.app.oai.DSpaceOAICatalog</code> class to <code>hdl:</code></P></LI>
|
||||||
|
<LI><P>Change the servlet mapping for the <code>OAIHandler</code> servlet back to <code>/</code> (from <code>/request</code>)</P></LI>
|
||||||
|
<LI><P>Rebuild and deploy <code>dspace-oai.war</code></LI>
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
<P>However, note that in this case, all the records will be re-harvested by harvesters anyway, so you still need to brace for the associated DB activity; also note that the set spec changes may not be picked up by some harvesters. It's recommended you read the above-linked mailing list posts to understand why the change was made.</P>
|
||||||
|
</LI>
|
||||||
|
</OL>
|
||||||
|
|
||||||
|
<P>Now, you should be finished!</P>
|
||||||
|
|
||||||
|
|
||||||
|
<H2><A NAME="11_111">Updating From 1.1 to 1.1.1</A></H2>
|
||||||
|
|
||||||
|
<P>Fortunately the changes in 1.1.1 are only code changes so the update is fairly simple.</P>
|
||||||
|
|
||||||
|
<p>In the notes below <code><i>[dspace]</i></code> refers to the install directory for your existing DSpace installation,
|
||||||
|
and <code><i>[dspace-1.1.1-source]</i></code> to the source directory for DSpace 1.1.1. Whenever you see these path
|
||||||
|
references, be sure to replace them with the actual path names on your local system.</p>
|
||||||
|
|
||||||
|
<OL>
|
||||||
|
<LI><P>Take down Tomcat.</P></LI>
|
||||||
|
|
||||||
|
<LI><P>It would be a good idea to update any of the third-party tools used by DSpace at this point (e.g. PostgreSQL), following the instructions provided with the relevant tools.</P></LI>
|
||||||
|
|
||||||
|
<LI><P>In <code><i>[dspace-1.1.1-source]</i></code> run:</P>
|
||||||
|
|
||||||
|
<pre>ant -Dconfig=<i>[dspace]</i>/config/dspace.cfg update</pre></LI>
|
||||||
|
|
||||||
|
<LI><P>If you have locally modified JSPs of the following JSPs in your <code><i>[dspace]</i>/jsp/local</code> directory, you might like to merge the changes in the new 1.1.1 versions into your locally modified ones. You can use the <code>diff</code> command to compare the 1.1 and 1.1.1 versions to do this. The changes are quite minor.</P>
|
||||||
|
|
||||||
|
<PRE>collection-home.jsp
|
||||||
|
admin/authorize-collection-edit.jsp
|
||||||
|
admin/authorize-community-edit.jsp
|
||||||
|
admin/authorize-item-edit.jsp
|
||||||
|
admin/eperson-edit.jsp</PRE></LI>
|
||||||
|
|
||||||
|
<LI><P>Restart Tomcat.</P></LI>
|
||||||
|
</OL>
|
||||||
|
|
||||||
|
<H2><A NAME="101_11">Updating From 1.0.1 to 1.1</A></H2>
|
||||||
|
|
||||||
|
<P>To upgrade from DSpace 1.0.1 to 1.1, follow the steps below. Your <code>dspace.cfg</code> does not need to be changed.
|
||||||
|
In the notes below <code><i>[dspace]</i></code> refers to the install directory for your existing DSpace installation,
|
||||||
|
and <code><i>[dspace-1.1-source]</i></code> to the source directory for DSpace 1.1. Whenever you see these path
|
||||||
|
references, be sure to replace them with the actual path names on your local system.</P>
|
||||||
|
|
||||||
|
<OL>
|
||||||
|
<LI><P>Take down Tomcat (or whichever servlet container you're using).</P></LI>
|
||||||
|
|
||||||
|
<LI><P>We recommend that you upgrage to the latest version of PostgreSQL (7.3.2). Included are some <A HREF="postgres-upgrade-notes.txt">notes to help you do this</A>. Note you will also have to upgrade Ant to version 1.5 if you do this.</P></LI>
|
||||||
|
|
||||||
|
<LI><P>Make the necessary changes to the DSpace database. These include a couple of minor schema changes, and some new indices which should improve performance. Also, the names of a couple of database views have been changed since the old names were so long they were causing problems. First run <code>psql</code> to access your database (e.g. <code>psql -U dspace -W</code> and then enter the password), and enter these SQL commands:</P>
|
||||||
|
|
||||||
|
<PRE>ALTER TABLE bitstream ADD store_number INTEGER;
|
||||||
|
UPDATE bitstream SET store_number = 0;
|
||||||
|
|
||||||
|
ALTER TABLE item ADD last_modified TIMESTAMP;
|
||||||
|
CREATE INDEX last_modified_idx ON Item(last_modified);
|
||||||
|
|
||||||
|
CREATE INDEX eperson_email_idx ON EPerson(email);
|
||||||
|
CREATE INDEX item2bundle_item_idx on Item2Bundle(item_id);
|
||||||
|
REATE INDEX bundle2bitstream_bundle_idx ON Bundle2Bitstream(bundle_id);
|
||||||
|
CREATE INDEX dcvalue_item_idx on DCValue(item_id);
|
||||||
|
CREATE INDEX collection2item_collection_idx ON Collection2Item(collection_id);
|
||||||
|
CREATE INDEX resourcepolicy_type_id_idx ON ResourcePolicy (resource_type_id,resource_id);
|
||||||
|
CREATE INDEX epersongroup2eperson_group_idx on EPersonGroup2EPerson(eperson_group_id);
|
||||||
|
CREATE INDEX handle_handle_idx ON Handle(handle);
|
||||||
|
CREATE INDEX sort_author_idx on ItemsByAuthor(sort_author);
|
||||||
|
CREATE INDEX sort_title_idx on ItemsByTitle(sort_title);
|
||||||
|
CREATE INDEX date_issued_idx on ItemsByDate(date_issued);
|
||||||
|
|
||||||
|
DROP VIEW CollectionItemsByDateAccessioned;
|
||||||
|
|
||||||
|
DROP VIEW CommunityItemsByDateAccessioned;
|
||||||
|
CREATE VIEW CommunityItemsByDateAccession as SELECT Community2Item.community_id, ItemsByDateAccessioned.* FROM ItemsByDateAccessioned, Community2Item WHERE ItemsByDateAccessioned.item_id = Community2Item.item_id;
|
||||||
|
CREATE VIEW CollectionItemsByDateAccession AS SELECT collection2item.collection_id, itemsbydateaccessioned.items_by_date_accessioned_id, itemsbydateaccessioned.item_id, itemsbydateaccessioned.date_accessioned FROM itemsbydateaccessioned, collection2item WHERE (itemsbydateaccessioned.item_id = collection2item.item_id);</PRE></LI>
|
||||||
|
|
||||||
|
<LI><P>Fix your JSPs for Unicode. If you've modified the site 'skin' (<code>jsp/local/layout/header-default.jsp</code>) you'll need to add the Unicode header, i.e.:</P>
|
||||||
|
|
||||||
|
<PRE><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"></PRE>
|
||||||
|
|
||||||
|
<P>to the <HEAD> element. If you have any locally-edited JSPs, you need to add this page directive to the top of all of them:</P>
|
||||||
|
|
||||||
|
<PRE><%@ page contentType="text/html;charset=UTF-8" %></PRE>
|
||||||
|
|
||||||
|
<P>(If you haven't modified any JSPs, you don't have to do anything.)</P></LI>
|
||||||
|
|
||||||
|
|
||||||
|
<LI><P>Copy the <A HREF="install.html#javalibs">required Java libraries</A> that we couldn't include in the bundle to the source tree. For example:</P>
|
||||||
|
|
||||||
|
<PRE>cd <i>[dspace]</i>/lib
|
||||||
|
cp *.policy activation.jar servlet.jar mail.jar <i>[dspace-1.1-source]</i>/lib</PRE></LI>
|
||||||
|
|
||||||
|
|
||||||
|
<LI><P>Compile up the new DSpace code, replacing <code><i>[dspace]</i>/config/dspace.cfg</code> with the path to your current, LIVE configuration. (The second line, <code>touch `find .`</code>, is a precaution, which ensures that the new code has a current datestamp and will overwrite the old code. Note that those are back quotes.)</P>
|
||||||
|
|
||||||
|
<PRE>cd <i>[dspace-1.1-source]</i>
|
||||||
|
touch `find .`
|
||||||
|
ant
|
||||||
|
ant -Dconfig=<i>[dspace]</i>/config/dspace.cfg update</PRE></LI>
|
||||||
|
|
||||||
|
|
||||||
|
<LI><P>Update the database tables using the upgrader tool, which sets up the new ><code>last_modified</code> date in the item table:</P>
|
||||||
|
|
||||||
|
<PRE>Run <i>[dspace]</i>/bin/dsrun org.dspace.administer.Upgrade101To11</PRE></LI>
|
||||||
|
|
||||||
|
|
||||||
|
<LI><P>Run the collection default authorisation policy tool:</P>
|
||||||
|
|
||||||
|
<PRE><i>[dspace]</i>/bin/dsrun org.dspace.authorize.FixDefaultPolicies</PRE></LI>
|
||||||
|
|
||||||
|
|
||||||
|
<LI><P>Fix the OAICat properties file. Edit <code><i>[dspace]</i>/config/templates/oaicat.properties</code>. Change the line that says</P>
|
||||||
|
|
||||||
|
<PRE>Identify.deletedRecord=yes</PRE>
|
||||||
|
|
||||||
|
<P>To:</P>
|
||||||
|
|
||||||
|
<PRE>Identify.deletedRecord=persistent</PRE>
|
||||||
|
|
||||||
|
<P>This is needed to fix the OAI-PMH 'Identity' verb response. Then run <code><i>[dspace]</i>/bin/install-configs</code>.</P></LI>
|
||||||
|
|
||||||
|
|
||||||
|
<LI><P>Re-run the indexing to index abstracts and fill out the renamed database views:</P>
|
||||||
|
|
||||||
|
<PRE><i>[dspace]</i>/bin/index-all</PRE>
|
||||||
|
|
||||||
|
|
||||||
|
<LI><P>Restart Tomcat. Tomcat should be run with the following environment variable set, to ensure that Unicode is handled properly. Also, the default JVM memory heap sizes are rather small. Adjust <code>-Xmx512M</code> (512Mb maximum heap size) and <code>-Xms64M</code> (64Mb Java thread stack size) to suit your hardware.</P>
|
||||||
|
|
||||||
|
<PRE>JAVA_OPTS="-Xmx512M -Xms64M -Dfile.encoding=UTF-8"</PRE></LI>
|
||||||
|
</OL>
|
||||||
|
|
||||||
|
|
||||||
|
<HR>
|
||||||
|
|
||||||
|
<ADDRESS>
|
||||||
|
Copyright © 2002-2004 MIT and Hewlett Packard
|
||||||
|
</ADDRESS>
|
||||||
|
</BODY>
|
||||||
|
</HTML>
|
@@ -1,18 +1,10 @@
|
|||||||
#!/bin/sh
|
#!/bin/sh
|
||||||
|
|
||||||
USAGE="$0 [-d <doc-cvs-tag>] cvs-tag version"
|
USAGE="$0 cvs-tag version"
|
||||||
|
|
||||||
# Just in case you need to 'socksify' etc
|
# Just in case you need to 'socksify' etc
|
||||||
CVS_COMMAND="cvs"
|
CVS_COMMAND="cvs"
|
||||||
|
|
||||||
DOC_CVSTAG="no"
|
|
||||||
|
|
||||||
# Check for doc CVS tag
|
|
||||||
if [ "$1" = "-d" ]; then
|
|
||||||
DOC_CVSTAG=$2
|
|
||||||
shift;shift
|
|
||||||
fi
|
|
||||||
|
|
||||||
# Check we have required command-line arguments
|
# Check we have required command-line arguments
|
||||||
if [ "$#" != "2" ]; then
|
if [ "$#" != "2" ]; then
|
||||||
echo $USAGE
|
echo $USAGE
|
||||||
@@ -21,7 +13,6 @@ fi
|
|||||||
|
|
||||||
FILENAME="dspace-$2-source"
|
FILENAME="dspace-$2-source"
|
||||||
|
|
||||||
|
|
||||||
mkdir tmp
|
mkdir tmp
|
||||||
cd tmp
|
cd tmp
|
||||||
|
|
||||||
@@ -34,21 +25,6 @@ rm -f dspace/make-release-package
|
|||||||
# Or silly cvsignore files
|
# Or silly cvsignore files
|
||||||
rm -f `find dspace -name .cvsignore`
|
rm -f `find dspace -name .cvsignore`
|
||||||
|
|
||||||
# Check out docs if appropriate
|
|
||||||
if [ "$DOC_CVSTAG" != "no" ]; then
|
|
||||||
|
|
||||||
echo "Checking out docs..."
|
|
||||||
cd dspace
|
|
||||||
$CVS_COMMAND -Q export -r $DOC_CVSTAG docs
|
|
||||||
|
|
||||||
# Remove unwanted stuff
|
|
||||||
rm -f docs/.cvsignore
|
|
||||||
rm -rf docs/originals
|
|
||||||
rm -f docs/make-doc-package
|
|
||||||
|
|
||||||
cd ..
|
|
||||||
fi
|
|
||||||
|
|
||||||
echo "Creating tarball..."
|
echo "Creating tarball..."
|
||||||
mv dspace $FILENAME
|
mv dspace $FILENAME
|
||||||
|
|
||||||
|
Reference in New Issue
Block a user