From 0ab5d3bad40f72353add73aa5834b6b8a43fbc1e Mon Sep 17 00:00:00 2001
From: Stuart Lewis
Date: Tue, 28 Aug 2007 11:05:15 +0000
Subject: [PATCH] SF Patch #1591871 Docs for google and html sitemaps
git-svn-id: http://scm.dspace.org/svn/repo/trunk@2170 9c30dcfa-912a-0410-8fc2-9e0234be79fd
---
.../dspace/app/sitemap/GenerateSitemaps.java | 5 ++-
dspace/CHANGES | 5 ++-
dspace/bin/generate-sitemaps | 2 +-
dspace/docs/install.html | 33 +++++++++++++++++++
4 files changed, 42 insertions(+), 3 deletions(-)
diff --git a/dspace-api/src/main/java/org/dspace/app/sitemap/GenerateSitemaps.java b/dspace-api/src/main/java/org/dspace/app/sitemap/GenerateSitemaps.java
index 38db2604a7..1b913773d9 100644
--- a/dspace-api/src/main/java/org/dspace/app/sitemap/GenerateSitemaps.java
+++ b/dspace-api/src/main/java/org/dspace/app/sitemap/GenerateSitemaps.java
@@ -188,7 +188,10 @@ public class GenerateSitemaps
File outputDir = new File(ConfigurationManager
.getProperty("dspace.dir"), "sitemaps");
-
+ if (!outputDir.exists()) {
+ outputDir.mkdir();
+ }
+
AbstractGenerator html = null;
AbstractGenerator sitemapsOrg = null;
diff --git a/dspace/CHANGES b/dspace/CHANGES
index a83c5bc595..373919b877 100644
--- a/dspace/CHANGES
+++ b/dspace/CHANGES
@@ -58,10 +58,14 @@
(Stuart Lewis)
- SF Patch #1737792 Patch for bug 1552760 - Submit interface looks bad in Safari
- Removal of message for Netscape users in choose-file.jsp + removal of supporting text and images in the help file
+- SF Patch #1591871 Docs for google and html sitemaps
(Chris yates)
- SF Patch #1724330 Removes "null" being displayed in community-home.jsp
+(Robert Tansley / Stuart Lewis)
+- SF Patch #1587225 Google and html sitemap generator
+
1.4.2 beta
===========
@@ -111,7 +115,6 @@
(Stuart Lewis)
- SF Patch #1670110 for SF Bug #1670106 Onebox and textarea fail when visibility set to workflow
- SF Patch #1628889 Improve file size descriptions in ItemTag
-- SF Patch #1587225 Google and html sitemap generator
- SF Patch #1641678 [dspace]/bin scripts for import and export
- SF Patch #1642336 Restrict domains of self-registered users
diff --git a/dspace/bin/generate-sitemaps b/dspace/bin/generate-sitemaps
index 80a5ce4c20..1f9407b579 100644
--- a/dspace/bin/generate-sitemaps
+++ b/dspace/bin/generate-sitemaps
@@ -47,5 +47,5 @@
# Get the DSPACE/bin directory
BINDIR=`dirname $0`
-echo "Generating HTML sitemaps"
+echo "Generating sitemaps"
$BINDIR/dsrun org.dspace.app.sitemap.GenerateSitemaps $@
diff --git a/dspace/docs/install.html b/dspace/docs/install.html
index e11e903af6..452f3bd613 100644
--- a/dspace/docs/install.html
+++ b/dspace/docs/install.html
@@ -528,6 +528,39 @@ $JAVA_HOME/bin/keytool -genkey -alias tomcat -keyalg RSA -keysize 1024 \
will change any handles currently assigned prefix 123456789 to prefix 1303, so for example handle 123456789/23 will be updated to 1303/23 in the database.
+
+
+ To aid web crawlers index the content within your repository, you can make use of sitemaps. There are currently two forms of sitemaps included in DSpace; Google sitemaps and HTML sitemaps.
+
+ Sitemaps allow DSpace to expose it's content without the crawlers having to index every page. HTML sitemaps provide a list of all items, collections and communities in HTML format, whilst Google sitemaps provide the same information in gzipped XML format.
+
+ To generate the sitemaps, you need to run [dspace]/bin/generate-sitemaps This creates the sitemaps in [dspace]/sitemaps/
+
+ The sitemaps can be accessed from the following URLs:
+
+
+ - http://dspace.example.com/dspace/sitemap - Index sitemap
+ - http://dspace.example.com/dspace/sitemap?map=0 - First list of items (up to 50,000)
+ - http://dspace.example.com/dspace/sitemap?map=n - Subsequent lists of items (e.g. 50,0001 to 100,000) etc...
+
+
+ HTML sitemaps follow the same procedure:
+
+
+ - http://dspace.example.com/dspace/htmlmap - Index sitemap
+ - etc...
+
+
+
+
+ When running [dspace]/bin/generate-sitemaps the script informs Google that the sitemaps have been updated. For this update to register correctly, you must first register your Google sitemap index page (/dspace/sitemap) with Google at http://www.google.com/webmasters/sitemaps/. If your DSpace server requires the use of a HTTP proxy to connect to the Internet, ensure that you have set http.proxy.host and http.proxy.port in [dspace]/config/dspace.cfg
+
+ The URL for pinging Google, and in future, other search engines, is configured in [dspace]/config/dspace.cfg using the sitemap.engineurls setting where you can provide a comma-separated list of URLs to 'ping'.
+
+
You can generate the sitemaps automatically every day using an additional cron job:
+
+ # Generate sitemaps
0 6 * * * [dspace]/bin/generate-sitemaps
+