add docs on configuring rewriteDownloadUrls

This commit is contained in:
Art Lowel
2020-09-18 17:34:39 +02:00
parent f96549d5c7
commit 47edd9b6d8

View File

@@ -63,3 +63,70 @@ Angulartics can be configured to work with a number of other services besides Go
In order to start using one of these services, select it from the [Angulartics Providers page](https://angulartics.github.io/angulartics2/#providers), and follow the instructions on how to configure it.
The Google Analytics script was added in [`main.browser.ts`](https://github.com/DSpace/dspace-angular/blob/ff04760f4af91ac3e7add5e7424a46cb2439e874/src/main.browser.ts#L33) instead of the `<head>` tag in `index.html` to ensure events get sent when the page is shown in a client's browser, and not when it's rendered on the universal server. Likely you'll want to do the same when adding a new service.
## SEO when hosting REST Api and UI on different servers
Indexers such as Google Scholar require that files are hosted on the same domain as the page that links them. In DSpace 7, Bitstreams are served from the REST server. So if you use different servers for the REST api and the UI you'll want to ensure that Bitstream downloads are proxied through the UI server.
In order to achieve this we'll need to do two things:
- **Proxy the Bitstream downloads through the UI server.** You'll need to put a webserver such as httpd or nginx in front of the UI server in order to achieve this. [Below](#apache-http-server-config) you'll find a section explaining how to do it in httpd.
- **Update the URLs for Bitstream downloads to match the UI server.** This can be done using a setting in the UI environment file.
### UI config
If you set the property `rewriteDownloadUrls` to `true` in your `environment.prod.ts` file, the [origin](https://developer.mozilla.org/en-US/docs/Glossary/Origin) of any download URL will be replaced by the origin of the UI. This will also happen for the `citation_pdf_url` `<meta>` tag on Item pages.
The app will determine the UI origin currently in use, so the external UI URL doesn't need to be configured anywhere and rewrites will still work if you host the UI from multiple domains.
### Apache HTTP Server config
#### Basics
In order to be able to host bitstreams from the UI Server you'll need to enable mod_proxy and add the following to the httpd config of your UI server:
```
ProxyPassMatch "/server/api/core/bitstreams/([^/]+)/content" "http://rest.api/server/api/core/bitstreams/$1/content"
ProxyPassReverse "/server/api/core/bitstreams/([^/]+)/content" "http://rest.api/server/api/core/bitstreams/$1/content"
```
Replace http://rest.api in with the correct origin for your REST server.
The `ProxyPassMatch` line forwards all requests matching the regular expression for a bitstream download URL to the corresponding path on the REST server
The `ProxyPassReverse` ensures that if the REST server were to return redirect response, httpd would also swap out its hostname for the hostname of the UI before forwarding the response to the client.
#### Using HTTPS
If your REST server uses https, you'll need to enable mod_ssl and ensure `SSLProxyEngine on` is part of your UI server's httpd config as well
If the UI hostname doesn't match the CN in the SSL certificate of the REST server (which is likely if they're on different domains), you'll also need to add the following lines
```
SSLProxyCheckPeerCN off
SSLProxyCheckPeerName off
```
These are two names for [the same directive](https://httpd.apache.org/docs/trunk/mod/mod_ssl.html#sslproxycheckpeername) that have been used for various versions of httpd, old versions need the former, then some in-between versions need both, and newer versions only need the latter. Keeping them both doesn't harm anything.
So the entire config becomes:
```
SSLProxyEngine on
SSLProxyCheckPeerCN off
SSLProxyCheckPeerName off
ProxyPassMatch "/server/api/core/bitstreams/([^/]+)/content" "https://rest.api/server/api/core/bitstreams/$1/content"
ProxyPassReverse "/server/api/core/bitstreams/([^/]+)/content" "https://rest.api/server/api/core/bitstreams/$1/content"
```
If you don't want httpd to verify the certificate of the REST server, you can also turn all checks off with the following config:
```
SSLProxyEngine on
SSLProxyVerify none
SSLProxyCheckPeerCN off
SSLProxyCheckPeerName off
SSLProxyCheckPeerExpire off
ProxyPassMatch "/server/api/core/bitstreams/([^/]+)/content" "https://rest.api/server/api/core/bitstreams/$1/content"
ProxyPassReverse "/server/api/core/bitstreams/([^/]+)/content" "https://rest.api/server/api/core/bitstreams/$1/content"
```