Alan Orth
|
abad6116cb
|
DS-4587: Update the spider user agent file
This updates the included spider user agent file to the latest from
the COUNTER-Robots project. DSpace's own copy is over five years old
and is missing a bunch of new patterns, which greatly decreases the
accuracy of the Solr usage statistics.
Port of #3333 to main for DSpace 7.x.
See: https://jira.lyrasis.org/browse/DS-4587
See: https://github.com/atmire/COUNTER-Robots/releases/tag/2021-07-05
|
2021-09-04 21:24:16 +03:00 |
|
Mark H. Wood
|
522c6fb696
|
[DS-2463] Escape pattern characters that are significant in regular expressions.
|
2015-10-18 11:01:21 -04:00 |
|
Mark H. Wood
|
014128bf2e
|
[DS-2463] Remove stale spider file; don't poll to update it; extract interesting UA patterns.
|
2015-10-18 08:43:37 -04:00 |
|
Bram Luyten
|
14a4850b0b
|
DS-2531 New entries for the robots hostname list
|
2015-04-01 16:10:29 +02:00 |
|
Hardy Pottinger
|
4f5846f2b8
|
DS-1841: adding example files for agent and domain-based spider filtering, borrowed from OSU Libraries, with much thanks
|
2013-12-12 23:17:52 +00:00 |
|
Mark Diggory
|
a5beae59c2
|
[DS-440] Adjust SpiderDownloader to download multiple files in a "config/spiders" directory relative ${dspace.dir}
git-svn-id: http://scm.dspace.org/svn/repo/dspace/trunk@4744 9c30dcfa-912a-0410-8fc2-9e0234be79fd
|
2010-02-07 16:42:56 +00:00 |
|