Commit Graph

102 Commits

Author SHA1 Message Date
Jean-Yves Gaulier
5947468eaa PHRAS-716 #time 10m 2015-09-21 16:31:55 +02:00
Jean-Yves Gaulier
a02b3961bf PHRAS-716 #time 1d
fix : operation list by databox
2015-09-21 15:58:43 +02:00
Jean-Yves Gaulier
5146881076 PHRAS-716 #time 3d
fix : admin/base progress bar (indexation)
fix : reindex button
fix : indexation does not skip records between fetches
2015-09-17 11:03:42 +02:00
Mathieu Darse
1fa0b511b1 Remove performance degradation with some subfields 2015-08-31 19:18:10 +02:00
Mathieu Darse
a51fe87f11 Highlight using raw field data too 2015-08-31 19:17:46 +02:00
Mathieu Darse
5bbd4bfe33 Fix values lookup on thesaurus hydrator 2015-07-24 18:13:18 +02:00
Jean-Yves Gaulier
82a8c97790 #PHRAS-606 #time 5m
fix: mime is forced into es
2015-07-21 18:23:21 +02:00
Jean-Yves Gaulier
00f8c8735d #PHRAS-606 #time 4h
fix default "binary" thumbnails for stories
nb : must re-populate
2015-07-21 16:54:36 +02:00
Jean-Yves Gaulier
d012978508 PHRAS-504 #time 8h
fix: prod / cr-lf in metadata is ok for aggregate  filter
cr,lf,crlf are normalized
- getting field values from recordadapter
- during es indexation (direct sql read in metadata...)
- before querying
2015-07-15 11:10:38 +02:00
Jean-Yves Gaulier
90ae2f2474 PHRAS-454 #time 4h 2015-06-22 17:41:32 +02:00
jygaulier
e4b5b86d97 Merge pull request #1397 from jygaulier/PHRAS-582_INDEX_REVERSE_ORDER
PHRAS-582 #time 10 m
2015-06-22 14:30:44 +02:00
Mathieu Darse
418eac8e5a Only create thesaurus filters if needed 2015-06-19 21:53:08 +02:00
Jean-Yves Gaulier
b21d3ded24 PHRAS-582 #time 10 m
populate : records are indexed from newer to older (rid desc)
2015-06-11 17:56:08 +02:00
Jean-Yves Gaulier
bea0c2a6e6 PHRAS-462 #time 5 h
populate : candidates are written under the good field
2015-06-11 17:31:23 +02:00
Mathieu Darse
1e4669c122 Test structure more extensively & make isPrivate() throw on invalid field name 2015-06-03 19:45:48 +02:00
Mathieu Darse
421684757a Refactor merged field structure
Here is the new model:

+-----------------------------+
|          Structure          |
+-----------------------------+
| +createFromDataboxes()      |
| getAllFields()              |
| getUnrestrictedFields()     |
| getPrivateFields()          |
| getFacetsFields()           |
| getThesaurusEnabledFields() |
| getDateFields()             |
|- - - - - - - - - - - - - - -|
| add()                       |
| get()                       |
| typeOf()                    |
| isPrivate()                 |
+-------+-+-+-----------------+
        | | |          +---------------------+
        | | +--------> |        Field        |
        | |            +---------------------+
        | |            | getName()           |
        | |            | getType()           |
        | |            | isXXX()             |
        | |            | getThesaurusRoots() |
        | |            +---------------------+
        | |
        | |            +-------+
        | +----------> | Field |
        |              +-------+
        |
        |              +-------+
        +------------> | Field |
                       +-------+

It was driven by the following use cases:
- Get list of facets (only searchable fields)
- Get list of fields with concept inference
- Get list of all fields
    - Splitted in private / public fields (to define mapping)
- Get all date fields
- Get field type
    - To apply sanitization rules
    - To define mapping
- Check if concept inference enabled
- Check if the field is searchable
- Check if the field is a facet
- Check if the field is private
- Dereference field from label (still to be done)

(The last two UCs are new)

Also removed old code from legacy search engines.

[#PHRAS-500]
2015-06-03 19:45:48 +02:00
Benoît Burnichon
3856622d48 Merge pull request #1361 from mdarse/thesaurus-prefixes
Thesaurus prefixes in field structure (the return)
2015-05-11 12:07:48 +02:00
Nicolas Le Goff
2d5a36f5a2 Add highlights query 2015-05-05 11:09:28 +02:00
Mathieu Darse
f3a4b35420 Make ThesaurusHydrator code style uniform 2015-04-30 16:47:41 +02:00
Mathieu Darse
3e2c2da4a2 Fix typos 2015-04-30 16:47:01 +02:00
Mathieu Darse
feb7fd057e Add some indexer logging 2015-04-29 20:43:20 +02:00
Mathieu Darse
24bcdba635 Handle field root concepts (prefixes) on indexing 2015-04-29 20:42:51 +02:00
Mathieu Darse
c6075fcc1a Thesaurus prefixes in field structure
Also fixes candidates collected from all string fields
2015-04-22 20:46:37 +02:00
Mathieu Darse
4785fbc8ed Fix indexing issue with date fields
- Date and number types sanitization
- Remove `RecordIndexer` dependency on `ElasticSearchEngine`
- Move some sanitization from `RecordIndexer` to `RecordHelper`
2015-04-15 18:28:12 +02:00
Benoît Burnichon
e5c1cab623 Merge pull request #1347 from mdarse/facets-strict-match
Strict facet matching
2015-04-10 17:29:46 +02:00
Mathieu Darse
38465a591f Strict facet matching
Clicking on a facet value on the left pane now return the expected result count.

This commit implement a new "raw" matcher. It can be used like
`r"some raw value"`. It operate on the the `.raw` multi-field and skips all
analysis.
Escaping `"` is supported by prepending a backslash `\"`. You can also escape
the escaping character `\` by doubling it (`\\`).

Adds a new `ContextAbleInterface` to differenciate matcher supporting an
optional context from those who can't.

Fixes an issue with `QueryContext::narrowToFields()` ignoring passed fields.
2015-04-09 20:32:13 +02:00
Nicolas Maillat
8a977db621 Audio sample rate may not be an integer. 2015-04-02 18:52:41 +02:00
Benoît Burnichon
a814208a6b Merge pull request #1315 from mdarse/cross-fields-multi-word-query
Working cross-fields queries with multiple words (without operators)
2015-03-24 18:42:13 +01:00
Mathieu Darse
771aa5b765 Working cross-fields queries with multiple words (without operators)
- Index the full content of a record in a (private_)content_all field
- Handle all fields wide search as a special-case (drastically simplify queries)
- QueryContext doesn't take all allowed fields anymore, but whether private
fields are allowed or not. Since private fields are namespaced, field level
restriction is not needed anymore.
2015-03-24 17:52:30 +01:00
Mathieu Darse
a3dae412f1 Index MIME type 2015-03-19 19:00:56 +01:00
Benoît Burnichon
6c825f4582 Add logger to Elastic SearchEngine 2015-03-16 13:52:28 +01:00
Nicolas Le Goff
315e780fb9 Merge pull request #1298 from mdarse/facet-collections
Add collection facet
2015-03-12 19:49:05 +01:00
Mathieu Darse
0ec75c85b3 Add collection facet 2015-03-12 19:47:31 +01:00
Mathieu Darse
71c7fd8adb Enhance thesaurus strict mode 2015-03-12 19:13:13 +01:00
Mathieu Darse
4d7ea8debb Throw on failing bulk operation item 2015-03-11 17:06:51 +01:00
Benoît Burnichon
118bb2f03c Some fixup for ES instance 2015-03-11 15:22:20 +01:00
Mathieu Darse
400ecad8e6 Use thesaurus bulk API for indexing 2015-03-10 14:53:32 +01:00
Mathieu Darse
5d3f8b3123 Merge pull request #35 from nlegoff/poplate-by-database
Add possibilty to explicitely set database to populate
2015-03-10 11:55:10 +01:00
Mathieu Darse
e889d19b7d Filter databoxes earlier (in Indexer) and databox id error handling 2015-03-10 11:52:30 +01:00
Mathieu Darse
ac42daa062 Fill candidate terms while indexing 2015-03-05 14:49:26 +01:00
Mathieu Darse
c917c7f952 Thesaurus matching while indexing records
- Add a new hydrator to ask query thesaurus on the fly
- Add a filtering system on thesaurus
- And a databox filter friend
2015-03-03 18:50:34 +01:00
Nicolas Le Goff
4daf40029a Add possibilty to explicitely set database to populate 2015-02-27 16:23:26 +01:00
Mathieu Darse
6bf03de2ca Refactor metadata indexing
- Caption fields are multi-valued by default
- Simpler query
- Technical data (EXIF) is always single-valued
- Error handling
2015-02-26 17:09:16 +01:00
Nicolas Le Goff
6d580ac5fd Fix bitfield key 2015-02-24 15:27:04 +01:00
Nicolas Le Goff
c99f2e7746 Fix sql field name
Signed-off-by: Mathieu Darse <mdarse@jolicode.com>
2015-02-23 17:50:26 +01:00
Nicolas Le Goff
111755fa9b Refactor status display 2015-02-19 15:14:23 +01:00
Mathieu Darse
2faec3686e Refactor indexer job 2015-02-18 18:58:19 +01:00
Mathieu Darse
23ccca28c7 Fetcher post fetch hook 2015-02-18 18:53:53 +01:00
Mathieu Darse
9dc653c543 Move bulk operation into indexer namespace 2015-02-18 12:19:10 +01:00
Mathieu Darse
123b700685 Move fetcher with indexer stuff and his friend classes 2015-02-18 12:06:04 +01:00