Search indexing of documents
Documents can be included in the search index. Thus, content in external files can be found in the search. Indexing includes data and metadata of the documents.
The following document types are supported.
Type |
XML-based |
Supported |
Not supported |
File extension |
---|---|---|---|---|
Excel |
Yes |
from Excel '97-2003 file format, Excel 2007+ .xlsx OOXML |
No limitation known |
xls, xla, xlw, xlt |
Word |
Yes |
from Word '97(-2007) file format, Word 2007+ .docx OOXML |
No limitation known |
doc, dot |
PowerPoint |
Yes |
from Powerpoint 2007+ .pptx OOXML |
No limitation known |
ppt, pps, ppa, pot |
Adobe PDF |
No |
No limitation known |
No limitation known |
|
Open Document Format or Open Office 2.0 |
Yes |
No limitation known (but possibly amplified 'noise') |
No limitation known (but possibly amplified 'noise') |
odg, dtg, odp, otp, odt, ott, odf, ods, ots |
Open Office 1.0 |
Yes |
No limitation known (but possibly amplified 'noise') |
No limitation known (but possibly amplified 'noise') |
sxd, std, sxi, sti, sxw, stw, sxc, stc, sxm |
Star Office StarDraw 3.0, StarImpress 5.0, 4.0 StarDraw 5.0 StarWriter 5.0 / 4.0 / 3.0 StarMath 5.0 StarCalc 5.0 / 4.0 / 3.0 |
Yes |
Xml-based (very likely amplified 'noise') |
No limitation known |
vor, sdd, sda, sdw, smf, sdc |
Xml |
Yes |
Amplified noise |
No limitation known |
xml (consider configuration settings, so that also mindmaps etc. are searchable). |
Requirements:
-
In the schema, the Include objects in index option is enabled for the Document category.
-
The old index (folder with the same name in the database directory) must be deleted before Aeneis is started.
See also: Delete index
-
In the Portal report, the Document category must be referenced in the Searched Categories entry.
Limitations:
-
There is no guarantee of the completeness of the indexing of the contents of documents. (This also depends on the functionality of underlying libraries.) This applies especially to unsupported document types, but also to the supported ones.
-
The memory requirements for the index can increase drastically.
-
The duration of the queries can possibly be slowed down considerably by the contents of the documents. This depends on the performance of the search engine "Lucene".
-
Especially with files in xml-based format, unwanted indexing of file format information may occur, which is actually not useful in the search (values like 'true', 'false', coordinates etc.). → Called 'noise' above.
-
Any issues arising in these contexts that do not result in exceptions can generally be handled in support cases as an enhancement but not as a bug.