Searching

Two types of search methods are currently supported by SCAM - RDQL and freetext. The Search EJB is responsible for executing all queries.

RDQL

RDQL (RDF Data Query Language) is a query language for RDF similar to SQL. In short it is a triple-matching syntax. There are some limitations of this language compared to SQL, for instance you can not use the OR operation between triples. Furthermore the query-engine that is currently used (part of Jena) has some performance issues. All triples matching the first RDQL-triple will be extracted from the database and all following queries will use this subset. Potentially this solution can overwhelm the amount of data that is being read from the database. RDQL string matching (parts of strings) is also performed in-memory having the same effect, as is all RDQL-AND constructions.

Example 11.1. RDQL: Retrieve all objects of a known property of a known resource


    SELECT ?x
    WHERE  (<http://somewhere/res1>, <http://somewhere/pred1>, ?x)

    

Example 11.2. RDQL: Constraints


    SELECT ?a, ?b
    WHERE  (?a, <http://somewhere/pred1>, ?b)
    AND    (?b < 5)

    

Example 11.3. RDQL: Paths


    SELECT ?a, ?b
    WHERE (?a, <some:pred1>, ?c) ,
          (?c, <some:pred2>, ?b)
    USING some FOR <http://somewhere/>

    

Example 11.4. RDQL: Stringmatching, object contains the substring 'hello'


    SELECT ?a
    WHERE  (?a, <some:pred1>, ?c)
    AND    (?c ~~ "hello")
    USING some FOR <http://somewhere/>

    

Freetext

The freetext search is all about finding a Component having any object (value) matching the supplied pattern. This method would not be required if we could improve the existing RDQL query engine, since RDQL in itself can express this. The differance between them is that the Jena query engine performs string matching in-memory, SCAM freetext engine uses native SQL and can therefore take advantage of the RDBs strengths.