PDA

View Full Version : Remote search. returning sorted results with start and count. Best practices



Fredric Berling
7 Nov 2009, 5:19 AM
Hi all you yellow bleeding heroes.
Ive been using Ext JS for some time now and it works very well together with Domino. This question could be misplaced since it has nothing to do with Ext.nd (wich i love). But please help me out here.

Since all Lotus Notes users are used to browsing their documents in views, they often request this in their webbed apps to. right?. This might not seem like a big problem since we have a Ext.nd generating our views beatifully. but i often have to handle very large search results, say 2000 and more...This is where the problem starts.

I make these searches with ajax calls to a domino agent that searches the ftindex of the database and returns a notesdocumentcollection wich is then printed out as JSON . Problem is that the results from a ftindex search are returned unsorted. And i often want them sorted on a specific field, so the notesdocumetcollection needs to be resorted every time.The result must also handle pagination (start and count) , so i need to sort the entire notesdocumentcollection just to be able to return 50 rows of JSON. This causes serious performance issues even on very fast servers.

So whats the alternatives?

Find a quicker sorting algorithm for notesdocumentcollections?
Maybe...but its a short time solution.What if the result is 20 000 documents

Searching with views?
No.. The result are not sorted by field, sorting 4 doesnt work with start and count parameters.

Export my documents to SQL and use this a a fake index?
Maybe..anyone tries this? (i like the idea)

Google Search Appliance
There seems to be a box you could buy that indexes domino content, but i dont know i i could query this with my own calls from Ajax. Anyone tried this approach?

Other 3rd party full text indexer Domino
Please share your knowledge about those.

Ask IBM to include a sortby parameter that actually sorts the result from the ftindex search?
Its a miracle that it isnt there. And im not very hopeful it will turn up anytime soon.


Please help me get a discussion started on this matter. We need a solid future proof way to access large amount of sorted data from Domino.

markroberts906
9 Nov 2009, 7:56 AM
Hi Fredric,

Here might be another option for you I can toss out there...

Use your agent and instead of returning JSON have it toss those documents into a folder. I would imagine you could use the name of person or such to identify that folder and use it for each search they do, clearing the folder at start of your agent. Then return instead with script to draw your view/folder into the display.

I haven't thought it completely through so not sure if you would run into other hurdles with that however it might give you an idea to jog your thinking in a direction.

Hope it helps some, good luck on your solution either way.

Mark

Fredric Berling
10 Nov 2009, 10:18 AM
Thanks for your input, but im not sure how it would be faster to move several thousands of documents to a folder. And a folder has no sorting better that a view.. And i really need the jSON...

I have now started testing an open source solution called Apache SolR.
Solr uses the Lucene Java search library at its core for full-text indexing and search. Supports highlighting etc
Im running it under Tomcat and it seems very fast.
You can feed the search engine index thru a Java API or thru Ajax call posting small chunks of XML.

Its fast as fu..k .. and handles Start, Count, and Sort perfectly. Im just in the testing face , but it seems to work very well. Anyone written any Java agent for delivering index content to SolR? (fonetic: "Solar")

Fredric Berling
23 Nov 2009, 9:12 AM
Solr accepts the parameter "json.wrf" wich wraps the result in a function name.

This makes it possible to run this on a TomCat server next to your Domino server. Cross domain..No problem.


proxy: new Ext.data.ScriptTagProxy({
url: 'http://solr.mydomain.com:8080/solrdev/select/?',
method: 'GET',
callbackParam:'json.wrf'
}),
baseParams: {
q: 'test',
start:'0',
version:'2.2',
rows:'10',
indent:'on',
sort:'subject asc',
wt:'json'
},
reader: new Ext.data.JsonReader({
root: "response.docs",
id: "id",
idProperty: "id",
successProperty: "success",
totalProperty: "totalcount",
fields: ["subject", "id"]
}),

jratcliff
23 Nov 2009, 9:29 AM
Hi Fredric,

So how does Solr search the view/database? Is it searching the full-text index that domino generates? Or are you writing code to connect to the database and calling the domino .FTSearch method?

Jack

Fredric Berling
24 Nov 2009, 12:24 AM
SolR is a Lucene search engine, so it has its own index that you must control from the outside. This is one of the main advantages for me. I just want SOME stuff in the index. And have control over it. ftIndex has its limitations as i state earlier in this post, plus you get all the nice stuff that comes with a serious search engine.

fields and fieldtypes are described in a Schema where you have LOADS of options on how to handle that specific field

There are several options for updating index in Solr

It is meant to act RESTful (Or do i mean CRUDful), so you can post changes to the index with Ajax calls or thru the Java API. There are also extensions for other ways to do it.

There are also a number of ImportDataHandlers you can configure in SolR that can fetch the complete index from your source thrue JDBC or other connectors, for full rebuilds etc. I havent looked into wich of these fit DOmino the best, but im sure we could use the views with some OutputFormat for this.

Right now im exporting XML files from my notesviews and use a small JAR tool (comes with the package) to import them in my index, but i will probably write a Java Agent that updates it thru the API. Im not that good at Java so i was hoping someone else had already done this. :)

Zakaroonikov
30 Nov 2009, 4:36 PM
This search server sounds very interesting. So the general idea is create some agent in domino that when called will spit out XML representing index update requests. It seems like a lot of overhead to keep this index up to date but may be worth it if you can get away from the Lotus Domino black box search index.

We currently use the domain index with a customized results set to produce our search results. Problem is that there is NO documentation on how to optimize the search results. The search summary field in these results appears to pick a random point in the rich text field. This would be adequate if we did not store HTML content in these fields. The end result is broken HTML tags in the summary.

It is a shame that there is no code already available because after creating the agent you may find the server impact to too large to make this solution viable.

breckster
10 Dec 2009, 8:53 PM
We are experimenting with this appliance. Too soon to tell if it will do the job.

yogd
5 May 2010, 7:41 AM
Hi

I have seen your posts regarding solr and domino integration. How was your testing with solr and domino? Did you proceed with that and implemented in production? How did you pump domino data into lucene index? Did you write any domino connector to do that?

Thanks,


Thanks for your input, but im not sure how it would be faster to move several thousands of documents to a folder. And a folder has no sorting better that a view.. And i really need the jSON...

I have now started testing an open source solution called Apache SolR.
Solr uses the Lucene Java search library at its core for full-text indexing and search. Supports highlighting etc
Im running it under Tomcat and it seems very fast.
You can feed the search engine index thru a Java API or thru Ajax call posting small chunks of XML.

Its fast as fu..k .. and handles Start, Count, and Sort perfectly. Im just in the testing face , but it seems to work very well. Anyone written any Java agent for delivering index content to SolR? (fonetic: "Solar")

Fredric Berling
7 May 2010, 12:10 AM
Hi yogd

I have been using Solr With domino for quite som time now in a product development phase. The automatic pump between Domino and SolR is not developed yet. It will probably be java-agent with SolR Java API against SolR. This agent will be called on every doc.save and will also be able to do a full index update/replace. Until then i just produce a XML -file from my domino data that i Import into SolR.