This year we wanted to make big improvements in the Sencha Forums. We started by adding post voting, JIRA integration and trends. Next up, we set out to make search better. Here’s what we’ve done to make search more useful for the community.
Table of Contents
The Sencha forum is built on vBulletin, which has a built-in search function. I’ve found vBulletin’s built-in search to be pretty accurate for most cases and it offers many advanced search options. The built-in search also allows us to render results using Sencha templates—which is nice. The issue with this search is that it’s not optimized for large communities, and as the community grew, we had to replace the native search with Google search for performance reasons.
However, this had its own disadvantages: Google search was very fast but you couldn’t search particular forums; accuracy wasn’t as good as native search; and, the results were not rendered within our template. In other words, we fixed the server load issue but dropped features.
Neither solution worked for our community which led us to build our own search solution for the Sencha Forums. Our goals were for the solution to be fast, flexible and easy on the servers without sacrificing accuracy. Our very top priority was to have both accuracy AND performance.
The core of the new search service is powered by Apache Solr, an open source enterprise search platform from the Apache Lucene project. At the heart of Apache Solr is Lucene Core, an advanced search engine library implemented in Java and designed to be highly accurate, very fast at indexing and searching documents at large scale. Apache Solr extends Lucene Core to provide an easy to use RESTful API, additional data types, advanced text analysis, query parser extensions, full and delta import capabilities, and much more.
Apache Solr was a perfect fit for this project for the high performance search capabilities it is well-known for. Beyond high performance, we needed our search solution to return results scored according to relevancy rules we could control and change over time as needed. We also wanted to provide a search query interface that could be easy to use by anyone yet powerful and flexible for the power-user. Solr gave us the control and flexibility we needed for our specific search requirements.
Our new search is very flexible. You can use queries that are simple and basic, to queries that are as advanced as you want. The new search interface allows you to search with advanced options (conditions) without opening an “advanced options” form. We’re calling this ‘smart queries’. But, you can use the “Options” dialog to add or remove conditions.
Here is an example of a smart query:
grid -title:(cell row) +author:mitchell* from:2011-01-01 to:2012-01-01 #ext4 #st2
Before we analyze this search request—a little background on syntax. Currently there are 4 fields that you can search on: title, author, from and to. For each field, you can include or exclude matches from the result values. To exclude matches for that field, just add a minus ‘-‘ sign before the field. To include matches, add a plus ‘+’ sign before the field (which is the default behavior). In the above example we’re searching on “grid” but excluding matches that have a title with ‘(cell row)’, and including matches that are written by ‘mitchell*’. The ability to include and exclude based on fields gives you flexibility and accuracy. You can also specify multiple matches for each field.
The value you pass to a field is very flexible and can be specified in multiple formats. You can specify a simple term like ‘grid’ or you can use parentheses as in our example above. Within the parentheses, you can use AND and OR explicitly. The implicit behavior within parentheses is to treat multiple terms as connected with ANDs. In other words, ‘(cell row)’ is the same as ‘(cell AND row)’. In the case of our example above, we exclude results that have ‘cell’ AND ‘row’ in the title. You can also use an asterisk as a wildcard anywhere in the value. In our example, ‘mitchell*’ will return any author with a username that begins with ‘mitchell’. You can also use the wildcard in a term within parentheses like ‘(*cell row*)’. The wildcard will match any result that ends with ‘cell’ and starts with ‘row’.
You also can search in any given date range within the “from” and “to” fields. Two things to note is that first you can only search a single date range. And second, unlike title and author fields, the “from” and “to” fields require a specific date format. The required format is YYYY-MM-DD, which is why the example was formatted as ‘2011-01-01’ and ‘2012-01-01’.
In some cases, you may want to specify what forums to search. You can do this by adding hashtags to your query. In our example above, we wanted only the Ext JS 4 and Sencha Touch 2 forums so we used ‘#ext4’ and ‘#st2’. We have pre-specified which forums fall under which hashtag. #ext4 has all the Ext JS 4 forums, and #st2 includes all the Sencha Touch 2 forums. Premium forum content is also included. If there are multiple versions of a product (Ext JS has 4 versions), we also have a more general hashtag to include all versions. For example, ‘#ext’ includes all forums related to Ext JS. Here are the available hashtags today:
Because the underlying search service is very fast we wanted the user interface to be equally fast, simple and flexible. The main search input is on the forum index page where the Google Search box once lived. To search, just enter a query and hit enter or click the search icon.
By default, results are shown in a popup modal dialog but you can see them in page-view by clicking the “Open” link at the top-right of the dialog. If you specified any matching conditions or forum hashtags, you will see them listed between the search box and the results panel in page-view. Each result highlights matched text in green and shows which forum the post is from. You can go directly to the source post by clicking on the result.
You can also add matching rules to your result set by clicking on “Options”. The Options interface is a wizard that allows you to add matching rules via a GUI (if you don’t want to type your own smart query). The Options dialog has three tabs: Conditions, Forums and Settings. The Conditions Tab allows you to add new matching rules. The Forums tab allows you to pick individual forums to include and is more granular than the hashtags we support in the search command line.
The Settings tab allows you to configure your default search settings.
- Persist forum selection—When checked, the forums you select in the Forums tab will be used in all searches.
- Grow field on focus—If checked, the search input box’s width will grow when you focus on it. If unchecked or if you are using Internet Explorer, the field will have the full width.
- Reset paging on new search—If checked, each new search will reset to the first page.
- View results as—This is where you can choose whether page view or popup view is the default display mode for search results
- Refresh page on success—If checked, when you click the “Save” button, the browser will reset to the original search page.
We hope the new Sencha Forum search is useful. We built this so you can easily and quickly find information that you need to build great applications. Since this service is for the community, if you have any suggestions or bugs, please post this on the forums and we will do our very best to address it.
At last! Thank you for attending this. The forums are such a rich mine of information and it is great that we have a better way to find what we are looking for.
A couple of questions from my testing:
1. Will you be adding the ability to search the body of posts? If a person has not been very descriptive in their title, the post that contains the information I need may not be found.
2. As a test I searched the Ext:User Extensions and Plugins for posts with “title: Ext.ux”. eg to find all the posts related to an Ext class name (or part thereof). The search string above finds no results. I tried “title: Ext.ux.grid.feature.RowFilter” which worked, but neither “title: Ext.ux.grid.feature” nor “title: Ext.ux.grid.feature.*” returned any results. How can one search for partial matches on Ext class names?
Great! Significant plus in searching.
Seams like it should be a little bit tuned. Like underlined @Murray
searching “title: Ext.ux.grid.feature” or “title: Ext.ux.grid.feature.*” has no results but using
the same string with tags’s first searching letter in small register gives us results e.g.
“title: ext.ux.grid.feature*” or “title: ext.ux.grid.feature.*”.
Also just try this simple examples without and with searching results:
“title: Ext*” and “title: ext*”
“title: Grid*” and “title: grid*”
Hope it will be not a big deal to correct this behavior.
Mitchell Simoens says
As @Nick said, you would need to use the wildcard (*) for partial word searching.
Yes, I did try that (see post: “title: Ext.ux.grid.feature.*” <-- note the * ) - it doesn't work in this case. I have tested @Nick's suggestion above and can confirm that if the first letter of the string is UpperCase you get no results but if the first letter is lowerCase you get results. Please see @Nick's post. @Nick: thanks for the workaround. I would never have thought to test case sensitive! Thanks, Murray
Kazuhiro Kotsutsumi says
I translated it into Japanese.
Provision: Japan Sencha User Group
Mitchell Simoens says
We finally found out what is going on and believe we have a fix. We are just talking about where the best place to put the fix is and it will then be fixed which should happen today.
Something I didn’t mention is that if you are searching for an asterisk then you would have to escape it in your query otherwise it will be used as a wildcard.
Thanks. I was intending the asterisk to be a wildcard – I hadn’t thought about it being part of the search. And, therefore, is the escape character a ‘/’ ? An example of escaping in the help might be useful.
Re: my first question: Any answer on searching in post body?
Mitchell Simoens says
The actual issue was that Solr was keeping the case but everything is lowercased at time of indexing. So since the data is in lowercase and when using the wildcard it kept the case nothing matched. We did push the fix last night and is working when I try: title:Ext.ux*
About the post body, that is already working. If you search for: grid that will search for the title and within the post body for grid.
That’s great! Thanks.
Re: the post body: I interpreted the line in the blog “Currently there are 4 fields that you can search on: title, author, from and to.” to mean it *only* searched those fields. I now see that it does, in fact search the body if you leave the field qualifier out. Cool!
Thanks again. I look forward to finally being able to find what I want in the forums!