Skip to main content
Blog

Using Xtext to create an editor for Elastic Search JSON Search API

By 25 mei 2014januari 30th, 2017No Comments

The Elastic Search REST API for searching can be quite overwhelming. There are lots of different queries you can do with multiple options and some of them even can be nested. The JSON format can be quite clumsy with all the brackets and colons. For this reason I prefer the Java API which provides a lot more type information.

However, sometimes it is preferable to use the REST API, e.g. to address Elastic Search directly from client-side Javascript.

As the Java API already proves, it can be quite handy to have a specific editor with autocomplete and syntax highlighting checking. After some googling I couldn’t find one, not in Eclipse or some query builder tool. So I decided to build one, at least for the most common queries. This turned out to be quite simple using Xtext. It has an excellent tutorial which helps you to get up and running in a few minutes. I won’t replicate it here, see the links below for more information. The tutorial guides you to setting up a project which compiles to an Eclipse plugin. At the core of that project is a grammar.

For anyone who has ever done anything with context-free grammars at university or somewhere else, it should be pretty recognizable. If you’ve worked with regular expressions, some of the syntax will be quite similar although it is much more powerful. I’ll give a brief introduction of the grammar as I am developing it. It currently is not finished but I hope to release it “Real Soon Now” ™. Xtext is based on Antlr to read the grammar and create the parsers and editors from it.

Let’s have a look at a simple example grammar. This was my first attempt following the Xtext tutorial, and I’ll expand a bit on the information there.

grammar com.first8.elasticsearch.dsl.SearchDsl 
        with org.eclipse.xtext.common.Terminals
generate search "http://www.first8.com/elasticsearch/dsl/SearchDsl"

Query:
  '{' Number '}';

Number:
  Float | INT;

terminal Float:	
  '-'? INT? '.' INT (('E'|'e') '-'? INT)?;

The first two lines contain some magic which is picked up by Xtext. The grammar line specifies which package names to use for generating the parsers. It also imports some Eclipse predefined values using the with keyword. One of the values defined in that Terminals file is the definition of an INT, a basic number. The generate keyword is followed by the preferred file extension for the editor we are creating. In this case, “*.search” files are automatically opened in the editor-to-be-created.

Next are the actual grammar rules definition. The first rule it encounters should describe the text to be parsed on the highest level. In this case, the entire text should match to the rule Query. A rule consists of a name followed by a colon (“:”). After the colon, you can specify which elements you expect. If you put something between quotes (”), you indicate that you expect that literal text to appear. So, in this case we expect a semicolon opening, followed by something that is a Number and then a closing semicolon.

The following should thus be a valid document:

{ 123 }

The next rule actually describes how a number should look like. Here it is defined as either a floating point (Float) or an INT. The pipe (‘|’) stands for an ‘OR’ operation, similar as in regular expressions. The INT is defined in the imported Terminals file. The Float is defined in the next rule and is a bit more complex and I basically copied it from an example JSON grammar. The question marks indicate that the element before it is optional. Brackets are used to group things together. So what this rule says is that a float can optionally start with a minus sign, then optionally some numbers. It should at least contain a dot followed by more numbers. And optionally again, there is the E notation for scientific exponent notation, again followed by an optionally negative number.

So with a reasonably simple grammer (see below), you can generate an editor which will have autocomplete and syntax checking out of the box:
Elastic Search Query DSL editor

To give an impression, here is a shortened version of the grammar I am working on and which fuels the editor shown above:

grammar com.first8.elasticsearch.dsl.SearchDsl 
        with org.eclipse.xtext.common.Terminals

generate search "http://www.first8.com/elasticsearch/dsl/SearchDsl"

SearchDSL:
	'{' 
		(search+=SearchField)
		(',' search+=SearchField)+
	'}';

SearchField:
	( From | Size | Query); 

From:
	'"from"' ':' Number;

Size:
	'"size"' ':' Number;
Query:
	'"query"' ':' '{' queryType=QueryType '}';

QueryType:
	MatchQuery | TermQuery  ;	

TermQuery:
	'"term"' ':' Object;

MatchQuery:
	'"match"' ':' (ShortMatchObject | MatchObject);

ShortMatchObject:
	'{' fieldName=STRING ':' query=STRING '}' ;	

MatchObject:
	'{' fieldName=STRING ':' MatchObjectParameters '}' ;	

MatchObjectParameters:
	'{' 
		'"query"' ':' STRING
		(',' '"operator"' ':' (AND_OR))?
		(',' '"zero_terms_query"' ':' (NONE_ALL))?
		(',' '"cutoff_frequency"' ':' Number)?

	'}'
;

terminal NONE_ALL:
	'"none"' | '"all"';

terminal AND_OR:
	'"and"' | '"or"';

Object:
	'{' (Member)? (',' members+=Member)* '}';

Member:
	key=STRING ':' value=Value;

Value:
	Object | STRING | Array | Boolean | Null | Number;

Array:
	'[' (values+=Value)? (',' values+=Value)* ']';

terminal Boolean:
  'true' | 'false';

terminal Null:
  'null';

terminal Number:
	Float | INT;

terminal Float:	
	'-'? INT? '.' INT (('E'|'e') '-'? INT)?;

More reading: