Last year I extended the
CDK SMARTS implementation to match component groupings and stereochemistry. Specifying stereochemistry presents some interesting logical predicate that might be tricky to handle.
Here are some examples that I came up with for testing the correctness of query handling. They start simple before getting a little mischievous. First, recursion and component grouping.
query | targets | nmatch | Comment |
Component grouping (fragment) |
(O).(O) | O=O | 0 | Example from Daylight |
| OCCO | 0 |
| O.CCO | 2 |
Component grouping (connected) |
(O.O) | O=O | 2 | Example from Daylight |
| OCCO | 2 |
| O.CCO | 0 |
Recursion, ad infinitum |
[$(CC[$(CCO),$(CCN)])] | CCCCO | 1 | |
| CCCCN | 1 |
| CCCCC | 0 |
Recursive component grouping |
[O;D1;$(([a,A]).([A,a]))][CH]=O | OC=O.c1ccccc1 | 1 | Feature/Bug #1312 |
| OC=O | 0 |
These next ones are concerned with logic and stereochemistry.
query | targets | nmatch | Comment |
Ensure local stereo matching |
*[@](*)(*)(*) | O[C@](N)(C)CC | 12 | tetrahedrons have 12 rotation symmetries |
| O[C@@](N)(C)CC | 12 |
| O[C](N)(C)CC | 0 |
Implicit (hydrogen or lone-pair) neighbour |
CC[S@](C)=O | CC[S@](C)=O | 1 |
| CC[S@@](C)=O | 0 |
| CC[S](C)=O | 0 |
Either (tetrahedral) |
CC[@,@@](C)O | CC[C@H](C)O | 1 |
| CC[C@@H](C)O | 1 |
| CCC(C)O | 0 |
Both (tetrahedral) |
CC[@&@@](C)O | CC[C@H](C)O | 0 |
| CC[C@@H](C)O | 0 |
| CCC(C)O | 0 |
Respect logical precedence 1 |
CC[@,Si@@](C)O | CC[C@H](C)O | 1 |
| CC[C@@H](C)O | 0 |
| CCC(C)=O | 0 |
Respect logical precedence 2 |
CC[C@,Si@@](C)O | CC[C@H](C)O | 1 |
| CC[C@@H](C)O | 0 |
| CCC(C)O | 0 |
| CC[Si@H](C)O | 0 |
| CC[Si@@H](C)O | 1 |
| CC[Si](C)O | 0 |
Unspecified |
CC[@@?](C)O | CC[C@H](C)O | 0 |
| CC[C@@H](C)O | 1 |
| CCC(C)O | 1 |
Negation |
CC[!@](C)O | CC[C@H](C)O | 0 | !@@ is also equivalent to @? |
| CC[C@@H](C)O | 1 |
| CCC(C)O | 1 |
Neither (tetrahedral) using 'or unspecified' |
CC[@?@@?](C)O | CC[C@H](C)O | 0 |
| CC[C@@H](C)O | 0 |
| CCC(C)O | 1 |
Neither (tetrahedral) using negation |
CC[!@!@@](C)O | CC[C@H](C)O | 0 |
| CC[C@@H](C)O | 0 |
| CCC(C)O | 1 |
Either (geomeric) |
C/C=C/,\C | C/C=C/C | 1 |
| C/C=C\C | 1 |
| CC=CC | 0 |
Neither (geomeric) |
C/C=C!/!\C | C/C=C/C | 0 |
| C/C=C\C | 0 |
| CC=CC | 1 |
The last two are quite tricky (and not currently implemented) but once the atom-centric handling is correct it's a simple reduction. It's quite fun to work out so i'll
leaf that up to the reader.