Last year I extended the
CDK SMARTS implementation to match component groupings and stereochemistry. Specifying stereochemistry presents some interesting logical predicate that might be tricky to handle.
Here are some examples that I came up with for testing the correctness of query handling. They start simple before getting a little mischievous. First, recursion and component grouping.
| query | targets | nmatch | Comment |
| Component grouping (fragment) |
(O).(O) | O=O | 0 | Example from Daylight |
| OCCO | 0 |
| O.CCO | 2 |
| Component grouping (connected) |
(O.O) | O=O | 2 | Example from Daylight |
| OCCO | 2 |
| O.CCO | 0 |
| Recursion, ad infinitum |
[$(CC[$(CCO),$(CCN)])] | CCCCO | 1 | |
| CCCCN | 1 |
| CCCCC | 0 |
| Recursive component grouping |
[O;D1;$(([a,A]).([A,a]))][CH]=O | OC=O.c1ccccc1 | 1 | Feature/Bug #1312 |
| OC=O | 0 |
These next ones are concerned with logic and stereochemistry.
| query | targets | nmatch | Comment |
| Ensure local stereo matching |
*[@](*)(*)(*) | O[C@](N)(C)CC | 12 | tetrahedrons have 12 rotation symmetries |
| O[C@@](N)(C)CC | 12 |
| O[C](N)(C)CC | 0 |
| Implicit (hydrogen or lone-pair) neighbour |
CC[S@](C)=O | CC[S@](C)=O | 1 |
| CC[S@@](C)=O | 0 |
| CC[S](C)=O | 0 |
| Either (tetrahedral) |
CC[@,@@](C)O | CC[C@H](C)O | 1 |
| CC[C@@H](C)O | 1 |
| CCC(C)O | 0 |
| Both (tetrahedral) |
CC[@&@@](C)O | CC[C@H](C)O | 0 |
| CC[C@@H](C)O | 0 |
| CCC(C)O | 0 |
| Respect logical precedence 1 |
CC[@,Si@@](C)O | CC[C@H](C)O | 1 |
| CC[C@@H](C)O | 0 |
| CCC(C)=O | 0 |
| Respect logical precedence 2 |
CC[C@,Si@@](C)O | CC[C@H](C)O | 1 |
| CC[C@@H](C)O | 0 |
| CCC(C)O | 0 |
| CC[Si@H](C)O | 0 |
| CC[Si@@H](C)O | 1 |
| CC[Si](C)O | 0 |
| Unspecified |
CC[@@?](C)O | CC[C@H](C)O | 0 |
| CC[C@@H](C)O | 1 |
| CCC(C)O | 1 |
| Negation |
CC[!@](C)O | CC[C@H](C)O | 0 | !@@ is also equivalent to @? |
| CC[C@@H](C)O | 1 |
| CCC(C)O | 1 |
| Neither (tetrahedral) using 'or unspecified' |
CC[@?@@?](C)O | CC[C@H](C)O | 0 |
| CC[C@@H](C)O | 0 |
| CCC(C)O | 1 |
| Neither (tetrahedral) using negation |
CC[!@!@@](C)O | CC[C@H](C)O | 0 |
| CC[C@@H](C)O | 0 |
| CCC(C)O | 1 |
| Either (geomeric) |
C/C=C/,\C | C/C=C/C | 1 |
| C/C=C\C | 1 |
| CC=CC | 0 |
| Neither (geomeric) |
C/C=C!/!\C | C/C=C/C | 0 |
| C/C=C\C | 0 |
| CC=CC | 1 |
The last two are quite tricky (and not currently implemented) but once the atom-centric handling is correct it's a simple reduction. It's quite fun to work out so i'll
leaf that up to the reader.