Items not being indexed after publish

sitecore

Our marketing team recently reported an issue, whereby items where no longer being automatically indexed after a publish. Upon further investigation, it became obvious that all our custom indexes and also sitecore_web_index all seemed to be affected. In each case, if a Full Rebuild is triggered of the index via the control panel, then the items were successfully added to the index.

Troubleshooting the issue

Reviewing each of the affected indexes, it was clear that they all used the onPublishEndAsync index strategy and were all targeting the web database.

<strategies hint="list:AddStrategy">
  <strategy ref="contentSearch/indexConfigurations/indexUpdateStrategies/onPublishEndAsync" />
</strategies>
<locations hint="list:AddCrawler">
  <crawler type="Sitecore.ContentSearch.SitecoreItemCrawler, Sitecore.ContentSearch">
    <Database>web</Database>
    <Root>/sitecore/content/some location</Root>
  </crawler>
</locations>

Looking at the crawling logs, we could see the strategy was successfully being initalized and crawler was setup:

[Index=sitecore_web_index] Initializing SitecoreItemCrawler. DB:web / Root:/sitecore
[Index=sitecore_web_index] Initializing OnPublishEndAsynchronousSingleInstanceStrategy [Stamp:2682411].

When a publish was triggered, the item was successully pushed to web database and the publish operation completed without any error.

Recently Deployed Web Database

One further piece to add to the puzzle, was that we had recently deployed a new web database, for our CA environment. The timing of this seemed to coincide roughly with these problems starting - which obviously was too big a coincidence to ignore... However, we couldnt quite work out How / What or Why this event could explain the symptoms that we were experiencing.

The new database (after having done a full republish to it) seemed to be working exactly as we expected. No errors were present and the CA environment front end was displaying perfectly.

Sitecore Support

A quick ticket to the guys at sitecore support provided us with the answers we were looking for. In short, it was due to a mismatch between a stamp in the Properties table of the CORE database and the stamps alongside new items that were being added to the Event Queue in the new web database.

How OnPublishAsync Indexing Strategy Works

As you make changes in sitecore back end, multiple events are registered in the master databases Event Queue (EQ). Each of these events has a 'Stamp' property, which is incremented sequentially (note not timestamp). When you trigger a publish event, those events are then copied to the EQ of the publishing targets (i.e. web database).

Note: that the 'Stamp' is not necessarily the same for an event in the master EQ as the corresponding event in the web EQ.

The OnPublishAsync strategy listens out for a 'publish' event being added to the EQ (of the database that is defined in the config). At that point, it checks the Last Updated Timestamp for that specific index (from the Properties table in CORE Database), then finds all events in its EQ that have a 'Stamp' after that point. Those events are then processed and the index is updated accordingly.

Confusing Terminology

What I found confusing was the terminology that was used, both in the Sitecore docs and the Properties table. The property that is used is labelled as CORE_SITECORE_WEB_INDEX_MyMachineName-MySite.local_LAST_UPDATED_TIMESTAMP. However, its not actually a timestamp, but rather a sequential counter.

Example

To make this a little clearer, I will show what was happening in our situation.

Situation One: Prior to the new database being deployed

CORE Database | Properties table
CORE_SITECORE_WEB_INDEX_xxx_LAST_UPDATED_TIMESTAMP: 290006
WEB Database | Event Queue
290007 | Publish Event
290006 | Save Event
290005 | Property Changed Event

So when the publish event was processed by OnPublishAsync handler, it processed all events after timestamp in CORE database, then set the timestamp to equal the last event it processed. Future events would be added to EQ 290008, 290009 etc..

Situation Two: After new database was deployed

When the new web database was deployed, the incremental value used in the EQ was reset to zero. So in the test that I performed, I noticed the following values:

CORE Database | Properties table
CORE_SITECORE_WEB_INDEX_xxx_LAST_UPDATED_TIMESTAMP: 290006
WEB Database | Event Queue
2003 | Publish Event
2002 | Save Event
2001 | Property Changed Event

In the above situation, when the publish event is triggered, the OnPublishAsync handler looks for all events in the event queue after 290006, which returns 0 records. Therefore no items are proccessed and no items are added to the index.

Solution

The solution to the problem shown above is pretty straigtforward to apply. Run the following SQL command on Core, Master and all Publishing Target databases:

TRUNCATE TABLE [EventQueue]
TRUNCATE TABLE [PublishQueue]
TRUNCATE TABLE [Properties]

This effectively resets all of the queues and related properties in each of the databases. Once in sync again, the OnPublishAsync strategy worked exactly as expected.

Conclusion

Our initial goal was to deploy a new Web database, in order to reduce its overall size and that of its backup files. The reason for this was because the DB had been siginificantly reduced in size, but as a result had a lot of uneccessary underlying files attached. Also it was the product of several upgrades, so it seemed a reasonable idea to start with a fresh DB and publish over the top of it. It goes to show, how a seemingly simple task good have much wider ramifications, if you dont fully understand how things hang together.

That said, in diagnosing and solving the problem, it has given us a better insight into the underlying processes involved.

Useful Links

How to restore sitecore database from backup: https://support.sitecore.com/kb?id=kb_article_view&sysparm_article=KB1000799

OnPublishEndAsync strategy: https://doc.sitecore.com/xp/en/developers/101/platform-administration-and-architecture/index-update-strategies.html#onpublishendasync-strategy

CategoriesSql

Leave a Reply

Your email address will not be published. Required fields are marked *