This series of blogs will look at the steps I took to create a page recommendation engine for Sitecore. The main points covered are as follows:
- The concept
- How to build a machine learning service
- Building our page recommendation ML Model
- Triggering Page Level Goals and Events
- Collating, merging data and training our model
- Generating page recommendations and storing the results
Companion code for this series of blog posts can be found on my GitHub Pages: https://github.com/deanobrien/page-recommender-for-sitecore
The aim is to create a recommendation service, that works in the same way that you would expect one to work on a movie site. Whereby people provide a rating between 1 and 10 for all the movies that they have seen. Then use a machine learning algorithm to analyse the data and make predictions, based on the idea:
“if User A likes a lot of the movies that Users B and C like, he/she will most likely rate other movies they like highly as well – therefore the service would predict a high rating for those movies for User A”.
However, for our course recommendation service, instead of predicting what Rating a ‘User’ will give a ‘Movie’, we will predict what Engagement a ‘User ID’ will give for a ‘Page ID’.
Movie Recommendation Service – Example Data
When provided with the information above, we might expect the movie recommendation service to predict that Ben would give “Die Hard” a rating of “9.5”, based on the fact that both Bill and Bob like similar movies to Ben and rated that movie highly.
Page Recommendation Service – Example Data
When provided with the information above, we might expect the page recommendation service to predict that “9fdeeefb-8418-4638-9e06-3efa0e7d9634” would give “ce6793e8-98e6-455e-8915-611d1a073150” a rating of “26”.
In practice, for the prediction service to be effective we need to provide the machine learning model thousands of rows of data, containing predictions for a wide range of pages (PageIDs) and users (UserIDs).
To generate the data, we will trigger a range of “in page goals and events“ and then use the cortex processing engine to first gather all the engagements, then merge the results together.
i.e. if a user triggers 10 separate page events, each with a different engagement value, these will be merged together to give one single result showing total engagement for user on a given page
Next up in the series: How to build a machine learning service