This series of blogs will look at the steps I took to create a page recommendation engine for Sitecore. The main points covered are as follows:
- The concept
- How to build a machine learning service
- Building our page recommendation ML Model
- Triggering Page Level Goals and Events
- Collating, merging data and training our model
- Generating page recommendations and storing the results
Companion code for this series of blog posts can be found on my GitHub Pages: https://github.com/deanobrien/page-recommender-for-sitecore
The Concept
The aim is to create a recommendation service, that works in the same way that you would expect one to work on a movie site. Whereby people provide a rating between 1 and 10 for all the movies that they have seen. Then use a machine learning algorithm to analyse the data and make predictions, based on the idea:
“if User A likes a lot of the movies that Users B and C like, he/she will most likely rate other movies they like highly as well – therefore the service would predict a high rating for those movies for User A”.
However, for our course recommendation service, instead of predicting what Rating a ‘User’ will give a ‘Movie’, we will predict what Engagement a ‘User ID’ will give for a ‘Page ID’.
Examples
Movie Recommendation Service – Example Data
User | Movie | Rating |
---|---|---|
Bill | Aliens | 9 |
Bob | Aliens | 8 |
Bob | Die Hard | 10 |
Bob | Predator | 9 |
Bill | Kill Bill | 8 |
Bill | Predator | 9 |
Ben | Predator | 10 |
Ben | Aliens | 8 |
When provided with the information above, we might expect the movie recommendation service to predict that Ben would give “Die Hard” a rating of “9.5”, based on the fact that both Bill and Bob like similar movies to Ben and rated that movie highly.
Page Recommendation Service – Example Data
UserID | PageID | Engagement |
---|---|---|
20e98813-b4f2-4dd3-aad6-eee188264f83 | 97f20fd5-f2e2-424b-8bfe-14479f8cca13 | 20 |
10573eb5e-e310-4f60-9a39-c2c1c09126c1 | 97f20fd5-f2e2-424b-8bfe-14479f8cca13 | 22 |
0573eb5e-e310-4f60-9a39-c2c1c09126c1 | ce6793e8-98e6-455e-8915-611d1a073150 | 20 |
0573eb5e-e310-4f60-9a39-c2c1c09126c1 | c32f30ff-867a-4b75-8643-5808ae3a6378 | 32 |
20e98813-b4f2-4dd3-aad6-eee188264f83 | b591fd13-f564-405b-baeb-428d9e53e4a4 | 12 |
20e98813-b4f2-4dd3-aad6-eee188264f83 | c32f30ff-867a-4b75-8643-5808ae3a6378 | 18 |
9fdeeefb-8418-4638-9e06-3efa0e7d9634 | c32f30ff-867a-4b75-8643-5808ae3a6378 | 10 |
9fdeeefb-8418-4638-9e06-3efa0e7d9634 | 97f20fd5-f2e2-424b-8bfe-14479f8cca13 | 21 |
When provided with the information above, we might expect the page recommendation service to predict that “9fdeeefb-8418-4638-9e06-3efa0e7d9634” would give “ce6793e8-98e6-455e-8915-611d1a073150” a rating of “26”.
Summary
In practice, for the prediction service to be effective we need to provide the machine learning model thousands of rows of data, containing predictions for a wide range of pages (PageIDs) and users (UserIDs).
To generate the data, we will trigger a range of “in page goals and events“ and then use the cortex processing engine to first gather all the engagements, then merge the results together.
i.e. if a user triggers 10 separate page events, each with a different engagement value, these will be merged together to give one single result showing total engagement for user on a given page
Next up in the series: How to build a machine learning service