### Segmentation Algorithms and Rater Bias: How our proprietary algorithm for market segmentation outperforms standard approaches

Segmentation is a method that partitions a market to market segments. These are essentially subgroups of consumers who have similar characteristics, be it their attitude towards shopping, their life philosophy, their preference to a product, etc. The idea behind segmentation is that it helps firms develop more efficient marketing strategies, such as to better target their products to their consumers based on the different needs of segments. There are generally two types of segmentation: a-priori and post hoc. With the a-priori segmentation, segments are defined prior to analysis and the goal of the analysis is simply to profile these segments. With the post hoc segmentation, segments are identified by using either hierarchical or partitioning methods (Dillon, 1994).

**Partitioning Methods**

Partitioning methods generally assign cases to a fixed number of groups to minimize some measure of distance from the group centers called centroids or medoids. The two most common partitioning methods are the K-means and Pam methods. The K-means algorithm performs clustering of cases based on their minimal squared distance from a specific number of centroids (a multidimensional version of group means). The Pam algorithm minimizes the sum of dissimilarities from medoids (a multidimensional version of group medians). The Pam algorithm is generally more robust against outliers than the K-means algorithm. (Insightful Corporation, 1988)

**Hierarchical Methods**

There are two general approaches to hierarchical clustering: agglomerative and divisive methods. Agglomerative methods, such as the Agnes algorithm, start with each observation as its own group and then step by step combine the two most similar groups together until all observations are in one group. On the other hand, divisive methods, such as the Diana algorithm, start with all observations in one group and then step by step create new groups by dividing the group with the highest dissimilarity until all groups contain only one observation. (Insightful Corporation, 1988)

**Alternative Market Segmentation Method Developed by Horizon Consumer Science**

Unfortunately, all the methods described above tend to be sensitive to **rater bias**, the tendency of people to use a restricted range of a rating scale to evaluate all rated attributes. This typically shows up as high intercorrelation between rated attributes. Figure 1 shows a two-dimensional distribution of two variables representing two attributes, each rated on a 9-point scale. For convenience, each of these variables was rescaled into three categories: Low (rating 1 through 5), Medium (rating 6 and 7), and High (rating 8 and 9). This two-dimensional distribution shows a correlation of .50 between the two variables.

A proprietary method was developed by Jan Muska for Horizon Consumer Science. This method can segment consumers without being influenced by the rating bias, returning segments that are superior to the other methods.

**Rating and Segmentation**

As an example of how this new method outperforms existing classification methods, one attribute-rated question was selected from the Horizon Consumer Science market study. Specifically, the stem for this question is “Please indicate how desirable each statement is to you personally when you travel internationally and shop for luxury goods?” The rating was done on a 9-point scale where *9=Extremely desirable* and *1=Extremely undesirable*. A 4-group segmentation, using all the described methods above, was done to show the superior performance of the HCS algorithm with rating data.

Table 1 shows 4-group segmentation, using the K-means method. It immediately becomes clear that the method captured the rater bias and grouped people on their general rating tendency, which has a limited utility. Similar results were obtained by other cluster methods such as Pam, Agnes, and Diana.

Table 2 shows the segmentation summary for the HCS algorithm. Clearly, each group is well defined, despite the average intercorrelation of .46 among the 16 attributes of the rating question.

**Dealing with Rater Bias**** **

A simple approach to removing rater bias is to case normalize the rating matrix (one row - one respondent). To do that, subtract the row mean from each row. This converts all row means to zero. Then divide each row by the row standard deviation. This converts all row deviations to one. For example, let one individual’s ratings for the questions we are considering be (3, 5, 3, 5). To case normalize this row, first subtract its mean (4) to obtain (-1, 1, -1, 1). Then divide these centered observations by the row standard deviation (1.15) to obtain the case normalized ratings (-.89, .89, -.89, .89). As rater bias is the tendency to rate all attributes higher or lower, with reduced spread, subtracting the row mean and dividing by the standard deviation equalizes the impact of each case. The only difference that remains are the relative differences of individual attributes in this rating matrix. Rater biased was removed.

#### Assessing the Segmentation

One way to assess the quality of segmentation is to look at the percent of explained variance EV, or the degree to which the segmentation model explains the variation in the responses.

Table 3 shows EV results for the original data set and for the case centered data set. As table 3 clearly demonstrates, the HCS algorithm doesn’t look impressive when looking at EV for ordinary X. However, the case normalized X clearly shows that the apparent stellar performance of the K-Means, Pam, Agnes, and Diana algorithms was due to capturing the rater bias. Once the rater bias was removed, the HSC algorithm was the only one that provided a reasonable solution.

Note that the performance of the standard methods for rating data can be improved by simply case normalizing the rating variables first. Table 4 shows EV for segmentation done for the same rating data that were first case normalized. What is immediately clear is that the performance of the classical methods improved. Notably, K-Means and Diana returned results that look technically better than those of the HCS algorithm.

Table 5 shows the 4-group segmentation by the K-Means algorithm for the case normalized matrix X. Even though there is an observable improvement compared to table 2, the segmentation is not as clear cut as it is in table 1 for HCS algorithm. The result for the Diana algorithm shows a similar outcome and is not shown here.

So, even though case normalizing can improve the performance of the classical methods to even outperform the HCS algorithm on technical grounds, the segmentation results provided by the HCS algorithm are cleaner and better defined.