JL – I always remember James as a very creative and self-possessed student. An unusual combination. When inviting him to contribute his MSc Reflections, I wrote: ‘Feel free to do something novel – that was always your forte, as I remember’. He seems to have taken me at my word – no answers to my questions, for example. Also, I am not sure that I have got the hang of this horse-racing thing. Have you? For the purposes in hand, however, James was my horse and I have won (I think…..). Thanks James, much appreciated.
To which James replied:
Hi Johnthat made me smile, I love the idea of being a horse and it was fun to do. And, I didn’t mention it, but I was lucky falling into HCI at the time I did as its made a big difference to my life.HCI Reflections, Horse Racing & Chocolates
I did the course at the end of the 90’s. I hadn’t any practical experience with computers but had a background in psychology, philosophy and art, and as it turned out this was a pretty good combo for HCI. More luck than planning.
I enjoyed the course and the people I met. I thought it was like a foundation course as I got to sample lots of different bits and pieces. There are a few things that have stuck with me; we had a practical design fortnight where we had to complete a project, from concept to prototype. This was great fun, very stimulating and really useful. I came a way feeling that the course should do more of this. All the computing elements were good as I was keen to learn more about this. I found the introduction to programming especially useful as I hadn’t done anything like this before. It’s fun making stuff that works and this really helps when it comes to testing and communicating and idea. And then there was John’s Engineering approach. I liked the idea but thought, as I did with a lot of the HCI stuff, that there was a little too much emphasis on the method and process rather than doing. I felt at the time that by focusing so much on how to go about a thing gets in the way of doing it.
Ok, so how has this translated to work. Well a lot of what I currently do has an engineer-ish (sorry John I can remember the exact definition, but I feel I’m sticking to the spirit of the thing) quality. By this I mean we make changes for a reason, measure the impact and make adjustments based on the feedback.
I work for commercial clients who generally want to sell more. This, for me, involves running quite a lot of live tests, AB, Multivariate and personalisation. In short, creating alternate versions of a web page, splitting the traffic so that different users are given different pages and measuring the results. The goal is always to outperform the what is currently there or (the default page). Oh, and personalisation isn’t as grand as it sounds. Its a great word as it immediately conjures up in clients minds a world were every customer is dealt with on a one to one basis, bespoke tailoring always pops into my head when I hear it. The reality is, we operate at in broader strokes, you are a return visitor, or you are at university, then this is for you.
When I started doing live testing I was not particularly well informed. From chatting to others, who knew a bit more than me, I was open to the idea that what I would consider to be a well designed page had little baring on what would work. I had been shown pages that were truly ugly, not well organised and had nothing that looked there was a sense of purpose behind them, but had apparently performed miraculously. So God is not a designer?
The first few years of running these tests were emotionally exhausting. Sometimes we (me, project manager and client) would get lucky, most the time we wouldn’t. We would set a test live and then watch the results day by day . It’s like gambling at the race track, I’m not a gambler, but you pick your favourite and then watch it go. Horse racing is probably more rational, horses have form, this was a crazy horse race, a horse race organised by the Marx brothers. We would see a particular design top the results table for week. When we returned on Monday morning the picture had reversed. This happened time and time again. What made it more frustrating was that the current winner had hit a level of significance where it would be reasonable to call it (as a winner), 95%, or 99% and in some cases higher. Not so much engineering, more the occult , Black Magic perhaps. We got to the point that when our horse was winning, we wanted to stop the race. Definitely not engineering.
This left me feeling like I wanted to get out of the live testing business. We lost a few project managers along the way, and client side there was quite a bit of shuffling. However I didn’t have much choice so I started chatting with a colleague and scratching my head. I wondered what would a test look like if it we just ran the same design but showed it to different groups? Then after some more chatting, my colleague helped me realise that I didn’t need to run the test, I could simulate it. The basic ingredients of one of these tests is a number of experiences ( we have run tests from 3 – 128 variations of a page ), each assigned to a group of users ( once a user is allocated to an experience this is the only experience they will see), and measured against some metric ( how many tickets purchased). So with a bit of Java Script I built a simulation to see what the results of this test would look like if there were absolutely no variation in the pages shown. All I had to do is plug in the number of variations, the size of the population ( how much traffic will see the test) and the baseline conversion rate. And voila, what you get is a Marx brothers horse race. Great fun, but not good for the nerves.
Here is the simulation, so that you can have a go yourselves. It should give you an idea of the range of variation you can get due to the sample. Play around with the numbers, hit the refresh button a few times, and see how often you can get a significant result (any thing above 95% confidence) by essential doing nothing.
http://www.paperst.co.uk/MVT/mvt.htm
The key to this is random distribution. As we can’t control the who sees each experience, we have to assign them randomly in order to remove any sampling bias. Our population that gets randomly assigned to a particular group will be made up of convertors and non-convertors. Some groups will have more convertors, others less. One of these groups will be our default. It could be at the top of the list or at the bottom, who knows? The uplift (positive or negative) is calculated off the default. As its random, it changes each time we run it, this week my horse is winning, next week its some one else’s. I call this simulation the Black Magic box. So in the words of Forest Gump, “Live testing is like a box of chocolates”.
1996/97 James Sinclair