Top 11 A/B Testing Tips

  1. Always test. You should constantly be testing something on your direct mail pieces, emails, web content, and telemarketing/in-person scripts. You should always be looking to improve. Your existing materials may end up being the best, but you don’t know that until you test it.
  2. Test one item at a time. In order to test what makes the difference, you should always change just one element at the time. It might be the message, colour, envelope, greeting, or other small piece of the campaign. If you change both the colour and the reply device, you will not know which made the difference if it performs better than your existing package. Certain software allows you to test multiple items at a time for your website. This should be reserved for relatively minor elements, rather than the layout.
  3. Test for statistical significance. Though the response rate or average gift may be higher on your test package, you need to test if this is statistically significant or not. Average gifts can be brought up by one large gift, making it look like the test package performed better, when in actuality it performed worse, aside from the one large gift. 
  4. Ensure your test group has enough recipients. This can be a tricky number to determine, as it really depends on how sure you want to be that differences seen aren’t just a circumstance of chance. I recommend having at least 1/3 the number in the control group in your test group. For exact numbers to have a statistically significant test, contact us and get a copy of our campaign analysis Excel template.
  5. When testing website elements, be patient. Depending on your site traffic, let the test run for anywhere from a couple of days to a couple of weeks.
  6. Prioritize elements to test based on implementation difficulty and opportunity to determine which tests to run first. For webpage testing, start with the most frequent landing page, which may not be your home page, as this is the page most customers see first. Top exit pages, labeled % exit in Google Analytics, may also indicate a problem and an opportunity for improvements.
  7. When testing web or email, concentrate your results analysis on the donation rate, rather than just the click through rate.
  8. Ensure your control and test group are selected randomly. Random selection ensures that other factors don’t influence the results. For instance, if your test group contained only donors in one post code and your control group contained only donors in another post code, the difference in demographics may be the reason for a difference in donations rather than the tested element.
  9. Run your test and control groups at the same time. Because time of day, week, or month may influence response rates and donation amounts, the test and control need to be done at the same time.
  10. Do not worry about the possible loss of revenue. If you are changing only one element in a package, script, email, or webpage, you are unlikely to experience a large loss of revenue. To improve, you must take risks.
  11. Don’t discount elements that were previously less effective than a test element. Test elements every so often, as effective packages sometimes experience fatigue and may just need to be rotated out for a time.  

Zero is Not the Same as Missing

Zero is not the same as a missing value. A missing value indicates something is unknown and it is almost never appropriate to substitute one for the other. Many people substitute missing values with a zero to make their graphs look better or prepare the data for modeling. However, a zero indicates a specific value, while a missing value indicates the value is not known. For instance, when we are tramping the Kepler and are counting the number of people on the track, we would have to leave it blank with the view below. We simply cannot see all of the track, so cannot know how many people are on the track at that moment by visual inspection.

On the other hand, if we are counting the number of people on the deck of the hut below, we can easily see there are zero people. 

There are rare times that replacing missing values with zeros might be appropriate. For instance, if your CRM has no pledges for a donor prospect, most reports would by default show this donor's giving as a missing value. However, you know that they have not given anything yet, so replacing their giving amount with a zero on reports is appropriate. However, if you track donors' number of children, replacing a missing value with a zero would not be appropriate. Taking the time to think about each value will help you decide when it is appropriate to replace missing values with zeros (or vice versa).  

Dashboarding for Data Driven Decisions

Getting information in a timely, easy to understand way is not only expected by business professionals, it is necessary for businesses to stay competitive. Dashboards allow a user to get needed information at a glance. Stephen Few, a data visualization expert, defines a dashboard as, “a visual display of the most important information needed to achieve one or more objectives; consolidated and arranged on a single screen so the information can be monitored at a glance”.  The design of a dashboard can either help facilitate data-driven decision making or hinder it, but getting the design right can be tricky.


There are a lot of tools that you can use for dashboarding these days, with some being open-source and free. Some tools allow the user power to customize their dashboard by arranging elements in a manner that best suits their needs, while others require a developer to design and change all elements. Power BI by Microsoft is quickly gaining in popularity for SQL based data, particularly as they have a free user level. Tableau can produce excellent visualizations and has great customization, but tends to be picky as to the data format fed into the tool. There are several tools that allow you to program a dashboard in Java, but require a developer for any future changes. In all cases, you should work closely with the DBA and/or BI developer to ensure the data fed into the dashboard has the correct structure for the tool, allowing you to fully use the features and accurately represent the data. Also, ensure your data is complete and correct or your dashboard will not give you the desired accurate measures.
 

 Desk.com dashboard sample

Desk.com dashboard sample

When starting a dashboarding project, it is necessary to collect the requirements from business owners. It is also key to define who the user group is and if only one dashboard is being built, or one per business area. Avoid creating dashboards for each individual, but separate dashboards for separate business units can allow that unit to see the data that is most pertinent to that group. Try to understand what data the business owners use to make daily decisions. Talk to several of the users in order to get a full picture of how they might use data to assist in their work. A data visualization expert should put together the “how” of the dashboard, while users and business owners should define the “what”. 


Once you have collected what information is used by users for decision making, divide the data elements into those that will go into the dashboard, and which will go into a report or drill-down area. When possible, use software that has drill-down capabilities, or allowing the user to click on a data element to pull up further details. This greatly expands the detail you can include in your dashboard while keeping the main dashboard simplistic and free of clutter. 
 

 Tableau sample dashboard

Tableau sample dashboard

You should strive to have less than 10 data elements in the dashboard. The dashboard data should change often, making it useful to see that piece of data on a daily basis. If the data element changes only once per month or less frequently, or changes inconsequential amounts within long periods of time, it is better put into a report than on a dashboard. Also, if data is complicated and not easily put into a single data visualization, it is best to reserve this data for reports or drill-down areas of a dashboard.


Design the dashboard for simplicity and user-friendliness. There should be zero learning curve, and users should immediately understand the data they are viewing. Context should be given for any data element where it is not well known or obvious. For instance, if a percentage is given as a figure on the dashboard, indicate visually or with additional numbers/text whether this number is good or bad, better or worse (i.e. than last month or the average). 
 

 Geckoboard.com sample dashboard

Geckoboard.com sample dashboard

Keep visualizations to those that are easy to understand. There is a great deal of science behind the perception of data visualizations and there is a reason you see particular visualizations frequently. Just because there is a new, fancy chart available in the tool of your choice doesn’t mean you should use it, particularly in a dashboard. Save complex visualizations for reports that a user will have time to study. Avoid pie charts and doughnut charts, as these are almost always the wrong choice for conveying data. Bar charts, line graphs, maps, gauges, and simple tables generally make the best visualizations on a dashboard. Arrows, spark lines, and bullet charts can give your text charts a visual element allows for easy understanding of directionality. 


Try to use a logical order to the data elements, particularly if there are any time-dependent elements. If two visualizations relate to one another, put them next to each other. If you have visualizations relating to one-month, one year, and ten year performance, put them in the order you would expect to read them in, left to right, top to bottom. Use different colours or styles to allow the user to immediately find the data element they are looking for, once they are familiar with the dashboard. Do not try to make the dashboard a piece of art. Art is something you want the user to stare at for long periods of time, while you want users of a dashboard to view it a few seconds and come away with data useful for their decision making.
 

 Piraeus Consulting Tableau Dashboard

Piraeus Consulting Tableau Dashboard

Review the dashboard periodically. As business processes and needs change, so too should the dashboard. Check in with users to see if any element is no longer useful, or if there additional items needed. A seasonal element may be appropriate at certain times of the year, such as a year-end revenue visualization. Once the year-end has past, this element may not be needed for another 11 months, so should not be kept on the dashboard.

18 Best Practices for Dashboards

  1. Choose a tool that is easy to update and fits your data structure, or fit your data structure to the tool.
  2. Choose a tool that allows your users the amount of desired control for customization. 
  3. Ensure your data is complete and correct.
  4. Create a view most pertinent to a particular group. Don’t create a separate dashboard for each person.
  5. Data elements should be those that promote decisions and action, not just nice-to-know.
  6. Data should change often enough to be useful. Data that rarely changes should be saved for reports.
  7. The data elements chosen should be those that drive action, rather than those that satisfy curiosity. 
  8. Simplicity is key. Keep the dashboard to less than 10 data elements.
  9. Keep the dashboard to a size that fits on one screen. Drill down reports can be used for further details.
  10. Dashboards are not pieces of art. In fact, the most effective dashboards are those looked at for the least amount of time. 
  11. Graphs should be those that are proven to effectively convey data and can be properly perceived by users.
  12. Don’t use pie graphs (or doughnuts). Though related to number 11, it bears repeating as some need to be broken of this bad habit.
  13. Use a logical order to the elements.
  14. Use a cohesive style throughout, but vary the elements enough that the user can quickly find what they’re looking for.
  15. Use colours to highlight certain areas and allow the user to quickly find elements.
  16. Define the framework of the measures. Is the measure up/down from previous or good/bad, etc.?
  17. Ensure the learning curve for users is zero.
  18. Review the dashboard elements periodically and update as business needs change.

How to hire a data scientist

So you've decided you want to hire a data scientist. What kind of considerations should you make when writing the job description? How can you find the best person for the type of work you want done?

First we look at what the definition of what a data scientist is. Data scientists are generally expected to be great at coding, math and statistics, graphic design, business understanding, and translating technical knowledge into business insights. The coding allows the data scientist to combine data sources and transform the data in a usable form. They may also use their coding skills to create software, apps, or APIs to function as a front-end for a machine learning algorithm or predictive model. The mathematical and statistical skills are at the core of a data scientist’s worth to your organization. These are used to build the predictive models, machine learning algorithms, tests for significance and appropriate data structures, and optimization calculations. From there, a data scientist must be able to transfer these algorithms in real value for the organization. This means they must understand how your business works, translate complicated technical projects into insights and uses understandable by business line owners, and create visualizations that convey these ideas. How can you find someone like this, you ask. Easy, a typical data scientist is pictured below.

unicorn-1615680_1280.jpg

Since there is no one, or at least very few people, that can say they are experts in all five of the areas core to data science, you need to outline which features are most important to your company’s needs. Determine what kind of projects your new hire will be working on. If you are looking to build a piece of software that utilizes machine learning, it may be most important that the prospect have high coding and analytical ability, while they may not need excellent graphic design or presentation skills. If your data scientist will be primarily working on predictive models that will be used in the business, some coding ability can be sacrificed for better business and graphical design skills. Some of the tasks may be able to be done by an existing employee or by someone cheaper than a data scientist, such as an intern.

If you have a data scientist who has great analytical, graphic design, and presentation skills, but has less experience doing data extraction, loading, and transformation (ETL), you can generally find someone else to do this function. One thing you should not worry too much about is specific software or programming languages, as someone who knows how to program can easily learn a new language or data structure and someone who knows many different software programs can pick up a new one with little trouble.

unicorn-1214509_1280.jpg

Now that you have decided what skills are most important to your open position, you need to attract the right people to apply. If you want someone who can hit the ground running, make sure you set the pay correctly at 6 figures. To take a cheaper route, you can hire someone junior with little experience, however, you will need to give them more time to learn on the job. Of course, once these data scientists have up-skilled sufficiently, they will likely expect more pay and go looking for it if you don’t offer it. It can help your case if you offer some other benefits, such as more annual leave, flexible working hours and/or location, or subsidized education. You could also try shorter working hours with the same amount of work, encouraging efficiency and high productivity for the hours they are at work.
There are plenty of blog posts and articles that list data science interview questions. I’ve listed some below. Most data scientists wouldn't be able to answer every one of these, nor would all apply to the position they are applying for. Though these might be a good starting point, make sure you understand what you are asking and adapt the question to relate to your business, or you will not know if the interviewee answered appropriately. If you aren’t technically inclined and don’t have anyone available who is to sit in the interview with you, it is best to ensure the data scientist is able to translate technical information into business language, as this will likely be important for their position. I caution you from including an exercise or presentation requirement in your initial interviews. There are plenty of companies looking for data scientists and any barriers to the first interview should be avoided. A short exercise or presentation can be included in second interviews, or you can ask for a past project – anonymized – after the first interview.

KDNuggets.com

RPubs.com

Springboard.com

AnalyticsVidhya.com

In short, finding the right data scientist for your organization can be difficult, but if you keep your expectations realistic, it’s not impossible. Data scientists by nature are curious and love to learn, so once you have them, you can often keep them in your organization by keeping them interested and engaged.

What data science can’t predict

Many may be shocked that Nate Silver didn’t accurately predict the outcome of the US election, getting some of the key states wrong. If you don’t know who Nate Silver is, in short, he is a data geek (he hates the term data scientist) who has gained fame from correct political and sports predictions, bringing predictive analytics into the forefront of election coverage. He originally stated that Donald Trump had only a 2% chance of even winning the Republican nomination in the first place. One must remember that all of these predictions are based on predicted probability. 

 FiveThirtyEight.com's last prediction map for the 2016 US election

FiveThirtyEight.com's last prediction map for the 2016 US election

You can think of this like a weather forecast. If it is only 30% chance of rain on any given day, it doesn’t mean it won’t rain that day. It simply means that the day is more likely to not rain than to rain. This is perhaps a bad example as predicting weather is infinitely easier than predicting human behavior. And predicting weather is not easy. 


Nate Silver’s website, fivethirtyeight.com predicted Clinton had a 55% chance of winning North Carolina. Trump took the state. Would you take an umbrella if someone told you there was a 55% chance of it not raining that day? Probably, because these percentages are far from certainty. These were the same percentage chances given for a Clinton win in Florida. As of 98% of votes having been counted, Trump is ahead of Clinton by 128,630 votes, far less than votes for a third party candidate. 

 CNN Florida Election R

CNN Florida Election R

In short data science can’t predict everything. It has the most trouble when human emotions are involved, as they are during an election. If data scientists could predict everything, we probably would all be rich, own an island, and be living the high life having accurately predicted all stock market movements. Unfortunately the stock market, like an election, is driven primarily by emotions, and, thus, not accurately predictable.

 CNN Election Results

CNN Election Results

When you look at the US election results, don’t blame the analysts for predicting that America would come to its senses. Don’t lose faith in analytics, as they weren’t as far off as the outcome would suggest. Analytics uses logic, and illogical behavior cannot be accurately predicted.

Fun with Paralympic Stats

Kiwi athletes are once again doing well in Rio, this time in the Paralympics. New Zealand has participated in each Summer Paralympics since 1968, winning a total of 169 medals over the years prior to 2016. The most successful Paralympian as of today, is the late Eve M. Rimmer, who competed in athletics in each Games from 1968 - 1980, winning 14 medals. Swimmer Sophie Pascoe today won her 13th medal, a tally amassed over 3 Paralymipc games, equaling nearly a quarter of the New Zealand Paralympic swimming medals ever.. Pascoe has 2 more races this week in which to tie or beat the record for most successful NZ Paralympian. If not in Rio, Pascoe's young age of 23 makes it likely we will see her go for more medals in 4 years time in Tokyo.

New Zealand is currently at the top of the medal per capita chart. However, we are third in gold medals per capita, behind Fiji and Jamaica. At the 2012 London Paralympics, Kiwis came in 1st on the medals per capita ratings and 4th on the gold medals per capita ratings, behind Grenada, Jamaica, and Trinidad/Tobago. New Zealand is currently 11th overall on the medal chart in Rio. 

The 31 New Zealand athletes span in ages from 15, swimmer Tupou Neiufi, to 58, sailor Chris Sharp. However, in London, the oldest Kiwi Paralympian was 61 and the youngest 13. Ten of the  athletes are from the South Island, with the majority of the athletics team residing in Dunedin.

Most of the Kiwi athletes are students or full-time athletes, though 10 have additional occupations outside the home.

Kate Horan, a para-cyclist and mother of three, is competing in her third Paralympics, but her first as a cyclist. She competed in athletics as a sprinter in Athens in 2004 and Beijing in 2008.

Rio has a total 4,350 athletes competing in 22 sports with 526 medal events. Triathlon and canoeing are making their first appearance at the Paralympics this year.

Some Fun Olympic Stats

At the time of this writing, Michael Phelps just got his 22nd gold medal and New Zealand's Eric Murray and Hamish Bond got the 43rd New Zealand gold medal ever. In other words, if Phelps was a Kiwi, he would have more than half of the gold medals won by the country since the first Kiwis competed in 1920 (though a few Kiwis competed for Great Britain and Australasia in earlier Olympics).

New Zealand fairs much better in the Summer games than the Winter games. The only Winter Olympic medal for New Zealand was a silver won in 1992 by Annelise Coberger for alpine skiing. She was the first medalist from the Southern Hemisphere in the Winter Olympics.

New Zealand's biggest strength comes in boating of all kinds, having won a medal in rowing, canoeing, kayaking, or sailing in all but 2 Olympics since 1956. 49 of New Zealand's 106 medals have been from these sports.

The Olympics of 1900 had only 95 events in 20 different sports. The Rio Olympics has 306 events in 35 sports. Only athletics, swimming, cycling, fencing and gymnastics have been in each modern Olympic games.

In a ranking of Olympic medals per million people, New Zealand comes in 12th with 23.7 medals per million Kiwis. However, the top 10 is primarily populated by countries with high Winter Olympics medal counts. So far in the Rio Olympics, New Zealand ranks 2nd in medals per capita, just behind Slovenia.

Another Misleading Graph

In general, using images in your graph should be avoided. Similar to using a bar graph with a vertical axis starting above zero, using images in your graph can mislead an audience. An image is generally perceived as having volume. Increasing the image size to meet the y-axis value desired, is visually increases the image size along 3 dimensions.

For example, the graph above shows a graph featuring a tent to represent the number of tents bought by the non-profit organization from funds donated during two particular campaigns. Though the Campaign B only surpassed the first by 10 tents, it visually appears to have increased by 30, because the image was increased on 3 axis.

If you really love images and want to use them in a graph, consider either putting the image within a bar on a bar chart or use the image as a point on your line graph, as seen in the graph below.

Misleading Visualizations: Cropping Axis on Bar Graphs

One of the most common problems in graphs is the use of a cropped axis in a bar graph. This can lead to a visually overstated difference between two numbers. In the graph on the left, which was automatically was created by Excel, Campaign B looks greatly more successful than the other campaigns in terms of average gift. Though campaign B was more successful than the other campaigns, the average gift for Campaign B is only 10% higher than Campaign A. However, in a graph with an axis that starts above 0 like the graph on the left, the bar for Campaign B is more than twice the size of the bar for Campaign A, leading to a misleading visual representation of the data. The graph on the right shows an accurate visualization of the data, with the axis starting at 0.

But what if you want to emphasize the differences in the data, even if they are a small percentage of the data (i.e. a difference of $1000 in bars that are in the millions)? In this case, you have a few options. You can use a table containing the numbers and/or differences from each other or an average. Not everything needs to be visually represented, particularly when their are only a few categories. Another option is to use a line graph with a cropped axis. Because a line does not have a solid area like a bar, the differences in size do not confuse the audience in a line graph as as they would in a bar graph. One more way to represent this data is to visualize it in terms of difference from an overall average or other reference point. You can determine method which works best, just avoid cropping axis for any graph with solid area.