Talking tech

A Layered Approach to Better Data Cleansing: How Getwizer Optimizes Survey Data through Automation

  • By Alon Ravid
  • May 11, 2022
  • 3 min

The quality of research data can make or break the business decisions that stem from it. Yet the raw data of market research is a mixed bag at best. Among the thousands of open-ended survey responses, one finds both valuable insights and quite a bit of bad or “dirty” data. From irrelevant, inaccurate, or fragmentary input, to poorly-vetted respondents, duplicate submissions, repetitions and beyond, there’s no shortage of potential problems. Data cleansing is the process of eliminating them from the sample.

Sorting the wheat from the chaff

Manual, post-panel data cleansing is painfully inefficient. With 5-10% of responses on average being unusable, not only is sorting through them in the field slow, laborious, and expensive – but it can also mean significant reduction in sample size, and thus the need for repeated panel provisioning.

Getwizer’s 3 layer solution

Getwizer’s advanced consumer insights platform features intelligent automated data optimization that delivers quality results in real time, eliminating the need for manual data processing. This rapid, nuanced methodology consists of three successive 

Step one: the panel

The GetWizer algorithm starts by ensuring the quality of the survey panel. Panelists must represent an appropriate demographic and regional cross section of the population. The personal information entered by the panelists is scanned and verified, allowing the client full control in vetting and categorizing panelists for specific target panels.

Step two: the survey

The Getwizer platform uses artificial intelligence algorithms to deploy multiple approaches in real time, including Natural Language Processing (NLP), analyzing text in open-ended responses for relevance and readability. The algorithm also checks for answering patterns, inconsistent responses, and “speeders” who complete the survey too fast to provide quality answers.

Step three: the data

Once you’ve got the raw dataset, Getwizer’s algorithm runs through it to weed out problematic data such as expletives, typos, gibberish, and all unusable responses that might skew or invalidate your insights. The end result is a clean, usable dataset, achieved in the fraction of time and cost it would take for manual processing. 

This real-time cleansing algorithm automatically corrects typos and name variants, and understands slang. If necessary, you can also coach it to distinguish between correct and incorrect analysis as per your special requirements. 

Getting the picture

Researchers often allow for image-based responses. That’s why, in addition to text, Getwizer’s data cleansing also includes tagging and categorization for user-generated images. Using machine vision algorithms, the system enables research teams to quickly gain insights from images, without the need to manually sift through hundreds of poorly-shot smartphone photos.

Cleaner, faster, better

With Getwizer’s real-time data cleansing, speed and quality are no longer a tradeoff. You get clean, high-quality data to fuel your business insights – while dramatically improving your survey cost and efficiency. It’s a win-win.

Find out how it’s done! Book a demo.

Get started

You too can test anything, anytime, over and over again.

We'll be in touch soon. Privacy settings

    Accessibility Toolbar