DMP 1-2-3

Blank Whiteboard IsolatedAlmost every marketer is starting to lean into data management technology. Whether they are trying to build an in-house programmatic practice, use data for site personalization, or trying to obtain the fabled “360 degree user view,” the goal is to get a handle on their data and weaponize it to beat their competition.

In the right hands, a data management platform (DMP) can do some truly wonderful things. With so many use cases, different ways to leverage data technology, and fast moving buzzwords, it’s easy for early conversations to get way too “deep in the weeds” and devolve into discussions of “match rates” and how cross-device identity management works. The truth is that data management technology can be much simpler than you think.

At its most basic level, DMP comes down to “data in” and “data out.” While there are many nuances around the collection, normalization, and activation of the data itself, let’s look at the “data in” story, the “data out” story, and an example of how those two things come together to create an amazing use case for marketers.


The DMP Tie Fighter: The left wing shows data coming into the DMP, and rhe right wing shows the data actvated on various channels. 



The “Data In” Story

To most marketers, the voodoo that happens inside the machine isn’t the interesting part of the DMP, but it’s really where the action happens. Understanding the “truth” of user identity (who are all these anonymous people I see on my site and apps?) is what makes the DMP useful in the first place, making one-to-one marketing and understanding customer journeys something that goes beyond AdExchanger article concepts, and starts to really make a difference!

  • Not Just Cookies: Early DMPs focused on mapping cookie IDs to a defined taxonomy and matching those cookies with execution platforms. Most DMPs—from lightweight “media DMPs” inside of DSPs to full-blown “first-party” platforms—handle this type of data collection with ease. Most first-generation DMPs were architected as cookie collection and distribution platforms, meant to associate a cookie with an audience segment, and pass it along to a DSP for targeting. The problem is that people are spending more time in cookie-less environments, and more time on mobile (and other devices). That means today’s DMPs have to have the ability to do more than organize cookies, but also be able to capture a large variety of disparate identity data, which can also include hashed CRM data, data from a point-of-sale (POS) system, and maybe even data from a beacon signal.
  • Ability to Capture Device Data: To a marketer, I look like eight different “Chris OHara’s:” three Apple IDFAs, several Safari unique browser signatures, a Roku device ID, and a hashed e-mail identity or two. These “child identities” must be reconciled to a “Universal ID” that is persistent and collects attributes over time. Most DMPs were architected to store and manage cookies for display advertising, not cross-device applications, so platforms’ ability to ingest highly differentiated structured and unstructured data are all over the map. Yet, with more and more time dedicated to devices instead of desktop, cookies only cover 40% of today’s pie.
  • Embedded Device Graph: Cross-device identification is notoriously difficult, requiring both the ability to identify people through deterministic data (users authenticate across mobile and desktop devices), or the skill to apply smart algorithms across massive datasets to make probabilistic guesses that match users and their devices. Over the next several years, the word “device graph” will figure prominently in our industry, as more companies try and innovate a path to cross-device user identity—without data from “walled garden” platforms like Google and Facebook. Since most algorithms operate in the same manner, look for scale of data; the bigger the user set, the more “truth” the algorithms can identify and model to make accurate guesses of user identity.

The “data in” story is the fundamental part of DMP—without being able to ingest all kinds of identifiers and understand the truth of user identity, one-to-one marketing, sequential messaging, and true attribution is impossible

Data Out

While the “data in” story gets pretty technical, the “data out” story starts to really resonate with marketers because it ties three key aspects of data-driven marketing together. Here’s what a DMP should be able to do:

  • Reconcile Platform Identity: Just like I look like eight different “Chris O’Haras” based on my device, I also look like 8 different people across media channels. I am a cookie in DataXu, another cookie in Google’s DoubleClick, and yet another cookie on a site like the New York Times. The role of the DMP is to user match with all of these platforms, so that the DMP’s universal identifier (UID) maps to lots of different platform IDs (child identities). That means the DMP must have the ability to connect directly with each platform (a server-to-server integration being preferable), and also the chops to trade data quickly, and frequently.
  • Unify the Data Across Channels: To a marketer, every click, open, like, tweet, download, and view is another speck of gold to mine from a river of data. When aggregated at scale, these data turn into highly valuable nuggets of information we call “insights.” The problem for most marketers that operate across channels (display, video, mobile, site-direct, social, and search, just to name a few) is that the fantastic data points they receive all live separately. You can log into a DSP and get plenty of campaign information, but how do you relate a click in a DSP with a video view, an e-mail “open,” or someone who has watched a YouTube on an owned and operated channel? The answer is that even the most talented Excel jockey running twelve macros can’t aggregate enough ad reports to get decent insights. You need a “people layer” of data that spans across channels. To a certain extent, who cares what channel performed best, unless you can reconcile the data at the segment level? Maybe Minivan Moms convert at a higher percentage after seeing multiple video ads, but Suburban Dads are more easily converted on display? Without unifying the data across all addressable channels, you are shooting in the dark.
  • Global Delivery Management: The other thing that becomes possible when you tie both cross-device user identity and channel IDs together with a central platform is the ability to manage delivery globally. More on this below!

gdmGlobal Delivery Management

If I am a different user on each channel—and each channel’s platform or site enables me to provide a frequency cap—it is likely that I am being over-served ads. If I run ads in five channels and frequency cap each one at 10 impressions a month per user, I am virtually guaranteed to receive 50 impressions over the course of a month—and probably more depending on my device graph. But what if the ideal frequency to drive conversion is only 10 impressions? I just spent 5 times too much to make an impact. Controlling frequency at the global level means being able to allocate ineffective long-tail impressions to the sweet spot of frequency where users are most likely to convert, and plug that money back into the short tail, where marketers get deduplicated reach.

In the above example, 40% of a marketer’s budget was being spent delivering between 1-3 impressions per user every month. Another 20% was spent delivering between 4-7 impressions, which conversion data revealed to be where the majority of conversions were occurring. The rest of the budget (40%) was spent on impressions with little to very little conversion impact.

In this scenario, there are two basic plays to run: Firstly, the marketer wants to completely eliminate the long tail of impressions and reinvest it into more reach. Secondly, the marketer wants to push more people from the short tail down into the “sweet spot” where conversions happen. Cutting off long tail impressions is relatively easy, through sending suppression sets of users to execution platforms.

“Sweet spot targeting” involves understanding when a user has seen her third impression, and knowing the 4th, 5th, and 6th impressions have a higher likelihood of producing an action. That means sending signals to biddable platforms (search and display) to bid higher to win a potentially more valuable user.

It’s Rocket Science, But It’s Not

If you really want to get deep, the nuts and bolts of data management are very complicated, involving real big data science and velocity at Internet speed. That said, applying DMP science to the common problems within addressable marketing is not only accessible—it’s making DMPs the must-have technology for the next ten years, and global delivery management is only one use case out of many. Marketers are starting to understand the importance of capturing the right data (data in), and applying it to addressable channels (data out), and using the insights they collect to optimize their approach to people (not devices).

It’s a great time to be a data-driven marketer!

Digital Media Ecosystem · DMP

A Brief History of Banner Time

mighty-jointIt’s been a long time since publishers have truly been in control of their inventory, but new trends in procurement methodologies and technology are steadily giving premium publishers the upper hand.

The story of display inventory procurement started with the Publisher Direct Era, when publishers were firmly in control of their banners, and kept them safely hidden behind sales forces and rate cards. Then the Network Era crept in, and smart companies like Tacoda took all the unwanted banners and categorized them. Advertisers liked to buy based on behavior, and publishers liked the extra check at the end of the month for hard-to-sell inventory.

That was no fun for the demand side though. They started the Programmatic Era, building trading desks, and leveraging DSPs to make sure they were the ones scraping a few percentage points from media deals. Why let networks have all of the arbitrage fun? The poor publisher was left to try and fight back with SSPs and more technology to battle the technology that was disintermediating them, kind of like a robot fight on the Science Channel.

But all of the sudden, publishers realized how silly it was to let someone else determine the value of their inventory, and launched the DMP Era. They ingested first-party data from their registration and page activity and created real “auto intenders” and “cereal moms” and wonderful segments that they could use to effectively sell to marketers. Now, every smart publisher knows more about their inventory than 3rd parties, and they can also find their readers across the wider Web through exchanges. A win-win!

Then all of the marketers in the world started reading AdExchanger, and saw the publisher example, and thought, “Wow, good call!” They started to truly understand how much money Programmatic companies were taking out of the investment they earmarked for media (silly marketers, Y U no read Kawaja’s first IAB deck?), and decided to use their own technology and data to power audience targeting. If it were a baseball game, this DMP Era for Marketers would be in the first or second inning, but the pitcher is throwing at a fast pace.

The next thing that happened was the Programmatic Direct Era, which lasted about ten minutes and effectively jumped the shark when Rubicon bought two of the more prominent companies involved (ShinyAds and iSocket). Programmatic Direct marketplaces promised a flip of the yield curve for publishers to expose the “fat middle” of undervalued impressions. They attempted this by placing blocks of inventory in a marketplace, and enabled the publisher to set rates, impression levels, and provide API access directly into their ad server. Alas, a tweak to Google’s API did not an industry make. Marketers loved the idea, but since they use audience as the primary mechanism to value inventory, PD marketplaces failed as stand-alone entities and were gobbled up. Under the steady hand of RTB-based technologies, they slowly evolve based on buy-side methodologies. Again, the demand side foils a perfectly reasonable, publisher-derived procurement scheme!

So, what’s next?

The Programmatic Direct Era still lives, albeit within private marketplaces (PMPs) and Direct Deal functionality. The IAB’s Open Direct protocol remains stuck at 1.0, but there is hope—and this time it’s a change that is positive for both marketers and publishers. The latest Era in inventory procurement is what I call Total Automation. Let me explain.

Say a big auto manufacturer has a DMP and has identified, via purchase information, the exact profile of everyone who buys their minivan. Call then “Van Moms.” Then suppose the publisher, who licenses an instance of the same DMP, is a women-friendly publication chock full of those Van Moms—and women who just happen to look like Van Moms. It’s pretty easy to pipe those Moms from the marketer right to the publisher. That process, which you might call Programmatic Direct 2.0, is interesting.

It requires no exchanges, no 3rd party data, no DSPs, no “private marketplace” no SSP, and potentially no agencies (spare the thought!). All it requires is some technology to map users and port them directly into an ad server.

What I just described is happening today, and moving quickly. Marketers are discovering that the change from demo-based buying to purchase-based buying through 1st party data is winning them more customers. Publishers are asking for—and commanding—high CPMs, and those CPMs are backing out for marketers. Thanks to all the crap in open exchanges, paying more for quality premium, “well lit” inventory actually works better than slogging through exchanges trying to find the audience needle in a haystack full of robots and “natural born clickers.”

The new Era of Total Automation will start putting publishers back on the map—but not all of them. The big distinction between the winners and losers will not only be the quality of their audience but, more importantly, the first-party data used to derive that audience. Not long ago, it was easy to apply a layer of 3rd party data and call someone an “auto intender” if they brushed past an article on the latest BMW. But compare that to the quality of an “auto intender” on a car site that has looked at 5 sedans over the last 2 weeks, and also used a loan calculator. There’s no comparison. The latter “intender,” collected from page- and user-level attributes directly by the publisher is 10 times more valuable (or $30 CPM rather than $3, if you like). The reason? That user volunteered real, deterministic information about herself that the publisher can validate. I am willing to bet that an auto manufacturer would pay a high CPM for access to an identified basket of those intenders on an ongoing, “always on” basis.

This is fantastic news for publishers that have great, quality inventory and have implemented a first-party data strategy. It’s even better news for the marketers that have embraced data management, and can extract and find their perfect audience on those sites. The Era of Total Automation will be over when every single marketer has a DMP. At that time, we will discover that there is no longer a glut of display inventory—all of the quality “Van Moms” and “Business Travelers” and the like will be completely spoken for. What will be left is a large pile of unreliable, long tail inventory available for the brave DR marketer and his DSP.

I think both marketers and publishers should welcome this new Era of data-driven one-to-one marketing. The crazy thing is that, once we get it right, it looks just like an anonymized version of direct mail—perhaps the oldest, greatest, most effective and measurable marketing tactic ever invented!

[This post originally appeared in AdExchanger on 7/2/15]


What Marketers want from AdTech


If you read AdExchanger regularly, you might think that nearly every global marketer has a programmatic trading strategy. They also seem to be leveraging data management technology to get the fabled “360-degree view” of their customers, to whom they are delivering concise one-to-one omnichannel experiences.

The reality is that most marketers are just starting to figure this out. Their experience ranges from asking, “What’s a DMP?” to “Tell me your current thinking on machine-derived segmentation.”

A small, but significant, number of major global marketers are aggressively leaning into data-driven omnichannel marketing, pioneering a trend that is not going anywhere anytime soon. Over the next five years, nearly every global marketer will have a data-management platform (DMP), programmatic strategy and “chief marketing technologist,” a hybrid chief marketing officer/chief information officer that marries marketing and technology. These are exciting times for people in data-driven marketing.

So, what are marketers looking for from technology today? Although these conversations ultimately become technical in nature, you soon discover that marketers want some pretty basic, “table stakes” type of stuff.

Better Segmentation Through First-Party Data 

Marketers spend a lot of time building customer personas. Once a customer is in their customer relationship management (CRM) database and generates some sales data, it’s pretty easy to understand who they are, what they like to buy and where they generally can be found. From a programmatic perspective, these are the equivalent of a car dealer’s “auto intenders,” neatly packaged up by ad networks and data providers to be targeted in exchanges.

That’s still available today, but the amazing amount of robotic traffic, click fraud and media arbitrage has made marketers realize just how loose some segment definitions may be. Data companies have a great deal of incentive to create and sell lots of auto intenders, so marketers are starting to look deeper at how such segments are actually created. It turns out that some auto intenders are people who brushed past a car picture on the web, which lumped them into a $12 cost per mille (CPM) audience segment.

Those days seem to be coming to an abrupt close as marketers increasingly use their own data to curate such segments and premium publishers, which do have auto intenders among their readerships, use data-management tools to make highly granular segments available directly to the demand side. Marketers are now willing to pay premium prices for premium audiences in a dynamic being driven by more transparency into how audiences are created in the first place. Audiences comprised of first- and second-party data will win every time in a transparent ecosystem. 

Less Waste, More Efficiency

Part and parcel of better audience segmentation is less waste and more media efficiency. The old saw, “I know half of my marketing works, I just don’t know which half,” goes away with good data and better attribution.

As an industry, we promised to eliminate waste 20 years ago. The banner ad was supposed to usher in a brave new world of media accountability, but we ended up creating a hell of a mess. Luckily, venture money backed “solutions” to the problems of click fraud, faulty measurement and endless complexity in digital marketing workflow.

Marketers don’t want to buy more technology problems they need to fix. And they don’t want to spend money chasing the same people around the web. They want to limit how much they spend trying to achieve reach. Data-management technology is starting to rein in wasteful spending, via tactics including global frequency management, more precise segmentation, overlap analysis and search suppression.

Marketers want to use data to be more precise. They are starting to leverage systems that help them understand viewability and get a better sense of attribution by moving away from stale last-click models. The days are numbered for marketers with black-box technology that creates a layer between their segmentation strategies and how performance is achieved against it.

One-To-One Communication Via Cross-Device Identity

Maybe the biggest trend and aspiration among marketers is the ability to truly achieve one-to-one marketing. A few years ago, that meant email, telemarketing and direct mail. Today, if you want to have a one-to-one customer relationship, you must be able to associate the “one” person with as many as five or six connected devices.

That is extremely difficult, mostly because we have been highly dependent on the browser-based “cookie” to determine identity. Cookie-based technologies evolved to ensure different cookies match up in different systems, but it’s a new world today.

Really understanding user identity means being able to reconcile different device signals with a universal ID. That means lots of cookies from different browsers, Safari’s unique browser signature, IDFAs, Android device IDs and even signals from devices like Roku, not to mention reliably “onboarding” anonymized offline data, such as CRM records.

Without device mapping, an individual looks like seven different devices to a marketer, making it impossible to deliver the “right message, right place, right time.” Frequency management is tougher, attribution models start to break and sequential messaging is hard to do. Marketers want a reliable way to reconcile user identity across devices so they can adapt their messages to your situation.

Data-Derived Insights 

Marketers inject tons of dollars into the advertising ecosystem and expect detailed performance reports. Each dollar spent is an investment. Some dollars create sales results, but all dollars spent in addressable channels create some kind of data.

Surprisingly, that data is still mostly siloed, with social data signals not connected to display results. Much of it is delivered in the form of weekly spreadsheets put together by an agency account manager. It seems crazy that marketers can’t fully take advantage of all the data produced by their digital marketing, but that is still very much the reality of 2015.

Thankfully, that dynamic is changing quickly. Data technology is rapidly offering a “people layer” of intelligence across all channels. Data coming into a central system can look at campaign performance across many dimensions, but the key is aggregating that data at the people level. How did a segment of “shopping cart abandoners” perform on display vs. video?

Marketers now operate under the new but valid assumption that they will be able to track performance in this way. They are starting to understand that every addressable media investment can create more than just sales – it can produce data that helps them get smarter about their media investments going forward.

It’s a great time to be a data-driven marketer.

[This post originally appeared in AdExchanger on 4.6.15]

Advertising Agencies · DMP

The Agency’s Role in Data Management


Twenty years after the first banner ad, the programmatic media era has firmly taken hold. The Holy Grail for marketers is a map to the “consumer journey,” a circuitous route filled with multiple addressable customer touchpoints. With consumers spending more of their time on mobile devices – and interacting with brands like never before through social channels, review sites, pricing comparison sites and apps – how can marketers influence customers everywhere they encounter a brand?

It’s a tough nut to crack, but starting to become an achievable reality to companies dedicated to collecting, understanding and activating their data. Marketers are starting to turn towards data management platforms (DMP), which help them connect people with their various devices, develop granular audience segments, gain valuable insights and integrate with various platforms where they can activate that data. In addition to technology, marketers also have to configure their entire enterprises to align with the new data-driven realities on the ground.

The question is: Where do marketers turn for help with this challenging, enterprise-level transition?

Many argue that agencies cannot support the type of deep domain expertise needed for the complicated integrations, data science and modeling that has become an everyday issue in modern marketing. But should data management software selection and integration be the sole province of the Accentures and IBMs of the world, or is there room for agencies to play?

For lots of software companies, having an agency in between an advertiser and their marketing platform sounds like a problem to overcome, rather than a solution. Many ad tech sellers out there have lamented the process of the dreaded agency “lunch and learn” to develop a software capability “point of view” for a big client.

Yet, there are highly compelling ways agencies add value to the software selection process. The best agencies insert themselves into the data conversation and use their media and creative expertise to influence what DMPs marketers choose, as well as their role within the managed stack.

From Digital To Enterprise

It makes perfect sense that agencies are involved with data management. The first intersection of data and media added the “targeting” column to the digital RFP. Agencies have started to evolve beyond the Excel-based media planning process to start their plans with an audience persona that is developed in conjunction with their clients. Today, plans begin with audience data applied to as many channels as are reachable. Audience data has moved beyond digital to become universal.

Agencies have also been at the tip of the spear, both from an audience research standpoint (understanding where the most relevant audiences can be found across channels) and an activation standpoint (applying huge media budgets to supply partners). Since they are on the front lines of where media dollars are expressed, they often get the first practical look at where data impacts consumer engagement. During and after campaigns conclude, the agency also owns the analytics piece. How did this channel, partner and creative perform? Why?

Having formerly limited agencies to doing campaign development and execution, marketers are now turning to the collected expertise of their agency media and analytics teams and asking them to embed the culture of audience data into their larger organization. When it’s time to select the DMP—the internal machine that will drive the people-based marketing enterprise—the agency is naturally called upon.

Data Management Is About Ownership

Although a small portion of innovative marketers have begun leveraging DMP technology and taken media execution “in-house,” the vast majority stills relies on agencies and ad tech platform partners to operate their stacks through a managed services approach. Whether a marketer should own the capability to manage its own ad technology stack is a matter of choice, but data ownership shouldn’t be. Brands may not want to own the process of applying audience data to cross-channel media, but they absolutely must own their data.

Where Agencies Play in Data Management

The Initial Approach: Most agencies have experience leveraging marketers’ first-party data through retargeting on display advertising. In an initial DMP engagement, marketers will rely on their agencies to build effective audience personas, map those to available attributes that exist within the marketer’s taxonomy and apply the segments to existing addressable channels. Marketers can and should rely on past campaign insights, attribution reports and other data insights from their agencies when test-driving DMPs.

Connect the Dots: For most marketers, agencies have been the de-facto connector of their diverse systems. Media teams operate display, video and mobile DSPs, ad serving platforms, and attribution tools. Helping a marketer and their DMP partner tie these execution platforms together, understand audience data, and the performance data generated from campaigns is a critical part of a successful DMP implementation.

Operator: Last, but not least, is the agency as operator of the DMP. Marketers want their data safely protected in their own DMP, with strong governance rules around how first-party data is shared. They also need a hub for utilizing third-party data and integrating it with various execution and analytics platforms. Marketers may not want to operate the DMP themselves, though. Agencies can win by helping marketers wring the most value from their platforms.

Marketers have strong expertise in their products, markets and customer base – and should focus on their core strengths to grow. Agencies are great at finding audiences, building compelling creative and applying marketing investment dollars across channels, but are not necessarily the right stewards of others’ data.

Future success for agencies will come from helping marketers implement their data management strategy, align their data with their existing technology stack and return insights that drive ongoing results.

[This post originally appeared in AdExchanger on 2.2.15]


Data Management Platform · DMP

How Can Advertisers Bypass The Industry’s Walled Gardens?

own-walled-gardenIn this increasingly cross-device world, marketers have been steadily losing the ability to connect with consumers in meaningful ways. Being a marketer has gone from three-martini lunches where you commit to a year’s worth of advertising in November to a constant hunt for new and existing customers along a multifaceted “customer journey” where the message is no longer controlled.

Consumers’ attention migrates from device to device, where they spread their limited attention among multiple applications. It’s become a technology game to try and track them down, and starting to become a big data game to serve them the “right message, at the right place, at the right time.”

Modern ad tech is supposed to be the marketer’s savior, helping him sort out how to migrate budgets from traditional media, such as TV, radio and print, to the addressable channels where people now spend all of their time. Marketers and their agencies need a technology “stack,” but they end up with a hot mess of different solutions, including various DSPs for multiple channels, content marketing software and ad servers.

Operating and managing all of them is possible, but laborious and difficult to do right. Worse still, these systems are nearly impossible to connect. Am I targeting the same consumer over and over through various channels? How to manage messaging, frequency and sequencing of ads?

Since all of these systems purport to connect marketers to customers on the audience level, the coin of the realm is data. It’s not just “audience data” but actual data on the individuals the marketer wants to target.

Marketing is now a people game.

Yet, in the cross-channel, evolving world of addressable media, connecting people to their various devices is difficult. You need to see a lot of user data, and you have to not only collect web-based event data, but also mobile data where cookies don’t exist. Deterministic data, such as a website’s registration data, can lay the foundation for identity. When blended with probabilistic data and modeled from user behavior and other signals, it becomes possible to find an individual.

Right now, the overlords of the people marketing game are platforms like Google, where people are happy to stay logged in to their email application on desktop, mobile and tablet, or Facebook, which knows everything because we are nice enough to tell them. Regular publishers may be lucky enough to have subscription users that log in to desktop and mobile devices, but most publishers don’t collect such data. Their ability to deliver true one-to-one marketing to their advertisers is limited to their ability to identify users.

This dynamic rapidly makes the big “walled gardens” of the Internet the only place big marketers can go to unlock the customer journey. That might work for Google and Facebook shareholders and employees, but it’s not good for anyone else. In our increasingly data-dependent world, not all marketers are comfortable borrowing the keys to user identity from platforms that sell their customers advertising. Soon, everyone will have to either pay a stiff toll to access such user data, or come up with innovation that enables a different way to unlock people-centric marketing.

What is needed is an independent “truth set” that advertisers can leverage to match their anonymous traffic with rich customer profiles, so they can actually start to unlock the coveted “360-degree view of the user.” Not only does a large truth set of users create better match rates with first-party data to improve targeting, but it also holds the key to making things like lookalike modeling and algorithmic optimization work. Put simply, the more data the machine has to work with, the more patterns it finds and the better it learns. In the case of user identity, the probabilistic models most DMPs deploy today are very similar. Their individual effectiveness depends on the underlying data they can leverage to do their jobs.

In the new cross-device reality: If you can’t leverage a huge data set to target users, it’s time to take your toys and go home. Little Johnny doesn’t use his desktop anymore.

Think about the three principle assets most companies have: their brand, their intellectual property and products and their customer data. Why should a company make a third of their internal value dependent upon a third party, whether or not they pledge “no evil?” Those that offer a “triple play” of mobile, cable television and phone services are also part of the few companies that can match a user across various devices. The problem? They all sell, or facilitate the sale of, lots of advertising. Marketers are not sure they want to depend on them for unlocking the puzzle of user identity.

Some of the greatest providers of audience data are independent publishers who, banded together, can create great scale and assemble a truth set as great as Facebook and Google. Maybe it’s time to create a data alliance that breaks the existing paradigm. The “give to get” proposition would be simple: Publishers contribute anonymized audience identity data to a central platform and get access to identity services as a participant. This syndicate could enable the deployment of a universal ID that helps marketers match consumers to their devices and create an alternative to the large walled gardens.

The real truth is that, without banding together, even great premium publishers will have a hard time unlocking the enigma of cross-device identity for marketers. Why not build a garden with your neighbors, rather than play in somebody else’s?

[This post was originally published in AdExchanger on 12.11.14]

Data Management Platform · DMP · Publishing

Are Publisher Trading Desks Next?

tradingDeskA long time ago, I was selling highly premium banner ad inventory to major advertisers. Part of a larger media organization, our site had great consumer electronics content tailored to successful professional and amateur product enthusiasts. The thing we loved most was sponsorships and advertorials. We practically had a micro-agency inside our shop, and we produced amazing custom websites, contests, and branded content sections for our best clients. They loved our creative approach, subject matter expertise, and association with our amazing brand. They still capture this revenue today.

The next thing we loved was our homepage and index page banner inventory. We sold all of our premium inventory—mostly 728×90 and 300×250 banners—by hand, and realized very nice CPMs. Back then, we were getting CPMs upwards of $50, since we had an audience of high-spending B2B readers. I imagine that today, the same site is running lots of premium video and rich media, and getting CPMs in the high teens for their above-the fold inventory and pre-roll in their video player. I was on the site recently, and saw most of the same major advertisers running strong throughout the popular parts of the site. Today, a lot of this “transactional RFP” activity is being handled by programmatic direct technologies that include companies like NextMark, Centro, iSocket, and AdSlot, not to mention MediaOcean.

What about remnant? We really didn’t think about it much. Actually, realizing how worthless most of that below-the-fold and deep-paged inventory was, we ran house ads, or bundled lots of “value added ROS” impression together for our good customers. Those were simple days, when monetization was focused on having salespeople sell more—and pushing your editorial team to produce more content worthy of high CPM banner placements.

Come to think of it, it seems like not much has changed over the last 10 years, with the notable exception of publishers’ approach to remnant inventory. About five years ago, they found some ad tech folks to take 100% of it off their hands. Even though they didn’t get a lot of money for it, they figured it was okay, since they could focus on their premium inventory and sales relationships. In doing some of those early network deals, I wondered who the hell would want millions of below-the-fold banners and 468x90s, anyway? Boy, was I stupid. Close your eyes for a year or two, and a whole “Kawaja map” pops up.

Anyhow, we all know what happened next. Networks used data and technology to make the crap they were buying more relevant to advertisers (“audience targeting”), and the demand side—seeing CPMs drop from $17 to $7, played right along. Advertisers LOVE programmatic RTB buying. It puts them in the driver’s seat, lets them determine pricing, and also (thanks to “agency trading desks”) lets them enhance their shrinking margins with a media vigorish. Unfortunately, for publishers, it meant that a rising sea of audience targeting capability only lifted the agency and ad tech boats. Publishers were seeing CPMs decline, networks eat into overall ad spending, DSPs further devaluing inventory, and self-service platforms like Facebook siphon off more of the pie.

How do publishers get control back of their remnant inventory—and start to take their rightful ownership of audience targeting?

That has now become simple (well, it’s simple after some painful tech implementation). Data Management Platforms are the key for publishers looking to segment, target, and expand their audiences via lookalike modeling. They can leverage their clients’ first party data and their own to drive powerful audience-targeted campaigns right within their own domains, and start capturing real CPMs for their inventory rather than handing networks and SSPs the lion’s share of the advertising dollar. That is step number one, and any publisher with a significant amount of under-monetized inventory would be foolish to do otherwise. Why did Lotame switch from network to DMP years ago? Because they saw this coming. Now they help publishers power their own inventory and get back control. Understanding your audience—and having powerful insights to help your advertisers understand it—is the key to success. Right now, there are about a dozen DMPs that are highly effective for audience activation.

What is even more interesting to me is what a publisher can do after they start to understand audiences better. The really cool thing about DMPs is that they can enable a publisher to have their own type of “trading desk.” Before we go wild and start taking about “PTDs” or PTSDs or whatever, let me explain.

If I am BigSportsSite, for example, and I am the world’s foremost expert in sports content, ranking #1 or #2 in Comscore for my category, and consistently selling my inventory at a premium, what happens when I only have $800,000 in “basketball enthusiasts” in a month and my advertiser needs $1,000,000 worth? What happens today is that the agency buys up every last scrap of premium inventory he can find on my site and others, and then plunks the rest of her budget down on an agency trading desk, who uses MediaMath to find “basketball intenders” and other likely males across a wide range of exchange inventory.

But doesn’t BigSportsSite know more about this particular audience than anyone else? Aren’t they the ones with historical campaign data, access to tons of first-party site data, and access to their clients’ first party data as well? Aren’t they the ones with the content expertise which enables them to see what types of pages and context perform well for various types of creative? Also, doesn’t BigSportsSite license content to a larger network of pre-qualified, premium sites that also have access to a similar audience? If the answer to all of the above is yes, why doesn’t BigSportsSite run a trading desk, and do reach extension on their advertisers’ behalf?

I think the answer is that they haven’t had access to the right set of tools so far—and, more so, the notion of “audience discovery” has somehow been put in the hands of the demand side. I think that’s a huge mistake. If I’m a publisher who frequently runs out of category-specific inventory like “sports lovers,” I am immediately going to install a DMP and hire a very smart guy to help me when I can’t monetize the last $200,000 of an RFP. Advertisers trust BigSportsSite to be the authority in their audience, and (as importantly) the arbiter of what constitutes high quality category content.

Why let the demand side have all of the fun? Publishers who understand their audience can find them on their own site, their clients’ sites, across an affiliated network of partner sites, and in the long tail through exchanges. These multi-tiered audience packages can be delivered through one trusted partner, and aligned with their concurrent sponsorship and transactional premium direct advertising.

Maybe we shouldn’t call them Publisher Trading Desks, but every good publisher should have one.

[This article originally appeared in AdExchanger on 4/5/2013]

DMP · Marketing

The Data-Driven CMO

talkdataMark Zagorski, the CEO of data management platform eXelate has worked with dozens of big marketers to help them put all kinds of data to work, including their own.

“Right now most organizations are dealing with terabytes of data. Over a third [manage] more than 10 terabytes of data and one-fifth will manage half a petabyte of data within three years,” Zagorski tells me. “The key objective for marketers seeking to harness the power of big-data is to make it actionable.”

As a marketer, it is likely that you have access to a great deal of data, and maybe even the kind of big-data we’ve been hearing so much about. CRM data grows every day; point-of-sale data gets easier and less expensive to store; tag-collected data from websites and social sites expands daily; and there is a seemingly infinite amount of third-party data available for purchasing and mixing in with your own.

The modern CMO must find a way to value the data assets she has, learn to listen for the real signals among the noise, and find a way to put that data to use. Mostly, that means understanding customer attributes, what drives them to transact, and how much it costs to get them to do so more frequently.

For Darren Herman, in charge of digital media at forward-looking agency Media Kitchen, data is all about the way it can be leveraged for his clients. “We care less about big-data and more about actionable data. Our clients have tons of first-party data,” Herman tells me, adding that the real challenge is in “uniting the data between silos (usually within client organizations) and making them available and actionable for advertising and marketing decisions. Much of the time, the clients’ data is available through the IT organization, and it’s not quite understood how it will be used for marketing decisions.”

In many ways, data-driven CMOs face two challenges: Firstly, winning the internal battle with the CTO to get access to disparate data sources, and bringing them together in a way that creates the opportunity to glean global insights; and secondly, building the platform that enables them to normalize many discrete data types, query that data quickly, and “activate” that data to produce a sales outcome.

Think of a large, global consumer products organization. A company that sells soap suds around the world may have up to 20 regional operating companies, and as many as 200 separate datacenters throughout the organization. Within all of those data silos are digital stories of marketing success and failure. Imagine if you could duplicate the promotional dynamics that drove a 20 percent increase in Italian diaper sales across the entire global organization, or leverage the learnings that one operating company had when a key discounting scheme failed?

These types of insights can be obtained when the CMO asks the right questions, and when he has data management platforms behind him that can make it possible to get the answers. Being a data-driven marketer isn’t about how much data you can centralize in a single platform. The data may be big, but ultimately the data you store is only as valuable as your ability to extract insights from it — and act upon it.

[This post originally appeared in the CMO Site on 3/20/13]

Big Data · Data Management Platform · DMP

Managing Data in [real] Real-Time

A Conversation with Srini Srinivasan, Founder and VP Operations of Aerospike

Even today, the notion that a consumer can go to a website, be identified, trigger a live auction involving as many as a dozen or more advertisers, and be served an ad in real-time, seems like a marvel of technology. It takes a tremendous amount of hardware and, even more than ever, a tremendous amount of lightning-fast software to accomplish. What has been driving the trend towards ever faster computing within ad technology are new no-SQL database technologies, specifically designed to read and write data in millisecond frameworks. We talked with one of the creators of this evolving type of database software, who has been quietly powering companies including BlueKai, AppNexus, and [x+1], and got his perspective on data science, what “real time” really means, and “the cloud.”

Data is growing exponentially, and becoming easier and cheaper to store and access. Does more data always equal more results for marketers?

Srini Srinivasan: Big Data is data that cannot be managed by traditional relational databases because it is unstructured or semi-structured and the most important big data is hot data, data you can act on it in real-time. It’s not so much the size of the data but rather the rate at which data is changing. It is about the ability to adapt applications to react to the fast changes in large amounts of data that are happening constantly on the Web.

Let’s consider a consumer who is visiting a Web page, or buying something online, or viewing an ad. The data associated with each of these interactions is small. However, when these interactions are multiplied by the millions of people online at any moment, they generate a huge amount of data. AppNexus, which uses our Aerospike NoSQL database to power its real-time bidding platform, handles more than 30 billion transactions per day.

The other aspect is that real-time online consumer data has a very short half life. It is extremely valuable the moment it arrives, but as the consumer continues to move around the Web it quickly loses relevance. In short, if you can’t act on it in real-time, it’s not that useful. That is why our customers demand a database that handles reads and writes in milliseconds with sub-millisecond latency.

Let me give you a couple examples. [x+1] uses our database to analyze thousands of attributes and return a response within 4 milliseconds. LiveRail uses our database to reliably handle 200,000 transactions per second (TPS) while making data accessible within 5 milliseconds at least 99% of the time.

This leads into the last dimension, which is predictable high performance. Because so much of consumer-driven big data loses value almost immediately, downtime is not an option. Moreover, a 5-millisecond response has to be consistent, whether a marketing platform is processing 50,000 TPS or 300,000 TPS.

What are some of the meta-trends you see that is making data management easier (standardization around a platform such as Hadoop? The emergence of No-SQL systems? The accessibility of cloud-hosting?

SS: Today, with consumers engaged more with Web applications, social media sites like Facebook, and mobile devices, marketers need to do a tremendous amount of analysis against data to make sure that they are drawing the right conclusions. They need data management platforms that can absorb terabytes of data—structured and unstructured—while enabling more flexible queries on flexible schema.

In my opinion, classical data systems have completely failed to meet these needs over the last 10 years. That is why we are seeing an explosion of new products, so called NoSQL databases that work on individual use cases. Going forward, I think we’ll see a consolidation as databases and other data management platforms extend their capabilities to handle multiple use cases. There will still be batch analysis platforms like Hadoop, real-time transactional systems, and some databases like Aerospike that combine the two. Additionally, there will be a role for a few special-purpose platforms, just like in the old days we had OLTP, OLAP and special purpose platforms like IBM IMS. However, you won’t see 10 different types of systems trying to solve different pieces of the puzzle.

The fact is we are beginning to see the creation of a whole new market to address the question, “How do you produce insights and do so at scale?”

One of the biggest challenges for marketers has been that useful data is often in silos and not shared. What are some of the new techniques and technologies making data collection and integration easier and more accessible for today’s marketer?

SS: Many of our customers are in the ad-tech space, which is generally at the front-end of technology trends adopted by the broader marketing sector. We are just beginning to see a new trend among some of these customers, who are using Aerospike as a streaming database. They are eliminating the ETL (extract, transformation, load) process. By removing the multi-stage processing pipeline, these companies are making big data usable, faster than ever.

The ability to achieve real-time speed at Web-scale, is making it possible to rethink how companies approach processing their data. Traditional relational databases haven’t provided this speed at scale. However, new technology developments in clustering and SSD optimization are enabling much greater amounts of data to be stored in a cluster—and for that data to be processed in milliseconds.

This is just one new way that real-time is changing how marketers capitalize on their big data. I think we’ll continue to see other innovative new approaches that we wouldn’t have imagined just a couple years ago.

Storing lots of data and making it accessible quickly requires lots of expensive hardware and database software. The trend has been rapidly shifting from legacy models (hosted Oracle or Neteeza solutions) to cloud-based hosting through Rackspace or Amazon, among others. Open source database software solutions such as Hadoop are also shifting the paradigm. Where does this end up? What are the advantages of cloud vs. hosted solutions? How should companies be thinking about storing their marketing-specific data for the next 5-10 years?

SS: A couple years ago nearly everyone was looking at the cloud. While some applications are well suited for the cloud, those built around real-time responses require bare metal performance. Fundamentally it depends on the SLA of the applications. If you need response times in the milliseconds, you can’t afford the cloud’s lack of predictable performance. The demand for efficient scalability is also driving more people back from the cloud. We’re even seeing this with implementations of Hadoop, which is used for batch processing. If a company can run a 100-server cluster locally versus having to depend on a 1,000-server cluster in the cloud, the local 100-server option will win out because efficiency and predictability matter in performance.

What are top companies doing right now to leverage disparate data sets? Are the hardware and software technology available today adequate to build global, integrated marketing “stacks?”

SS: Many of the companies we work with today have two, four, sometimes more data centers in order to get as close to their customers as possible. Ad-tech companies in particular tell us they have about 100 milliseconds—just one-tenth of a second—to receive data, analyze it, and deliver a response. Shortening the physical distance to the customer helps to minimize the time that information travels the network.

Many of these firms take advantage of cross data center replication to include partial or full copies of their data at each location. This gives marketers more information on which to make decisions. It also addresses the demand for their systems to deliver 100% uptime. Our live link approach to replication makes it possible to copy data from one data center to another with no impact on performance and ensures high availability.

Over the last year, we’ve have had customers experience a power failure at one data center due to severe weather, but with one or more data centers available to immediately pick up the workload, they were able to continue business as usual. It comes back to the earlier discussion. Data has the highest value when marketers can act on it in real-time, 100% of the time.

This interview, among many others, appears in EConsultancy’s recently published Best Practices in Data Management by Chris O’Hara. Chris is an ad technology executive, the author of Best Practices in Digital Display Media, a frequent contributor to a number of trade publications, and a blogger.

This post also appeared on the iMediaConnection Blog 1/11/12.

Data Management Platform · DMP

Matching Offline Data for Online Targeting

A Conversation with Live Ramp’s CEO Auren Hoffman

When all marketers have universal access to an entire world of third party online segmentation data, advertisers are increasingly turning offline for an edge. Leveraging established and deep CRM data, marketers are matching their customer databases to online cookies for targeting and retargeting, and going beyond basic demographic data by bringing multiple data sets into the digital marketing mix. I recently interviewed Live Ramp’s Auren Hoffman to learn more about how traditional databases are getting matched to online cookies, and made available for targeting.

Offline data versus online data. You hear first-party data talked about like it’s the gold standard. Just how much more valuable is a company’s first party data?

Auren Hoffman (AH): First, some clarification: Offline does not equal first-party data; nor is online equivalent to third party data.

The gold standard is not first-party data. It’s the rich knowledge (and capacity for segmentation) that lies in a company’s CRM database, typically tied to a name/address or an email address (including purchase history, direct mail, email campaigns, and loyalty). That knowledge, which is largely (but not exclusively) first-party data, exists almost exclusively offline.

Oftentimes, this specific customer knowledge – first-party data belonging to a brand or business – is augmented by complementary third-party data (for example, zip code-based psychographic typing). Also added into the mix is certain online data (largely transactional, where the customer is known) that has been taken offline (into the CRM database).

This deep customer knowledge has – before now – really only been usable offline (to manage direct marketing, for example). Customer segmentation derived from CRM data is commonly used to target certain audiences with specific messages. That same knowledge has not been – could not be – used to achieve better targeting online through display advertising… until recently.

Companies such as LiveRamp take the knowledge about individual customers from offline CRM databases to form useful and rich customer segmentation that can be “onboarded” – taken online and used for highly-focused display advertising, in a safe and privacy centric way. For example, catalog recipients (from a CRM-driven direct marketing campaign) whom it is known both purchase online and focus on a particular product line in their purchases can be transformed into an online audience with a very focused marketing message. This is what LiveRamp does: translate rich offline data (first- or third-party, or both) into anonymized online segments that can be used to create highly targeted and therefore more effective display advertising. LiveRamp is the only company focused solely on providing data onboarding that can be used to achieve “CRM Retargeting” (using CRM data to enable highly-targeted display advertising).

It should be emphasized that onboarded data is anonymized – that is, unlike CRM data which is frequently used in its individualized form (specific customers tied to an email or postal), onboarded data is aggregated based on customer segments (e.g. a possible segment could be customers who have not purchased from the brand in more than six months) who receive a specific message (e.g. special incentive to return to the brand). So the customer’s privacy is protected, while the customer is still able to receive an offer or message likely to be of specific appeal. With CRM retargeting, brands can target last year’s shoppers with relevant ads about the upcoming holiday season to remind them about your brand’s offer, regardless of if, or when they’ve been to your site.

What kind of offline data should marketers consider bringing online? What offline data do you consider to be the most valuable in terms of audience targeting?

AH: Marketers should consider any data that allows them to create more targeted – and therefore more valuable – segmentation for use in online display advertising; which will vary depending on a brand’s business and messaging strategy. The most valuable such data is that which, when linked with focused messaging, is most likely to achieve resonance with the audience segment. Onboarded data, as noted above, is anonymized; consequently the objective is not to track down and message individual consumers (which would be intrusive), but rather to develop creative messaging to groups of (anonymized) customers (e.g. lapsed customers, or those with particular product or service requirements – for instance, customers with car leases about to expire might well be interested in incentives for a new lease).

Though the most valuable data is likely to be based on transactional history or product/service preferences, it is by no means limited to this. The most valuable data is that determined by the brand to create segmentation – and the accompanying messaging – needed to elicit a positive customer response and in turn ROI.

How should marketers manage their data? Now that data is so cheap to collect, transfer, load, and store the tendency is to make almost every piece of data available for analysis. Where should marketers draw the line? What about recency? Does the cost of keeping certain datasets (transaction events, for example) recent outweigh their potential value?

AH: We’re agnostic on this. (That is, we’re not in the business of managing the data, just bridging the offline/online divide with onboarding expertise.) Each marketer must judge for him or herself the value of data in relation to its potential use for targeted segmentation.
How does it work? Please describe, in layman’s terms where possible, the various methodologies for matching offline data with an online consumer. (cookie matching, key value pair match, etc)

  • A brand (or a brand’s agency) provides LiveRamp with an encoded CRM safely through our secure upload portal.
  • LiveRamp matches your offline data keyed off an email address to an anonymous online audience via cookies with extensive coverage and high accuracy.
  • LiveRamp places the online audience on a brand’s existing DSP or DMP (or we can suggest one of our partner platform’s) & the display campaign runs as normal with a larger, more valuable, and more targeted audience.
  • Your customers see a relevant and timely message from your brand
  • LiveRamp does not buy or sell data. We do not collect any data from a site, our cookies do not contain PII, and we do not pass any site audience information to any third party.

This interview, among many others, appears in EConsultancy’s recently published Best Practices in Data Management by Chris O’Hara. Chris is an ad technology executive, the author of Best Practices in Digital Display Media, a frequent contributor to a number of trade publications, and a blogger.

This post also appeared on the iMediaConnection blog on 1/3/12.

Data Management Platform · DMP · nPario

What is Data Science?

A Conversation with Ankur Teredesai, Data Scientist, nPario

These days, the term “bid data” is all the rage and over a dozen data management platforms are competing for the right to manage audience segmentation, targeting, analytics, and lookalike modeling for advertisers and publishers. I recently sat down with noted data scientist Ankur Teredesai to help understand data science, and how ad technology companies are using data science principles to help publishers and marketers understand audiences better. Ankur is the head of data science for nPario, a WPP portfolio company focused on data management, and he is also a professor at the University of Washington.

You hear lots of digital advertising people talk about “data science.” Does their perception differ from the broader, academic understanding of what data science is? Is the notion of utilizing data science in digital marketing applications new?

Ankur Teredesai (AT): It’s very interesting to see that the digital advertising folks have started deep conversations with data scientists. Data science is a very interesting space these days where data mining and database management technology is now helping variety of disciplines in establishing a scientific approach to decision making. Data science dealing with the problems of finding patterns in large amounts of data is not a new concept for digital marketing. What is new is the advent of technologies that now support finding useful patterns in large variety and velocity of data in addition to volume; thereby advancing the state of the art in marketing analytics.

Do ad technology companies really rely on data science? What does being a data-driven organization really mean?

AT: No ad-tech company can afford to NOT rely on data science in some shape or form. The power of predictive modeling is quickly differentiating the players who are making quick inroads by using the low-hanging fruits of data science for these domains from the ones that are treating data science as a passing buzzword. My advice to all ad-tech companies is to get at least one data scientist in their ranks; even if they don’t like the term data science for some or the other reason. Machine Learning, data mining and big data analytics are all equally acceptable today.

Describe the concept of data modeling for the non-academic user. What kind of models are being built for digital marketing applications?

AT: A variety of problems in digital marketing are being addressed using predictive modeling. Some examples of the work we are doing at nPario are (a) look-alike modeling that helps find targetable audiences to “rightsize” or expand a particular segment, (b) recommend cost-aware segments that are similar to desired audience segments for targeting, (c) provide comparative analytics for exploring the unique properties of a given segment, (d) enabling real-time audience classification to reduce the time to target in an efficient and effective manner.

Is the concept of lookalike modeling legitimate? How does this work? Is LAM a scalable targeting practice?

AT: Given customer behavior data, the lookalike model estimates and exploits the variations and similarities in behavior across various segments. Once the model figures out which similarities or differences are robust enough in the audience to be useful for predicting future behavior, it exploits these attributes in the data to expand a particular segment’s size by including those customers that are similar to the base segment but were not included because they did not meet the segment definition criteria. This allows fairly restrictive criteria to be relaxed using data science methods such as association mining and regression analysis to expand the segment size accurately, confidently, and in a scalable manner.

The entire digital advertising ecosystem is driven by data. What are the most valuable types of data for targeting? How do you see the future for the ecosystem? Will those with the most data win?

AT: This is central to success of the entire digital advertising world and the benefit of end consumers in finding the right and useful advertising. If we have to understand the user and focus on making digital advertising useful without making it adversarial, we have to focus carefully on the types and granularity of the data being collected. At both nPario and the Institute of Technology, University of Washington Tacoma where I hold an academic appointment, we stress the need for developing an ecosystem of data collection, management and mining that is customer centric with highest regards for security through robust multitenancy, cryptography and privacy aware practices.  The entire technology stack at nPario is for example, data agnostic. We decided very early on in the company to not be supplies of data but to be data pool neutral to allow our clients to bring their own first, second and third party data. Our platforms help clients derive value from the variety of big datasets while at the same time ensuring that customer and end-user privacy is preserved and not compromised through our actions in any manner whatsoever. To address your question if those with most data win I would like to quote that : Everybody has some data and some just have lots of data. It is the ones that have the right tools at the right time that will monetize their data the best.

This interview, among many others, appears in EConsultancy’s recently published Best Practices in Data Management by Chris O’Hara. Chris is an ad technology executive, the author of Best Practices in Digital Display Media, a frequent contributor to a number of trade publications, and a blogger.