Every Product Manager will bump into the Big Data challenge sooner or later. Will your product be ready to store, analyze, and present the flood of data it produces? In this post, I share 6 critical areas every Product Manager should address when building Products that manage large amounts of data.
A few years ago, when I was working as Product Manager of an Enterprise Data Management Product, a company approached us to help them solve a major challenge. They had just invested in a state-of-the-art automation system capable of generating tons of data (to the rate of gigabytes/second). A few months later, they realized that their new investment wasn’t paying off because they had just replaced one problem with another: they were able produce quality data, but they found themselves inundated in it. They weren’t able to analyze it, store it, or manage it.
This situation is faced today by many companies in all industries: our ability to produce data is growing at unprecedented rates, which creates a new challenge commonly known as Big Data. And as technology evolves, this challenge will only get worse. With trends like The Internet of Things, more and more devices are becoming data producers. In short, Big Data is only getting started..
As Software Product Managers, we need to make sure that our product is ready to handle large amounts of data while providing a good user experience. There are many considerations in this area, but here are 6 key areas you should definitely put some thought to in order to ensure your Product is able to keep up with the times.
The 6 areas every Product Manager should consider are:
- Data storage
- Data access tools
#1 – Personas: Who are you building this for?
An important first step is to understand the different ‘data personas’ that will use your Product. For example, here are some high-level groupings of possible persona types:
- Data producers
- Data consumers
- Personas focused on report creation
- Personas focused on maintenance or data integrity
And within each of these groupings, you’ll have sub-categories based on different user roles and contexts. For example, data producers might be broken into:
- At the lab (desktop use, investigatory by nature)
- On the go (tablet use, quick access to specific data)
- At his desk (web-based use, reports and drill-down capabilities)
- On the go (phone use, quick notifications and dashboards)
Think carefully about all of your personas and map out each of their scenarios, so you can have a good picture of all of your use cases. These personas will have an impact on the decisions you make in each of the following 5 areas.
#2 – Storage: Think carefully about your strategy
The two main considerations regarding storage are: how to store and where to store. How to store your data depends on your overall use case. The type of data you produce will determine the type of database you will require. If you have structured data, then a relational database such as SQL Server or MySQL are your best bet. On the other hand, if you have unstructured data such as images, videos, or tweets, then you probably need a schema-less database such as Hadoop or MongoDB. Or maybe, like some systems I’ve worked on, you need both.
I’m not suggesting that Product Managers dictate the type of DB or the architecture of the data tier. That’s the role of your Architecture and IT team. However, it IS our job as Product Managers to define clear use cases and convey those to our technical team, so they can implement the right infrastructure for your product.
As far as where to store, well, that’s more of a business question. Depending on your business model and the understanding of your customers, you need to decide if your product should be installed on-premise or in the Cloud.
If your Product is Cloud-based, you’ll need to ask yourself some questions about your servers, such as:
- Should my company build and maintain that infrastructure ourselves?
- Should we leverage third party infrastructure providers such as Public or Private Clouds?
- Should we go for a Managed Hosting approach?
Again, it is IT’s role to define the infrastructure, but as Product Managers, we should define the usage requirements, the scalability, elasticity, and ultimately the business model. Whichever direction we go with, we need to make sure it’s backed by some good ROI analysis.
On the other hand, if your approach is to be on-premise, then on top of storage, you’ll need to decide how your solution will be deployed. For example:
- Should you only sell the software, and require your customer’s IT to be responsible for providing the hardware?
- Should you provide and deploy the hardware in the form of servers that can be expanded as needed? Will you offer a hardware maintenance contract?
- Will you offer your product as an appliance, meaning they get a system in a box and therefore, the storage characteristics are pre-defined?
Regardless of the choices you make, it’s important to consider the amount of data your system will produce and project the storage needs for 1 year, 5 years, and so on. That way you can calculate the elasticity of your Product and even consider using storage as a parameter in your pricing model.
#3 – Number Crunching and Analytics
The goal (and challenge) of Big Data is to get actionable information out of raw data. Therefore, providing good analytics is very important. Many companies focus only on storing large amounts of data, but their analytic tools often fall short. Remember, storage is just half the battle. Some questions to consider regarding analytics include:
- Will you serve your users with raw data, or will you massage the data before consumption (i.e. reports)?
- Will the data processing occur on-demand, or will you pre-process the data to increase performance (i.e. cubes)?
- Is all data coming from the same place, or do you need to aggregate data from multiple locations?
- What are your performance requirements for retrieving data?
When building your Product, keep in mind that building analytics tools is a huge endeavor. The Business Intelligence (BI) industry has come a long way, and they often provide very powerful tools for you to white-label. Unless your product is in fact an analytics tool, I strongly recommend outsourcing this part of your solution. That way, you can focus on building your core competency while leveraging best of breed data analytics and visualization tools. There are many BI players that offer robust OEM solutions. Gartner’s BI Magic Quadrant shows some of the top ones. Now, keep in mind that these solutions are not cheap. But think of it this way, for less than the yearly salary of one single engineer, you can include a top-of-the-line BI solution into your product. The price often includes updates, support, plus assurance that you’ll be protected against obsolescence. It’s easy to see the ROI of this investment if you compare that cost with how much progress one engineer can make in a year.
#4 – Provide Great Tools for Data Visualization
Number crunching is important, but in my opinion, it means nothing if the data is not presented in a user-friendly and useful way. When creating a roadmap for data visualization, it’s very important to understand your user personas and demographics. It’s also very important to understand the user state of mind when accessing data, including form factor. The same persona might be interested in different visualizations when looking at data at his desk, on a tablet, or on a phone.
Also, try not to be prescriptive. One of the benefits of Big Data is the opportunity to explore and discover trends in our same ‘ol data. As Product Managers, we should understand the use case and provide some basic reports and views. But we can’t assume to know everything that the user will want to know or do. Instead, we should provide tools that enable the user to explore the data at will. I remember when a customer told me:
“With your Product, I need to be able to see my data in any way I can think of today or any day in the future.”
At the time, I was frustrated with his comment, but now I understand
the need for this flexibility. I apply his wisdom every time I’m working with my team on data visualization. By the way, visualization is another area where BI platforms shine. They already provide dashboards and tools that give the user this exploratory ability out-of-the box. Something to keep in mind.
#5 – Provide Multiple Ways to Access the Data
As much as we’d like to anticipate all the possible user scenarios and build them into our product, we know that’s not realistic. Let’s face it, it doesn’t matter how good of a tool you provide, your users will always want to use other tools–maybe because they are already familiar with them or maybe because they are superior. For example, business users will continue to love Excel for a simple reason: it’s an amazing tool!
So, instead of fighting it, embrace it. As part of your roadmap, be sure to include ways to get the data in and out of the system through various methods, not only your UIs. Good examples include Import/Export utilities and migration tools.
Of course, the most flexible approach is to provide an open API to access your data. That way you can build additional tools yourself, or you can enable your customer or the community to write whatever tools they require. In fact, this approach has been very successful for companies like Box and Dropbox which generate a large portion of their revenue from API usage. Having an open API is a big topic that has huge business and technical implications…a topic for a future post.
#6 – Focus on Security
Security is always a big concern, and it has been getting a lot more attention given the recent security breaches of companies like Target. For Cloud-based systems, there’s an added concern since the data is stored outside of the customer’s firewalls. Plus, if you are building a multi-tenant system, then there’s the added security risk of other tenants getting access to your data. Yikes!
The topic of security is a very big one, and it really needs to be the focus of any Software Product Manager. It’s our job to define the right criteria and to ensure that you have the right quality controls to ensure your system is bullet proof. Security cannot be an after-thought. It must be baked into every feature and should go across the full stack, from databases to API to User Interface. It should also be part of your QA testing and acceptance plans. And it should be tested in every sprint, and bugs related to security should always be prioritized high before any release.
I highly recommend hiring security companies to act like “hackers” to break into your system and show you where the weaknesses are. Financial institutions use them all the time, but they are becoming common practice for any Cloud-based company.
The Bottom Line
Big Data might be a buzz word, but it’s a very real challenge, and it’s going to hit your Product sooner or later. Will you be ready? The considerations above are certainly not all-encompassing, but hopefully, they will provide food for thought and somewhere to start.
I’d love to hear from you about your approach and experiences managing Big Data, and any additional important areas you consider in your product plans. Please leave a comment below, and if you liked this article, please share it around!