Welcome back to the second post in my series, Your First 60 Days as a First Data Hire. In my last post, I discussed some great early advice as you start your first 2 weeks as a data hire, including the unique challenges of data roles, the importance of understanding the business, the product, and your team.
Today, I’ll dive deeper into the technical side of things: setting up foundational infrastructure, understanding the common metrics of your industry, gently guiding your colleagues, and planning to plan.
As a reminder, Data Columns is a biweekly community-led newsletter by me (Pedram Navid!) that brings practical advice about leveling-up as a data practitioner. My focus is bringing relevant, topical advice on practical matters that data people care about to your inbox.
If you want to know more about the newsletter, or want to contribute, drop me a line on Twitter or by email.
If you've followed the advice for the first two weeks, you've spent some time better understanding the business, the product, and your colleagues. You've talked to your manager about expectations for the role, and you've dug into the pain points people are feeling. The next step is to define your team's purpose in order to be able to set the foundational infrastructure to help deliver the most value you can, as quickly as possible.
A Data Team's Purpose
At this stage, your job is really to figure out three things: what do I build, when, and in what order? These questions center around what the output of your team is, how to prioritize competing demands on your time, and how to estimate the time it will take to build the various pieces. Even if your job title isn't Data Team Manager, you will often be acting as one. Management comes in various flavours, and a lack of a direct report doesn't absolve you of the same responsibilities other managers tend to face.
If we think about a data team in terms of a black box with inputs, work, and outputs, then it might look something like this:
I like to start with this view of a data team as it helps clarify what their purpose is. We can debate whether Insights, Knowledge, or Shared Context is the right output, but the idea is to have a clear mission for your team, as this will help prioritize work to maximize the leverage of your team.
This is a lot harder than it sounds. I asked on Twitter what a Data Team's output is and I think all 71 comments were different answers
If I had to pick a single best answer, it would be Katie Bauer's response:
a shared context across the organization— Katie Bauer (@imightbemary) March 8, 2022
What I love about this is that it really highlights the value of a data team as mediators of an organization. I highly encourage diving into that thread, there's so much great insight there that it's even forcing me to rethink what I thought my purpose was on my team right now.
Defining the purpose of your team also goes back to the work you've done in the first two weeks. Speaking with your stakeholders and validating expectations for your roles can help clarify what value you can bring to the organization. Remember that it's very common for people to not know what they want and that there's always room for negotiation in terms of scope and responsibilities. As your maturity grows in your career, you will often find that decisions become less top-down and start to move toward a push-and-pull model as you and your manager continually refine what the best use of your time will be.
Only with your team's purpose in mind will you be able to start planning out the next steps. Neal Coleman acutely observes that being able to prioritize and zero in on the highest leverage work is critical in your early days.
If your team's raison-d'être is, for example, to deliver insights that help the company make better decisions, then your first steps should be finding one piece of work that will help deliver on that promise.
Through the interviews and discussions you've had, you'll likely have identified several pain points and one will stand out as the most obvious one to fix. One example from my own career was the lack of a data warehouse at a company that was relying heavily on Excel spreadsheets to answer fundamental questions about the business. The pain point was clear: the manual process was brittle, hard to reason about, and quickly becoming impossible to update. The issues began early in the pipeline, with data being queried directly from a production database, and an ever-evolving schema coupled with manual data entry. The complex business logic and transformations required no longer fit the existing tooling.
The spreadsheet itself was massive and to replace the entire thing would've been too large an undertaking. Instead I focused on the path that would lead to the greatest leverage. I narrowed my scope down to only the portions that could be readily automated. My goal was a pipeline that would ingest production data into a warehouse, and migrating the bulk of transformations that were in the Excel spreadsheet to version-controlled dbt code. I didn't plan on replacing the spreadsheet with Looker, or finding a way to automate all the manual work – that was work that could be done later.
By focusing on automating the ingestion and analysis of production data, I knew I would maximize my leverage, as that data would be useful across the organization and would serve as the foundation for everything else. Getting that right is how I built trust across the organization, so that I could get more resources and help when I needed it later.
Don't just listen to me. Kelly Burdine and Emilie Schario both echo the same sentiment, and they're far smarter than I could ever hope to be. It can be easy to get caught up in the trap of over-optimizing early on. If there's one thing I know for certain, it's that the work never ends. Be strategic about what you focus on. Your time is the only resource you cannot buy more of, so remember that every time you say yes to one thing, you implicitly say no to a universe of other things you could work on.
As you begin planning out your initial project, make sure you understand what industry best practices are for measuring performance. For example, if you're focused on sales and revenue, you'll want to figure out what MRR is and how to measure it. If your focus is on product, map out a minimal event taxonomy and start measuring product metrics.
There's a ton of great resources available for understanding the key metrics across any type of industry. Some great ones are from Senovo for B2B SaaS. For product metrics there's a great guide by UX Collective.
Now's a great time to also find communities centered around the work you do. #measure is a great place to go for digital marketing, Locally Optimistic is a wonderful Slack community for anyone working in data There's Wizards of Ops for the RevOps community.
The idea here is to broaden your horizons outside of the world of data, and to better understand the world of the people you serve. If you're wondering why this wasn't part of the initial two weeks, I've personally found it hard to digest information that is too abstract and removed from the work I'm used to. If I were to join a RevOps community, for example, without having anyone on a RevOps team to talk to and discuss with, I'll find myself overwhelmed and unable to connect what I'm reading with how it might apply to what I'm working on. But if I'm trying to solve a particular use case then it's much easier for me to dive into a particular area of focus as the context to what I'm reading becomes more clear.
Create Avenues for Questions
Undoubtedly as you start to build and improve the fundamental infrastructure required for the data team, you'll start to be able to answer questions from your colleagues in ways that they did not have access to before.
At first, this might look like direct requests to you whenever a question occurs. If you're not getting questions, this is often a sign that people aren't really sure how best to engage with you, so it's worth encouraging your colleagues. When I do get questions, I make a point of thanking people for asking me. If they ever say anything that seems like they're worried that they're bothering me, or that their questions are dumb, I emphasize very clearly that it's my job to answer their questions to encourage more questions. The last thing I want is people feeling like the data team is a mystical team that shall not be bothered unless there's an emergency.
Soon, you'll want to make the avenues for question-asking more explicit. I have a #help-data Slack channel where I invite questions to be asked in public. This helps build a culture of question-asking and curiosity that is really vital to develop as you build out a data function. If people aren't excited by the potential of a data team, you'll have a hard time building out a data function. There will come a time where the questions might have to be more formalized – you may ask for forms, or templates, or tickets – but hold off on that for now. Your goal is to show value, and making it easy to talk to the data team is the first step.
That's not to say every question deserves an answer. Some may be so difficult that they're not worth answering immediately. But questions serve two purposes: they’re a means for people to get answers and more importantly: they serve as information to you about what concerns the people you work with. That feedback is invaluable, as it can help shape the scope of work you're planning to accomplish. If you're getting question after question about product usage then you have a very different task ahead of you then if every question is about revenue and churn.
Plan Your Attack
Once you've defined your team’s purpose, started some ground-work on the fundamentals, built an understanding of the industry, and started soliciting feedback internally, your last step for these 2 weeks is to plan your attack.
Planning can sometimes be a daunting word, but I like to approach the planning portion of my job with a very simple 2-column table. On the left side, I describe the current state. On the right side, I describe the desired state six months or a year from now. The focus is less on what is achievable and more on what is desirable for this first step.
For example, the current state might be:
- we have very simple reporting and analytics on common business concepts like users and product usage.
- we have no self-serve or business intelligence tools, all outputs are via Google Sheets and SQL queries
- we only support one functional team, which is sales
Our desired state a year from now might be:
- more complex reporting and analysis on outcomes like churn, retention, and daily active users
- a self-serve functionality, or at least a BI tool that allows for automated reporting
- supporting two more functional teams, marketing and product
Once the present and future states are identified, the planning step becomes a lot easier. What will it take to get where you need to be in a year? What are the constraints preventing us from getting there? Are they resource-based? Do we need better tools? More headcount? Better support from the engineering teams?
With the constraints identified, you now can develop a plan and validate this with your manager. If you need an additional analytics engineer, it means working with recruiting to build out a Job Description and hiring process. If you need to buy Looker, it means working with Finance to somehow convince them that a $40,000 license for charts is money well spent. All of these things take time, and the planning process is there to help you better understand when to tackle each step.
Open the Black Box
While the output of your team is what you want to maximize, you'll need some indicators that will help guide you day-to-day.
If we assume again that our output as a data team is insights, then what are the activities that a data team performs that generate insight? There's the obvious, such as reports and analysis – but there's also models, documentation, tests, queries, and even meetings. Here is where metadata from your various systems can become really powerful.
The goal here isn't to create a dashboard on every activity that your team performs, but rather to think deliberately about what are the indicators you can use to determine the health of your team. Don’t burn yourself out trying to capture every nuance, but instead see what quick-wins you can identify for yourself to better understand your team.
For example, maybe documentation and test coverage on your models is dropping, even while the number of models is increasing. This might be an important indicator that your team is over-worked, or facing tighter deadlines. Maybe the number of sources and rows you've ingested has grown 10x over the past 3 months, and the number of teams you support has doubled. This could be a sign that it's time to think about hiring another analyst.
Decide what's important to you, and generate some internal reports for yourself. These don't need to be public; odds are no one else cares about these indicators, but they can be really useful when an important conversation comes up at your next one-on-one.
Consider how, "In the last six months, the number of models I support has increased from 50 to 400, and I'm fielding about 3 questions a week for new data requests. I think I need a raise, an analyst, and a bonus", compares to the much tougher conversation: "I'm really busy, give me more money"
This should be more than enough work for your third and fourth weeks. At this point, you’re likely moving from excitement to stress and anxiety. Congratulations, you’re well on your way to becoming a better data leader.
In our next post, we’ll dig deeper into the build, iterate, test cycle and explore some of the thorny edges that surround data teams. Stay tuned, and as always, let me know what you think. I’d love to hear from you!