Webinar Transcript: What Does a Data Scientist Do in the Working World?
The following is a raw transcript of my webinar for the students of the Royal Institute School on behalf of the school\'s Career Guidance Unit.
Hi everyone and thank you for that introduction.
I’m aware that the majority of the audience today is O-Level and A-Level students so I’m going to do my very best to keep things simple but informative and hopefully to keep you somewhat entertained as well.
There’s a lot that I can talk about on this topic but what I’ll be doing today is, instead of diving deep into one or two specific examples, I’ll give you a few high-level ways of thinking about data science that you can then use to figure out how you can carve out a role for yourself as a professional with data science skills in the business world.
Please feel free to ask questions throughout the webinar, drop them in the chat, and I’ll get to them at the end.
Since this webinar is on the working world, I figured let’s start with why I work in data:
I spent my late teenage years and my early 20s doing a lot of exploration. I worked in market research, I worked in radio, I worked as a voiceover artist, I was an entrepreneur, a digital marketer, a blogger, a salesman, and the whole aim of this mess of affairs and roles, at least in my mind, was to get exposure to as many different problems as possible, so that I could figure out which ones fascinated me enough that I would want to work on it for the rest of my life.
Now I had some specific criteria for what problems are fascinating to me:
- I wanted to wear many different hats or work in many different roles while solving the problem. In other words, I did not want any two days to be the same. I did not want to live a life of monotony where you do the same thing every single day. I wanted the freedom and flexibility to be able to shift into another role that’s different enough to engage me but still relevant to my experience.
- I wanted the problems to require the use of both sides of my brain — the creative right and the analytical left. I knew that the field that would fascinate me would be that which requires using both science and art to solve its problem.
- And finally, and perhaps, most importantly, I wanted to work on problems that are fascinating to everyone, not just to those working in the field. I didn’t want people to turn around and walk away when I start talking to them about what I do. I wanted them to be as fascinated as I am about the amount of potential and possibilities that could be unlocked with solving these problems. Because in all honesty, that’s where the money is. In our digital world, the money is where everyone’s attention is.
Naturally, this led to me stumbling upon data science and data analytics that checked all of this criteria.
Just so we’re all on the same page because I know there may be students here who don’t exactly know what data science is: very simply put, data science involves extracting insights from raw data, for example patterns and trends, to be used for decision making either by people or by machines.
For example, if you’ve ever bought anything from any major e-commerce store like Amazon or Daraz, you would’ve seen a section titled “customers who were interested in this product, also bought these other products”, so if you were interested in buying a bicycle of a certain premium brand you’d see premium bike seat covers or a high-end bike lock in the recommendations.
What’s happening behind the scenes is that a machine is looking over very large volumes of transaction data, finding patterns in what customers purchase together, and then automatically recommending related products that you might find helpful. And it’s doing this because the company obviously wants you to buy more so they can make as much money from you as possible.
That’s an example of what you can achieve with data science but the really neat thing about data science is that it is so broad it cuts across every single function of an organization and every single industry and appeals to different types of people with different skillsets and different personalities.
Now these are three things I want to expand on.
Let’s start with the first – cutting across every function and several industries.
Every business in the world is made up of teams of people with certain roles and responsibilities:
- You have management (the CEOs, COOs, CFOs, the strategists and planners) who are responsible for steering the ship.
- You have sales and marketing which are responsible for generating business.
- You have finance that’s responsible for ensuring the company stays afloat.
- You have operations or engineering or manufacturing that’s responsible for designing, creating, and delivering products or services.
- You have legal whose job is to ensure the company isn’t breaking any laws or putting itself at risk of being sued.
- You have HR who needs to make sure there are enough people with the right skills employed by the company.
- You have customer service that needs to help customers with their problems and ensure they’re happy.
- You have IT which needs to make sure all the systems are kept alive and everyone has the tools they need to do their jobs.
I could keep going but these are largely the main functions of most organizations and data science is extremely useful and relevant to every single one of these functions. Let me give you some examples:
- Using data science, management could figure out which regions or markets are the most profitable to expand into because those are really expensive decisions where you can’t afford to make a mistake.
- Using data science, the marketing team can test thousands of different types of advertisements and then spend significantly more on the ones that customers respond to the best thereby maximizing the performance of their campaign.
- Using data science, the finance team can predict if the company is on its way to achieving its revenue and profit targets for the quarter and alert management if it’s not.
- Using data science, the operations team can predict how many orders the company might get in the near future so that they can make sure there are enough people, machines, and warehouse space ready to meet the demand.
- Using data science, the legal team can very quickly find past cases similar to whatever legal case they’re facing right now so they spend less time researching and more time strategizing and planning on how to win.
- Using data science, HR can automatically reduce the 1000s of CVs they’ve gotten for a certain role to just the top 10 that meets all the criteria required for the role.
- Using data science, customer service can proactively reach out to customers who’ve recently bought but are not using their product so they can ensure they don’t regret their purchase.
- And using data science, IT can monitor the usage of systems so they can get rid of systems that aren’t in use or shut them down and save some money.
What this means for data scientists is that you have a buffet of work to choose from. You could be a generalist and work across all these functions or you could be a specialist in, for example marketing analytics or operational analytics, or you could keep jumping around whenever you get bored of one thing. That’s the beauty of data science, there are so many applications within the business world and beyond it.
My advice: you’re probably already interested in one of these things, learn it and then pair it with knowledge of data science. Even if you don’t end up being called a data scientist, possessing data science skills along with expertise in a certain domain will put you a notch above the rest.
We’ve looked at a range of functions and how data science cuts across all of it. It’s easy to imagine how data science cuts across various industries too. Instead of generic examples across a ton of different industries or the stuff you’ve probably heard about like self-driving cars and highly targeted advertising, I’d like to talk about 3 slightly unorthodox uses of data science.
- The first is farm analytics. This is where data science is used to better manage farms. Think of drones carrying cameras using computer vision to keep track of herds of sheep and cows on a very large farm. Think of sensors being used below and above the land to continuously measure the health of soil and crops and then automatically trigger water and fertilizer dispensers if needed. A lot of this stuff saves farmers a lot of time and labour and it helps them focus on growing more nutritious crops at cheaper prices that eventually benefit all of us.
- The second is sports analytics and I’m going to use Formula 1 as an example. Formula 1 cars have over 300 sensors that generate nearly 1.5 million data points per second. That’s a lot of data and so they need to use high performance computing solutions to crunch all those numbers and put it in formats and visualizations that the engineers and drivers can easily understand. These engineers and drivers then use this information to make changes to the configuration of the car or the drivers use the information to see which parts of the track they need to improve on, for example, on which parts they’re braking too much for or too little for. Because even if they can shave off half a second in their performance, in Formula 1 that can mean the difference between 10th place and 1st place.
- The third is entertainment analytics. There are fascinating case studies about how movie studios are using computer vision to continuously monitor the emotions and engagement of an audience during a test screening of a movie. The program can tell when people are laughing, interested, or just bored based on their facial expressions and body language. So, for example, if the program discovers that nobody laughed after a joke in the movie, the scriptwriters can rewrite that part of the movie and re-test it. That way using data science movie studios can create movies that are highly entertaining from start to finish.
Now this is one way of looking at what you could do as a data scientist in the working world. There is another and that takes me back to the other point I wanted to expand on — appealing to different types of people with different skillsets.
Let’s look at a typical data science workflow:
- We always start with a question we want to answer or a problem we want to solve. For example, which of our customers should we focus on? We have 100,000 but we want to identify the ones that we should spend the most time with and who’ll generate the highest return for our business. Remember companies don’t have infinite resources, there’s only a set amount of money and time you can spend and you need to prioritize.
- Then we need to collect data relevant to the question. We need to extract all our customer data from all the systems we’re using in this company and we need to figure out how to put it all together so we’re working with datasets that give us a complete picture of who our customers are and how they’ve interacted with our business.
- Then we need to understand and explore this data. Who are our largest customers? Have our biggest customers changed positions over the years? Which data about our customers have issues? How should this data be cleaned?
- Once we have a good understanding of the data we’re working with and the general patterns and trends, we can begin to think of how we can create a model to predict which customers could potentially generate the most revenue for the company over the next 3 months.
- And then finally, we’ll want to communicate these findings to management and the sales and marketing teams so they can start acting on our findings. We’ll want to create a dashboard for example so everyone can freely explore our findings and dive deeper into the data if necessary.
What’s interesting is that you could be a data scientist or a data professional that works across the entire workflow and usually in smaller organizations you’re going to have to do that. But in a large organization, you’ll find people specializing in a single aspect of this workflow.
- For instance, it’s generally a data-savvy management professional or a strategist that comes up with this question and identifies it as one of the most important things that a company should answer.
- It’s a data engineer working with systems experts, database administrators, and cloud engineers to figure out how the data should be extracted from systems, transformed, combined, and stored.
- It’s generally a business and data analyst that explores the data, identifies and rectifies its issues and they do this because they have a very good understanding of how the business works and they’re able to tell what’s good and representative data from what’s bad and inaccurate data. A data engineer could help here as well.
- Then, and most people find this the more interesting part, it’s the machine learning model developers who decide between all the models they know of to figure out which one would get the job done best.
- And then, it’s the dashboard developers who prepare the reports and dashboards and tell stories to the management and sales and marketing team to convince them that this model is working well and according to its predictions this is who the company should focus on and if they do so, the company could earn 2 times or 3 times more revenue. These dashboard developers could be the business and data analysts as well since who better to tell stories than someone who understands the business deeply. There tends to be some overlap between these roles.
- And of course, let’s not forget that there will be a data science-savvy project manager overseeing the entire workflow and ensuring the work is getting done according to the set budgets and timelines.
Now there’s a lot of different stuff you can be, how in the world do you decide between all of these roles? This takes me to the final thing I wanted to touch upon, and that is data science appeals to different types of people with different personalities as well.
Let’s simplify your personality and let’s reduce it to two factors. One of those factors is how much you like working with people, the other is how much you like working with technology. There will be some of you, like me, that like working with both — a healthy amount of interactions with customers and management teams as well as interactions with systems and code.
Given that our physical and digital worlds are so tightly integrated, you can’t really choose one and completely avoid the other but you may have a preference or a bias towards one.
So, wherever you may put yourself in this Venn diagram, there will be a role in data science for you.
- If you primarily enjoy working with technology, then you could consider roles like a machine learning engineer or a data engineer. These are really tech heavy roles that require you to, in the case of a machine learning engineer, spend time building models and algorithms and helping the company integrate it into their products or services. In the case of a data engineer, spend time connecting systems, making sense of the organization’s data, and making sure its data is flowing the right way through an organization so it’s available for analysis.
- If you primarily enjoy working with people and overseeing them and interacting with them and organizing their work, you could consider a role like a data project manager or roles in data product sales which involves establishing a deep understanding of a company’s data product or service and then convincing potential customers to purchase it by telling them how it could solve their specific problems.
- And if you enjoy working with both people and technology, you could consider roles in business and data analytics who are the middlemen between the more technical data resources within a company and the less technical business resources, they’re the translators and the storytellers who help decision makers make data-driven decisions. You can consider roles in business intelligence too and BI specialists don’t just figure out what kinds of dashboards or analytics business leaders need to do their jobs but they also make sure business intelligence tools are configured correctly and the data powering these dashboards is accurate, relevant, and up-to-date.
This is an illustrative example and there may be some of you on the call who’ve got some experience or knowledge and completely disagree and believe that a project manager should sit in the middle or a data analyst should sit on the right or the lines between a data analyst and a BI specialist are really blurry.
The thing is, different companies do things in different ways and because this is an emerging field the world really hasn’t reached a consensus on where to draw the lines between many of these roles. The key takeaway here is that you can do a lot more with data science skills than just be a guy or girl that builds machine learning models day in and day out.
With data science skills you can carve your own path in the business world.
Before I stop for questions, I want to leave you with three things that I sincerely believe every aspiring data scientist should learn. This is regardless of your personality type because from experience it’s been incredibly beneficial to my career.
- The first is learning to tell stories. We often get lost in the rational and logical aspects of data science because, well, it’s got science in its name. But at the end of the day you want to use insights to convince someone to make a certain decision and what works is not raw facts and figures but emotional stories. This can be in the form of a speech or an interactive visualization or a web application, whatever it is, speak to people’s hearts using data, not their minds.
- The second is learning to embrace mistakes. This is a field that’s growing insanely fast. There is so much I don’t know, there’s so much I need to learn, so naturally I make mistakes. I do the wrong things and I say the wrong things, but you need to be alright with not being perfect. It’s a lifelong learning journey and if you don’t let your mistakes get to you, the journey can be very rewarding.
- The third is learning to empathize. By empathize I mean, putting yourself in the shoes of other people. You don’t necessarily need to be working in an office to do this, you can start by doing it with your friends at school. Whenever they’re facing a problem, try and understand how they feel and why they want that problem solved. If you can do this for your friends, you can do it for your colleagues and your customers and that’ll be very helpful in figuring out how you can use data science to actually enrich people’s lives in the working world.
Commence Q&A Session
Thank you for your time today, I hope this was helpful. Feel free to connect with me on LinkedIn, just look for Stefan Jaro.
If you need more information about data science, how you can get started in the field, how RI or RIC can help you, feel free to reach out to the RIC Career Guidance Unit. They’ll set you up for a career of success.
Thanks again and have a great Friday evening.