Many aspiring data analysts or others who want to learn how to use data better have likely Googled “How to start learning data analysis”, only to be confronted with millions of search results suggesting everything from SQL, R, Python, Tableau, PowerBI, to classes from DataCamp, DataQuest, Udacity, Coursera, and Galvanize. While this is useful information, the sheer amount of it is very overwhelming.
From a largely self-taught data analyst (I studied economics in college, but saw the writing on the wall that the need for and use of data was in every organization and an in-demand skill), here are suggested first 5 steps to help you clear a path through the jargon and buzzwords and get to the data.
Understand the process of data analysis
Maximize your prowess with Excel
Learn a data visualization tool like Tableau
Consider a scripting language, like Python or R
1. Understand the principles and process of data analysis
At the core, data analysis is about understanding and decisions. How can we understand what is happening, and how can we leverage what we know to draw better conclusions and take action?The path to a great decision is not linear - it requires an iterative process. Create a hypothesis, collect data, clean data, explore data, review hypotheses, test them, explore some more, create new hypotheses, repeat, repeat, repeat.
For practice, take a look at data analyses on sites such as Towards Data Science to understand how analysts approach the process and what questions they ask of a dataset. Before reading the article, try looking at the data and thinking about what kind of questions that you asked. Then think about how it differs from the questions that the author asked, or how they went about the process differently.
This will help you to develop the analytical mindset, which is the most important tool for data science/analytics.
2. Learn statistics
Statistics, not any particular tool or technique, is at the heart of data analysis. Data analysis is about taking a hypothesis that we hold and confirming whether or not it is true, and under what conditions.
For example, students that attend the summer after-school program appear to get better grades than students that didn’t attend the program, but how do we know that this is not just due to something else, such as their personal characteristics or the way they were selected for the program? How do we know for sure that this is attributable to our summer after-school program? For this, we turn to statistics.
An understanding of descriptive statistics and inferential statistics, and some probability theory, is a great place to start. Khan Academy’s videos on statistics and probability are some of the most accessible videos that I have found online for these topics.
3. Maximize your prowess with Excel
Although many people are quick to bash Excel, it is truly an incredibly powerful tool, and it has much less of a learning curve than other tools like Python or R. Learning how to create macros, use VLOOKUP/HLOOKUP, create Pivot Tables, build dashboards, and utilize the Data Analysis Toolpak will take you a long way with Excel. Even if you eventually “graduate” to another tool like R, Excel will still be a valuable part of your toolkit for when you need to do quick data entry and manipulation. And, almost every company has Excel available, giving you access to a powerful data analysis tool whether you’re in a big corporation or tiny non-profit.
4. Learn a data visualization tool
PowerBI and Tableau are the two most popular business intelligence tools. Designed for use in a broad range of business analytics contexts, they offer more capabilities than Excel, and make it easy for users to connect to data sources, perform ad hoc analyses, and create interactive visualizations and dashboards. Visual by nature, humans can often grasp concepts much more quickly through the use of visualizations. Check out how easily information can be communicated with Tableau and PowerBI through these sample dashboards here, and here.
If you have Windows, you may already have access to PowerBI through your Microsoft Office Suite. Tableau has a free version called Tableau Public that you can download. Like Excel, these tools are becoming standard in many organizations, so understanding how they work and using them to tell your data story can be invaluable to many organizations.
5. Consider Python or R
Once you’ve gotten the hang of data analysis, it may make sense to try a tool like Python or R. The advantages of these languages are that they will enable reproducibility of your results, can handle much larger datasets than Excel, and can enable much more advanced capabilities and analysis such as machine learning.
The Internet is awash with arguments about which one you should start with. If you’re interested in learning how to program in the future, Python may be the better option since it is a general purpose programming language, and you can use it for web scraping, creating apps, and more. But if you’re only interested in data analysis, then either one can work for you.
These five tips might not make you a data analysis expert at first, but they’ll set you on the right path to be more effective and efficient in using data. The most important part is to just get started!
Want more help getting started with data analysis? Check out the DigitalC Learning Studios Data Analytics Workshop, offered twice this fall! We cover the data analysis process, Excel, and Tableau in a 3 day workshop.
October 10-12 Workshop