How to Build a Data Analyst Portfolio
Here are 5 must-do actions that can help you build a great portfolio for a data science role
December 16, 2022
15 minutes to read
"Data science is a combination of three things: quantitative analysis, programming and narrative. Quantitative analysis refers to the rigor required to understand your data. Programming allows you to process your data and act on your insights whereas narrative means to comprehend what the data means)." — Edwin Chen
It can be scary trying to build a great data analyst portfolio. There are tons of concepts you have learned and plenty you haven’t. There are a million project ideas all around and you aren’t sure where to begin. So, to get you started I have compiled some tips for you!
Here are 5 must-do actions that can help you build a great portfolio for a data science role 🛎️
1. Explore Kaggle Datasets for Beginners
Kaggle is a go-to site for a lot of enthusiasts to find their data and begin their analysis. Pick an area of interest like health care, finance, movies, food, and so on. Download the data, cleanse it and do some data visualization, and summary stats and there you go, you've just done a data project.
If Kaggle looks overwhelming, then use your own personal data to get you started, or maybe conduct your own survey and ask people to fill up a questionnaire. Next time step it up a little and look online for a dataset. Go with something you are interested in such as video games, sports, movies and set yourself a question to answer. This might not feel like a major thing to show the recruiters, but you'll be building skills and confidence to move on to more involved projects.
Pro-tip: Once you have worked on a few datasets available, try competing in one of their popular competitions and get a real-time feel of how you should build things from scratch. 📊
2. Get your hands dirty by data cleaning
Getting started is the hardest part but you'll never have impressive portfolio projects to show off if you don't start small. When looking online for a dataset, there is a number at the Registry of Open Data on AWS which is freely available. These datasets are also compatible with AWS cloud computing resources. The best part about these datasets is you can find something unique, whereas with Kaggle you might be repeating the work others have already done.
Pro-tip: Open data from government sources are extremely useful. These datasets are often unstructured, and you need to spend time processing and cleaning that data in a meaningful way. You will face similar circumstances in the real world too. So, get your hands dirty and fully engage in the data cleaning process which can often take up a majority of your time! 📈
3. Data Analysis Algorithms
It's often the thought process that employers are interested in rather than a shopping list of models and programming languages you've used to build it.
So, don't overthink it, just do a project and see what comes out. I often start with a dataset and work on applying the major algorithms: Linear Regression, Logistic Regression, SVM, Gradient Boosting, Random Forest, PCA, K-Means, Collaborative filtering, KNN, & ARIMA.
Pro-tip: Not just to run the algorithms but to truly understand when and why to use each! 🤔
4. Displaying Data Analytics Portfolio
4.1 GitHub Portfolio
Use GitHub pages to display some project stuff. It is very easy to set up and show off some data cleaning/feature engineering on datasets you find interesting. Then gradually add to them or take on different challenges.
4.2 WordPress Personal Website
On the other hand, you can build your free website using WordPress where you can share a lot more ideas you have related to data science, technology, or your hobbies in general. You can SEO-optimize your site to bring it to the top of the relevant google searches.
Congrats! Everyone in the world can now see your work and your passion for data science. 👏
5. Networking matters
Use LinkedIn and connect with people that share similar interests. Don't hesitate to reach out to recruiters and have conversations about their business needs and how you can add value to their firm. Use Meetup to check for local data science and tech meetups.
Attend them and network with people that have similar interests. These are great places to reach out to recruiters. 🤝
And here’s a bonus tip for all the Qureos minds
Pick an industry or business that interests you and try to identify a pain point in that business that data science can be used to solve. That will impress hiring managers as it shows you actually understand their business and how to leverage data in creating value. That's what they hire for at the end of the day! 💼
Abhay Sharma is an Associate Consultant in the Tech Consulting division at Ernst & Young (EY). A triple-major degree in Computer Science, Mathematics, and Statistics has given him the diverse experience to use data and business tools for solving challenging problems. Abhay has mentored over 150+ students and professionals on tools like Tableau, SQL, Excel, and more. You can also book a 1:1 mentorship slot with him here