From Zero to Beginner in Machine Learning and Data Science — Part 2

6 minute read

Pic taken from here

When there are already so many blogs out there that contain list of hundreds of resources(MOOCs, blogs, video series, books, GitHub), my attempt to add another to that, especially with a title which is exactly the opposite of a click-bait, is not to write another blog that contains a long list of resources. In the first part of this blog(here), I wrote about few of the things about my journey into Machine Learning. In this part, I will try and list out some of the practices which have helped me a lot with making some progress in Machine Learning and Data Science. Before I go ahead, I would like to recommend reading this paper- How to read a paper. Its a paper about how to read research papers. Though it’s written keeping in mind the researchers who spend a lot of time reading research papers, the practices mentioned in it can be applied to reading or learning in general.

For many people who get started with ML, the big question they have is, “Which is the best course for beginners”? There are some common answers to this question like Andrew Ng’s course from Coursera, Abu Mostafa’s Learn from Data, some courses from Udacity and Udemy etc. The field is so promising and full of possibilities that people from diverse and unconventional backgrounds are trying to get into it. So the answer to the above question not only depends on what a course can offer but also on how educated and experienced we are already in Math and programming. Having said that, do we need to build all the skills from the ground up and then get started with ML? Not really.

I got started with Andrew Ng’s course. I finished the first few weeks of it and then left it halfway to do some courses on Calculus and Probability. Though whatever Math I learned and revised after that really helped, I didn’t get back to the course again. Whether it’s Andrew’s course or any others which are usually suggested and recommended for beginners, most of the courses offer a wealth of knowledge, taught by excellent teachers. So as the paper I mentioned above suggests, what I learnt from my personal experience is to finish the course in one go and then dig deeper into the Math if one wants to. Also, if the approach is to start a course, see if it works and switch to another otherwise, there is a chance one can get stuck in this loop of switching courses. One should also remember that it’s hard to design a course that generalises to people from every background and meet every requirement considering how vast and interdisciplinary this field is. Two resources that helped me get started with ML are Kevin Markham’s Dataschool tutorials and Kirill Eremenko’s Machine Learning A to Z.

Enough course talk, let’s get to practices. One thing which I suggest you do immediately if you don’t do it already is — DOCUMENTATION. You must maintain a journal or a document. Why is it that important? You come across something like one-hot encoding and after searching for a while, you find a blog or a stackoverflow answer that helps you understand what it is. Considering how many new concepts and terms you might pick along the way as you keep learning, you may not remember everything. This along with many other reasons makes documenting the must follow practice. Whether it’s bookmarking the useful links, noting down in a book or saving content in a document, having a track of what you learnt is going to save a lot of time later. Here’s how I keep track of things :

Start going through research papers. Start with Computing Machinery and Intelligence. A paper written more than 60 years ago and be amazed how most of the AI advancements we are witnessing now were predicted back then. I thought I should leave this practice for some later time when I am good with most of the ML terms and concepts, but I was wrong. There are some very good papers which make sense to even beginners. Like this one — A few useful things to know about Machine Learning. The first thing I did soon after I finished reading the paper was search for Pedro Domingos that led me to his book, The Master Algorithm which I will be reading soon.

Learning an algorithm and trying it are two different things. You need hands-on and you should code. For beginners, GitHub is the place for this. I use Python for coding and there are so many Jupyter notebooks that you can code along and try. Find examples of some code that uses different algorithms. Until you are comfortable with picking a problem and coding it yourself, you shouldn’t feel shy to take someone’s code and try it yourself. But make sure you DONT COPY PASTE the code. Type it line by line, understand what the code does. When there is a line of code which seems too complex, break it down, try adding one parameter/component at a time and see how the output changes. As an example for a project to code along, try this in which the author has explained so well, one end to end flow starting from data scraping and till modelling. He scraped the data from IMDB and TMDB, used the data(both textual and visual posters) for genre prediction. Kaggle is another place where there are well explained Kernels to get started.

Below I am listing few of my epiphanies and confessions :

You don’t necessarily have to understand 100% of what’s taught in the course. Its great if you do and its also fine if there are parts of it you don’t understand which you can learn from other resources.
There are no good and bad courses(maybe a few). There are only courses which you finish and courses which you leave halfway.
Follow the blogs of companies that are using ML like Google, Uber, Amazon.
If you are not on LinkedIn yet, do it asap. You will be surprised how approachable people are in the platform.
There will be ML slumps-

Pic taken from here

Yea, it will go away if you are really passionate about ML.
ML is not hard. It needs time and perseverance.
There will be few concepts which might be hard to comprehend and that happens with everyone. You should do extra learning when required.
Learn, Code, Document and Review
Use LinkedIn,Twitter, Reddit, Stackoverflow and Discord to interact with other people who are into ML. After I started using these platforms, it took only a few days for me to realise how willing people are to talk, share and help others with what they know.
Use platforms like Coursera, EDX, Udacity, Datacamp not just for learning but also for engaging in the discussion forums(you will meet cool people and gain more knowledge).
Share and contribute to the community in anyway possible.

I tried my best to summarise and share what all I feel can be helpful for those who are getting started with ML. If you feel anything has to be added to this blog, feel free to leave a response. Happy learning.

Share on

Twitter Facebook Google+ LinkedIn

Avinash Kappa

From Zero to Beginner in Machine Learning and Data Science — Part 2

Share on

Leave a Comment

You May Also Enjoy

Pre-trained Models are Helpful. But!

Looking Inside the Neural Network Black Box

Udacity KPIT Scholarship Program

InOut — Learnings From 2-Day Hackathon