R programming language – DILIP MERALA

When a class is named after your graduation major, and one of the most popular disciplines in the present world, you know it’s going to be pivotal in your learning path. BA with R proved to be just that. The brilliant Dr. Sourav Chatterjee made it clear right at the beginning that R programming is going to be used just as a tool (which it is) to understand and master the nuances of business analytics. Having said that, his course material left no stone unturned in taking us through all aspects of R programming needed for data science.

I had worked a bit with Java and PHP, but this was my first experience with the R programming language. I started with an introductory course on Datacamp to quickly learn the very basics of R like vectors, matrices and data frames. Then, in class, Dr. Chatterjee proved to be a dedicated and patient professor as he started with basic manipulations and sample generation in R and then quickly moving to the foundations of data analytics. We got familiar with libraries like tidyverse, forecast, gplots and toyed with data visualization using ggplot on some interesting data sets. We created several plots, graphs, charts, and heatmaps, before scaling up to larger data sets.

This was followed by some of the most important things a business analyst/data scientist learns in his career. So far, everything looked pretty straight forward to me but now was the time to push boundaries and actually dive deep into analytics. I was introduced to dimension reduction, correlation matrix and the all-important analytics task of principal component analysis (PCA). I learnt how to evaluate performance of models, create lift and decile charts, and classification with the help of a confusion matrix – all with just a few lines of code. As Dr. Chatterjee explained time and again, it was never about the code. It was about knowing when and how to use it and what to do with the result.

We then followed the natural analytics progression with linear and multiple regression where I learned about partitioning of data and generating predictions. This was followed by a thorough understanding of the KNN model and how and when to run it. By now, I was beginning to get a hand of problem statements and the approach to take to solve them, thanks to class assignments on real-world scenarios like employee performance and spam detection. Through the examples done in class, it was easy to grasp the concepts of R-squared value, p-value and the roles they play in model evaluation. It was in this class that I understood logistic regression, discriminant analysis, association rules for the first time and I have been working on them ever since, in every data science course or project that I have taken up.

All of this knowledge and Dr. Chatterjee’s guidelines were put to use in the final project where I worked with a group led by the talented Abhishek Pandey on London cabs data. After rigorous work on large data sets downloaded/extracted from various sources, we trained a model to predict arrival times for cabs by comparing RMSE across random forests, logistic regression, and SVMs. It was a great way to put into practice everything we had learned over four months.

And with that, I had laid a robust foundation in data analytics, and was ready to build it further in the time to come. By January 2019, I was confident to dive into analytics projects and work on complex data sets to generate prediction models using the tools taught by Dr. Saurav Chatterjee.

ALSO SEE Saying “Hello, old friend” to Statistics and Analytics

This is the second post of my #10DaysToGraduate series where I share 10 key lessons from my Master’s degree in the form of a countdown to May 8, my graduation date.

For over a decade now, I have chased the dream of becoming a Bollywood star. It has been an amazing ride full of ups and downs. There have been some minor breakthroughs but nothing significant enough for me to make a living out. So, while this journey as an aspiring Bollywood actor has taught me a lot and I have thoroughly enjoyed and loved every bit of it, I have come to realize that it is time to pull the plug.

It has taken a lot of effort for me to come to terms with the fact that my acting career is going nowhere. For over 15 years, all that I wanted was this. No matter what I did, no matter where I went, I always felt that it will connect back to my dream. But now, I feel like I do not want to invest any more of my youth in this “struggle”. I need to accept that I have failed. And it is now time to move on.

It makes me very sad. I feel like something is dying inside me. After all, it’s a dream I have chased since I was 16. However, I have found some solace in the knowledge that acting is now a part of who I am and I can always continue being an actor on the side. This is where acting becomes a hobby for me like playing the guitar or dancing or travel. May be I can get back to doing theatre and join the countless number of doctors, engineers, working professionals who use it as a way of expressing themselves! With that in mind, I have made my peace with my decision of giving up my Bollywood aspirations.

Once I made this call, I started looking at other things that excite me – other areas where I thought I could make a difference. I have worked as a Senior Travel Writer, Editor and Manager over the last few years. During this time, I have had time to travel, volunteer, teach, write, think and reconsider my career options. After a fair amount of self-discovery, I have concluded that the best combination of what I would like to do and what the world needs right now is data science in the solar energy sector.

The world of renewable energy, like every other field these days, generates huge amounts of data and there is a need for analysts and scientists who can make sense of this data. With skilled effort in the right direction, a lot can be done to bring down solar implementation costs. That to me is an exciting future to work towards. With my background in Electronics and Telecommunications engineering, and my interest in programming and statistics, it felt like the right thing to pursue next.

I started my data science journey last year with an introductory course on the R programming language on a website called Datacamp. I have followed it up with an MIT OCW course on Introduction to Computational Thinking Using Python. I have also applied to several universities for my Masters in Business Analytics/ Data Science/ Information Systems. If all goes well, I hope to begin higher studies in Fall 2018.

This is a new beginning and as one would expect, I am nervous and anxious just like I was at the beginning of my Bollywood struggle. I am 32 now and it scares the shit out of me to restart my whole career. Nevertheless, I am driven by the fact that I now have a new purpose – one that can add some value to the world and also help me meet my true potential. I realize that this may look like a clichéd choice, a silly one even. But what matters to me is – it feels like something worth doing no matter how people perceive it. It is what my heart is pointing me towards.

I do plan to continue theatre and acting in some form or another. But now, it would be just for me and not with the motive of “chasing a dream”. My dream has now been replaced by an ambition – Become a skilled Data Scientist and make a revolutionary impact in the Renewable Energy sector.

Love,
Dilip Merala

Tag: R programming language

Diving Deep into Business Analytics with R Programming

Beginning Data Science: A New Journey