Machine Learning, Programming

An intro to recommender systems with live implementation

What should I watch this evening?

How often you feel after a hectic day at work that what should I watch next? As for me — yes, and more than once. From Netflix to Prime Video, building robust movie recommendation systems is extremely important, given the huge demand for personalized content of modern consumers.

Once at home, sitting in front of the TV seems like a fruitless exercise with no control and no remembrance of the content we consumed. We tend to prefer an intelligent platform which understands our tastes and preferences and not just run on autopilot.

I have…

Data Science

The market knows it all, and it has prepared itself for the post-Covid world using millions of data points, Have you?

Source: image by Author

“In life, unlike chess, the game continues after checkmate”-Isaac Asimov.

Coronavirus took everyone by surprise, and it seemed the end of the world and of the human race, which was next to checkmate in Chess. However, life must go on, and after analyzing 30Mn.+ data points across 35 years ranging from Dow jones index to sectors to individual companies for price movements, financial results, news, management guidance, etc., Mr. Market has a clearer idea of how the post-Covid world is going to shape.

If Mr. Market is to be believed, below is a summary of the new normal based on…

Photo by Luke Chesser on Unsplash

The New York Times on an average Sunday contains more information than a Renaissance-era person had access to in his entire lifetime.We’re getting better and better at collecting data, but we lag in what we can do with it. Lots of data is out there, but it’s not being used to its greatest potential because it’s not being visualized as well as it could be.

As legends have it, >90% of Data science projects don’t see the light of the day and ML models die their slow deaths within jupyter notebooks. Absence of a sweet intersecting spot between Data Scientists…

Deep Learning

Recommendations served on your own, Watch on

A deep-learning-based floating movie recommender

Have your ever watched inception and wondered about a dream within a dream and the possibility of real-world items floating in space? Well, we can’t float real items in space (at least so far) but can definitely project the python vectors in Multi-dimensional space.

Follow along and by end of this post, we will have a floating recommender system for movies to your name without any fuss around renting servers, writing flask codes, data scraping, or portal maintenance. Additionally, why just limit ourselves to movies and not extend it to our social network, stocks, sportsperson, TV shows, books, etc. …

End to End Parallelized Data Science from Reading Big Data to Data Manipulation to Visualisation to Machine Learning

Dask- Familiar pandas with superpowers

As the saying goes, a data scientist spends 90% of their time in cleaning data and 10% in complaining about the data. Their complaints may range from data size, faulty data distributions, Null values, data randomness, systematic errors in data capture, differences between train and test sets and the list just goes on and on.

One common bottleneck theme is the enormity of data size where either the data doesn’t fit into memory or the processing time is so large(In order of multi-mins) that the inherent pattern analysis goes for a toss. Data scientists by nature are curious human beings…

Teaching Computers to describe pictures

Image Captioning refers to the process of generating textual description from an image — based on the objects and actions in the image. This is a 3 part series to implement Image Captioning as presented by Andrej Karapathy in his PHD thesis paper at Stanford.

Computer Generated Captions using Neural Nets

In the process, we would learn basics of Neural Network, create a Convolutional Neural Network(CNN) in Keras(wrapper around Tensorflow), explore State of The Art NLP models(Sequence to Sequence, Glove, BERT etc) and stack together CNN and NLP model using LSTM to generate captions of an image.

We would take it from there and create Recommender…


Intended Audience: Data Scientists with a working knowledge of Python, SQL, and Linux

How often we see the below error followed by a terminal shutdown followed by despair over lost work:


“I bought this company because my friend believes in it” or “Every morning I see this stock in the recommendations of CNBC and they must have researched it well”- Trading diary of a retail investor would be full of such anecdotes.

Data for all

After working for few years and settling down the initial euphoria of money in the saving bank accounts, there came a stage where inflation kicked in and I realized the concept of negative returns from idle funds. It was my first exposure to stock market and within a week, I was lost in the sea of equities jargons.

MS Dhoni may drag the match to the last over or Maxwell may finish the match too soon but it leaves footprint/data trail after every such innings.Such data points are consciously tracked and after 10 seasons/636 matches/150461 balls, a Machine Learning model is ready to take off.

Only Dhoni knows when the match would finish,Not anymore!

Problem Statement: The duration of a match affects the advertising revenue and the broadcaster would want to place the most premium inventory at highest rated(Maximum TRP) slot. Even if the highest rated slot isn’t guaranteed, the ad inventory shouldn’t be wasted in case of an early finish.

Unanswered questions before attacking bigger problem…

Ravi Shankar

Data Scientist II - Amazon, Ex-Hotstar, IIM Ahmedabad

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store