Tanaya Jha, "Predicting Music Genres using Spotify's Audio Features"

In this project, I tie together two of my interests, music and programming in order to be able to understand music from a programming perspective. The end product is a document that outlines my thinking process as I attempt each step of the data science process (with code) in my first independent machine learning project. The process is broken down into four main parts:creating the dataset, analyzing the data to gain some insights, building a couple machine learning models, and a discussion reflecting upon my work. From analyzing the dataset, it looked like danceability, energy, and valence had the greatest difference in distribution across the genres. The models chosen was a decision tree classifier and a gradient boosting classifier, both of which are known to work well with multi-class classification problems. I learned that tuning is a very complicated process that requires a bit more background knowledge to efficiently navigate because of the many parameters that can take on a wide range of values.Finding the optimal combination is not a straightforward task, but I picked a few common parameters and attempted to find a ballpark of values where the model worked best on new data. I was able to achieve an 0.84 prediction accuracy on the test dataset using the gradient boosting algorithm, but it should be noted that the test dataset is a very small selection of data.In the future, the models can benefit from being trained on more diverse data, going through amore rigorous optimization process, and testing on a larger sample.