Meet the Class of 2020: Raj Biswas, Data Scientist at Amazon Machine Learning Solutions Lab

Raj BiswasRaj Biswas studied computer science at the Vellore Institute of Technology in India before spending two years working as a data scientist for NextOrbit. The 2020 alumnus of the M.S. in data science program through the Data Science Institute (DSI) at Columbia University is now a data scientist at Amazon Machine Learning Solutions Lab in Palo Alto. He credits DSI with giving him “great friends, up-to-date technical knowledge, and the confidence to tackle any data science and machine learning problem."

How did you become interested in data science?

I got interested in data science through Coursera courses around 2013; it wasn't as popular as it is today. The fact that I could combine mathematics and computer science to get interesting insights from data attracted me. I got instant gratification by doing simple things and got interesting results, which also helped me win a few hackathons.

How did your undergraduate experience prepare you for graduate school?

My undergraduate experience in computer science helped me a lot in my journey. Taking up courses like data structures, algorithms, and object-oriented programming helped me develop as a data scientist because, more often than not, data science modules have to work in tandem with a lot of different computer science systems. These fundamentals definitely help in building data science systems that integrate well with the rest of the components. My undergraduate studies also gave me good exposure in some mathematics courses like multivariate calculus, linear algebra, numerical analysis, and statistics, which form the basis of data science.

Why did you choose to come to DSI?

DSI has a very reputable program. Based on the feedback that I received from alumni and also by going through the course content, I was able to figure out that the courses were both rigorous and up-to-date with the most recent trends in the industry. Another reason to pick Columbia was due to it being located in NYC. I was interested in the finance industry at that time and wanted to explore more in that area. NYC, being the financial capital of the world, could provide me ample opportunities.

Did you enjoy living in New York City?

This was my first time in NYC as well as my first time in the USA. I still remember taking my first yellow cab ride from JFK. Entering Manhattan was mesmerizing. Driving through downtown Manhattan made me feel like I was in a movie. Central Park was a different world altogether amidst the concrete jungle. The best part about NYC is the diversity of the people living there. It felt like a miniature representation of the entire world. The nightlife of NYC was great as well; we could always find places that are open as late as 4 a.m. The winter was certainly something that I had to adapt to, I had never experienced sub-zero temperatures before.

What was your favorite course at DSI?

My favorite courses were Algorithms for Data Science taught by Eleni Drinea as it was very challenging and made me think the most. I also loved the Applied Machine Learning course taught by Andreas Mueller as it provided us the practical aspects of machine learning and I still find myself using most of that knowledge in the industry.

Which internship opportunities have you had during your DSI studies?

I interned at Autodesk, where I was given a very open-ended problem with no clear business objectives. My biggest challenge and learning point from this experience was to work with problems that are in a very nascent stage. I learned how to reduce very abstract ideas to machine learning problems and link them to business outcomes that would be beneficial to the company.

Tell us about your capstone project.

My capstone project was Unsupervised Entity Resolution on Multi-Type Graphs and we partnered with Capital One. The goal of the project was to identify distinct entities in textual data and link the same entities across multiple datasets using a graph methodology. We found a research paper that had a similar solution, implemented the algorithms from scratch on our end to see the performance of the algorithm, and made it useful for the folks at Capital One. After implementing the algorithm and experimenting with it, we realized that it did not perform as claimed. We went further and tweaked different parts of the algorithm to improve its performance. It was a great experience as we were being guided by people working in the industry as well as our faculty advisor Tian Zheng.


Media Contact: Sharnice Ottley, so2506@columbia.edu


550 West 120th Street, Northwest Corner Building, Suite 1401, New York, N.Y. 10027    212.854.5660
©2020 Columbia University