Machine Learning Coding Interview I was provided with two CSV files containing different sets of attributes for the same entity and tasked with developing a classification model.
Utilisateur anonyme
My approach was to merge the CSV files, remove duplicates, select the relevant features, encode categorical variables, and then train models using Logistic Regression and Random Forest. Interviewer went through the code and asked questions like 1. Why Random Forest is performing well 2. How it would have behaved if duplicates werent removed (Overfitting) 3. Where you could have done better (Using pipelines etc)