Overview
Solving the pricing guesswork problem for Airbnb hosts
Hosts often set prices without data, leading to undervaluation, lost bookings, or inconsistent revenue. Our goal was to build a pricing recommendation system using 7,800+ Airbnb listings that could suggest optimal nightly rates instantly and accurately, replacing gut feeling with data-driven precision.
7,800+
Airbnb listings used to train and test the models
0.85 R²
Training accuracy achieved by the final XGBoost model
6 models
Models built and compared before selecting XGBoost as the final model
What I Did
From raw data to real-time pricing tool
- Data cleaning: Led numerical data cleaning using log transforms, winsorisation, and imputation to prepare 7,800+ listings for modelling.
- Model building: Built and compared multiple models including OLS regression, Ridge, Lasso, Decision Tree, Random Forest, and XGBoost to identify the best performer.
- Model selection: Selected XGBoost as the final model based on superior performance with R² of 0.85 on training, 0.63 on testing, and RMSE of 77.4.
- Streamlit app: Co-built a Streamlit application that allowed hosts to input their listing details and receive real-time optimal price recommendations instantly.
Project Visuals
Selected slides from the final report
Key Takeaway
People often set prices based on instinct, without understanding why the outcome came out a certain way. Data provides evidence and direction. It shows exactly which aspects of a model can be developed or enhanced, turning vague intuition into actionable, defensible decisions.