This project aims to measure recent volatility in the equities market by creating a simple linear regression model of the SPDR S&P 500 ETF Trust (SPY) from April 21, 2024, to July 21, 2024.
- Clean and preprocess SPY market data
- Develop a linear regression model to predict closing prices
- Analyze the model's performance and feature importance
- Visualize the results using a Streamlit web application
- Python 3.7+
- pandas
- numpy
- scikit-learn
- matplotlib
- streamlit
dataCleaner.py
: Handles data preprocessing and feature engineeringmodel.py
: Contains the linear regression model and training functionsapp.py
: Streamlit web application for displaying results and visualizationsSPY_4-21-2024_to_7-21-2024.csv
: Raw data file (not included in repository)
- Clone the repository
- Install required packages:
pip install -r requirements.txt
- Ensure the SPY data file is in the correct location
- Run the Streamlit app:
streamlit run app.py
- Data cleaning and preprocessing
- Linear regression model training
- Model performance metrics (MSE, RMSE, R-squared)
- Visualization of actual vs predicted close prices
- Feature importance analysis
Undecided