To assist in my fantasy baseball league, I developed a tool to analyze and suggest trades that would be beneficial to both teams. The tool implements web scraping, data analysis, and forecasting to model the rest of a given fantasy baseball season.
Data Scraping and Cleaning
Since ESPN provides no public API to get fantasy data, the tool first uses the Selenium library to scrape the data needed for the project. It launches a web driver, logs in to ESPN, and scrapes the tables for the standings as well as each team’s player stats. This data is then cleaned and placed into Pandas DataFrames for the standings and players for each team, such as the following tables.
Slot | Player | Positions | (Season, HR) | (Season, RBI) | (Season, BB) | ... | (7 Days, AVG) | (7 Days, OBP) | (7 Days, FPCT) |
C | J.T. Realmuto | C | 14 | 48 | 29 | | .278 | .350 | 1.000 |
1B | Matt Olson | 1B | 43 | 107 | 79 | | .414 | .575 | 1.000 |
2B | Andres Gimenez | 2B | 11 | 45 | 27 | | .435 | .480 | 1.000 |
... | | | | | | | | | |
UTIL | Ronald Acuna Jr. | OF, DH | 26 | 71 | 65 | | .313 | .450 | 1.000 |
UTIL | Brandon Nimmo | OF | 16 | 48 | 58 | | .391 | .481 | 0.929 |
UTIL | Justin Turner | 3B, 1B, 2B, DH | 19 | 73 | 39 | | .444 | .444 | 1.000 |
Roster Optimization
For my personal roster, I created an additional tool that optimizes the starting lineup by prioritizing certain stat categories. This is a generic constraint satisfaction problem as each players can only occupy certain roster positions. So, the tool first finds the top 14 players on the roster for a given stat category. It then tries to find a roster lineup using just those players.
If it can’t fill every roster spot, it loosens the constraint and tries to fill roster spots with the top 13 players before filling the final spot with the next best player of that position. If this doesn’t work, it continues loosening the constraint and trying to fill the roster with the top n-1 number of players until it successfully staffs those players.
The image below shows how the tool steps through the logic using an example stat. It repeatedly iterates through the players and staffs any player with only a single position remaining (tightest constraint). Once a position has been staffed, the tool removes that position from the other remaining players. If no positions remain, it staffs the player in the utility spot. It continues iterating until all players are staffed.
Season Forecasting
Once the roster is staffed, the tool then
The tool checks every stat to see which lineup projects the best season-ending points standings.
- – used these stats to forecast out the season and project season-ending stats
- – using the strengths and weaknesses of each team’s rosters to suggest trades that would improve both players rosters
- – For each of these trades, it had to consider how adding/removing players from each roster would both affect the possible starting lineups as well as stat projections
- – It then showed a summary of each trade and how much it projected the trade would shift each team’s season-ending ranking