The upside to a regular job is a regular paycheck! The downside is that it distracts me from all the interesting things I can solo as a developer.
With the week off, I'm trying to figure out where I left off in exploring MLFlow and how I can use it to solve the generation of machine learning problems.
Although I've enthusiastically embraced AI LLMs, I am always glad to have a personal writing snippet to reference. Fortunately, the MLFlow example uses the Iris data set, and I wrote a post on that back in 2021 (https://www.owlmountain.net/post/visualizing-labeled-point-data-in-pyspark).
So now I need to figure out what I want to do with MLFlow. Oh yeah, start up the UI and see what it does. Aha, it shows me a record of my logistical regressions. My last one is the binary classification of Iris data. It pays to revisit the standards.
One more cup of coffee...
Ok, so the MLFflow API lets me track runs of the logistical model and then manage it with the UI.
ChatGPT is an excellent/shitty tool to use as a development assistant. It's excellent because I can avoid reading docs and asking them questions. It generates a general idea of the code. However, the code is shitty and doesn't work until I fix it.
So now I need to dig around the world of interesting problems and find something more non-trivial than classifying flowers.
If I don't pick this up again for a while, here is the code so I don't forget. https://github.com/timowlmtn/bigdataplatforms/tree/master/src/mlflow
Zillow Rental Data
I would rather do anything but clean my house.
Instead, I wrote some scraper code and started on the process of building rental price forecasting based on the Zillow data (https://www.zillow.com/research/data/)
Scrapers, logistic regression, forecasting, MLflow. Much more fun.
Comments