I’m trying to train a machine learning model to detect if an image is blurred or not.
I have 11,798 unblurred images, and I have a script to blur them and then use that to train my model.
However when I run the exact same training 5 times the results are wildly inconsistent (as you can see below). It also only gets to 98.67% accuracy max.
I’m pretty new to machine learning, so maybe I’m doing something really wrong. But coming from a software engineering background and just starting to learn machine learning, I have tons of questions. It’s a struggle to know why it’s so inconsistent between runs. It’s a struggle to know how good is good enough (ie. when should I deploy the model). It’s a struggle to know how to continue to improve the accuracy and make the model better.
Any advice or insight would be greatly appreciated.
View all the code: https://gist.github.com/fishcharlie/68e808c45537d79b4f4d33c26e2391dd
I feel like 98+ % is pretty good. I’m not an expert but I know over-fitting is something to watch out for.
As for inconsistency- man I’m not sure. You’d think it’d be 1 to 1 wouldn’t you with the same dataset. Perhaps the order of the input files isn’t the same between runs?
I know that when training you get diminishing returns on optimization and that there are MANY factors that affect performance and accuracy which can be really hard to guess.
I did some ML optimization tutorials a while back and you can iterate through algorithms and net sizes and graph the results to empirically find the optimal combinations for your data set. Then when you think you have it locked, you run your full training set with your dialed in parameters.
Keep us updated if you figure something out!
I think what you’re referring to with iterating through algorithms and such is called hyper parameter tuning. I think there is a tool called Keras Tuner you can use for this.
However. I’m incredibly skeptical that will work in this situation because of how variable the results are between runs. I run it with the same input, same code, everything, and get wildly different results. So I think in order for that to be effective it needs to be fairly consistent between runs.
I could be totally off base here tho. (I haven’t worked with this stuff a ton yet).