How do we solve them so I will take you directly to the part that I wanted to show the solution part I promised you two to show the solution part at 12 and I am delayed by 10 minutes I'm sorry for that but here you go there is a concept called Auto ml so I have talked to you about imputation techniques I have talked to you about feature selection dimensional tin reduction and selecting the best algorithm selecting the best ensemble and all of it right ditch all of that right you don't have to do any of that people have come up with smarter ways of doing things, for example, they've come up with something called Auto ml right?

Where you just feed the data and it does everything for you all that you need to say is give it some time right say that go ahead do your experimentation for the next two hours and get me to the output right and in in in real world this is what I do in my world so I don't actually go ahead and create models individually I'm going to show you an example that I have created for this session I have taken the IDS data set.

I'm not sure if you have gone through any Auto ml libraries are yet there are lots and lots of libraries the most popular library is something called Anti pot and because you are all familiar with I scale done which is scikit-learn SK learn also has a library called Otto a scale and that's what we are talking about and that's what we're going to be using here so this is our typical credit card loan status data says that I have taken all of you to see my code yeah very good thank you, Richard, so if you look at it I have read the data set and I have done some preprocessing around dropping the Norton L Records and all of it those preprocessing I have done and I have checked how many records are there what is the shape of it are there any null values of the pre-processing drop any removes all the records I was just quickly doing this notebook yesterday.

I dropped instead of doing any imputation and after that whatever unfortunately I see certain columns have been read as an object as an internal representation for Python and which is not recognized by my model, therefore, I have converted them to categorical with this particular code and I'm again looking at the data set here right and this is the interesting part and this part you would know wright so what is it doing it is dividing my test data and training data set into ninety ten portions it is it is keeping aside the test data and after that, I'm converting the values into a NumPy array this is all my - looks like and this is how my ex test array looks like after converting into an umpire array and all that I will have to do after all of this is just this comes on I am just importing the libraries which are autoscaled and model selection library and stuff.

This is it I will have to do in real-world industry situation this is what I do and I go play my poker game or whatever game what I am telling here is I am passing the data set and I am also I am actually instantiating that object and I'm saying that run might ask for one twenty-minute one twenty seconds and in every run experiment with multiple models and multiple techniques and four and spend about 30 seconds on each of the models and get me the output and how do I evaluate which model is better and saying that use a five-fold cross-validation strategy right after the run is done go ahead and terminate the folder that's all I'm saying right if you look at here.

I've got the auto ml results does author and will need any high-performance machine benign brain that has asked me this question no it doesn't need any high-performance machine what it needs is a normal machine I just had a simple four core machine and I was still able to run it and multi multi-core processing is possible it's just that you can decide how much time is required to find out the optimum model and that is a heuristic nobody can tell you how much time you can let Auto ml do the things so in this particular case I have just let it do the things for 120 seconds right but I can do you arrive what did you arrive at cross Priya I am not sure about your question can you rephrase the question please okay sure.

I just left it for 120 seconds and I wanted to do a quick run and I read that I did that and what it actually does is it gives me multiple experiment results I will show you one experiment that it has done what it does is that it's a class imbalance data set so it weighted the data set oh yeah you can do 10-fold cross-validation it's careful right what you select for k is up to you I want to do evaluation using K 4 and 5 4 right so and then I talked about one hot encoding right this I without having me having to do that transformation this has already done it and it has chosen a classifier called extractor ease classifier and wherever it has to do imputation it has done imputation using medial right in earlier case you will have to individually do more median mean and stuff and have to test it here it has done it and like and it has not done bootstrapping it has not done any arm symbols.

Furthermore, I also have an option to pass on the symbol as an option here I said on symbol size equal to 1 I can say on symbol equal to 50 that essentially means that it is going to combine 50 different algorithms and try to see what will be the 10-fold cross-validation for it I mean or five for cross-validation result for it and whichever mod it is going to experiment internally with almost all almost 14 different algorithms that are there inside a scalar and then it is going to fetch you the right kind of algorithm when I say auto ml dot show model it is going to show me the best model out of all the models that it has experimented in the last 120 seconds.

You are absolutely not touching anything all your you are not doing any uncle Kaput okay sure all right Auto ml is a library auto I scale can we use this no problem yes I have used this for a lot of hackathons I am a winner of multiple hackathons and this is one of the techniques I follow in the hackathon as well as soon as I go in I take the raw data set and dump it into an auto-scale own kind of algorithm and it is going to automatically try various models and it is going to get you the best model and if I were creating new features what I will try is to see if I can beat the benchmark that I am algorithm masculine right if auto ml has given me yes it'll take off take care of categorical variables if you see in the example that I have just explained you see that there is one hot encoding in some cases it was doing no encoding.

When the algorithm doesn't demand it but you will see that it is also doing one hot encoding it is automatically taking care of that you don't have to do anything so you are not doing feet can you please share this code yes yes we are sharing it crutches sure uncle you have asked me to repeat a repeat the thing once again I am doing it for you right so Otto SK learn has both classifiers as well as regressive depending on the problem whether you are solving a regression problem or a classification problem you need to use that particular autoscales an API what I am doing here is I am telling Auto SK to learn function or to run experiments several experiments for 120 seconds right it will try the combination.

Post a Comment

Previous Post Next Post