You know understanding how the customers are engaging through recency frequency and monetary meaning allowed to apply a mathematical model called RFM to be able to understand the data so the application of the process of application of that model is possible through data preprocessing as part of data preprocessing I will also have to clean up the data I will have to make sure that my data is evenly balanced I'll have to allowed to do several data what you call gate of preparation techniques this is unique to machine learning projects where you will have to go through a certain set of steps to be able to understand your data better and to process your data better.

Once you clean up the data what you do is you start playing around with the data right so what do I mean by playing around with the data is in our case of customer segmentation I am just purely looking at how recently a customer has transacted how frequently a customer has transacted and what is the monetary value that a customer is bringing in, so I have to create some new columns I have to create some new fields new business fields called recency score frequencies for and monetary score and I have to combine those fields into another column called era from the store so as part of data manipulation I would create what is called as a future creation where I create new columns which are needed for the processing and the running of effective customer segmentation.

Ok, so once I create these columns once I create this column as part of my data manipulation exercise what I do is I create a data set on which my machine learning algorithm is going to work ok so all of these steps I'm not going to go through these steps step by step by step because I think it's just way too much for you guys at this point I just show at a high level so once I have this once I have to have those new fields as you can see I am segmenting the customers based on those new fields as loyal customers big spenders last winders and Pat spenders.

I am creating four segments out of this data all of this process will happen during the data manipulation step right, and then I start performing once I create those columns I start performing what is called exploratory data analysis meaning I start analyzing the data right so what kind of analysis I'm doing the simple analysis is what is a percentage of expenditure what is the percentage of lost customers what is the best percentage of bad spenders and loyal customers so that is one form of analysis the other form of analysis could be around you know how many customers are recent how many customers are frequent, so I I do a lot of univariate and bivariate analysis which you will be learning as part of your course where how does one variable work with another right.

For example here if you see how many customers are frequent customers how many customers are recent customers right all that kind of analysis you start digging deeper about your data okay and across all the segments which segment has been spending more money right like several analyzes is possible through Python the simple Python visualization right, and you get to understand more so why do we need to analyze I need to I get to understand more about the data, so I need to do exploratory data analysis EDA right so once I do visualization once I do EDA I get more information about my customers right so so for example here I am gone to the level of just analyzing at an invoice level which invoices are higher and which are not right so all of this kind of analysis the more you analyze the more you explore the better your chances of coming out with better predictions.

So as as as a data scientist 70% of your work would be on exploring and analyzing your data right so data preparation data manipulation and data exploration is the key aspect for any data scientist okay 70 to 75 percent of your work is going to be purely exploring and analyzing the data and then once you analyze the data you start running your machine learning models you start understanding what are the models that can be applied for prediction, okay, so I have just played around with a lot of analysis you don't have to break your heads with this, but I am moving on to a stage where how I and how each variable works with each other variable right so that is another aspect of you know data science where you can understand at the data level how one variable is affecting variable so there is the concept called correlation that you will be learning as part of your course where you will be able to understand.

You know how total price is affecting recency or how frequency is as affecting the monetary value and stuff like that right so this data level analysis is something that is not possible in a typical software SDLC life cycle project this is something that you will learn better only by becoming a data scientist, okay, and you know that is where exploration becomes a very important aspect and once you understand that you start looking into you start getting into the face of model selection so what is model selection is basically you have you know about your data well you know what feels influence what other fields, and you know how fields are connected now for customer segmentation what you need to do is you you are supposed to do two things one is you are supposed to understand the existing customer segments which you have understood by exploration where I have bad customers big spenders lost customers and bad spenders right.

So with that pattern, I want to be able to come out with a machine learning model that will help me to predict any data or any customer as to which segments the customer is likely to fall into okay so given a new set of customers I would be able to understand with the help of my machine learning algorithm I would be able to understand under which customer segments that customer might be falling into so that is the advantage of machine learning it does not only give you the pattern of existing data it also helps you to understand if some new data is passed in there which of the customer segments will that new data fit into right imagine a retail one or having that flower if he knows under which segment a customer segments a particular new customer might fall in don't you think he'll be able to treat the customer better even before the customer has really started showing potential right.

That is where machine learning helps in play so as part of machine learning model selection what we do is we apply we look at various techniques right whether it is classification techniques like you know random forest or support vector model or K&N Xavier's so in this case we are working on a classification algorithm because we want to segment customers into different segments so each segment is a class, so we are going to work on a classification algorithm, okay but in other cases, you might want to work on regression algorithms for that there are other techniques.

Post a Comment

Previous Post Next Post