I did a presentation at the MelbURN R User Group last night on a few techniques to improve the accuracy of your predictive models and also talked a bit about the Heritage Health Prize story and how our win unfolded. The video of the talk is below and the R code I used in the simulations below that.

You will probably be able to enhance the code and make it more efficient. If you do, please let us all know.

Thanks for all your positive feedback - hope you all had an enjoyable night.

# Code from the presentation

## Footy Tipping

```
#--------------------------------------
# Footy Tipping Simulation
# Ensembling of Tipsters
#
# Phil Brierley
# Oct 2013
#---------------------------------------
#clean all objects from memory
rm(list=ls())
#------------------------------
# adjustable parameters to set
#------------------------------
Number_of_Tipsters <- 12
Number_of_Games <- 2000
Number_ofSeasons <- 100
Tipster_Strength <- 0.6
#------------------------------
#simulation each season
for (Season in 1:Number_ofSeasons){
#gernerate random tips
Results = matrix(rbinom(Number_of_Games*Number_of_Tipsters,1,Tipster_Strength),Number_of_Games,Number_of_Tipsters)
#majority vote = median score
Results[,Number_of_Tipsters] <- apply(Results[,1:(Number_of_Tipsters-1)],1,median)
#find the mean score per tipster over the season
seasonSummary <- apply(Results,2,mean)
#stack the seasons together
if (Season == 1) {
runningSummary <- seasonSummary
} else {
runningSummary <- rbind(runningSummary,seasonSummary)
}
} #Number_ofSeasons
#give the columns sensible names
colnames(Results)[1:Number_of_Tipsters] <- paste('Tipster',1:Number_of_Tipsters)
colnames(Results)[Number_of_Tipsters] <- paste('Majority Vote')
colnames(runningSummary) <- colnames(Results)
#plot the results
bestPunter <- apply(runningSummary[,1:(Number_of_Tipsters-1)],1,max)
plot(runningSummary[,Number_of_Tipsters]
,type='l'
,col='red'
,ylim=c(Tipster_Strength - 0.1,1)
,xlab='Season'
,ylab='% of Games Correct')
lines(bestPunter,col='blue')
abline(h=Tipster_Strength,col='green')
bp <- mean(bestPunter)
abline(h=bp,col='blue')
mv <- mean(runningSummary[,Number_of_Tipsters])
abline(h=mv,col='red')
legend("topright"
, inset=.05
, c(paste("Majority Vote (avg=",mv,")"),paste("Best Tipster (avg=",bp,")"),paste('expected (',Tipster_Strength,')'))
,fill=c('red','blue','green')
,horiz=FALSE)
```

## Footy Tipping Parallel

```
I think there is a 'parallel' package in base R now also.
library(help = "parallel")
```

```
###############################
# Footy Tipping Simulation
# Ensembling of Tipsters
#
# Parallel Version
#
# Phil Brierley
# Oct 2013
#
###############################
#clean all objects from memory
rm(list=ls())
#set memory
memsize <- 3200
if (memory.limit() < memsize) memory.limit(size=memsize)
#----------------------
#parameters to set
#-----------------------
Number_of_Tipsters <- 12
Number_of_Games <- 20000
Number_ofSeasons <- 100
Tipster_Strength <- 0.6
threads <- 8 #depends on how many processors you have
#----------------------------------------------
# main function to simulate a season
#----------------------------------------------
simulateSeason <- function(f){
#gernerate random tips
Results = matrix(rbinom(Number_of_Games*Number_of_Tipsters,1,Tipster_Strength),Number_of_Games,Number_of_Tipsters)
#my tip is a majority vote - hence median score
Results[,Number_of_Tipsters] <- apply(Results[,1:(Number_of_Tipsters-1)],1,median)
#find the mean score per tipster over the season
seasonSummary <- apply(Results,2,mean)
}
#end of function
#---------------------------------------------
#--------------------------------
# the parallel stuff
#--------------------------------
#load library
library(snowfall)
#initiate clusters
sfStop()
sfInit(parallel = TRUE, cpus = threads, type = "SOCK")
sfExport(list = c("Number_of_Games","Number_of_Tipsters","Tipster_Strength"))
#start the clock
timeStart <- Sys.time()
#do the calculation in parallel
seasonSummary <- sfClusterApplyLB(1:Number_ofSeasons, simulateSeason)
#stack results together into a data frame
runningSummary <- do.call(rbind.data.frame, seasonSummary)
colnames(runningSummary) <- paste('punter',1:ncol(runningSummary))
colnames(runningSummary)[ncol(runningSummary)] <- paste('ensemble')
#record the time it took
totTime <- as.numeric(Sys.time() - timeStart, units = "secs")
myText <- paste('Avg calculation time per season = ', formatC(totTime/ Number_ofSeasons,digits=2,format='f') ,'seconds')
#stop clusters
sfStop()
#--------------------------------------
#plot the results
#--------------------------------------
bestPunter <- apply(runningSummary[,1:(Number_of_Tipsters-1)],1,max)
plot(runningSummary[,Number_of_Tipsters]
,type='l'
,col='red'
,ylim=c(Tipster_Strength - 0.1,1)
,xlab='Season'
,ylab='% of Games Correct'
,main=myText)
lines(bestPunter,col='blue')
abline(h=Tipster_Strength,col='green')
bp <- mean(bestPunter)
abline(h=bp,col='blue')
mv <- mean(runningSummary[,Number_of_Tipsters])
abline(h=mv,col='red')
legend("topright"
,inset=.05
,c(paste("Majority Vote (avg=",mv,")")
,paste("Best Tipster (avg=",bp,")")
,paste('expected (',Tipster_Strength,')'))
,fill=c('red','blue','green')
,horiz=FALSE)
```

## Model Synergy

```
###############################################
# If two models have the same RMSE error
# but are different, what do we get by
# averaging them?
#
# Phil Brierley
# Oct 2013
#
###############################################
#clean all objects from memory
rm(list=ls())
#number of cases
cases <- 100000
#how much worse is one model than the other
worseness_factor = 1.0
#generate random errors
errors1 <- rnorm(cases,0,1)
errors2 <- rnorm(cases,0,1) * worseness_factor
#average the 2
errorsAve <- (errors1 + errors2)/2
#calculate then RMSE
rmse1 <- sqrt(sum(errors1 * errors1)/cases)
rmse2 <- sqrt(sum(errors2 * errors2)/cases)
rmseAve <- sqrt(sum(errorsAve * errorsAve)/cases)
#-----------------------------
# plot the results
#-----------------------------
op <- par(mfrow=c(2,2))
#histogram of errors
bp <- barplot(c(rmse1,rmse2,rmseAve)
,ylim =c(0,1.1*max(rmse1,rmse2,rmseAve))
,ylab='rmse')
axis(side = 1, at = bp, labels = c('model1','model2','modelAverage'))
abline(h=rmseAve,col='red',lwd=4)
#lineplot of errors
num <- 25
plot(errors1[1:num]
,col='red'
,type='l'
,ylim=c(min(errors1,errors2),max(errors1,errors2))
,ylab='error'
,xlab='case')
lines(errors2[1:num],col='blue',type='l')
lines(errorsAve[1:num],col='forestgreen',type='l',lwd=3)
abline(h=0)
#error distribution
br <- seq(from=min(errorsAve,errors2),to=max(errorsAve,errors2),length.out=100)
hist(errorsAve,breaks=br,col=rgb(0,0,1,1/4),xlab='error',main='error distribution')
hist(errors2,breaks=br,col=rgb(1,0,0,1/4),add=T)
legend("topright", inset=.05, c('model1','averaged models'), fill=c(rgb(1,0,0,1/4),rgb(0,0,1,1/4)), horiz=FALSE)
#scatterplot of errors
plot(errors1,errors2,main=paste('correlation=',cor(errors1,errors2)),xlab='model1 errors',ylab='model2 errors')
```

```
par(op)
```

## Model Synergy but varying correlation

```
###############################################
# If two models have the same RMSE error
# but are different, what do we get by
# averaging them?
#
# What happens as the correlation between the
# two models changes
#
# Phil Brierley
# Oct 2013
#
###############################################
#clean all objects from memory
rm(list=ls())
#memsize <- 6400
#if (memory.limit() < memsize) memory.limit(size=memsize)
#settings
cases <- 1000
number_of_correlations_to_test <- 20
threads <- 8
worseness_factor = 1.3
#initiate
errors1 <- rnorm(cases,0,1)
correlations <- seq(from=0,to=1, length.out = number_of_correlations_to_test)
#-----------------------------------------------------------------
# function to generate a correlated variable
#-----------------------------------------------------------------
generateCorrelatedVariable <- function(x1,rho){
if (rho==1) return(x1)
#adapted from http://stats.stackexchange.com/questions/15011/generate-a-random-variable-with-a-defined-correlation-to-an-existing-variable/15040#15040
n <- length(x1) # length of vector
#rho <- 0.8 # desired correlation = cos(angle)
theta <- acos(rho) # corresponding angle
#x1 <- rnorm(n, 1, 1) # fixed given data
x2 <- rnorm(n, 1, 1) # new random data
X <- cbind(x1, x2) # matrix
Xctr <- scale(X, center=TRUE, scale=FALSE) # centered columns (mean 0)
Id <- diag(n) # identity matrix
Q <- qr.Q(qr(Xctr[ , 1, drop=FALSE])) # QR-decomposition, just matrix Q
P <- tcrossprod(Q) # = Q Q' # projection onto space defined by x1
x2o <- (Id-P) %*% Xctr[ , 2] # x2ctr made orthogonal to x1ctr
Xc2 <- cbind(Xctr[ , 1], x2o) # bind to matrix
Y <- Xc2 %*% diag(1/sqrt(colSums(Xc2^2))) # scale columns to length 1
x <- Y[ , 2] + (1 / tan(theta)) * Y[ , 1] # final new vector
}
#---------------------------------------------------------------------
# function to calculate the errors
#---------------------------------------------------------------------
calcErrors <- function(myCorrelation){
errors2 <- worseness_factor * scale(generateCorrelatedVariable(errors1,myCorrelation))
#average the 2
errorsAve <- (errors1 + errors2)/2
#calculate then RMSE
cases <- length(errors1)
rmse1 <- sqrt(sum(errors1 * errors1)/cases)
rmse2 <- sqrt(sum(errors2 * errors2)/cases)
rmseAve <- sqrt(sum(errorsAve * errorsAve)/cases)
#return the results
c(myCorrelation,rmse1,rmse2,rmseAve)
}
#--------------------------------
# the parallel stuff
#--------------------------------
#load library
library(snowfall)
#initiate clusters
sfStop()
sfInit(parallel = TRUE, cpus = threads, type = "SOCK")
sfExport(list = c("errors1","generateCorrelatedVariable","calcErrors","worseness_factor"))
#do the calculation in parallel
allErrors <- sfClusterApplyLB(correlations,calcErrors)
#stack results together into a data frame
allErrors <- do.call(rbind.data.frame, allErrors)
colnames(allErrors) <- c('correlation','error1','error2','errorAve')
#stop clusters
sfStop()
#------------------------------------------
# plot the results
#------------------------------------------
yrange <- c(min(allErrors$errorWeighted,allErrors$errorAve,allErrors$error1,allErrors$error2),max(allErrors$errorWeighted,allErrors$errorAve,allErrors$error1,allErrors$error2))
plot(allErrors$correlation,allErrors$errorAve
,type='b'
,col='red'
,lwd=3
,xlab='correlation between the two models'
,ylab='RMS Error'
,main='The less correlated the two models, the more synergy when averaged'
,ylim=yrange)
lines(allErrors$correlation,allErrors$error1,col='blue',type='b')
lines(allErrors$correlation,allErrors$error2,col='black',type='b')
legend("left", inset=.05, c('model1','model2','average'), fill=c('blue','black','red'), horiz=FALSE)
```

## Variable Importance

```
#####################################################
#
# A generic method to calculate the importance
# of variables in any model
#
# Phil Brierley
# Oct 2013
#
#####################################################
#clean all objects from memory
rm(list=ls())
#set memory
memsize <- 3200
if (memory.limit() < memsize) memory.limit(size=memsize)
#libraries
library(nnet)
library(randomForest)
library(gbm)
#----------------------
#parameters to set
#-----------------------
#what model are we building
modTypes <- vector()
modTypes[1] = 'linear_regression'
modTypes[2] = 'neural_net'
modTypes[3] = 'gbm'
modTypes[4] = 'random_forest'
#a number >-=0 and < 1
deletion_threshold <- 0.05
#for data set generation
Number_of_Useful_Variables <- 10
Number_of_Junk_Variables <- 10
Number_of_Records <- 1000
Number_of_Removed_Useful_Variables <- 0
Include_Junk_Variables <- TRUE
#importance testing loops
numloopsImportance <- 100
#train test split
Train_Percent <- 0.5
#multithreading
threads <- 8
#-----------------------------------------
# error function
#-----------------------------------------
calc_error <- function(act,pred){
aact <- as.matrix(act)
ppred <- as.matrix(pred)
return (sqrt(colSums(((ppred) - (aact)) ^ 2) / nrow(aact)))
}
#------------------------
#generate a data set
#------------------------
#set seed if you want to regenerate the same data set
set.seed(42)
useful <- matrix(runif(Number_of_Records*Number_of_Useful_Variables,0,1),Number_of_Records,Number_of_Useful_Variables)
junk <- matrix(runif(Number_of_Records*Number_of_Junk_Variables,0,1),Number_of_Records,Number_of_Junk_Variables)
colnames(useful) <- paste('useful',1:ncol(useful),sep="_")
colnames(junk) <- paste('junk',1:ncol(junk),sep="_")
#create the target
useful_weightings <- sort(runif(Number_of_Useful_Variables,0,1),decreasing=TRUE)
target <- useful %*% useful_weightings
#remove some useful variables
useful <- useful[,1:(Number_of_Useful_Variables-Number_of_Removed_Useful_Variables)]
#create a data set
if (Include_Junk_Variables){
myData <- data.frame(cbind(useful,junk,target))
} else {
myData <- data.frame(cbind(useful,target))
}
#target - what we are predicting
theTarget <- 'target'
targindex <- ncol(myData)
colnames(myData)[targindex] <- theTarget
#----------------------------------------------------
# divide data set into train and test
#----------------------------------------------------
trainrows <- runif(nrow(myData)) < Train_Percent
if(length(which(trainrows)) < 2) stop('not enough training cases')
testrows <- !trainrows
#-------------------------------------------------
# function for calculating variable importance
#--------------------------------------------------
varImporatnce <- function(variable){
#initialse the errors
errorTrain <- 0
errorTest <- 0
#copy this variable data
temp <- myData[,variable]
for(i in 1:numloopsImportance){
#scramble the values of this variable
myData[,variable] <- temp[order(runif(length(temp)))]
#calculate the predictions
if (modType == 'neural_net'){
predictions <- predict(model,newdata=myData[,-targindex],type='raw')
}
if (modType == 'linear_regression'){
predictions <- predict(model, myData)
}
if (modType == 'random_forest'){
predictions <- predict(model, myData,type="response")
}
if (modType == 'gbm'){
predictions <- predict.gbm(model, myData[,-targindex],type="response",n.trees = model$n.trees)
}
#calculate the error
errorTest <- errorTest + calc_error(myData[testrows,theTarget],predictions[testrows])
errorTrain <- errorTrain + calc_error(myData[trainrows,theTarget],predictions[trainrows])
}
#return average train and test error
c(errorTrain/numloopsImportance,errorTest/numloopsImportance)
}
#----------------------------------------
# set up multithreading
#---------------------------------------
library(snowfall) #for parallel processing
library(rlecuyer)
sfInit(parallel = TRUE, cpus = threads, type = "SOCK")
sfClusterSetupRNG()
sfExport(list = c('myData','trainrows','testrows','numloopsImportance','calc_error','theTarget','targindex'))
####################################
# LOOP THROUGH ALL MODEL TYPES
####################################
variables <- setdiff(colnames(myData),theTarget)
candidates_for_deletion <- NULL
for (modType in modTypes){
#-----------------------------
#build a model
#----------------------------
if (modType == 'linear_regression'){
model <- lm(as.formula(paste(theTarget, " ~ . "))
, data=myData[trainrows,])
basePredictions <- predict(model, myData)
}
if (modType == 'neural_net'){
model <- nnet(x=myData[trainrows,-targindex]
,y=myData[trainrows,targindex]
,size=5
,linout=TRUE)
basePredictions <- predict(model,newdata=myData[,-targindex],type='raw')
}
if (modType == 'random_forest'){
model <- randomForest(x= myData[trainrows,-targindex]
,y=myData[trainrows,targindex]
,ntree=1000)
basePredictions <- predict(model,myData,type="response")
}
if (modType == 'gbm'){
model <- gbm(as.formula(paste(theTarget, " ~ . ")), # formula
data=myData[trainrows,], # dataset
distribution="gaussian", # see the help for other choices
n.trees=1000, # number of trees
shrinkage=0.05, # shrinkage or learning rate,
keep.data=FALSE, # keep a copy of the dataset with the object
verbose=FALSE, # don't print out progress
n.cores=1) # use only a single core (detecting #cores is # error-prone, so avoided here)
basePredictions <- predict.gbm(object=model, newdata=myData[,-targindex],type="response",n.trees = model$n.trees)
}
#calculate the error
full_Train_Error <- calc_error(myData[trainrows,theTarget],basePredictions[trainrows])
full_Test_Error <- calc_error(myData[testrows,theTarget],basePredictions[testrows])
#Export model to threads
sfExport(list = c('modType','model'))
if (modType == 'neural_net') sfLibrary(nnet)
if (modType == 'random_forest') sfLibrary(randomForest)
if (modType == 'gbm') sfLibrary(gbm)
#-------------------------------------
# calculate variable importance
#-------------------------------------
s <- sfClusterApplyLB(variables,varImporatnce)
s <- do.call(rbind.data.frame,s)
colnames(s) <- c('Train','Test')
row.names(s) <- variables
#get the full model error
s$Train <- s$Train / full_Train_Error
s$Test <- s$Test / full_Test_Error
#scale to 0-1
myRows <- which(s$Train > 1)
s[myRows,c('Train')] <- s[myRows,c('Train')] / max(s$Train)
s[-myRows,c('Train')] <-0
myRows <- which(s$Test > 1)
s[myRows,c('Test')] <- s[myRows,c('Test')] / max(s$Test)
s[-myRows,c('Test')] <-0
#pick candidates for deletion based on a threshold
my_candidates_for_deletion <- rownames(s[which(s$Test < deletion_threshold),])
candidates_for_deletion <- c(my_candidates_for_deletion,candidates_for_deletion)
#get the ranking of each variable
s1 <- s
s1 <- s1[order(s1$Test,decreasing = TRUE),]
s1$Rank <- 1:nrow(s1)
colnames(s1)[ncol(s1)] <- paste('Rank',modType,sep="_")
s1 <-s1[order(row.names(s1)),]
#combine the rankings
if (modType == modTypes[1]){
rankings <- s1[ncol(s1)]
} else {
rankings <- cbind(rankings,s1[ncol(s1)])
}
#---------------------------------------
# plot the chart
#---------------------------------------
s <- s[order(s$Test),]
x <- barplot(as.matrix(t(s))
,horiz=TRUE
,beside=TRUE
,main = paste(modType,'variable importance\nTrain RMSE =', formatC(full_Train_Error,digits = 5,format='f'),'\nTest RMSE = ',formatC(full_Test_Error,digits = 5,format='f'))
,col=c("aliceblue","forestgreen")
,xlim=c(-0.2,1)
,axes = FALSE
,axisnames = FALSE)
text(-0.1,colSums(x)/2,row.names(s),col='blue')
legend('bottomright',inset=0.05,c('Train','Test'),fill=c("aliceblue","forestgreen"))
abline(v=0)
} #end looping through model types
```

```
#end multithreading
sfStop()
#plot the candidates for deletion
barplot(sort(table(candidates_for_deletion)),main='Unimportant Variables',ylab='Votes',xlab='Variable Name')
```

```
#plot average ranking
d <- sort(rowMeans(rankings),decreasing = TRUE)
x <- barplot(
d
,main='Average Variable Ranking'
,ylab='Variable'
,xlab='Average Rank'
,horiz=TRUE
,xlim=c(-3,max(d))
,axes = TRUE
,axisnames = FALSE
)
text(-1.5,x,names(d),col='blue')
text(0.5,x+0.05,formatC(d,digits = 1,format='f'),col='blue',cex=0.8)
```

Excellent talk !

ReplyDeleteCurious of something in relation to the variable importance bit: It seemed you were advocating using all of your learners to vote on important variables. If so, then do you use this important set to refit all the models using only this predictor set? Doesn't that impact the ability of the disparate models when ensembling together?

By the way, http://cran.r-project.org/web/packages/RRF/index.html appears to be a published R package for regularized random forest. Also: https://dl.dropboxusercontent.com/u/45301435/GRF.pdf

Jeff

Hi Jeff,

ReplyDeleteI wasn't particularly advocating anything in particular - just wanted to demonstrate how this technique was algorithm independent and plot a few charts - the voting stuff was just one of those charts.

In reality what you could do is stick in a random number as a variable and use this as the cut off point. Any variables that are no more important than the random number across all algorithms can probably be eliminated - although this was hard to do in this demo as the numbers were random anyway.

The regularised random forest is not the same algorithm as Rie's, although thanks for introducing it to me.

Thanks for the clarification.

DeleteHi Phil

DeleteGreat talk, I have shared this with tthe modellers in my team. I'm curious why you perturb the data rather than leave each variable out? Most documented approaches I've seen so the later. Is that something you've seen elsewhere and what are the advantages?

Hi Shane,

DeleteThis method gives an indication of the variable importance in the model we've built - so if it is your final model then it indicates the variables driving that model. In order to leave a variable out you need to build a model for each variable. My method saves you having to do this and will quickly indicate all the variables that can be eliminated as they are having no effect on the outcome of that model.

The variable importance will change depending on the variables in the model. For example, if two variables are essentially the same then they could share the importance wheras if we got rid of one of them it could suddenly become twice as important. If

This technique is not new. see Elements of Statistical Learning, bottom of page 593.

http://www-stat.stanford.edu/~tibs/ElemStatLearn/

Got it, thanks!

ReplyDeleteThis comment has been removed by a blog administrator.

ReplyDeleteI enjoyed your presentation thank you. There were some good tips. Another tip would be to create new features using existing features. eg. feature9 = feature1 * feature5.

ReplyDeleteKind Regards,

James

I enjoyed it its i like it.

ReplyDeleteCoal Mines

This was a really informative article

ReplyDeleteThis comment has been removed by the author.

ReplyDeleteThis comment has been removed by a blog administrator.

ReplyDeleteThis comment has been removed by a blog administrator.

ReplyDeleteThanks For this information. Data Mining is known as the process of extracting the useful information from a pool of data and transforming them into a required format like CSV, Excel, HTML etc.

ReplyDeleteGood post.

ReplyDeleteVery nice article. Was there ever parts 2 & 3. I'd be interested in see those.

ReplyDeletesorry, no. But I am putting together some tutorials if you are Melbourne based.

ReplyDeleteReally good article/speech. I love the method of evaluating feature importance. Does this method have a name or any other reference?

ReplyDelete(BTW: It was a bit unclear to me at first weather or not you retrained the model in between each feature shuffling. Apparently you do not retrain, but only consider the a pretrained model. If you had retrained the model it would be similar to one step of RFE.)

Well... If I'd just read the other comments, I would have seen that you are referring to Elements of Statistical Learning 2ed. bottom of page 953. An updated link here: https://web.stanford.edu/~hastie/Papers/ESLII.pdf

DeleteThis blog is having advanced information about the techniques of the

ReplyDeleteData Mining. Very nice and thank you for sharing such information.COEPD LLC- Center of Excellence for Professional Development is the most trusted online training platform to global participants. We are primarily a community of Business Analysts who have taken the initiative to facilitate professionals of IT or Non IT background with the finest quality training. Our trainings are delivered through interactive mode with illustrative scenarios, activities and case studies to help learners start a successful career. We impart knowledge keeping in view of the challenging situations individuals will face in the real time, so that they can handle their job deliverables with at most confidence.

ReplyDeletehttp://coepd.us/business-analyst-classroom-training.html

Data Mining Services

ReplyDeleteThe precise understanding of your collected data can actually lead to quality data processing, better decision making, customer satisfaction and obviously improved company’s revenue. This is where the role of Data Mining fits in to improve a business’ efficiency with a positive approach. It is a process that aims at identifying new, meaningful and purposeful data in your existing accumulated data. This data is further utilized to deal with competitive business industry with a foresight and valuable interference.

Along with the professional and dedicated team of Data Miners, we at Loginworks hold the expertise in converting a huge amount of raw data into a valuable information. This process may appear time-taking, but it eventually helps in making strategic business decisions with an ease. With an experience of 11 years of experience in our domain, we ensure that the data reaches in an accurate and fast manner to our clients.

data mining services

Thanks for sharing the descriptive information on MSBI Tutorial.It’s really helpful to me since I'm taking MSBI Training. Keep doing the good work and if you are interested to know more on MSBI Tutorial, do check this MSBI Tutorial.:-https://www.youtube.com/watch?v=V5gXTb4QO-g&t=3s

ReplyDeleteThank you for this valuable information. It’s really a wonderful article Data Mining.

ReplyDeletePermit me to introduce you to LE-MERIDIAN FUNDING SERVICES. We are directly into pure loan and project(s) financing in terms of investment. We provide financing solutions to private/companies seeking access to funds in the capital markets i.e. oil and gas, real estate, renewable energy, Pharmaceuticals, Health Care, transportation, construction, hotels and etc. We can finance up to the amount of $900,000,000.000 (Nine Hundred Million Dollars) in any region of the world as long as our 1.9% ROI can be guaranteed on the projects.

ReplyDeleteLe-Meridian Funding Service.

(60 Piccadilly, Mayfair, London W1J 0BH, UK) Email Email Contact....lnfo@lemeridianfds.com

The code along with the graphs is very easy to understand. The video show that the concept has been excellently explained in the simplest way possible.

ReplyDeleteData Analytics Courses In Pune

This particular is usually apparently essential and moreover outstanding truth along with for sure fair-minded and moreover admittedly useful My business is looking to find in advance designed for this specific useful stuffs… self improvement

ReplyDeleteYou must have a lot of pride in writing quality content. I'm impressed with the amount of solid information you have written in your article. I hope to read more.

ReplyDeleteBest Data Science training in Mumbai

Data Science training in Mumbai

wonderful article. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article resolved my all queries. keep it up.

ReplyDeletedata analytics course in Bangalore

wonderful article. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article resolved my all queries.

ReplyDeleteData Science Course

We are glad to announce that in COEPD we have introduced Digital Marketing Internship Programs (Self sponsored) for professionals who want to have hands on experience. In affiliation with IT companies we are providing this program. Presently, this program is available in COEPD Hyderabad premises. We deem in real time practical Internship program. We guide participants through real-time project examples and assignments, giving credits for Real-Time Internship. Our digital marketing certified mentors tutor our learning people through modules of Digital Marketing in an exhaustive manner. This internship is intelligently dedicated to our avid and passionate participants predominantly acknowledging and appreciating the fact that they are on the path of making a career in Digital Marketing discipline. We upskill and master the nitty-gritty of the Digital Marketing profession. More than a training institute, COEPD today stands differentiated as a mission to help you "Build your dream career" - COEPD way.

ReplyDeletehttp://www.coepd.com/DMInternship.html

We are glad to announce that in COEPD we have introduced Digital Marketing Internship Programs (Self sponsored) for professionals who want to have hands on experience. In affiliation with IT companies we are providing this program. Presently, this program is available in COEPD Hyderabad premises. We deem in real time practical Internship program. We guide participants through real-time project examples and assignments, giving credits for Real-Time Internship. Our digital marketing certified mentors tutor our learning people through modules of Digital Marketing in an exhaustive manner. This internship is intelligently dedicated to our avid and passionate participants predominantly acknowledging and appreciating the fact that they are on the path of making a career in Digital Marketing discipline. We upskill and master the nitty-gritty of the Digital Marketing profession. More than a training institute, COEPD today stands differentiated as a mission to help you "Build your dream career" - COEPD way.

ReplyDeletehttp://www.coepd.com/DMInternship.html

I like viewing web sites which comprehend the price of delivering the excellent useful resource free of charge. I truly adored reading your posting. Thank you!

ReplyDeleteSimple Linear Regression

Correlation vs Covariance

Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.

ReplyDeleteCorrelation vs Covariance

Simple linear regression

Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.

ReplyDeleteCorrelation vs Covariance

Simple linear regression

data science interview questions

Great post i must say and thanks for the information. Education is definitely a sticky subject. However, is still among the leading topics of our time. I appreciate your post and look forward to more.

ReplyDeletedata science interview questions