Wednesday, 8 June 2011

Data Anomalies

This is a place where I will put some quirks of the HHP data that may or may not be relevant in wining the $3million

1) Be aware of the claims truncated flag for the claims in Y1.

2) There are some who have claimed for pediatrics who are clearly not children. This might indicate family accounts or incorrect age?


  1. Thank you for your blog. I am new to data mining and am using HPP as a training ground. Your articles have really helped me think through things as well as import data to SQL Server within few minutes.
    I am learning R and your examples are invaluable - please keep up the good work.

  2. Hi Vishal,

    I'm glad you are finding some useful stuff here. I'm no R expert and haven't been using it long myself. What is good about R is that you know it is possible to do anything you might want to do, even if finding out how to do it can be a long winded process of Googling and trial and error. Having examples to work from is the best way to get the hang of things.

    My first encounter with R is documented in another forum, where you might find some useful code snippets.

    After another 2 years you will be a Guru. Please remember me if you win 3 million.

  3. Thank you for your response - Your link is great and will spend few hours going through the posts.

    If I win the 3 million - you shall receive 10% for showing me the path....