[R] R how to find outliers and zero mean columns?

David Winsemius dwinsemius at comcast.net
Thu Mar 31 05:25:06 CEST 2016


> On Mar 30, 2016, at 6:39 PM, Norman Pat <normanmath1 at gmail.com> wrote:
> 
> Hi David,
> 
> > Please find the  attached data sample.
> 
> No. Nothing attached. Please read the Rhelp Info page and the Posting Guide.
> I attached it. Anyway I have attached it again (sample train.xlsx).

I didn't say you didn't attach it. I only said there was nothing attached. There's a difference. The mail-server strips most attachments. I _told_ you to read certain documents. You are not demonstrating that you are capable of following basic instructions. 

-- 
David Winsemius


> 
> Who is assigning you this task? Homework? (Read the Posting Guide.)
> This is my new job role so I have to do that. I know some basic R 
> 
> > 1. How to Identify features (names) that have all zeros?
> 
> That's generally pretty simple if "names" refers to columns in a data frame.
> You mean such as something like names(data.nrow(means==0))
> 
> > 2. How to remove features that have all zeros from the dataset?
> 
> But maybe you mean to process by rows?
> in a column(feature) 
> 
> > 3. How to identify features (names) that have outliers such as 99999,-1 in
> > the data frame.
> Please refer to the attached excel file
> 
> > 4. How to remove outliers?
> 
> You could start by defining "outliers" in something other than vague examples. If this is data from a real-life data gathering effort, then defining outliers would start with an explanation of the context.
> By looking at data I need to find the outliers
> 
> Thanks 
> 
> 
> On Thu, Mar 31, 2016 at 12:20 PM, David Winsemius <dwinsemius at comcast.net> wrote:
> 
> > On Mar 30, 2016, at 3:56 PM, Norman Pat <normanmath1 at gmail.com> wrote:
> >
> > Hi team
> >
> > I am new to R so please help me to do this task.
> >
> > Please find the  attached data sample.
> 
> No. Nothing attached. Please read the Rhelp Info page and the Posting Guide.
> 
> > But in the original data frame I
> > have 350 features and 400000 observations.
> >
> > I need to carryout these tasks.
> 
> Who is assigning you this task? Homework? (Read the Posting Guide.)
> 
> > 1. How to Identify features (names) that have all zeros?
> 
> That's generally pretty simple if "names" refers to columns in a dataframe.
> 
> >
> > 2. How to remove features that have all zeros from the dataset?
> 
> But maybe you mean to process by rows?
> 
> 
> > 3. How to identify features (names) that have outliers such as 99999,-1 in
> > the data frame.
> >
> > 4. How to remove outliers?
> 
> You could start by defining "outliers" in something other than vague examples. If this is data from a real-life data gathering effort, then defining outliers would start with an explanation of the context.
> 
> 
> >
> >
> > Many thanks
> 
> Please at least do the following "homework".
> 
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> David Winsemius
> Alameda, CA, USA
> 
> 
> <sample train .xlsx>

David Winsemius
Alameda, CA, USA



More information about the R-help mailing list