[R] finding peaks in a simple dataset with R

Tuszynski, Jaroslaw W. JAROSLAW.W.TUSZYNSKI at saic.com
Mon Nov 28 15:41:28 CET 2005


Try,

  # work directly with data from the input files
  directory  = system.file("Test", package = "caMassClass")
  X = msc.rawMS.read.csv(directory, "IMAC_normal_.*csv")
  Peaks = msc.peaks.find(X) # Find Peaks
  cat(nrow(Peaks), "peaks were found in", Peaks[nrow(Peaks),2], "files.\n")
  stopifnot( nrow(Peaks)==424 )

On my data to see that every thing works OK. Than I would convert your
"input.dat" to CSV format:

2.00, 233
2.04, 220
...
11.60, 540
12.00, 600   <-- a peak!
12.04, 450
...

On Windows machine, you can do it by opening your file in excel, and saving
it as CSV. Or possibly using test editor to replace ' ' with ', '. Than the
script

  X = msc.rawMS.read.csv('.', "Input.csv")
  Peaks = msc.peaks.find(X)
  cat(nrow(Peaks), "peaks were found in", Peaks  [nrow(Peaks),2],
"files.\n")

 should work.

Other way, is to try:

  X = read.table("input.dat", header=TRUE)
  Y = X[,2]
  rownames(Y) = signif(X[,1], 6)
  Peaks = msc.peaks.find(Y)

Which casts your data in correct format, described in documentation as:
"Spectrum data either in matrix format [nFeatures x nSamples] or in 3D array
format [nFeatures x nSamples x nCopies]. Row names (rownames(X)) store M/Z
mass of each row."

I hope one of those solutions works for you.

Good Luck.

Jarek Tuszynski

-----Original Message-----
From: dylan.beaudette at gmail.com [mailto:dylan.beaudette at gmail.com] 
Sent: Wednesday, November 23, 2005 5:47 PM
To: r-help at stat.math.ethz.ch
Cc: Tuszynski, Jaroslaw W.
Subject: Re: [R] finding peaks in a simple dataset with R


On Wednesday 23 November 2005 10:15 am, Tuszynski, Jaroslaw W. wrote:
> >> I am looking for some way to locate peaks in a simple x,y data set.
>
> See my 'msc.peaks.find' function in 'caMassClass', it has a simple 
> peak finding algorithm.
>
> Jarek Tuszynski
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list 
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html

Jarek,

Thanks for the tip. I was able to install the caMassClass package and all of

its dependancies. In addition, I was able to run the examples on the manual 
pages.

However, The format of the input data to the 'msc.peaks.find' function is
not 
apparent to me. In its simplest form, my data looks something like this:

2.00 233
2.04 220
...
11.60 540
12.00 600   <-- a peak!
12.04 450
...

Here is an example R session, trying out the function you suggested:

#importing my data like this:
X <- read.table("input.dat", header=TRUE)

#from the example:
Peaks = msc.peaks.find(X)

#errors with:
Error in sort(x, partial = unique(c(lo, hi))) :
        'x' must be atomic


Also: I have tried one of the functions ( 'getPeaks' ) listed on the 
'msc.peaks.find' manual page, however I am still having a problem with the 
format of my data vs. what the function is expecting.

#importing my data like this:
X <- read.table("input.dat", header=TRUE)

#setup an output file for peak information
peakfile <- paste("peakinfo.csv", sep="/")

#run the analysis:
getPeaks(X,peakfile)

#errors with:
Error in area/max(area) : non-numeric argument to binary operator In
addition: Warning message: no finite arguments to max; returning -Inf

any ideas would be greatly appreciated!

-- 
Dylan Beaudette
Soils and Biogeochemistry Graduate Group
University of California at Davis
530.754.7341




More information about the R-help mailing list