[R] [R ] Writing loop to estimate ARCH test for a multiple columns of a data frame?

Wed May 13 13:28:26 CEST 2020

Dear Sir,

I am so sorry that due to certain inconveniences, I became late to try your
suggested code and to reply to your email.

Thank you very much for your wonderful solution and suggestion for my
problem. Like before,  Your suggested code has worked awesome. Even, I
successfully imported the required output to the word following your
suggested similar path for the Libre office editor.

But, I have certain queries on your suggested code mentioned below which I
would like to discuss with you for my further learning.

1. Is there any difference between reading the tab and text file in R
because when I used  sp_8_5<-read.table("sp_8_5.tab",sep="\t",

header=TRUE,stringsAsFactors=FALSE)
it had thrown some error. But, when I changed the sp_8_5.tab into
sp_8_5.text, it worked. So, here my query, "does R read tab and text file
differently, however, both the files are similar"?

2. In the code, "return(sprintf("ChiSq = %.1f, p =
%.3f",archout$statistic,archout$p.value))", sprintf stands for printing the
particular results (i.e., statistics and p-value), right? Further, "ChiSq =
%.1f, p = %.3f" indicate the calling the values up to 1 and 3 decimal
points respectively, right? kindly correct me if I am worng in my
interpretation.

3. While opening a text file, sink("sp_8_5.txt")
                                         for(row in 0:2) {
                                         for(column in 1:4)

cat(spout[[column+row*4]],ifelse(column
< 4,"\t","\n"))
                                         }
                                                   sink()
3.1. what sink indicates, I think here sink calls for the arranging of the
statistics and p-values in the required 3*4 dimension in the generated text
file, right? Please educate me.
3.2 Hence, the results are arranged in 3 rows and 4 columns in the text
file. I understand the code for arranging loop for columns [i.e.,
for(column in 1:4) ], but i didn't understand the loop for row [i.e., for(row
in 0:2)]. In particular, what is the logic behind the setting of 2 rather
than 3 for 3 rows in "for(row in 0:2)"?
3.3. In the code, "cat(spout[[column+row*4]],ifelse(column <
4,"\t","\n"))", what cat indicates? what is the logic behind [column+row*4]
 and ifelse(column < 4,"\t","\n") ? This is my major query in the entire
code. Please help me to understand this line.

Along with the above queries in your suggested code, I have one more query that
is it possible to rename each row and column? Actually, why I am asking
this because I have data from 80 countries, and each country has 5 columns
of data arranging in 5 columns. In other words, the total number of columns
in my study is 400. While doing the ARCH test for each column, there may be
a mistake to arrange the results in the text file. Thus, I want to arrange
the resulted statistics for 5 columns (for instance A1, A2, A3, A4, A5) for
each country in the following way which I think will definitely avoid any
kind of typo-mistake in arranging output in the text file. In other words,
Each row will have results for each country arranged in 5 columns for the
particular 5 variables which help to identify the particular result for the
particular columns of the particular countries in an easy manner.

Country           A1        A2       A3     A4     A5
India              0.65      0.33   0.32   0.12  0.34
Israel              0.35      0.05   0.10    0.15   0.23
Australia          0.43      0.25    0.45    0.55    0.56

and so on.

Thank you very much, Sir, for educating a R learner for which I shall be
always grateful to you.

[image: Mailtrack]
<https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&>
Sender
notified by
Mailtrack
<https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&>
05/13/20,
04:56:34 PM

On Sat, May 9, 2020 at 8:58 AM Jim Lemon <drjimlemon using gmail.com> wrote:

> Hi Subhamitra,
> I have washed the dishes and had a night's sleep, so I can now deal with
> your text munging problem. First, I'll reiterate the solution I sent:
>
> sp_8_5<-read.table("sp_8_5.tab",sep="\t",
>  header=TRUE,stringsAsFactors=FALSE)
> library(tseries)
> library(FinTS)
> # create a function that returns only the
> # statistic and p.value as a string
> archStatP<-function(x) {
>  archout<-ArchTest(x)
>  # I have truncated the values here
>  return(sprintf("ChiSq = %.1f, p =
> %.3f",archout$statistic,archout$p.value))
> }
> # using "lapply", run the test on each column
> spout<-lapply(sp_8_5[,2:13],archStatP)
>
> If you look at "spout" you will see that it is a list of 12 character
> strings. I arranged this as you seem to want the contents of a 3x4 table in
> a document. This is one way to do it, there are others.
>
> First, create a text table of the desired dimensions. I'll do it with
> loops as you seem to be familiar with them:
>
> # open a text file
> sink("sp_8_5.txt")
> for(row in 0:2) {
>  for(column in 1:4)
>   cat(spout[[column+row*4]],ifelse(column < 4,"\t","\n"))
> }
> sink()
>
> If you open this file in a text editor (e.g. Notepad) you will see that it
> contains 3 lines (rows), each with four TAB separated strings. Now to
> import this into a word processing document. I don't have MS Word, so I'll
> do it with Libre Office Writer and hope that the procedure is similar.
>
> Move to where you want the table in your document
> Select Insert|Text from file from the top menu
> Select (highlight) the text you have imported
> Select Convert|Text to table from the top menu
>
> The highlighted area should become a table. I had to reduce the font size
> from 12 to 10 to get the strings to fit into the cells.
>
> There are probably a few more changes that you will want, so let me know
> if you strike trouble.
>
> Jim
>
>
> On Fri, May 8, 2020 at 11:28 PM Subhamitra Patra <
> subhamitra.patra using gmail.com> wrote:
>
>> Dear Sir,
>>
>> Thank you very much for your wonderful suggestion for my problem. Your
>> suggested code has excellently worked and successfully extracted the
>> statistics and p-value in another R object.
>>
>> Concerning your last suggestion, I attempted to separate the strings with
>> TAB character in the "spout" object by using different alternative packages
>> like dplyr, tidyr, qdap, ans also by using split,strsplit function so that
>> can export the statistics and p-values for each column to excel, and later
>> to the MSword file, but got the below error.
>>
>> By using the  split function, I wrote the code as,
>> *string[] split = s.Split(spout, '\t')*
>> where I got the following errors.
>> Error: unexpected symbol in "string[] split"
>> Error: unexpected symbol in "string[[]]split"
>> Error in strsplit(row, "\t") : non-character argument
>>
>> Then I tried with  strsplit function by the below code
>> *strsplit(spout, split)*
>> But, got the below error as
>> Error in as.character(split) :
>>   cannot coerce type 'closure' to vector of type 'character'.
>>
>> Then used dplyr and tidyr package and the wrote the below code
>> library(dplyr)
>> library(tidyr)
>> *separate(spout,value,into=c(“ChiSq”,”p”),sep=”,”)*
>> *separate(spout,List of length 12,into=c(“ChiSq”,”p”),sep="\t")*
>> But, got the errors as,
>> Error: unexpected input in "separate(spout,value,into=c(“"
>> Error: unexpected symbol in "separate(spout,List of"
>>
>> Then used qdap package with the code below
>>
>> *colsplit2df(spout,, c("ChiSq", "p"), ",")*
>> *colsplit2df(spout,, c("ChiSq", "p"), sep = "\t")*
>> But got the following errors
>> Error in dataframe[, splitcol] : incorrect number of dimensions
>> In addition: Warning message:
>> In colsplit2df_helper(dataframe = dataframe, splitcol = splitcols[i],  :
>>   dataframe object is not of the class data.frame
>> Error in dataframe[, splitcol] : incorrect number of dimensions
>> In addition: Warning message:
>> In colsplit2df_helper(dataframe = dataframe, splitcol = splitcols[i],  :
>>   dataframe object is not of the class data.frame
>>
>> Sir, please suggest me where I am going wrong in the above to separate
>> string in the "spout" object.
>>
>> Thank you very much for your help.
>>
>> [image: Mailtrack]
>> <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&> Sender
>> notified by
>> Mailtrack
>> <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&> 05/08/20,
>> 06:51:46 PM
>>
>> On Fri, May 8, 2020 at 4:47 PM Jim Lemon <drjimlemon using gmail.com> wrote:
>>
>>> 1) In general, *apply functions return a list with the number of
>>> elements equal to the number of columns or other elements of the input
>>> data. You can assign that list as I have to "spout" in the first example.
>>>
>>> 2) spout<-list() assigns the name "spout" to an empty list. As we are
>>> processing columns 2 to 12 of the input data, spout[[i-1]] assigns the
>>> results to elements 1 to 11 of the list "spout". Just a low trick.
>>>
>>> 1a) Yes, you can create a "wrapper" function that will return only the
>>> statistic and p.value.
>>>
>>> # create a function that returns only the
>>> # statistic and p.value as a string
>>> archStatP<-function(x) {
>>>  archout<-ArchTest(x)
>>>  return(sprintf("ChiSq = %f, p = %f",archout$statistic,archout$p.value))
>>> }
>>> # using "lapply", run the test on each column
>>> spout<-lapply(sp_8_5[,2:12],archStatP)
>>>
>>> Note that I should have used "lapply". I didn't check the output
>>> carefully enough.
>>>
>>> 2a) Now you only have to separate the strings in "spout" with TAB
>>> characters and import the result into Excel. I have to wash the dishes, so
>>> you're on your own.
>>>
>>> Jim
>>>
>>> On Fri, May 8, 2020 at 8:26 PM Subhamitra Patra <
>>> subhamitra.patra using gmail.com> wrote:
>>>
>>>> Dear Sir,
>>>>
>>>> Thank you very much for such an excellent solution to my problem. I was
>>>> trying sapply function since last days, but was really unable to write
>>>> properly. Now, I understood my mistake in using sapply function in the
>>>> code. Therefore, I have two queries regarding this which I want to discuss
>>>> here just for my learning purpose.
>>>>
>>>> 1. While using sapply function for estimating one method across the
>>>> columns of a data frame, one needs to define the list of the output table
>>>> after using sapply so that the test results for each column will be
>>>> consistently stored in an output object, right?
>>>>
>>>> 2. In the spout<- list() command, what spout[[i-1]]  indicates?
>>>>
>>>> Sir, one more possibility which I would like to ask related to my above
>>>> problem just to learn for further R programming language.
>>>>
>>>> After running your suggested code, all the results for each column are
>>>> being stored in the spout object. From this, I need only the statistics and
>>>> P-value for each column. So, my queries are:
>>>>
>>>> 1. Is there any way to extract only two values (i.e., statistics and
>>>> p-value) for each column that stored in spout object and save these two
>>>> values in another R data frame for each column?
>>>>  or
>>>> 2. Is there any possibility that the statistics and p-value
>>>> calculated for each column can directly export to a word file in a table
>>>> format (having 4 columns and 3 rows). In particular, is it possible to
>>>> extract both statistic and p-value results for each column to an MS word
>>>> file with the format of A1, A2, A3, A4 column results in 1st row, A5, A6,
>>>> A7, A8 column results in 2nd row, and A9, A10, A11, A12 column results in
>>>> the 3rd row of the table?
>>>>
>>>>
>>>> Like before, your suggestion will definitely help me to learn the
>>>> advanced R language.
>>>>
>>>> Thank you very much for your help.
>>>>
>>>> [image: Mailtrack]
>>>> <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&> Sender
>>>> notified by
>>>> Mailtrack
>>>> <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&> 05/08/20,
>>>> 03:47:26 PM
>>>>
>>>> On Fri, May 8, 2020 at 2:37 PM Jim Lemon <drjimlemon using gmail.com> wrote:
>>>>
>>>>> Hi Subhamitra,
>>>>> This isn't too hard:
>>>>>
>>>>> # read in the sample data that was
>>>>> # saved in the file "sp_8_5.tab"
>>>>> sp_8_5<-read.table("sp_8_5.tab",sep="\t",
>>>>>  header=TRUE,stringsAsFactors=FALSE)
>>>>> library(tseries)
>>>>> library(FinTS)
>>>>> # using "sapply", run the test on each column
>>>>> spout<-sapply(sp_8_5[,2:12],ArchTest)
>>>>>
>>>>> The list "spout" contains the test results. If you really want to use
>>>>> a loop:
>>>>>
>>>>> spout<-list()
>>>>> for(i in 2:12) spout[[i-1]]<-ArchTest(sp_8_5[,i])
>>>>>
>>>>> Jim
>>>>>
>>>>>
>>>>> On Fri, May 8, 2020 at 5:27 PM Subhamitra Patra <
>>>>> subhamitra.patra using gmail.com> wrote:
>>>>>
>>>>>> Dear Sir,
>>>>>>
>>>>>> Herewith I am pasting a part of my sample data having 12 columns
>>>>>> below, and want to calculate ARCH test for the 12 columns by using a loop.
>>>>>>
>>>>>>
>>>>
>>>> --
>>>> *Best Regards,*
>>>> *Subhamitra Patra*
>>>> *Phd. Research Scholar*
>>>> *Department of Humanities and Social Sciences*
>>>> *Indian Institute of Technology, Kharagpur*
>>>> *INDIA*
>>>>
>>>
>>
>> --
>> *Best Regards,*
>> *Subhamitra Patra*
>> *Phd. Research Scholar*
>> *Department of Humanities and Social Sciences*
>> *Indian Institute of Technology, Kharagpur*
>> *INDIA*
>>
>

-- 
*Best Regards,*
*Subhamitra Patra*
*Phd. Research Scholar*
*Department of Humanities and Social Sciences*
*Indian Institute of Technology, Kharagpur*
*INDIA*

	[[alternative HTML version deleted]]