[R] follow-up on Error when reading a SAS transport file (with sasxport.get from Hmisc)

Frank E Harrell Jr f.harrell at vanderbilt.edu
Fri Oct 10 22:45:31 CEST 2008


Peter Dalgaard wrote:
> Jean-Louis Abitbol wrote:
>> I have done what P. Dalgaard has suggested and I don't find a
>> descrepancy between the number of values and the number of labels: there
>> 15 each...
>>
>> Any hint on what might go wrong here ?
>>   
> 
> Actually, I think you got it:
> 
>  > factor(1,c(NA,1:4),c(1:5))
> Error in factor(1, c(NA, 1:4), c(1:5)) :
>  invalid labels; length 5 should be 1 or 4
> 
> but
> 
>  > factor(1,c(NA,1:4),c(1:5),exclude=NULL)
> [1] 2
> Levels: 1 2 3 4 5
> 
> so the issue is more than likely that your SAS format puts a label on 
> "." (missing). You probably need something like
> 
> factor(x, f$value, f$label, exclude=if (!any(is.na(f$value))) NA)

Thanks Peter.  We will make this change in Hmisc for the next release.

Thomas - please take note.  Thanks.

Frank

>> Here is the output
>>
>> The SAS format from proc contents
>>
>> VISITF                                                                   
>>                 . = 
>> INEXTXT                                                            -10 
>> = Visit 1 [Screening]                                                  
>> 0 = Visit 2 [Baseline]                                   
>>                 1 = CRF 
>> Tracking                                                         6 = 
>> Visit 6                                                              7 
>> = Tel.Contact (day 7)                                                 
>> 14 = Visit 3                                              
>>                21 = Tel.Contact (day 
>> 21)                                                28 = Visit 
>> 4                                                             35 = 
>> Tel.Contact (day 35)                                                43 
>> = Visit 5 [EOT]                                                       
>> 65 = Visit 6 [Follow-up]                                  
>>               777 = End of 
>> Study                                                       888 = 
>> Concomitant Med.                                                   999 
>> = Adverse Events                the cat output with  sep=" * " (manual 
>> CR edit due to line length)             Processing SAS dataset 
>> ADMIN     .x=  * 43 * 28 * 0 * 14 * 43 * 0 * 28 *
>> 14 * 28 * 43 * 14 * 0 * 28 * 14 * 0 * 43 * 43 * 28 * 14 * 0 * 0 * 43 * 
>> 28 * 14  * 0 * 43 * 28 * 14 * 0 * 14
>> * 0 * 43 * 28 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 
>> * 43 * 14 * 0 * 0 * 43 * 28 * 14 * 0 *
>> 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 14 * 0 * 43 * 28 * 0 * 28 * 43 * 
>> 14 * 14 * 0 * 28 * 43 * 0 * 43 * 0 *
>> 43 * 14 * 0 * 28 * 0 * 43 * 28 * 14 * 43 * 28 * 14 * 0 * 43 * 28 * 14 
>> * 0 * 43 * 28 * 14 * 0 * 0 * 43 * 43 * 28 *
>> 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 * 14
>>  * 0 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28
>>  * 14 * 0 * 28 * 14 * 0 * 43 * 14 * 0 * 43  * 28 * 0 * 14 * 43 * 28 * 
>> 14 * 43 * 28 * 0 * 28 * 14 * 0 * 43 * 0 * 43
>>  * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28  * 14 * 0 * 0 * 28 * 14 * 
>> 0 * 43 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 *
>>  43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43  * 28 * 14 * 0 * 43 * 28 * 
>> 14 * 0 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 28
>>  * 14 * 0 * 14 * 28 * 43 * 0 * 43 * 14
>>   * 28 * 0 * 28 * 14 * 43 * 0 * 0 * 14 * 0 * 28 * 43 * 43 * 28 * 14 * 0
>>   * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0
>>    * 14 * 0 * 43 * 28 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 *
>>    14 * 0 * 43 * 28 * 14 * 0 * 43 * 28    * 14 * 0 * 43 * 28 * 14 * 0 
>> * 43 * 28 * 14 * 0 * 14 * 0 * 43 * 28 * 0
>>    * 43 * 28 * 14 * 28 * 43 * 0 * 14
>>     * 0 * 43 * 28 * 14 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 *
>>     14 * 0 * 43 * 28 * 14 * 0 * 43     * 28 * 14 * 0 * 43 * 28 * 14 * 
>> 0 * 14 * 0 * 28 * 14 * 0 * 14 * 0 *
>>     43 * 28 * 14 * 0 * 43 * 28 * 14 * 0     * 43 * 28 * 14 * 0 * 43 * 
>> 28 * 14 * 0 * 43 * 28 * 14 * 0 * 0 * 43 *
>>     28 * 14 * 0 * 14 * 43 * 28 * 28 * 14     * 0 * 43 * 43 * 28 * 0 * 
>> 14 * 28 * 0 * 14 * 43 * 43 * 28 * 14 * 0 *
>>     43 * 28 * 14 * 0 * 43 * 28 * 14 * 0     * 28 * 14 * 0 * 43 * 43 * 
>> 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 *
>>     14 * 0 * 43 * 28 * 14 * 0 * 43 * 28     * 14 * 0 * 43 * 28 * 14 * 
>> 0 * 0 * 28 * 14 * 0 * 0 * 14 * 0 * 43 * 28
>>     * 0 * 43 * 14 * 28 * 14 * 43 * 28     * 0 * 0 * 43 * 28 * 14 * 43 
>> * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 *
>>     28 * 14 * 0 * 43 * 14 * 0 * 43 * 28     * 14 * 0 * 0 * 43 * 28 * 
>> 14 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 *
>>     43 * 28 * 14 * 0 * 43 * 28 * 14 * 0     * 43 * 28 * 14 * 0 * 43 * 
>> 28 * 14 * 0 * 43 * 28 * 14 * 0 * 0 * 43 *
>>     28 * 14 * 0 * 14 * 28 * 43 * 14 * 28     * 0 * 43 * 43 * 28 * 0 * 
>> 14 * 28 * 0 * 14 * 43 * 43 * 28 * 14 * 0 *
>>     43 * 28 * 14 * 0 * 43 * 28 * 14 * 0     * 28 * 14 * 0 * 43 * 43 * 
>> 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 *
>>     14 * 0 * 43 * 28 * 14 * 0 * 43 * 28     * 14 * 0 * 43 * 28 * 14 * 
>> 0 * 43 * 28 * 14 * 0 * 28 * 14 * 0 * 43 *
>>     14 * 0 * 28 * 43 * 14 * 0 * 43 * 28     * 43 * 28 * 0 * 14 * 43 * 
>> 14 * 0 * 28 * 0 * 43 * 28 * 14 * 43 * 28 *
>>     14 * 0 * 43 * 28 * 14 * 0 * 43 * 28     * 14 * 0 * 0 * 43 * 28 * 
>> 14 * 14 * 0 * 28 * 14 * 0 * 43 * 28 * 14 *
>>     0 * 0 * 43 * 28 * 14 * 0 * 43 * 28     * 14 * 0 * 43 * 28 * 14 * 0 
>> * 43 * 28 * 14 * 0 * 28 * 14 * 0 * 43 *
>>     14 * 0 * 43 * 28 * 14 * 28 * 0 * 28     * 43 * 0 * 14 * 43 * 28 * 
>> 14 * 0 * 14 * 0 * 43 * 28 * 43 * 28 * 14 *
>>     0 * 43 * 28 * 14 * 0 * 0 * 43 * 28     * 14 * 0 * 0 * 43 * 28 * 14 
>> * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 *
>>     43 * 28 * 14 * 0 * 43 * 28 * 14 * 0     * 43 * 28 * 14 * 0 * 43 * 
>> 28 * 14 * 0 * 43 * 28 * 14 * 0 * 0 * 43 *
>>     28 * 14 * 14 * 43 * 28 * 0 * 14 * 0     * 14 * 0 * 43 * 28 * 43 * 
>> 14 * 0 * 28 * 0 * 43 * 28 * 14 * 43 * 28 *
>>     14 * 0 * 43 * 28 * 14 * 0 * 43 * 28     * 14 * 0 * 0 * 28 * 14 * 
>> 28 * 14 * 0 * f$value=  * NA * -10 * 0 * 1 * 6 * 7 * 14 * 21 * 28 * 35 
>> * 43 * 65 * 777
>> * 888 * 999 * f$label=  * INEXTXT * Visit 1 [Screening] * Visit 2 
>> [Baseline] * CRF
>> Tracking * Visit 6 * Tel.Contact (day 7) * Visit 3 * Tel.Contact (day 
>> 21) * Visit 4 * Tel.Contact (day 35) * Visit 5 [EOT]
>> * Visit 6 [Follow-up] * End of Study * Concomitant Med. * Adverse Events
>> * Erreur dans factor(x, f$value, f$label) :   invalid labels; length 
>> 15 should be 1 or 14
>>
>> Thanks again, JL
>>
>>
>> On Thu, 09 Oct 2008 17:33:06 +0200, "Peter Dalgaard"
>> <P.Dalgaard at biostat.ku.dk> said:
>>  
>>> Jean-Louis Abitbol wrote:
>>>    
>>>> Dear All,
>>>>
>>>> I get the following error when using either SASxport or Hmisc to 
>>>> read an
>>>> .xpt file:
>>>>
>>>>        
>>>>> w <- read.xport("D:/consult/Trophos/dnp/base/TRO_ds_20081006.xpt")
>>>>>             
>>>> Erreur dans factor(x, f$value, f$label) :   invalid labels; length 
>>>> 15 should be 1 or 14
>>>>        
>>>>> z<- sasxport.get("D:/consult/Trophos/dnp/base/TRO_ds_20081006.xpt")
>>>>>             
>>>> Erreur dans factor(x, f$value, f$label) :   invalid labels; length 
>>>> 15 should be 1 or 14
>>>>
>>>> I don't understand what is wrong with the labels ! Is there a limit for
>>>> their length ?
>>>> Could the problem be in the formats label ?         
>>> Hmmnoo...
>>>
>>> This is happening in R code, and the error is the same as you'd get from
>>>
>>>    
>>>> factor(1,levels=1:4,labels=1:5)
>>>>       
>>> Error in factor(1, levels = 1:4, labels = 1:5) :
>>>   invalid labels; length 5 should be 1 or 4
>>>
>>> So, not going into the actual code, I would suspect that it is
>>> encountering a problem where a user format has values and labels out of
>>> sync. This could well be a bug in the package(s), but I wouldn't rule
>>> out that your data could have gotten into some inconsistent state. You
>>> might try debugging to the trouble spot and see what is actually in
>>> f$value and f$label at that point.
>>>
>>>    
>>>> Just in case this might help this is the  output from test <- 
>>>> lookup.xport("D:/consult/Trophos/dnp/base/TRO_ds_20081006.xpt") 
>>>> print(test)
>>>>
>>>> for the first SAS dataset:
>>>> SAS xport file
>>>> --------------
>>>> Filename: `D:/consult/Trophos/dnp/base/TRO_ds_20081006.xpt'
>>>>
>>>> Variables in data set `ADMIN':
>>>>  dataset     name      type  format flength fdigits iformat iflength
>>>>  ifdigits                                  label nobs
>>>>    ADMIN      CEN   numeric               5       0                
>>>> 0          0                                 Centre  696
>>>>    ADMIN      PNO   numeric               6       0                
>>>> 0          0                      Pat./Subj. number  696
>>>>    ADMIN    VISIT   numeric  VISITF       0       0                
>>>> 0          0                              Visit no.  696
>>>>    ADMIN   VISITR   numeric               0       0                
>>>> 0          0                           Visit repeat  696
>>>>    ADMIN      PRO character               0       0                
>>>> 0          0                         Project number  696
>>>>    ADMIN    STUDY character               0       0                
>>>> 0          0                           Study number  696
>>>>    ADMIN  COLLDAT   numeric    DATE       7       0                
>>>> 0          0      Date collected (study medication)  696
>>>>    ADMIN   COMM_O character               0       0                
>>>> 0          0                                Comment  696
>>>>    ADMIN  INEXMET   numeric  YESNOF       0       0                
>>>> 0          0      In-/exclusion criteria still met?  696
>>>>    ADMIN LABEL_NO   numeric               4       0                
>>>> 0          0              Medication number (label)  696
>>>>    ADMIN  RAND_NO   numeric               4       0                
>>>> 0          0 Lowest randomisation/medication number  696
>>>>    ADMIN   RETMED   numeric               4       0                
>>>> 0          0            Number of capsules returned  696
>>>>    ADMIN     PAGE   numeric               0       0                
>>>> 0          0                                   Page  696
>>>>    ADMIN    PAGER   numeric               0       0                
>>>> 0          0                            Page repeat  696
>>>>    ADMIN CT_RECID character       $      40       0       $       
>>>> 40          0         for merge with notes and flags  696
>>>>    ADMIN      RNO   numeric               4       0                
>>>> 0          0                   Randomisation number  696
>>>>    ADMIN      SAF   numeric NOYESZF       0       0                
>>>> 0          0                                         696
>>>>    ADMIN      ITT   numeric NOYESZF       0       0                
>>>> 0          0                                         696
>>>>    ADMIN       PP   numeric NOYESZF       0       0                
>>>> 0          0                                         696
>>>>    ADMIN      SEX   numeric    SEXF       0       0                
>>>> 0          0                                    Sex  696
>>>>    ADMIN    AGE_C   numeric               4       0                
>>>> 0          0                               Age calc  696
>>>>    ADMIN      TRT   numeric    TRTF       0       0                
>>>> 0          0                                         696
>>>>    ADMIN CRF_VERS character               0       0                
>>>> 0          0                        CRF Version no.  696
>>>>
>>>> Thanks for any help,
>>>>
>>>> Best wishes, Jean-Louis
>>>>
>>>> PS: sessionInfo()
>>>> R version 2.7.1 RC (2008-06-20 r45965) i386-pc-mingw32
>>>> locale:
>>>> LC_COLLATE=French_France.1252;LC_CTYPE=French_France.1252;LC_MONETARY=French_France.1252;LC_NUMERIC=C;LC_TIME=French_France.1252 
>>>>
>>>>
>>>> attached base packages:
>>>> [1] stats     graphics  grDevices utils     datasets  methods   
>>>> base    
>>>> other attached packages:
>>>> [1] SASxport_1.2.3 Hmisc_3.4-3    foreign_0.8-29 RWinEdt_1.8-0
>>>> loaded via a namespace (and not attached):
>>>> [1] chron_2.3-24    cluster_1.11.11 grid_2.7.1      lattice_0.17-15
>>>>
>>>>
>>>> Jean-Louis Abitbol, MD
>>>> Chief Medical Officer
>>>> Trophos SA, Parc scientifique de Luminy, Case 931
>>>> Luminy Biotech Entreprises
>>>> 13288 Marseille Cedex 9 France
>>>> Email: jlabitbol at trophos.com ---- Backup Email: abitbol at sent.com
>>>> Cellular: (33) (0)6 24 47 59 34
>>>> Direct Line: (33) (0)4 91 82 82 73-Switchboard: (33) (0)4 91 82 82 
>>>> 82  Fax: (33) (0)4 91 82 82 89
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide 
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>         
>>> -- 
>>>    O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
>>>   c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
>>>  (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
>>> ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907
>>>
>>>
>>>
>>>     
> 
> 


-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University



More information about the R-help mailing list