[R] follow-up on Error when reading a SAS transport file (with sasxport.get from Hmisc)

Peter Dalgaard p.dalgaard at biostat.ku.dk
Fri Oct 10 21:50:45 CEST 2008


Jean-Louis Abitbol wrote:
> I have done what P. Dalgaard has suggested and I don't find a
> descrepancy between the number of values and the number of labels: there
> 15 each...
>
> Any hint on what might go wrong here ?
>   

Actually, I think you got it:

 > factor(1,c(NA,1:4),c(1:5))
Error in factor(1, c(NA, 1:4), c(1:5)) :
  invalid labels; length 5 should be 1 or 4

but

 > factor(1,c(NA,1:4),c(1:5),exclude=NULL)
[1] 2
Levels: 1 2 3 4 5

so the issue is more than likely that your SAS format puts a label on 
"." (missing). You probably need something like

factor(x, f$value, f$label, exclude=if (!any(is.na(f$value))) NA)
> Here is the output
>
> The SAS format from proc contents
>
> VISITF                                                                   
>                 . = INEXTXT                                              
>               -10 = Visit 1 [Screening]                                  
>                 0 = Visit 2 [Baseline]                                   
>                 1 = CRF Tracking                                         
>                 6 = Visit 6                                              
>                 7 = Tel.Contact (day 7)                                  
>                14 = Visit 3                                              
>                21 = Tel.Contact (day 21)                                 
>                28 = Visit 4                                              
>                35 = Tel.Contact (day 35)                                 
>                43 = Visit 5 [EOT]                                        
>                65 = Visit 6 [Follow-up]                                  
>               777 = End of Study                                         
>               888 = Concomitant Med.                                     
>               999 = Adverse Events   
>              
> the cat output with  sep=" * " (manual CR edit due to line length)    
>          
> Processing SAS dataset ADMIN     .x=  * 43 * 28 * 0 * 14 * 43 * 0 * 28 *
> 14 * 28 * 43 * 14 * 0 * 28 * 14 * 0 
> * 43 * 43 * 28 * 14 * 0 * 0 * 43 * 28 * 14  * 0 * 43 * 28 * 14 * 0 * 14
> * 0 * 43 * 28 * 43 * 28 * 14 * 0 * 43 
> * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 14 * 0 * 0 * 43 * 28 * 14 * 0 *
> 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 
> * 14 * 0 * 43 * 28 * 0 * 28 * 43 * 14 * 14 * 0 * 28 * 43 * 0 * 43 * 0 *
> 43 * 14 * 0 * 28 * 0 * 43 * 28 * 14 * 43 
> * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 0 * 43 * 43 * 28 *
> 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 * 14
>  * 0 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28
>  * 14 * 0 * 28 * 14 * 0 * 43 * 14 * 0 * 43 
>  * 28 * 0 * 14 * 43 * 28 * 14 * 43 * 28 * 0 * 28 * 14 * 0 * 43 * 0 * 43
>  * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 
>  * 14 * 0 * 0 * 28 * 14 * 0 * 43 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 *
>  43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 
>  * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 28
>  * 14 * 0 * 14 * 28 * 43 * 0 * 43 * 14
>   * 28 * 0 * 28 * 14 * 43 * 0 * 0 * 14 * 0 * 28 * 43 * 43 * 28 * 14 * 0
>   * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0
>    * 14 * 0 * 43 * 28 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 *
>    14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 
>    * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 14 * 0 * 43 * 28 * 0
>    * 43 * 28 * 14 * 28 * 43 * 0 * 14
>     * 0 * 43 * 28 * 14 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 *
>     14 * 0 * 43 * 28 * 14 * 0 * 43 
>     * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 14 * 0 * 28 * 14 * 0 * 14 * 0 *
>     43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 
>     * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 0 * 43 *
>     28 * 14 * 0 * 14 * 43 * 28 * 28 * 14 
>     * 0 * 43 * 43 * 28 * 0 * 14 * 28 * 0 * 14 * 43 * 43 * 28 * 14 * 0 *
>     43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 
>     * 28 * 14 * 0 * 43 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 *
>     14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 
>     * 14 * 0 * 43 * 28 * 14 * 0 * 0 * 28 * 14 * 0 * 0 * 14 * 0 * 43 * 28
>     * 0 * 43 * 14 * 28 * 14 * 43 * 28 
>     * 0 * 0 * 43 * 28 * 14 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 *
>     28 * 14 * 0 * 43 * 14 * 0 * 43 * 28 
>     * 14 * 0 * 0 * 43 * 28 * 14 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 *
>     43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 
>     * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 0 * 43 *
>     28 * 14 * 0 * 14 * 28 * 43 * 14 * 28 
>     * 0 * 43 * 43 * 28 * 0 * 14 * 28 * 0 * 14 * 43 * 43 * 28 * 14 * 0 *
>     43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 
>     * 28 * 14 * 0 * 43 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 *
>     14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 
>     * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 28 * 14 * 0 * 43 *
>     14 * 0 * 28 * 43 * 14 * 0 * 43 * 28 
>     * 43 * 28 * 0 * 14 * 43 * 14 * 0 * 28 * 0 * 43 * 28 * 14 * 43 * 28 *
>     14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 
>     * 14 * 0 * 0 * 43 * 28 * 14 * 14 * 0 * 28 * 14 * 0 * 43 * 28 * 14 *
>     0 * 0 * 43 * 28 * 14 * 0 * 43 * 28 
>     * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 28 * 14 * 0 * 43 *
>     14 * 0 * 43 * 28 * 14 * 28 * 0 * 28 
>     * 43 * 0 * 14 * 43 * 28 * 14 * 0 * 14 * 0 * 43 * 28 * 43 * 28 * 14 *
>     0 * 43 * 28 * 14 * 0 * 0 * 43 * 28 
>     * 14 * 0 * 0 * 43 * 28 * 14 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 *
>     43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 
>     * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 0 * 43 *
>     28 * 14 * 14 * 43 * 28 * 0 * 14 * 0 
>     * 14 * 0 * 43 * 28 * 43 * 14 * 0 * 28 * 0 * 43 * 28 * 14 * 43 * 28 *
>     14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 
>     * 14 * 0 * 0 * 28 * 14 * 28 * 14 * 0 * 
> f$value=  * NA * -10 * 0 * 1 * 6 * 7 * 14 * 21 * 28 * 35 * 43 * 65 * 777
> * 888 * 999 * 
> f$label=  * INEXTXT * Visit 1 [Screening] * Visit 2 [Baseline] * CRF
> Tracking * Visit 6 * Tel.Contact (day 7) * Visit 3 
> * Tel.Contact (day 21) * Visit 4 * Tel.Contact (day 35) * Visit 5 [EOT]
> * Visit 6 [Follow-up] * End of Study * Concomitant Med. * Adverse Events
> * 
> Erreur dans factor(x, f$value, f$label) : 
>   invalid labels; length 15 should be 1 or 14
>
> Thanks again, JL
>
>
> On Thu, 09 Oct 2008 17:33:06 +0200, "Peter Dalgaard"
> <P.Dalgaard at biostat.ku.dk> said:
>   
>> Jean-Louis Abitbol wrote:
>>     
>>> Dear All,
>>>
>>> I get the following error when using either SASxport or Hmisc to read an
>>> .xpt file:
>>>
>>>   
>>>       
>>>> w <- read.xport("D:/consult/Trophos/dnp/base/TRO_ds_20081006.xpt")
>>>>     
>>>>         
>>> Erreur dans factor(x, f$value, f$label) : 
>>>   invalid labels; length 15 should be 1 or 14
>>>   
>>>       
>>>> z<- sasxport.get("D:/consult/Trophos/dnp/base/TRO_ds_20081006.xpt")
>>>>     
>>>>         
>>> Erreur dans factor(x, f$value, f$label) : 
>>>   invalid labels; length 15 should be 1 or 14
>>>
>>> I don't understand what is wrong with the labels ! Is there a limit for
>>> their length ?
>>> Could the problem be in the formats label ? 
>>>   
>>>       
>> Hmmnoo...
>>
>> This is happening in R code, and the error is the same as you'd get from
>>
>>     
>>> factor(1,levels=1:4,labels=1:5)
>>>       
>> Error in factor(1, levels = 1:4, labels = 1:5) :
>>   invalid labels; length 5 should be 1 or 4
>>
>> So, not going into the actual code, I would suspect that it is
>> encountering a problem where a user format has values and labels out of
>> sync. This could well be a bug in the package(s), but I wouldn't rule
>> out that your data could have gotten into some inconsistent state. You
>> might try debugging to the trouble spot and see what is actually in
>> f$value and f$label at that point.
>>
>>     
>>> Just in case this might help this is the  output from 
>>> test <- lookup.xport("D:/consult/Trophos/dnp/base/TRO_ds_20081006.xpt") 
>>> print(test)
>>>
>>> for the first SAS dataset:
>>> SAS xport file
>>> --------------
>>> Filename: `D:/consult/Trophos/dnp/base/TRO_ds_20081006.xpt'
>>>
>>> Variables in data set `ADMIN':
>>>  dataset     name      type  format flength fdigits iformat iflength
>>>  ifdigits                                  label nobs
>>>    ADMIN      CEN   numeric               5       0                0    
>>>       0                                 Centre  696
>>>    ADMIN      PNO   numeric               6       0                0    
>>>       0                      Pat./Subj. number  696
>>>    ADMIN    VISIT   numeric  VISITF       0       0                0    
>>>       0                              Visit no.  696
>>>    ADMIN   VISITR   numeric               0       0                0    
>>>       0                           Visit repeat  696
>>>    ADMIN      PRO character               0       0                0    
>>>       0                         Project number  696
>>>    ADMIN    STUDY character               0       0                0    
>>>       0                           Study number  696
>>>    ADMIN  COLLDAT   numeric    DATE       7       0                0    
>>>       0      Date collected (study medication)  696
>>>    ADMIN   COMM_O character               0       0                0    
>>>       0                                Comment  696
>>>    ADMIN  INEXMET   numeric  YESNOF       0       0                0    
>>>       0      In-/exclusion criteria still met?  696
>>>    ADMIN LABEL_NO   numeric               4       0                0    
>>>       0              Medication number (label)  696
>>>    ADMIN  RAND_NO   numeric               4       0                0    
>>>       0 Lowest randomisation/medication number  696
>>>    ADMIN   RETMED   numeric               4       0                0    
>>>       0            Number of capsules returned  696
>>>    ADMIN     PAGE   numeric               0       0                0    
>>>       0                                   Page  696
>>>    ADMIN    PAGER   numeric               0       0                0    
>>>       0                            Page repeat  696
>>>    ADMIN CT_RECID character       $      40       0       $       40    
>>>       0         for merge with notes and flags  696
>>>    ADMIN      RNO   numeric               4       0                0    
>>>       0                   Randomisation number  696
>>>    ADMIN      SAF   numeric NOYESZF       0       0                0    
>>>       0                                         696
>>>    ADMIN      ITT   numeric NOYESZF       0       0                0    
>>>       0                                         696
>>>    ADMIN       PP   numeric NOYESZF       0       0                0    
>>>       0                                         696
>>>    ADMIN      SEX   numeric    SEXF       0       0                0    
>>>       0                                    Sex  696
>>>    ADMIN    AGE_C   numeric               4       0                0    
>>>       0                               Age calc  696
>>>    ADMIN      TRT   numeric    TRTF       0       0                0    
>>>       0                                         696
>>>    ADMIN CRF_VERS character               0       0                0    
>>>       0                        CRF Version no.  696
>>>
>>> Thanks for any help,
>>>
>>> Best wishes, Jean-Louis
>>>
>>> PS: sessionInfo()
>>> R version 2.7.1 RC (2008-06-20 r45965) 
>>> i386-pc-mingw32 
>>>
>>> locale:
>>> LC_COLLATE=French_France.1252;LC_CTYPE=French_France.1252;LC_MONETARY=French_France.1252;LC_NUMERIC=C;LC_TIME=French_France.1252
>>>
>>> attached base packages:
>>> [1] stats     graphics  grDevices utils     datasets  methods   base     
>>>
>>> other attached packages:
>>> [1] SASxport_1.2.3 Hmisc_3.4-3    foreign_0.8-29 RWinEdt_1.8-0 
>>>
>>> loaded via a namespace (and not attached):
>>> [1] chron_2.3-24    cluster_1.11.11 grid_2.7.1      lattice_0.17-15
>>>
>>>
>>> Jean-Louis Abitbol, MD
>>> Chief Medical Officer
>>> Trophos SA, Parc scientifique de Luminy, Case 931
>>> Luminy Biotech Entreprises
>>> 13288 Marseille Cedex 9 France
>>> Email: jlabitbol at trophos.com ---- Backup Email: abitbol at sent.com
>>> Cellular: (33) (0)6 24 47 59 34
>>> Direct Line: (33) (0)4 91 82 82 73-Switchboard: (33) (0)4 91 82 82 82  
>>> Fax: (33) (0)4 91 82 82 89
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>   
>>>       
>> -- 
>>    O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
>>   c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
>>  (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
>> ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907
>>
>>
>>
>>     


-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907



More information about the R-help mailing list