Type: Package
Title: Reading Portable Encapsulated Projects
Version: 0.5.0
Date: 2023-11-16
Maintainer: Nathan Sheffield <nathan@code.databio.org>
Description: A PEP, or Portable Encapsulated Project, is a dataset that subscribes to the PEP structure for organizing metadata. It is written using a simple YAML + CSV format, it is your one-stop solution to metadata management across data analysis environments. This package reads this standardized project configuration structure into R. Described in Sheffield et al. (2021) <doi:10.1093/gigascience/giab077>.
Imports: yaml, stringr, pryr, data.table, methods, RCurl
Suggests: knitr, testthat, rmarkdown, curl
VignetteBuilder: knitr
License: BSD_2_clause + file LICENSE
BugReports: https://github.com/pepkit/pepr
RoxygenNote: 7.2.3
Encoding: UTF-8
NeedsCompilation: no
Packaged: 2023-11-21 12:55:04 UTC; nsheff
Author: Nathan Sheffield [aut, cph, cre], Michal Stolarczyk [aut]
Repository: CRAN
Date/Publication: 2023-11-21 14:10:06 UTC

Append constant attributes across all the samples

Description

Append constant attributes across all the samples

Usage

.appendAttrs(.Object)

Arguments

.Object

an object of Project-class

Value

an object of Project-class


Apply amendments

Description

Overwrite and/or add Project attributes from the amendments section

Usage

.applyAmendments(cfg, amendments = NULL)

Arguments

cfg

config

amendments

list of amendments to apply

Value

possibly updated config

config


Function for recursive config data imports

Description

Function for recursive config data imports

Usage

.applyImports(cfg_data, filename)

Arguments

cfg_data

config data, possibly including imports statement

filename

path to the file to get the imports for

Value

config data enriched in imported sections, if imports existed in the input


Merge samples with identical names

Description

If sample table specifies samples with non-unique names, try to merge these samples

Usage

.autoMergeDuplicatedNames(.Object)

Arguments

.Object

an object of "Project"

Value

an object of "Project"


Check for a section existence in a nested list

Description

Check for a section existence in a nested list

Usage

.checkSection(object, sectionNames)

Arguments

object

list to inspect

sectionNames

vector or characters with sectio names to check for

Value

logical indicating whether the sections where found in the list

Examples

l = list(a=list(b="test"))
.checkSection(l,c("a","b"))
.checkSection(l,c("c","b"))

Derive attributes

Description

Derive attributes

Usage

.deriveAttrs(.Object)

Arguments

.Object

an object of "Project"

Value

an object of "Project"


Duplicate a selected attribute across all the samples

Description

Duplicate a selected attribute across all the samples

Usage

.duplicateAttrs(.Object)

Arguments

.Object

an object of "Project"

Value

an object of "Project"


Recursively try to expand list of strings

Description

Recursively try to expand list of strings

Usage

.expandList(x)

Arguments

x

list, possibly of strings that are paths to expand

Value

list of strings with paths expaned

Examples

x = list(a=list(b=list(c="~/test.txt")))
.expandList(x)

Expand system path

Description

This function expands system paths (the non-absolute paths become absolute) and replaces the environment variables (e.g, ${HOME}) with their values.

Usage

.expandPath(path)

Arguments

path

file path to expand. Potentially any string

Details

Most importantly strings that are not system paths are returned untouched

Value

Expanded path or untouched string

Examples


string = "https://www.r-project.org/"
.expandPath(string)
path = "$HOME/my/path/string.txt"
.expandPath(path)

Get the sample table from config

Description

Get the sample table from config

Usage

.getSampleTablePathFromConfig(config)

Arguments

config

an object of "Config"

Value

a string which specifies a path to the sample table file


Get the subsample tables from config

Description

Get the subsample tables from config

Usage

.getSubSampleTablePathFromConfig(config)

Arguments

config

an object of "Config"

Value

string/vector of strings/NULL depending on the configuration


Get list subscript

Description

Based on available list element names and subscript value determine index of the element requested

Usage

.getSubscript(lst, i)

Arguments

lst

list to search subscript for

i

character or numeric to determine final list index

Value

numeric index of the requested element in the list

Examples

l = list(a="a", b="b")
.getSubscript(l, 1) == .getSubscript(l, "a")

Set table indexes

Description

Get sample and subsample table indexes and save as a slot on the Project object

Usage

.getTableIndexes(.Object, stIndex, sstIndex)

Arguments

.Object

an object of "Project"

stIndex

character string indicating a constructor-specified sample table index

sstIndex

character string indicating a constructor-specified subsample table index

Details

This is the (sub)sample table index selection priority order:

  1. Project constructor specified

  2. Config specified

  3. Deafult value


Imply attributes

Description

Imply attributes

Usage

.implyAttrs(.Object)

Arguments

.Object

an object of "Project"

Value

an object of "Project"


Infer project name

Description

Based on dedicated config section or PEP enclosing dir

Usage

.inferProjectName(cfg, filename)

Arguments

cfg

config data

filename

path to the config file

Value

string project name


Determine whether a path is absolute.

Description

Determine whether a path is absolute.

Usage

.isAbsolute(path)

Arguments

path

The path to check for seeming absolute-ness.

Value

Flag indicating whether the path appears to be absolute.


Config file or annotation file

Description

Determine if the input file seems to be a project config file (based on the file extension).

Usage

.isCfg(filePath)

Arguments

filePath

a string to examine

Value

a boolean, TRUE if indicating the path seems to be pointing to a config, or FALSE if the path seems to be pointing to an annotation file.


Determine whether the string is a valid URL

Description

Determine whether the string is a valid URL

Usage

.isValidUrl(str)

Arguments

str

string to inspect

Value

logical indicating whether a string is a valid URL


Listify data frame columns

Description

This function turns each data frame column into a list, so that its cells can contain multiple elements

Usage

.listifyDF(DF)

Arguments

DF

an object of class data.frame

Value

an object of class data.frame

Examples

dataFrame=mtcars
listifiedDataFrame=.listifyDF(dataFrame)

Load the config of a PEP

Description

Loads a PEP config file

Usage

.loadConfig(filename = NULL, amendments = NULL)

Arguments

filename

file path to config file

amendments

amendments to activate

See Also

https://pep.databio.org/


Read sample table from disk

Description

Read sample table from disk

Usage

.loadSampleAnnotation(sampleTablePath)

Arguments

sampleTablePath

a character string indicating a path to the sample table

Value

an data.frame with samples; one sample per row


Load single subsample annotation

Description

Load single subsample annotation

Usage

.loadSubsampleAnnotation(.Object, path)

Arguments

.Object

an object of "Project"

path

string, a path to the subsample table to read and incorporate

Value

an object of "Project"


Create an absolute path from a primary target and a parent candidate.

Description

Create an absolute path from a primary target and a parent candidate.

Usage

.makeAbsPath(perhapsRelative, parent)

Arguments

perhapsRelative

Path to primary target directory.

parent

a path to parent folder to use if target isn't absolute.

Value

Target itself if already absolute, else target nested within parent.


Create a list of matched files in the system and unmatched regular expessions

Description

Create a list of matched files in the system and unmatched regular expessions

Usage

.matchesAndRegexes(rgx)

Arguments

rgx

string to expand in the system

Value

a list of all the elements after possible expansion


Merge samples defined in sample table with ones in subsample table(s)

Description

Merge samples defined in sample table with ones in subsample table(s)

Usage

.mergeAttrs(.Object, subsampleAannotationPaths)

Arguments

.Object

an object of "Project"

subsampleAannotationPaths

a vector of strings specifying the paths to sample

Value

an object of "Project"


Perform all the sample attribute modifications

Description

Perform all the sample attribute modifications

Usage

.modifySamples(object)

Arguments

object

an object of "Project"

Value

modified Project object


Print a nested list

Description

Prints a nested list in a way that looks nice

Usage

.printNestedList(lst, level = 0)

Arguments

lst

list object to print

level

the indentation level

Details

Useful for displaying the config of a PEP

Value

No return value, called for side effects

Examples

projectConfig = system.file("extdata",
"example_peps-master",
"example_basic",
"project_config.yaml",
package = "pepr")
p = Project(file = projectConfig)
.printNestedList(config(p),level=2)

Check config spec version and reformat if needed

Description

Check config spec version and reformat if needed

Usage

.reformat(object)

Arguments

object

an object of "Config"

Value

an object of "Config"


Remove attributes across all the samples

Description

Remove attributes across all the samples

Usage

.removeAttrs(.Object)

Arguments

.Object

an object of "Project"

Value

an object of "Project"


Check whether the string is a valid URL or an existing local path

Description

Check whether the string is a valid URL or an existing local path

Usage

.safeFileExists(path)

Arguments

path

string to be checked

Value

a logical indicating whether it's an existing path or valid URL


Format a string like python's format method

Description

Given a string with environment variables (encoded like ${VAR} or $VAR), and other variables (encoded like {VAR}) this function will substitute both of these and return the formatted string, like the Python str.format() method. Other variables are populated from a list of arguments. Additionally, if the string is a non-absolute path, it will be expanded.

Usage

.strformat(string, args, parent = NULL)

Arguments

string

String with variables encoded

args

named list of arguments to use to populate the string

parent

a directory that will be used to make the path absolute

Value

Formatted string

Examples

.strformat("~/{VAR1}{VAR2}_file", list(VAR1="hi", VAR2="hello"))
.strformat("$HOME/{VAR1}{VAR2}_file", list(VAR1="hi", VAR2="hello"))

Config objects and specialized list obejcts and expand string attributes

Description

Config objects are used with the "Project" object

Usage

Config(file, amendments = NULL)

Arguments

file

a character with project configuration yaml file

amendments

a character with the amendments names to be activated

Value

an object of "Config" class

Examples

projectConfig = system.file("extdata", "example_peps-master",
"example_amendments1", "project_config.yaml", package="pepr")
c=Config(projectConfig)

The constructor of a class representing a Portable Encapsulated Project

Description

This is a helper that creates the project with empty samples and config slots

Usage

Project(
  file = NULL,
  amendments = NULL,
  sampleTableIndex = NULL,
  subSampleTableIndex = NULL
)

Arguments

file

a string specifying a path to a project configuration YAML file

amendments

a string with the amendments names to be activated

sampleTableIndex

a string indicating the sample attribute that is used to index the sample table

subSampleTableIndex

a string indicating the sample attribute that is used to index the sample table

Value

an object of "Project"

Examples

projectConfig = system.file("extdata", "example_peps-master",
"example_amendments1", "project_config.yaml", package="pepr")
p=Project(projectConfig)

Portable Encapsulated Project object

Description

Provides an in-memory representation and functions to access project configuration and sample annotation values for a PEP.

Details

Can be created with the constructor: "Project"

Slots

file

character vector path to config file on disk.

samples

a data table object holding the sample metadata

config

a list object holding contents of the config file

sampleNameAttr

a string indicating the sample attribute that is used to index the sample table

subSampleNameAttr

a string indicating the sample attribute that is used to index the sample table


Activate amendments in objects of "Project"

Description

This method switches between the amendments within the "Project" object

Usage

activateAmendments(.Object, amendments)

## S4 method for signature 'Project,character'
activateAmendments(.Object, amendments)

Arguments

.Object

an object of class "Project"

amendments

character with the amendment name

Details

To check what are the amendments names call listAmendments(p), where p is the object of "Project" class

Value

an object of class "Project" with activated amendments

Methods (by class)

Examples

projectConfig = system.file("extdata",
"example_peps-master",
"example_amendments1",
"project_config.yaml",
package = "pepr")
p = Project(file = projectConfig)
availAmendments = listAmendments(p)
activateAmendments(p, availAmendments[1])

Check for existence of a section in the Project config

Description

This function checks for the section/nested sections in the config YAML file. Returns TRUE if it exist(s) or FALSE otherwise.

Usage

checkSection(object, sectionNames)

## S4 method for signature 'Config'
checkSection(object, sectionNames)

Arguments

object

object of "Config"

sectionNames

the name of the section or names of the nested sections to look for

Details

Element indices can be used instead of the actual names, see Examples.

Value

a logical indicating whether the section exists

Methods (by class)

Examples

projectConfig = system.file("extdata", "example_peps-master",
"example_amendments1", "project_config.yaml", package="pepr")
p=Project(projectConfig)
checkSection(config(p),sectionNames = c("amendments","newLib"))
checkSection(config(p),sectionNames = c("amendments",1))

Extract "Project"

Description

This method can be used to view the config slot of the "Project" class

Usage

config(object)

## S4 method for signature 'Project'
config(object)

Arguments

object

an object of "Project"

Value

project config

Methods (by class)

Examples

projectConfig = system.file("extdata", "example_peps-master",
"example_amendments1", "project_config.yaml", package="pepr")
p=Project(projectConfig)
config(p)


Collect samples fulfilling the specified requirements

Description

This funciton collects the samples from a data.table-class object that fulfill the requirements of an attribute attr specified with the fun argument

Usage

fetchSamples(samples, attr = NULL, func = NULL, action = "include")

Arguments

samples

an object of data.table-class class

attr

a string specifying a column in the samples

func

an anonymous function, see Details for more information

action

a string (either include or exclude) that specifies whether the function should select the row or exclude it.

Details

The anonymous function provided in the func argument has to return an integer that indicate the rows that the action should be performed on. Core expressions which are most useful to implement the anonymous function are:

Value

an object of data.table-class class filtered according to specified requirements

Examples

projectConfig = system.file("extdata", "example_peps-master",
"example_amendments1", "project_config.yaml", package="pepr")
p = Project(projectConfig)
s = sampleTable(p)
fetchSamples(s,attr = "sample_name", func=function(x){ which(x=="pig_0h") },action="include")
fetchSamples(s,attr = "sample_name", func=function(x){ which(x=="pig_0h") },action="exclude")
fetchSamples(s,attr = "sample_name", func=function(x){ grep("pig_",x) },action="include")

Extract samples

Description

This method extracts the samples

Usage

getSample(.Object, sampleName)

## S4 method for signature 'Project,character'
getSample(.Object, sampleName)

Arguments

.Object

An object of Project class

sampleName

character the name of the sample

Value

data.table one row data table with the sample associated metadata

Methods (by class)

Examples

projectConfig = system.file(
"extdata",
"example_peps-master",
"example_basic",
"project_config.yaml",
package = "pepr"
)
p = Project(projectConfig)
sampleName = "frog_1"
getSample(p, sampleName)

Extract subsamples

Description

This method extracts the subsamples

Usage

getSubsample(.Object, sampleName, subsampleName)

## S4 method for signature 'Project,character,character'
getSubsample(.Object, sampleName, subsampleName)

Arguments

.Object

An object of Project class

sampleName

character the name of the sample

subsampleName

character the name of the subsample

Value

data.table one row data table with the subsample associated metadata

Methods (by class)

Examples

projectConfig = system.file(
"extdata",
"example_peps-master",
"example_subtable1",
"project_config.yaml",
package = "pepr"
)
p = Project(projectConfig)
sampleName = "frog_1"
subsampleName = "sub_a"
getSubsample(p, sampleName, subsampleName)

List amendments

Description

Lists available amendments within a "Project" object.

Usage

listAmendments(.Object)

## S4 method for signature 'Project'
listAmendments(.Object)

Arguments

.Object

an object of "Project"

Details

The amendments can be activated by passing their names to the activateAmendments method

Value

names of the available amendments

Methods (by class)

Examples

projectConfig = system.file("extdata",
"example_peps-master",
"example_amendments1",
"project_config.yaml",
package = "pepr")
p = Project(file = projectConfig)
availAmendemtns = listAmendments(p)

Make selected sections absolute using config path

Description

Make selected sections absolute using config path

Usage

makeSectionsAbsolute(object, sections, cfgPath)

## S4 method for signature 'Config,character,character'
makeSectionsAbsolute(object, sections, cfgPath)

Arguments

object

"Config"

sections

character set of sections to make absolute

cfgPath

character absolute path to the config YAML file

Value

Config with selected sections made absolute

Methods (by class)


pepr

Description

Package documentation

Author(s)

Michal Stolarczyk, Nathan Sheffield

References

GitHub: https://github.com/pepkit/pepr, Documentation: https://code.databio.org/pepr/


View samples in the objects of "Project"

Description

This method can be used to view the samples slot of the "Project" class

Usage

sampleTable(object)

## S4 method for signature 'Project'
sampleTable(object)

Arguments

object

an object of "Project"

Value

a data.table with the with metadata about samples

Methods (by class)

Examples

projectConfig = system.file("extdata", "example_peps-master",
"example_amendments1", "project_config.yaml", package="pepr")
p=Project(projectConfig)
sampleTable(p)


Access "Config" object elements

Description

You can subset Config by identifier or by position using the `[`, `[[` or `$` operator. The string will be expanded if it's a path.

Usage

## S4 method for signature 'Config'
x[i]

## S4 method for signature 'Config'
x[[i]]

## S4 method for signature 'Config'
x$name

Arguments

x

a "Config" object.

i

position of the identifier or the name of the identifier itself.

name

name of the element to access.

Value

An element held in "Config" object

Examples

projectConfig = system.file("extdata", "example_peps-master",
"example_amendments1", "project_config.yaml", package="pepr")
c=Config(projectConfig)
c[[2]]
c[2]
c[["sample_table"]]
c$sample_table