Backwards Incompatible Changes:
locus
:
makeChangeoClone
General:
ExampleTrees
to use the igprah
1.5.0 format. See https://r.igraph.org/news/index.html#igraph-150 for
details.collapseDuplicates
.Diversity:
plotDiversityCurve
and
plotAbundanceCurve
where limits were not being applied
correctly to zoom in the plots.Gene:
groupGenes
where TCR chains where not
being considered when detecting heavy chain sequences prior to
subsetting.General:
ape::read.fastq
.General:
junctionAlignment
, which counts the number of
nucleotides in the reference germline not present in the alignment, and
the number of V and J nucleotides in the CDR3.Gene Usage:
getFamily
where temporary designation
gene names were not being correctly subset to the cluster (family)
level.Lineage:
runPhylip
which was causing
buildPhylipLineage
to fail when run on Windows.General:
readFastqDb
, which reads a repertoire’s .fastq
file and imports the sequencing quality scores for
sequence_alignment
. Added
maskPositionsByQuality
masks positions that have a
sequencing quality score lower than the specified threshold. The
convenience function getPositionQuality
will create a
data.frame
with quality scores per position.dplyr
dependency to v1.0.padSeqEnds
, the argument mod3=TRUE
has
been added so that sequences are padded to a length that is a multiple
of 3.translateDNA
where NA
values weren’t being translated properly.Amino Acid Analysis:
aminoAcidProperties
, which will now default to
nt=TRUE
.Diversity: + Added a parameter to countClones
(remove_na
) that will remove all rows with NA values in the
clone column if TRUE
(default) and issue a warning with how
many were removed. If FALSE
, those rows will be kept
instead.
Gene Usage:
getLocus
to extract the locus
information from the segment call.getChain
to define the chain from
the segment or locus call.countGenes
to
give a warning instead of an error so as not to disrupt running
workflows.getSegment
where filtering of
non-localized genes was not being applied when called from
getFamily
, because the “NL” part of the name was removed
before the filtering step.getAllele
,
getGene
, getFamily
and getLocus
,
to parse constant region gene names correctly.getSegment
to be able to
parse constant region gene names correctly and not remove the “D” from
“IGHD” when strip_d=TRUE
.Lineage:
branch_length
argument to
buildPhylipLineage
, and augmented graphToPhylo
and phyloToGraph
to track intermediate sequence in nodes
for phylo object.countGenes
(remove_na
) that will remove all rows with NA values in the
gene column if TRUE
(default) and issue a warning with how
many were removed. If FALSE
, those rows will be kept
instead.Diversity:
plotDiversityTest
that caused all values
of q
to appear on the plot rather than just the specified
one.Gene Usage:
groupGenes
where the v_call
j_call
column for J
gene grouping.groupGenes
.only_igh
argument of
groupGenes
to only_heavy
.Backwards Incompatible Changes:
V_CALL
(Change-O) as the default to
identify the field that stored the V gene calls, they now use
v_call
(AIRR). That means, scripts that relied on default
values (previously, v_call="V_CALL"
), will now fail if
calls to the functions are not updated to reflect the correct value for
the data. If data are in the Change-O format, the current default value
v_call="v_call"
will fail to identify the column with the V
gene calls as the column v_call
doesn’t exist. In this
case, v_call="V_CALL"
needs to be specified in the function
call.ExampleDb
converted to the AIRR Rearrangement standard
and examples updated accordingly. The legacy Change-O version is
available as ExampleDbChangeo
.GRAVY
to
gravy
);countGenes
, countClones
(e.g.,
SEQ_COUNT
to seq_count
)estimateAbundance
(e.g., RANK
to
rank
)groupGenes
(e.g., VJ_GROUP
to
vj_group
)collapseDuplicates
and makeChangeoClone
(e.g., SEQUENCE_ID
to sequence_id
,
COLLAPSE_COUNT
to collapse_count
)summarizeTrees
,
getPathLengths
, getMRCA
,
tableEdges
, testEdges
) also return columns in
lower case (e.g., parent
, child
,
outdegree
, steps
, annotation
,
pvalue
)IG_COLOR
names converted to official C region
identifiers (IGHA, IGHD, IGHE, IGHG, IGHM, IGHK, IGHL).General:
baseTheme
looks is now consistent across
sizing
options.cpuCount
will now return 1
if the core
count cannot be determined.padSeqEnds
wherein the
pad_char
argument was being ignored.Diversity:
estimateAbundance
slot clone_by
now
contains the name of the column with the clonal group identifier, as
specified in the function call. For example, if the function was called
with clone="clone_id"
, then the clone_by
slot
will be clone_id
.Lineage:
buildPhylipLineage
arguments
vcall
, jcall
and dnapars_exec
to
v_call
, j_call
and phylip_exec
,
respectively.Deprecated:
rarefyDiversity
is deprecated in favor of
alphaDiversity
, which includes the same functionality.testDiversity
is deprecated. The test calculation have
been added to the normal output of alphaDiversity
.General:
ape
and tibble
dependencies.Lineage:
readIgphyml
to read in IgPhyML output and
combineIgphyml
to combine parameter estimates across
samples.graphToPhylo
and phyloToGraph
to
allow conversion between graph and phylo formats.Diversity:
estimateAbundance
where setting the
clone
column to a non-default value produced an error.estimateAbundance
through
the min_n
, max_n
, and uniform
arguments.estimateAbundance
. alphaDiversity
will call
estimateAbundance
for bootstrapping if not provided an
existing AbundanceCurve
object.DiversityCurve
and
AbundanceCurve
objects to accomodate the new diversity
methods.Gene Usage:
groupGenes
now supports grouping by V gene, J gene, and
junction length (junc_len
) as well, in addition to grouping
by V gene and J gene without junction length. Also added support for
single-cell input data with the addition of new arguments
cell_id
, locus
, and
only_igh
.General:
nonsquareDist
function to calculate the
non-square distance matrix of sequences.progressBar
, baseTheme
,
checkColumns
and cpuCount
.Diversity:
estimateAbundance
, and plotAbundanceCurve
,
will now allow group=NULL
to be specified to performance
abundance calculations on ungrouped data.Gene Usage:
fill
argument to countGenes
. When
set TRUE
this adds zeroes to the group
pairs
that do not exist in the data.groupGenes
to group sequences
sharing same V and J gene.Toplogy Analysis:
indirect=TRUE
.makeChangeoClone
will now issue an error and terminate,
instead of continuing with a warning, when all sequences are not the
same length.General:
IPUAC_AA
wherein X was not properly
matching against Q.getAAMatrix
to treat * (stop codon)
as a mismatch.General:
readChangeoDb
.padSeqEnds
function which pads sequences with
Ns to make then equal in length.collapseDuplicates
.Diversity:
uniform
argument to
rarefyDiversity
allowing users to toggle uniform vs
non-uniform sampling.plotAbundance
to
plotAbundanceCurve
.estimateAbundance
return object from a
data.frame to a new AbundanceCurve
custom class.plot
call for AbundanceCurve
to plotAbundanceCurve
.annotate
argument from
plotDiversityCurve
to plotAbundanceCurve
.score
argument to
plotDiversityCurve
to toggle between plotting diversity or
evenness.plotDiversityTest
to generate a
simple plot of DiversityTest
object summaries.Gene Usage:
omit_nl
argument to getAllele
,
getGene
and getFamily
to allow optional
filtering of non-localized (NL) genes.Lineage:
makeChangeoClone
preventing it from
interpreting the id
argument correctly.pad_end
argument to
makeChangeoClone
to allow automatic padding of ends to make
sequences the same length.General:
dry
argument to collapseDuplicates
which will annotate duplicate sequences but not remove them when set to
TRUE
.collapseDuplicates
was returning one
sequence if all sequences were considered ambiguous.Lineage:
makeChangeoClone
and buildPhylipLineage
for
purposes of (optionally) treating indels as mismatches.buildPhylipLineage
when PHYLIP doesn’t
generate inferred sequences and has only one block.General:
readChangeoDb
causing the
select
argument to do nothing.Gene Usage:
countGenes
when the clone
argument is
specified to CLONE_COUNT
/CLONE_FREQ
.General:
readChangeoDb
and
writeChangeoDb
.General:
seqDist()
wherein distance was not
properly calculated in some sequences containing gap characters.getAAMatrix()
return
matrix.General:
readChangeoDb()
to wrap
data.table::fread()
instead of
utils::read.table()
if the input file is not
compressed.testSeqEqual()
, getSeqDistance()
and getSeqMatrix()
to C++ to improve performance of
collapseDuplicates()
and other dependent functions.testSeqEqual()
, getSeqDistance()
and getSeqMatrix()
to seqEqual()
,
seqDist()
and pairwiseDist()
,
respectively.pairwiseEqual()
which creates a logical sequence
distance matrix; TRUE if sequences are identical, FALSE if not,
excluding Ns and gaps.X
in translateDNA()
.collapseDuplicates()
wherein the input
data type sanity check would cause the vignette to fail to build under R
3.3.ExampleDb.gz
file with a larger, more
clonal, ExampleDb
data object.ExampleTrees
with a larger set of trees.multiggplot()
to gridPlot()
.Amino Acid Analysis:
normalize=FALSE
for charge calculations
to be more consistent with previously published repertoire sequencing
results.Diversity Analysis:
progress
argument to
rarefyDiversity()
and testDiversity()
to
enable the (previously default) progress bar.estimateAbundance()
were the function
would fail if there was only a single input sequence per group.data
and summary
slots of DiversityTest
to uppercase for consistency with
other tools.plot
to
plotDiversityCurve
for DiversityCurve
objects.Gene Usage:
sortGenes()
function to sort V(D)J genes by name
or locus position.clone
argument to countGenes()
to
allow restriction of gene abundance to one gene per clone.Topology Analysis:
General:
base::nchar()
.General:
Amino Acid Analysis:
aliphatic()
function
were not being passed through the ellipsis argument of
aminoAcidProperties()
.aminoAcidProperties()
.AA_TRANS
to ABBREV_AA
.Diversity:
rarefyDiversity()
output.Lineage:
ExampleTrees
data with example output from
buildPhylipLineage()
.General:
getDNADistMatrix()
and
getAADistMatrix()
to getDNAMatrix
and
getAAMatrix()
, respectively.getSeqMatrix()
which calculates a pairwise
distance matrix for a set of sequences.multiggplot()
function for performing multiple
panel plots.Amino Acid Analysis:
gravy()
, bulk()
,
aliphatic()
, polar()
, charge()
,
countPatterns()
and
aminoAcidProperties()
.Annotation:
getSegment()
, getAllele()
,
getGene()
and getFamily()
. May be disabled by
providing the argument strip_d=FALSE
.countGenes()
to tabulate V(D)J allele, gene and
family usage.Diversity:
countClones()
,
estimateAbundance()
and plotAbundance()
.resampleDiversity()
to
rarefyDiversity()
and changed many of the internals.
Bootstrapping is now performed on an inferred complete relative
abundance distribution.rarefyDiversity()
and
testDiversity()
.rarefyDiversity()
and testDiversity()
are now
calculated using the mean and standard deviation of the bootstrap
realizations, rather than the median and upper/lower quantiles.plotDiversityCurve()
.Initial public release.
General:
citation("alakazam")
command.Lineage:
buildPhylipLineage()
.Lineage:
buildPhylipLineage()
would hang on R
3.2 due to R change request PR#15508.Prerelease for review.