Python: module starcall.reads

starcall.reads

index
/home/nicho/starcall-docs/starcall/reads.py

Classes and functions to store and process the short reads that are generated from in situ sequencing. Reads are stored as the Read dataclass and in pandas dataframes.

Modules

collections
dataclasses
heapq
itertools
numpy
pandas
scipy
skimage
sklearn
time
starcall.utils

Classes



builtins.object

Heap
ReadsAccessor

collections.abc.Mapping(collections.abc.Collection)

Read

class Heap(builtins.object)

    Methods defined here:

__init__(self)
Initialize self.  See help(type(self)) for accurate signature.

empty(self)

pop(self)
Remove and return the lowest priority task. Raise KeyError if empty.

push(self, task, priority=0)
Add a new task or update the priority of an existing task

remove(self, task)
Mark an existing task as REMOVED.  Raise KeyError if not found.

Data descriptors defined here:

__dict__

dictionary for instance variables (if defined)

__weakref__

list of weak references to the object (if defined)

Data and other attributes defined here:

REMOVED = '<removed-task>'

class Read(collections.abc.Mapping)

    Read(position=None, values=None, sequence=None, image=None, channels=None, **kwargs) A sequencing read found in an image. Can be constructed with a position and the image or with a position and the values from that position. A Read has three main components, each of which can be None, depending on what is known about the read and in what step of the sequencing process it is. These attributes are:     position (ndarray of shape (2,)): The position of the read. This is relevant for all     reads detected from the sequencing images, and holds the position at which they were found, in pixels.     Reads without a position would be library barcodes that need to be compared to sequencing reads,     or cell consensus reads that don't have a single position.     values (ndarray of shape (n_cycles, n_channels)): The values extracted from the sequencing     images for this read. As such, this is only present for reads that came from sequencing     images, for example barcodes from the library don't have any raw sequencing values related to them.     sequence (string of len n_cycles): The sequence of the read.     The sequence of the read is always present and will not be None, however if a sequence     is not specified when creating the Read it is inferred from the values, by taking     the maximum channel for each cycle. Additional attributes can be added when creating the Read as keyword arguments, and can be accessed as if the read is a dictionary

Method resolution order:

Read

collections.abc.Mapping

collections.abc.Collection

collections.abc.Sized

collections.abc.Iterable

collections.abc.Container

builtins.object

Methods defined here:

__getitem__(self, name)

__init__(self, position=None, values=None, sequence=None, image=None, channels=None, **kwargs)

__iter__(self)

__len__(self)

__repr__(self)
Return repr(self).

__setitem__(self, name)

__str__(self)
Return str(self).

Static methods defined here:

asread(obj)

Readonly properties defined here:

qualities

Data descriptors defined here:

__dict__

dictionary for instance variables (if defined)

__weakref__

list of weak references to the object (if defined)

sequence

sequence_array

Data and other attributes defined here:

DEFAULT_CHANNELS = ('G', 'T', 'A', 'C')

__abstractmethods__ = frozenset()

Methods inherited from collections.abc.Mapping:

__contains__(self, key)

__eq__(self, other)
Return self==value.

get(self, key, default=None)
D.get(k[,d]) -> D[k] if k in D, else d.  d defaults to None.

items(self)
D.items() -> a set-like object providing a view on D's items

keys(self)
D.keys() -> a set-like object providing a view on D's keys

values(self)
D.values() -> an object providing a view on D's values

Data and other attributes inherited from collections.abc.Mapping:

__hash__ = None

__reversed__ = None

Class methods inherited from collections.abc.Collection:

__subclasshook__(C) from abc.ABCMeta
Abstract classes can override this to customize issubclass(). This is invoked early on by abc.ABCMeta.__subclasscheck__(). It should return True, False or NotImplemented.  If it returns NotImplemented, the normal algorithm is used.  Otherwise, it overrides the normal algorithm (and the outcome is cached).

Class methods inherited from collections.abc.Iterable:

__class_getitem__ = GenericAlias(...) from abc.ABCMeta
Represent a PEP 585 generic type E.g. for t = list[int], t.__origin__ is list and t.__args__ is (int,).

class ReadsAccessor(builtins.object)

    ReadsAccessor(table) Accessor object to provide attributes and functions for dataframes containing in situ sequencing reads A dataframe with the columns below can use the accessor:     (optional) 'position_x', 'position_y': The pixel location of the sequencing read     'values_cycle00_G', 'values_cycle00_T' ... 'values_cycle11_A', 'values_cycle11_C':         The sequencing values of the read, from the filtered sequencing images.         Values from every cycle and each channel are stored.     (optional) 'sequence': The sequences of the read The accessor provides custom properties listed below:     'positions': a numpy array of shape (num_reads, 2). The positions of the reads,     'values': a numpy array of shape (num_reads, num_cycles, num_channels).         The sequencing values for all reads     'sequences': a numpy array of strings of shape (num_reads,). If there is a column         called 'sequence' in the dataframe this is just the contents of that column.         Otherwise, the sequences are generated from the sequencing values, by selecting         the maximum channel in each cycle to build up a sequence. Changes to positions and values will both propagate back to the underlying dataframe, so you can do something like:     table.reads.positions *= 2     table.reads.values /= np.linalg.norm(table.reads.values, axis=2)[:,:,None] Full reference documentation is available at <https://fowlerlab.github.io/starcall-docs/starcall.html>

Methods defined here:

__getitem__(self, index)

__init__(self, table)
Initialize self.  See help(type(self)) for accurate signature.

__iter__(self)

__len__(self)

aggfuncs(self, position=None, values=None, **kwargs)

normalize(self, method='full')
Normalizes the values of this read set, based on the method specified Possible methods are:     'full' (default): values are normalized across the channel axis, so that for each     cycle the norm of the vector of all channels is 1

to_cell_table(self, cell_column='cell', include_attrs=['count'], cell_index=None)

Readonly properties defined here:

positions

sequences

sequences_array

values

Data descriptors defined here:

__dict__

dictionary for instance variables (if defined)

__weakref__

list of weak references to the object (if defined)

Functions

cluster_reads(distance_matrix, threshold=0.2, linkage='mean', debug=True, progress=False)

distance_matrix(table, cells=None, distance_cutoff=50, positional_weight=1.0, value_weight=1.0, sequence_weight=1.0, matrix=None, debug=True, progress=True)

join_contiguous_arrays(arrays)

make_readset(positions=None, values=None, sequences=None, image=None, channels=None, **kwargs)
Creates a pandas DataFrame that is compatible with the ReadsAccessor provided by this package.

Data

Optional = typing.Optional