eugene.interpret.evolve_seqs_sdata

eugene.interpret.evolve_seqs_sdata(model, sdata, rounds, seq_var='ohe_seq', axis_order=('_sequence', '_ohe', 'length'), add_seqs=True, return_seqs=False, device='cpu', batch_size=128, copy=False)

In silico evolve a set of sequences that are stored in a SeqData object.

This function is a wrapper around the evolution function from the seqexplainer package. It takes a SeqData object containing sequences and evolves them in silico using the specified model. The evolved sequences are stored in the SeqData object as a new variable. The function returns the evolved sequences if return_seqs is set to True.

Parameters:
  • model (torch.nn.Module) – The model to score the sequences with

  • sdata (xr.Dataset) – The SeqData object containing the sequences to evolve

  • rounds (int) – The number of rounds of evolution to perform

  • seq_var (str, optional) – The name of the sequence variable in the SeqData object, by default “ohe_seq”

  • axis_order (tuple, optional) – The axis order of the sequence variable in the SeqData object, by default (“_sequence”, “_ohe”, “length”)

  • add_seqs (bool, optional) – Whether to add the evolved sequences to the SeqData object, by default True

  • return_seqs (bool, optional) – Whether to return the evolved sequences, by default False

  • device (str, optional) – The device to use for scoring the sequences, by default “cpu”

  • batch_size (int, optional) – The batch size to use for scoring the sequences, by default 128

  • copy (bool, optional) – Whether to copy the SeqData object before adding the evolved sequences, by default False

Returns:

The SeqData object with the evolved sequences added

Return type:

sdata