eugene.preprocess.make_unique_ids_sdata

eugene.preprocess.make_unique_ids_sdata(sdata, id_var='id', copy=False)

Make a set of unique ids for each sequence in a SeqData object and store as new xarray variable.

Expects that the dimension for the number of sequences is named “_sequence”. Otherwise, it will fail. Will also overwrite any existing variable with the same name.

Parameters:
  • sdata (xr.Dataset) – SeqData object.

  • id_var (str, optional) – Name of the variable to store the ids in, by default “id”

  • copy (bool, optional) – Whether to return a copy of the SeqData object, by default False

Returns:

SeqData object with unique ids. If copy is True, a copy of the SeqData object is returned, else the original SeqData object is modified in place.

Return type:

xr.Dataset