pychopper(1)

package documentation

Section 1 python3-pychopper bookworm source

Description

PYCHOPPER

NAME

pychopper - package documentation

COMMAND LINE TOOLS

FULL API REFERENCE

pychopper

pychopper package

Subpackages

pychopper.phmm_data package

Module contents

pychopper.primer_data package

Module contents

pychopper.scripts package

Submodules

pychopper.scripts.pychopper module

pychopper.scripts.pychopper.main()

Parse command line arguments.

Module contents

pychopper.tests package

Submodules

pychopper.tests.test_detector module

class pychopper.tests.test_detector.TestDetector(methodName='runTest')

Bases: unittest.case.TestCase

Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.
testPairAlign()
testScoreCutoff()

pychopper.tests.test_regression_simple module

class
pychopper.tests.test_regression_simple.TestIntegration(methodName='runTest')

Bases: unittest.case.TestCase

Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.
testIntegration()

Integration test.

testIntegration_umi()

Integration test.

Module contents

Submodules

pychopper.alignment_hits module

pychopper.alignment_hits.process_hits(hits, max_score)

Process alignment hits by removing overlaps

pychopper.chopper module

pychopper.chopper.analyse_hits(hits, config)

Segment reads based on alignment hits using dynamic programming. The algorithm is based on the rule that each primer alignment hit can be used only once. Hence if a segment is included, the next one has to be excluded.

pychopper.chopper.chopper_edlib(reads, primers, config, max_ed, cutoff,
pool, min_batch)

Segment using the edlib/parasail backend

pychopper.chopper.chopper_phmm(reads, phmm_file, config, cutoff, threads,
pool, min_batch)

Segment using the profile HMM backend

pychopper.chopper.segments_to_reads(read, segments, keep_primers, bam_tags,
detect_umis)

Convert segments to output reads with annotation

pychopper.common_structures module

class pychopper.common_structures.Hit(Ref, RefStart, RefEnd, Query,
QueryStart, QueryEnd, Score)

Bases: tuple

Create new instance of Hit(Ref, RefStart, RefEnd, Query, QueryStart, QueryEnd, Score)

Query

Alias for field number 3

QueryEnd

Alias for field number 5

QueryStart

Alias for field number 4

Ref

Alias for field number 0

RefEnd

Alias for field number 2

RefStart

Alias for field number 1

Score

Alias for field number 6

class pychopper.common_structures.Segment(Left, Start, End, Right, Strand,
Len)

Bases: tuple

Create new instance of Segment(Left, Start, End, Right, Strand, Len)

End

Alias for field number 2

Left

Alias for field number 0

Len

Alias for field number 5

Right

Alias for field number 3

Start

Alias for field number 1

Strand

Alias for field number 4

class pychopper.common_structures.Seq(Id, Name, Seq, Qual, Umi)

Bases: tuple

Create new instance of Seq(Id, Name, Seq, Qual, Umi)

Id

Alias for field number 0

Name

Alias for field number 1

Qual

Alias for field number 3

Seq

Alias for field number 2

Umi

Alias for field number 4

pychopper.edlib_backend module

pychopper.edlib_backend.find_locations(reads, all_primers, max_ed, pool,
min_batch)

Find alignment hits of all primers in all reads using the edlib/parasail backend

pychopper.edlib_backend.find_umi_single(params)

Find UMI in a single reads using the edlib/parasail backend

pychopper.hmmer_backend module

pychopper.hmmer_backend.find_locations(reads, phmm_file, E, pool,
min_batch)

Find alignment hits of all primers in all reads using the pHMM/nhmmscan backend

pychopper.parasail_backend module

pychopper.parasail_backend.first_cigar(cigar)

Extract details of the first operation in a cigar string.

pychopper.parasail_backend.pair_align(reference, query, query_name,
subs_mat, params)

Perform pairwise local alignment using parsail-python

pychopper.parasail_backend.process_alignment(aln, query, query_name,
aln_params)

Process an alignment, extracting score, start and end.

pychopper.parasail_backend.refine_locations(read, all_primers, locations,
aln_params={'gap_extend': 1, 'gap_open': 1, 'match': 1, 'mismatch': -2},
subs_mat=<parasail.bindings_v2.Matrix object>)

Refine alignment edges based on local alignment

pychopper.report module

class pychopper.report.Report(pdf)

Bases: object

Class for plotting utilities on the top of matplotlib. Plots are saved in the specified file through the PDF backend.
Parameters

self -- object.

pdf -- Output pdf.

Returns

The report object.

Return type

Report

close()

Close PDF backend. Do not forget to call this at the end of your script or your output will be damaged!
Parameters

self -- object

Returns

None

Return type

object

plot_arrays(data_map, title='', xlab='', ylab='', marker='.',
legend_loc='best', legend=True, vlines=None, vlcolor='green',
vlwitdh=0.5)

Plot multiple pairs of data arrays.
Parameters

self -- object.

data_map -- A dictionary with labels as keys and tupples of data arrays (x,y) as values.

title -- Figure title.

xlab -- X axis label.

ylab -- Y axis label.

marker -- Marker passed to the plot function.

legend_loc -- Location of legend.

legend -- Plot legend if True

vlines -- Dictionary with labels and positions of vertical lines to draw.

vlcolor -- Color of vertical lines drawn.

vlwidth -- Width of vertical lines drawn.

Returns

None

Return type

object

plot_bars_simple(data_map, title='', xlab='', ylab='', alpha=0.6,
xticks_rotation=0, auto_limit=False)

Plot simple bar chart from input dictionary.
Parameters

self -- object.

data_map -- A dictionary with labels as keys and data as values.

title -- Figure title.

xlab -- X axis label.

ylab -- Y axis label.

alpha -- Alpha value.

xticks_rotation -- Rotation value for x tick labels.

auto_limit -- Set y axis limits automatically.

Returns

None

Return type

object

plot_histograms(data_map, title='', xlab='', ylab='', bins=50,
alpha=0.7, legend_loc='best', legend=True, vlines=None)

Plot histograms of multiple data arrays.
Parameters

self -- object.

data_map -- A dictionary with labels as keys and data arrays as values.

title -- Figure title.

xlab -- X axis label.

ylab -- Y axis label.

bins -- Number of bins.

alpha -- Transparency value for histograms.

legend_loc -- Location of legend.

legend -- Plot legend if True.

vlines -- Dictionary with labels and positions of vertical lines to draw.

Returns

None

Return type

object

save_close()

Utility method to save and close figure.

pychopper.seq_utils module

pychopper.seq_utils.base_complement(k)

Return complement of base.

Performs the subsitutions: A<=>T, C<=>G, X=>X for both upper and lower case. The return value is identical to the argument for all other values.
Parameters

k -- A base.

Returns

Complement of base.

Return type

str

pychopper.seq_utils.errs_tab(n)

Generate list of error rates for qualities less than equal than n.

pychopper.seq_utils.get_primers(primers)

Load primers from fasta file

pychopper.seq_utils.get_runid(desc)

Parse out runid from sequence description.

pychopper.seq_utils.mean_qual(quals, qround=False, tab=[1.0,
0.7943282347242815, 0.6309573444801932, 0.5011872336272722,
0.3981071705534972, 0.31622776601683794, 0.251188643150958,
0.19952623149688797, 0.15848931924611134, 0.12589254117941673, 0.1,
0.07943282347242814, 0.06309573444801933, 0.05011872336272722,
0.039810717055349734, 0.03162277660168379, 0.025118864315095794,
0.0199526231496888, 0.015848931924611134, 0.012589254117941675, 0.01,
0.007943282347242814, 0.00630957344480193, 0.005011872336272725,
0.003981071705534973, 0.0031622776601683794, 0.0025118864315095794,
0.001995262314968879, 0.001584893192461114, 0.0012589254117941675, 0.001,
0.0007943282347242813, 0.000630957344480193, 0.0005011872336272725,
0.00039810717055349735, 0.00031622776601683794, 0.00025118864315095795,
0.00019952623149688788, 0.00015848931924611142, 0.00012589254117941674,
0.0001, 7.943282347242822e-05, 6.309573444801929e-05,
5.011872336272725e-05, 3.9810717055349695e-05, 3.1622776601683795e-05,
2.5118864315095822e-05, 1.9952623149688786e-05, 1.584893192461114e-05,
1.2589254117941661e-05, 1e-05, 7.943282347242822e-06, 6.30957344480193e-06,
5.011872336272725e-06, 3.981071705534969e-06, 3.162277660168379e-06,
2.5118864315095823e-06, 1.9952623149688787e-06, 1.584893192461114e-06,
1.2589254117941661e-06, 1e-06, 7.943282347242822e-07, 6.30957344480193e-07,
5.011872336272725e-07, 3.981071705534969e-07, 3.162277660168379e-07,
2.5118864315095823e-07, 1.9952623149688787e-07, 1.584893192461114e-07,
1.2589254117941662e-07, 1e-07, 7.943282347242822e-08, 6.30957344480193e-08,
5.011872336272725e-08, 3.981071705534969e-08, 3.162277660168379e-08,
2.511886431509582e-08, 1.9952623149688786e-08, 1.5848931924611143e-08,
1.2589254117941661e-08, 1e-08, 7.943282347242822e-09,
6.309573444801943e-09, 5.011872336272715e-09, 3.981071705534969e-09,
3.1622776601683795e-09, 2.511886431509582e-09, 1.9952623149688828e-09,
1.584893192461111e-09, 1.2589254117941663e-09, 1e-09,
7.943282347242822e-10, 6.309573444801942e-10, 5.011872336272714e-10,
3.9810717055349694e-10, 3.1622776601683795e-10, 2.511886431509582e-10,
1.9952623149688828e-10, 1.584893192461111e-10, 1.2589254117941662e-10,
1e-10, 7.943282347242822e-11, 6.309573444801942e-11, 5.011872336272715e-11,
3.9810717055349695e-11, 3.1622776601683794e-11, 2.5118864315095823e-11,
1.9952623149688828e-11, 1.5848931924611107e-11, 1.2589254117941662e-11,
1e-11, 7.943282347242821e-12, 6.309573444801943e-12, 5.011872336272715e-12,
3.9810717055349695e-12, 3.1622776601683794e-12, 2.5118864315095823e-12,
1.9952623149688827e-12, 1.584893192461111e-12, 1.258925411794166e-12,
1e-12, 7.943282347242822e-13, 6.309573444801942e-13, 5.011872336272715e-13,
3.981071705534969e-13, 3.162277660168379e-13, 2.511886431509582e-13,
1.9952623149688827e-13, 1.584893192461111e-13])

Calculate average basecall quality of a read. Receive the ascii quality scores of a read and return the average quality for that read First convert Phred scores to probabilities, calculate average error probability convert average back to Phred scale

pychopper.seq_utils.random(size=None)

Return random floats in the half-open interval [0.0, 1.0). Alias for random_sample to ease forward-porting to the new random API.

pychopper.seq_utils.readfq(fp, sample=None, min_qual=None, rfq_sup={})

Below function taken from https://github.com/lh3/readfq/blob/master/readfq.py Much faster parsing of large files compared to Biopyhton.

pychopper.seq_utils.record_size(read, in_format='fastq')

Calculate record size.

pychopper.seq_utils.revcomp_seq(seq)

Reverse complement sequence record

pychopper.seq_utils.reverse_complement(seq)

Return reverse complement of a string (base) sequence.
Parameters

seq -- Input sequence.

Returns

Reverse complement of input sequence.

Return type

str

pychopper.seq_utils.writefq(r, fh)

Write read to fastq file

pychopper.utils module

pychopper.utils.batch(iterable, size)
pychopper.utils.check_command(cmd)
pychopper.utils.check_min_hmmer_version(major, minor)
pychopper.utils.count_fastq_records(fname, size=128000000, opener=<built-in
function open>)
pychopper.utils.hit2bed(hit, read)
pychopper.utils.parse_config_string(s)

Module contents

Index

Module Index

Search Page

AUTHOR

ONT Applications Group

COPYRIGHT

2022, Oxford Nanopore Technologies Ltd.