pychopper(1)
package documentation
Description
PYCHOPPER
NAME
pychopper - package documentation
COMMAND LINE TOOLS
FULL API REFERENCE
pychopper
pychopper package
Subpackages
pychopper.phmm_data package
Module contents
pychopper.primer_data package
Module contents
pychopper.scripts package
Submodules
pychopper.scripts.pychopper module
pychopper.scripts.pychopper.main()
Parse command line arguments.
Module contents
pychopper.tests package
Submodules
pychopper.tests.test_detector module
class pychopper.tests.test_detector.TestDetector(methodName='runTest')
Bases: unittest.case.TestCase
Create an
instance of the class that will use the named test method
when executed. Raises a ValueError if the instance does not
have a method with the specified name.
testPairAlign()
testScoreCutoff()
pychopper.tests.test_regression_simple module
class
pychopper.tests.test_regression_simple.TestIntegration(methodName='runTest')
Bases: unittest.case.TestCase
Create an
instance of the class that will use the named test method
when executed. Raises a ValueError if the instance does not
have a method with the specified name.
testIntegration()
Integration test.
testIntegration_umi()
Integration test.
Module contents
Submodules
pychopper.alignment_hits module
pychopper.alignment_hits.process_hits(hits, max_score)
Process alignment hits by removing overlaps
pychopper.chopper module
pychopper.chopper.analyse_hits(hits, config)
Segment reads based on alignment hits using dynamic programming. The algorithm is based on the rule that each primer alignment hit can be used only once. Hence if a segment is included, the next one has to be excluded.
pychopper.chopper.chopper_edlib(reads,
primers, config, max_ed, cutoff,
pool, min_batch)
Segment using the edlib/parasail backend
pychopper.chopper.chopper_phmm(reads,
phmm_file, config, cutoff, threads,
pool, min_batch)
Segment using the profile HMM backend
pychopper.chopper.segments_to_reads(read,
segments, keep_primers, bam_tags,
detect_umis)
Convert segments to output reads with annotation
pychopper.common_structures module
class
pychopper.common_structures.Hit(Ref, RefStart, RefEnd,
Query,
QueryStart, QueryEnd, Score)
Bases: tuple
Create new instance of Hit(Ref, RefStart, RefEnd, Query, QueryStart, QueryEnd, Score)
|
Query |
Alias for field number 3 |
QueryEnd
Alias for field number 5
QueryStart
Alias for field number 4
|
Ref |
Alias for field number 0 |
RefEnd
Alias for field number 2
RefStart
Alias for field number 1
|
Score |
Alias for field number 6 |
class
pychopper.common_structures.Segment(Left, Start, End, Right,
Strand,
Len)
Bases: tuple
Create new instance of Segment(Left, Start, End, Right, Strand, Len)
|
End |
Alias for field number 2 |
|||
|
Left |
Alias for field number 0 |
|||
|
Len |
Alias for field number 5 |
|||
|
Right |
Alias for field number 3 |
|||
|
Start |
Alias for field number 1 |
Strand
Alias for field number 4
class pychopper.common_structures.Seq(Id, Name, Seq, Qual, Umi)
Bases: tuple
Create new instance of Seq(Id, Name, Seq, Qual, Umi)
|
Id |
Alias for field number 0 |
|||
|
Name |
Alias for field number 1 |
|||
|
Qual |
Alias for field number 3 |
|||
|
Seq |
Alias for field number 2 |
|||
|
Umi |
Alias for field number 4 |
pychopper.edlib_backend module
pychopper.edlib_backend.find_locations(reads,
all_primers, max_ed, pool,
min_batch)
Find alignment hits of all primers in all reads using the edlib/parasail backend
pychopper.edlib_backend.find_umi_single(params)
Find UMI in a single reads using the edlib/parasail backend
pychopper.hmmer_backend module
pychopper.hmmer_backend.find_locations(reads,
phmm_file, E, pool,
min_batch)
Find alignment hits of all primers in all reads using the pHMM/nhmmscan backend
pychopper.parasail_backend module
pychopper.parasail_backend.first_cigar(cigar)
Extract details of the first operation in a cigar string.
pychopper.parasail_backend.pair_align(reference,
query, query_name,
subs_mat, params)
Perform pairwise local alignment using parsail-python
pychopper.parasail_backend.process_alignment(aln,
query, query_name,
aln_params)
Process an alignment, extracting score, start and end.
pychopper.parasail_backend.refine_locations(read,
all_primers, locations,
aln_params={'gap_extend': 1, 'gap_open': 1, 'match': 1,
'mismatch': -2},
subs_mat=<parasail.bindings_v2.Matrix object>)
Refine alignment edges based on local alignment
pychopper.report module
class pychopper.report.Report(pdf)
Bases: object
Class for
plotting utilities on the top of matplotlib. Plots are saved
in the specified file through the PDF backend.
Parameters
|
• |
self -- object. |
|||
|
• |
pdf -- Output pdf. |
Returns
The report object.
Return type
Report
close()
Close PDF backend. Do not
forget to call this at the end of your script or your output
will be damaged!
Parameters
self -- object
Returns
None
Return type
object
plot_arrays(data_map,
title='', xlab='', ylab='', marker='.',
legend_loc='best', legend=True, vlines=None,
vlcolor='green',
vlwitdh=0.5)
Plot multiple pairs of data
arrays.
Parameters
|
• |
self -- object. | ||
|
• |
data_map -- A dictionary with labels as keys and tupples of data arrays (x,y) as values. | ||
|
• |
title -- Figure title. | ||
|
• |
xlab -- X axis label. | ||
|
• |
ylab -- Y axis label. | ||
|
• |
marker -- Marker passed to the plot function. | ||
|
• |
legend_loc -- Location of legend. | ||
|
• |
legend -- Plot legend if True | ||
|
• |
vlines -- Dictionary with labels and positions of vertical lines to draw. | ||
|
• |
vlcolor -- Color of vertical lines drawn. | ||
|
• |
vlwidth -- Width of vertical lines drawn. |
Returns
None
Return type
object
plot_bars_simple(data_map,
title='', xlab='', ylab='', alpha=0.6,
xticks_rotation=0, auto_limit=False)
Plot simple bar chart from
input dictionary.
Parameters
|
• |
self -- object. | ||
|
• |
data_map -- A dictionary with labels as keys and data as values. | ||
|
• |
title -- Figure title. | ||
|
• |
xlab -- X axis label. | ||
|
• |
ylab -- Y axis label. | ||
|
• |
alpha -- Alpha value. | ||
|
• |
xticks_rotation -- Rotation value for x tick labels. | ||
|
• |
auto_limit -- Set y axis limits automatically. |
Returns
None
Return type
object
plot_histograms(data_map,
title='', xlab='', ylab='', bins=50,
alpha=0.7, legend_loc='best', legend=True,
vlines=None)
Plot histograms of multiple
data arrays.
Parameters
|
• |
self -- object. | ||
|
• |
data_map -- A dictionary with labels as keys and data arrays as values. | ||
|
• |
title -- Figure title. | ||
|
• |
xlab -- X axis label. | ||
|
• |
ylab -- Y axis label. | ||
|
• |
bins -- Number of bins. | ||
|
• |
alpha -- Transparency value for histograms. | ||
|
• |
legend_loc -- Location of legend. | ||
|
• |
legend -- Plot legend if True. | ||
|
• |
vlines -- Dictionary with labels and positions of vertical lines to draw. |
Returns
None
Return type
object
save_close()
Utility method to save and close figure.
pychopper.seq_utils module
pychopper.seq_utils.base_complement(k)
Return complement of base.
Performs the
subsitutions: A<=>T, C<=>G, X=>X for both
upper and lower case. The return value is identical to the
argument for all other values.
Parameters
k -- A base.
Returns
Complement of base.
Return type
str
pychopper.seq_utils.errs_tab(n)
Generate list of error rates for qualities less than equal than n.
pychopper.seq_utils.get_primers(primers)
Load primers from fasta file
pychopper.seq_utils.get_runid(desc)
Parse out runid from sequence description.
pychopper.seq_utils.mean_qual(quals,
qround=False, tab=[1.0,
0.7943282347242815, 0.6309573444801932, 0.5011872336272722,
0.3981071705534972, 0.31622776601683794, 0.251188643150958,
0.19952623149688797, 0.15848931924611134,
0.12589254117941673, 0.1,
0.07943282347242814, 0.06309573444801933,
0.05011872336272722,
0.039810717055349734, 0.03162277660168379,
0.025118864315095794,
0.0199526231496888, 0.015848931924611134,
0.012589254117941675, 0.01,
0.007943282347242814, 0.00630957344480193,
0.005011872336272725,
0.003981071705534973, 0.0031622776601683794,
0.0025118864315095794,
0.001995262314968879, 0.001584893192461114,
0.0012589254117941675, 0.001,
0.0007943282347242813, 0.000630957344480193,
0.0005011872336272725,
0.00039810717055349735, 0.00031622776601683794,
0.00025118864315095795,
0.00019952623149688788, 0.00015848931924611142,
0.00012589254117941674,
0.0001, 7.943282347242822e-05, 6.309573444801929e-05,
5.011872336272725e-05, 3.9810717055349695e-05,
3.1622776601683795e-05,
2.5118864315095822e-05, 1.9952623149688786e-05,
1.584893192461114e-05,
1.2589254117941661e-05, 1e-05, 7.943282347242822e-06,
6.30957344480193e-06,
5.011872336272725e-06, 3.981071705534969e-06,
3.162277660168379e-06,
2.5118864315095823e-06, 1.9952623149688787e-06,
1.584893192461114e-06,
1.2589254117941661e-06, 1e-06, 7.943282347242822e-07,
6.30957344480193e-07,
5.011872336272725e-07, 3.981071705534969e-07,
3.162277660168379e-07,
2.5118864315095823e-07, 1.9952623149688787e-07,
1.584893192461114e-07,
1.2589254117941662e-07, 1e-07, 7.943282347242822e-08,
6.30957344480193e-08,
5.011872336272725e-08, 3.981071705534969e-08,
3.162277660168379e-08,
2.511886431509582e-08, 1.9952623149688786e-08,
1.5848931924611143e-08,
1.2589254117941661e-08, 1e-08, 7.943282347242822e-09,
6.309573444801943e-09, 5.011872336272715e-09,
3.981071705534969e-09,
3.1622776601683795e-09, 2.511886431509582e-09,
1.9952623149688828e-09,
1.584893192461111e-09, 1.2589254117941663e-09, 1e-09,
7.943282347242822e-10, 6.309573444801942e-10,
5.011872336272714e-10,
3.9810717055349694e-10, 3.1622776601683795e-10,
2.511886431509582e-10,
1.9952623149688828e-10, 1.584893192461111e-10,
1.2589254117941662e-10,
1e-10, 7.943282347242822e-11, 6.309573444801942e-11,
5.011872336272715e-11,
3.9810717055349695e-11, 3.1622776601683794e-11,
2.5118864315095823e-11,
1.9952623149688828e-11, 1.5848931924611107e-11,
1.2589254117941662e-11,
1e-11, 7.943282347242821e-12, 6.309573444801943e-12,
5.011872336272715e-12,
3.9810717055349695e-12, 3.1622776601683794e-12,
2.5118864315095823e-12,
1.9952623149688827e-12, 1.584893192461111e-12,
1.258925411794166e-12,
1e-12, 7.943282347242822e-13, 6.309573444801942e-13,
5.011872336272715e-13,
3.981071705534969e-13, 3.162277660168379e-13,
2.511886431509582e-13,
1.9952623149688827e-13, 1.584893192461111e-13])
Calculate average basecall quality of a read. Receive the ascii quality scores of a read and return the average quality for that read First convert Phred scores to probabilities, calculate average error probability convert average back to Phred scale
pychopper.seq_utils.random(size=None)
Return random floats in the half-open interval [0.0, 1.0). Alias for random_sample to ease forward-porting to the new random API.
pychopper.seq_utils.readfq(fp, sample=None, min_qual=None, rfq_sup={})
Below function taken from https://github.com/lh3/readfq/blob/master/readfq.py Much faster parsing of large files compared to Biopyhton.
pychopper.seq_utils.record_size(read, in_format='fastq')
Calculate record size.
pychopper.seq_utils.revcomp_seq(seq)
Reverse complement sequence record
pychopper.seq_utils.reverse_complement(seq)
Return reverse complement of a
string (base) sequence.
Parameters
seq -- Input sequence.
Returns
Reverse complement of input sequence.
Return type
str
pychopper.seq_utils.writefq(r, fh)
Write read to fastq file
pychopper.utils module
pychopper.utils.batch(iterable,
size)
pychopper.utils.check_command(cmd)
pychopper.utils.check_min_hmmer_version(major, minor)
pychopper.utils.count_fastq_records(fname, size=128000000,
opener=<built-in
function open>)
pychopper.utils.hit2bed(hit, read)
pychopper.utils.parse_config_string(s)
Module contents
|
• |
Index |
|||
|
• |
Module Index |
|||
|
• |
Search Page |
AUTHOR
ONT Applications Group
COPYRIGHT
2022, Oxford Nanopore Technologies Ltd.