Edlib  1.1.2.
Lightweight, super fast C/C++ library for sequence alignment using edit distance.
Classes | Macros | Enumerations | Functions
edlib.h File Reference

Main header file, containing all public functions and structures. More...

Go to the source code of this file.

Classes

struct  EdlibAlignConfig
 Configuration object for edlibAlign() function. More...
 
struct  EdlibAlignResult
 

Macros

#define EDLIB_STATUS_OK   0
 
#define EDLIB_STATUS_ERROR   1
 
#define EDLIB_EDOP_MATCH   0
 Match.
 
#define EDLIB_EDOP_INSERT   1
 Insertion to target = deletion from query.
 
#define EDLIB_EDOP_DELETE   2
 Deletion from target = insertion to query.
 
#define EDLIB_EDOP_MISMATCH   3
 Mismatch.
 

Enumerations

enum  EdlibAlignMode { EDLIB_MODE_NW, EDLIB_MODE_SHW, EDLIB_MODE_HW }
 
enum  EdlibAlignTask { EDLIB_TASK_DISTANCE, EDLIB_TASK_LOC, EDLIB_TASK_PATH }
 
enum  EdlibCigarFormat { EDLIB_CIGAR_STANDARD, EDLIB_CIGAR_EXTENDED }
 

Functions

EdlibAlignConfig edlibNewAlignConfig (int k, EdlibAlignMode mode, EdlibAlignTask task)
 
EdlibAlignConfig edlibDefaultAlignConfig (void)
 
void edlibFreeAlignResult (EdlibAlignResult result)
 
EdlibAlignResult edlibAlign (const char *query, int queryLength, const char *target, int targetLength, const EdlibAlignConfig config)
 
char * edlibAlignmentToCigar (const unsigned char *alignment, int alignmentLength, EdlibCigarFormat cigarFormat)
 

Detailed Description

Main header file, containing all public functions and structures.

Author
Martin Sosic

Enumeration Type Documentation

§ EdlibAlignMode

Alignment methods - how should Edlib treat gaps before and after query?

Enumerator
EDLIB_MODE_NW 

Global method. This is the standard method. Useful when you want to find out how similar is first sequence to second sequence.

EDLIB_MODE_SHW 

Prefix method. Similar to global method, but with a small twist - gap at query end is not penalized. What that means is that deleting elements from the end of second sequence is "free"! For example, if we had "AACT" and "AACTGGC", edit distance would be 0, because removing "GGC" from the end of second sequence is "free" and does not count into total edit distance. This method is appropriate when you want to find out how well first sequence fits at the beginning of second sequence.

EDLIB_MODE_HW 

Infix method. Similar as prefix method, but with one more twist - gaps at query end and start are not penalized. What that means is that deleting elements from the start and end of second sequence is "free"! For example, if we had ACT and CGACTGAC, edit distance would be 0, because removing CG from the start and GAC from the end of second sequence is "free" and does not count into total edit distance. This method is appropriate when you want to find out how well first sequence fits at any part of second sequence. For example, if your second sequence was a long text and your first sequence was a sentence from that text, but slightly scrambled, you could use this method to discover how scrambled it is and where it fits in that text. In bioinformatics, this method is appropriate for aligning read to a sequence.

§ EdlibAlignTask

Alignment tasks - what do you want Edlib to do?

Enumerator
EDLIB_TASK_DISTANCE 

Find edit distance and end locations.

EDLIB_TASK_LOC 

Find edit distance, end locations and start locations.

EDLIB_TASK_PATH 

Find edit distance, end locations and start locations and alignment path.

§ EdlibCigarFormat

Describes cigar format.

See also
http://samtools.github.io/hts-specs/SAMv1.pdf
http://drive5.com/usearch/manual/cigar.html
Enumerator
EDLIB_CIGAR_STANDARD 

Match: 'M', Insertion: 'I', Deletion: 'D', Mismatch: 'M'.

EDLIB_CIGAR_EXTENDED 

Match: '=', Insertion: 'I', Deletion: 'D', Mismatch: 'X'.

Function Documentation

§ edlibAlign()

EdlibAlignResult edlibAlign ( const char *  query,
int  queryLength,
const char *  target,
int  targetLength,
const EdlibAlignConfig  config 
)

Aligns two sequences (query and target) using edit distance (levenshtein distance). Through config parameter, this function supports different alignment methods (global, prefix, infix), as well as different modes of search (tasks). It always returns edit distance and end locations of optimal alignment in target. It optionally returns start locations of optimal alignment in target and alignment path, if you choose appropriate tasks.

Parameters
[in]queryFirst sequence. Character codes should be in range [0, 127].
[in]queryLengthNumber of characters in first sequence.
[in]targetSecond sequence. Character codes should be in range [0, 127].
[in]targetLengthNumber of characters in second sequence.
[in]configAdditional alignment parameters, like alignment method and wanted results.
Returns
Result of alignment, which can contain edit distance, start and end locations and alignment path. Make sure to clean up the object using edlibFreeAlignResult() or by manually freeing needed members.

§ edlibAlignmentToCigar()

char* edlibAlignmentToCigar ( const unsigned char *  alignment,
int  alignmentLength,
EdlibCigarFormat  cigarFormat 
)

Builds cigar string from given alignment sequence.

Parameters
[in]alignmentAlignment sequence. 0 stands for match. 1 stands for insertion to target. 2 stands for insertion to query. 3 stands for mismatch.
[in]alignmentLength
[in]cigarFormatCigar will be returned in specified format.
Returns
Cigar string. I stands for insertion. D stands for deletion. X stands for mismatch. (used only in extended format) = stands for match. (used only in extended format) M stands for (mis)match. (used only in standard format) String is null terminated. Needed memory is allocated and given pointer is set to it. Do not forget to free it later using free()!

§ edlibDefaultAlignConfig()

EdlibAlignConfig edlibDefaultAlignConfig ( void  )
Returns
Default configuration object, with following defaults: k = -1, mode = EDLIB_MODE_NW, task = EDLIB_TASK_DISTANCE.

§ edlibFreeAlignResult()

void edlibFreeAlignResult ( EdlibAlignResult  result)

Frees memory in EdlibAlignResult that was allocated by edlib. If you do not use it, make sure to free needed members manually using free().

§ edlibNewAlignConfig()

EdlibAlignConfig edlibNewAlignConfig ( int  k,
EdlibAlignMode  mode,
EdlibAlignTask  task 
)

Helper method for easy construction of configuration object.

Returns
Configuration object filled with given parameters.