De Bruijn Graph Algorithm
1. K-mers
k-mers: A substring with length k, where k is a postive interger.
For example:
String: ATCGAT
4-mers: ATCG
TCGA
CGAT
2. De Bruijn Graph
A De Bruijn Graph is a directed graph that represents the overlap between strings. To generate a De Bruijn Graph, two inputs are required:
- reads: a list of strings, the target of assembly.
- k: a positive interger, it is the k-mer size used in the De Bruijn Graph
3. De Bruijn Graph Generation Example
Here is an example of the De Bruijn Graph generation, given the follow inputs.
reads = [ABCDEF, BCDEFG]
k = 4
Step 1
Copmuter 4-mers from every read.
Step 2
Spilt each 4-mer to a left 3-mer and a right 3-mer
Step 3
Draw a directed edge from each left 3-mer to the corresponding right 3-mer.
Step 4
Construct the De Bruijn Graph by merging every node that have the same 3-mer.