% This file, DNAdesign/notes-examples, is provided % to help familiarize you with this extremely preliminary % release of my DNA sequence design software. % % We will step by step construct sequences for the DAO A&B tiles % of the type used in my Nature paper. % % The DAO in Winfree,Liu,Wenzler,Seeman were designed by a previous % version of this software. The DAE were designed by Seeman's SEQUIN. % However, this file shows how similar sequences could be designed % using this software. % % Most recently tested with MATLAB 6.5.1 (6/18/04) % % DNAdesign package (Winfree lab, Caltech) % For folks in my lab and other users of the Caltech DNA Group cluster: % % In MATLAB, execute "addpath /research/src/DNAdesign", as below, where % the MATLAB files aleady exist. Also, the relevant files are already % compiled there, so you don't need to do that. % For other folks who % All the files in DNAdesign should be unpacked into a directory % of that name, possibly as a subdirectory of your working DNA design % directory. Matlab will need to know where all the *.m files are, % so that directory must be added into the path. Here, I use the % DNA cluster shared directory; I've commented out using my own local % copy. addpath /research/src/DNAdesign % We should compile the speed-critical functions. % Since the .mex files are put in the current working directory, % we cd to where we want them to stay, i.e. the DNAdesign directory. % Comments here are w/r to the MATLAB 6.5.1 R13 compiler. % (NOTE: if your computers don't have the MATLAB compiler, % skip the mcc commands. Everything will run 10x to 100x slower....) %% NOTE: MATLAB dies a horrible death if it tries to run a mexglx file %% from a directory found via an indirect path (e.g. '../DNAdesign'). %% Consequently, it is VERY IMPORTANT to use a full path name, above. %% Here, I compile in the DNA cluster /research directory. You shouldn't have %% to do this if you use the DNA cluster; it is already compiled for you. %% [This comment may only apply to MATLAB v 6.0, compiler v 2.1. %% Untested since then.] if 0 % This only needs to be done once. It's already done on the DNA cluster. workpath=pwd; cd /research/src/DNAdesign mcc -x -h spurious mcc -x -h spurious1 mcc -x -h spuriousW mcc -x -h dotplot mcc -x -h constrain mcc -x -h randbase mcc -x -h helix_dG mcc -x -h stickyends mcc -x -h verboten mcc -x -h make_bad6mer mcc -x -h score_spurious mcc -x -h score_eg mcc -x -h score_TX %% note: for debugging, it's best not to compile constraints.m bad6mer=make_bad6mer; save Bad6mer bad6mer cd(workpath) end % To understand stuff below, it helps to use help! % All the functions in this package have help information describing % their data structures and what they do. E.g. 'help constraints' % The four strands for each complex: % Sequence restrictions for each strand -- we hard code the junctions. DAOseqA = ['NNNNNNNNNNNCACCNNNNNNNNNNN ' ... 'NNNNNNGGACNNNNNNNNNNNNCACCNNNNNNNNNNNNCCTGNNNNNN ' ... 'NNNNNNCCTGNNNNNNNNNNNNGTGGNNNNNNNNNNNNGGACNNNNNN ' ... 'NNNNNNNNNNNGTGGNNNNNNNNNNN']; DAOseqB = ['NNNNNNNNNNNGGTGNNNNNNNNNNN ' ... 'NNNNNNCAGGNNNNNNNNNNNNGGTGNNNNNNNNNNNNGTCCNNNNNN ' ... 'NNNNNNGTCCNNNNNNNNNNNNCCACNNNNNNNNNNNNCAGGNNNNNN ' ... 'NNNNNNNNNNNCCACNNNNNNNNNNN']; % Define which ds helices exist. (see 'help constraints') DAOhelixA = [1 6 2 48 8; 3 25 2 40 16; 3 41 4 13 8; ... 2 1 1 21 8; 2 9 3 24 16; 4 14 3 8 8]; DAOhelixB = DAOhelixA; % Put it together into one spec with 8 strands and requirements for % the sticky-end complementarity. % Note the ' ' to separete the first DX from the second. % Note that the strands in B get renumbered when we put all 8 together. DAOseq = [DAOseqA ' ' DAOseqB ]; DAOhelixB = ones(6,1)*[4 0 4 0 0] + DAOhelixB ; DAOhelices = [ DAOhelixA; DAOhelixB ]; DAOsticky = [ 1 1 8 5 5; 1 22 8 26 5; 4 1 5 5 5; 4 22 5 26 5 ]; % Build the required sequence template (St), Watson-Crick base pairing (wc) % and base equality (eq) data structures. [St, wc, eq] = constraints(DAOseq, [DAOhelices; DAOsticky], []); displayDX(St, DAOhelixA, [ 1 2 3; 4 5 6 ] ) displayDX(St, DAOhelixB, [ 1 2 3; 4 5 6 ] ) % Now play around with the data structures a bit, just for fun. % Assign random bases according to the template. Note lack of complementarity. rS=randbase(St); displayDX(rS, DAOhelixA, [ 1 2 3; 4 5 6 ] ) % Enforce the base-pairing and equality rules S=constrain(rS,wc,eq); displayDX(S, DAOhelixA, [ 1 2 3; 4 5 6 ] ) displayDX(S, DAOhelixB, [ 1 2 3; 4 5 6 ] ) % save the data structures, so we don't have to go through this again. save DAOexample S St wc eq DAOhelixA DAOhelixB DAOhelices DAOseqA DAOseqB DAOseq DAOsticky % How good is the random sequence S? Diagonals show potential WC domains. % Mwc shows the actual sequence, Mbp shows only the required base-pairings. [Mwc,Mbp] = dotplot(S,wc,eq,Inf); figure(1); imagesc(1-exp(-Mwc/16)); figure(2); imagesc(1-exp(-Mbp/16)); % The help notes tell you what this means. How many places can % a 6-mer find a complementary subsequence that was NOT intended? % (i.e. how many unintended 6-long diagonals are there?) % (here, we do 3-mers to 8-mers, and consider intramolecular, intermolecular, % and intercomplex separately) spurious(S, 3,8, wc,eq) % As before, but allowing internal 1 mismatch (not counting perfect matches) spurious1(S, 3,8) % spuriousW doesn't give you the full breakdown, but it allows you % to make some parts of the complex more important that others % (e.g. junctions) ... however we don't do that here, to compare with above. sum(spurious(S, 3,8, wc,eq)') spuriousW(S, 3,8, wc, eq, ones(1,length(S)), [1 0 0]/2) spuriousW(S, 3,8, wc, eq, ones(1,length(S)), [0 1 0]/2) spuriousW(S, 3,8, wc, eq, ones(1,length(S)), [0 0 1]/2) % A typical scoring system. weights = 4*(St=='N') + 12*(St~='N'); % junctions cost more! prefactors = [2^(-8) 2^(-9) 2^(-12)]; % roughly same as Nature tiles spuriousW(S, 3,8, wc, eq, weights, prefactors) % What do these scores mean? They don't have a rigorous connection % to the probability of getting the correct DNA fold, unfortunately. % So one has to use intuition to determine the relative weighting % of various factors to set up in your scoring function. % 'score_eg.m' shows one set of choices, reasonable for the DAO tile % design. Make a new 'score_mine.m' and modify it for your % design task. % Now we'll do the stochastic optimization. % several variable need to be made global, so functions can read them clear global bestS DAOsticky bad6mer load Bad6mer load DAOexample optimize(S,St,wc,eq, 'score_eg') % hit control-C to stop it when you're bored... e.g. a few hours. % How well did it do? spurious(S, 3,8, wc,eq) spurious(bestS, 3,8, wc,eq) spurious1(S, 3,8) spurious1(bestS, 3,8) displayDX(bestS, DAOhelixA, [ 1 2 3; 4 5 6 ] ) displayDX(bestS, DAOhelixB, [ 1 2 3; 4 5 6 ] ) figure(1); [Mwc,Mbp] = dotplot(S,wc,eq,Inf); subplot(2,2,1); imagesc(log(1+Mwc)); subplot(2,2,2); imagesc(log(1+Mbp)); [Mwc,Mbp] = dotplot(bestS,wc,eq,Inf); subplot(2,2,3); imagesc(log(1+Mwc)); subplot(2,2,4); imagesc(log(1+Mbp)); %% The lattice-like scattering of matches in Mbp are due to the fixed %% (non-variable) sequences defined at the crossovers, e.g. GGAC. stickyends(S,DAOsticky) stickyends(bestS,DAOsticky) % to continute optimizing, optimize(bestS,St,wc,eq, 'score_eg') % Have Fun! ( Can you make a DAE? )