Alternative assembly implementation #165

manulera · 2023-12-05T17:49:42Z

Hi @BjornFJohansson I had started working on a similar refactoring to #157 for the assemblies (find which assemblies can be made with one function, execute them with another). To get this to work, I had to change things significantly, so my implementation has become quite different. I still use a graph, but the meaning of edges is different. I have not implemented the Contig functionality, but I could do something similar, if you think it would make sense to include this assembly in pydna. For now, it is in the ShareYourCloning repo.

You can find the implementation in this_file. It passes all the tests that pydna's assembly passes, excluding the order in which fragments are returned, and that it only returns unique assemblies (see test file).

If you want to have a look, I would start from the docstring of Assembly, and then add_edges_from_match to understand how it works. I think it's quite simplified from the original implementation, for example, the construction of the graph is very short (the function add_edges_from_match is quite short as well):

def __init__(self, frags: list[_Dseqrecord], limit=25, algorithm=common_sub_strings, use_fragment_order=True, use_all_fragments=False):
    # TODO: allow for the same fragment to be included more than once?
    G = _nx.MultiDiGraph()
    # Add positive and negative nodes for forward and reverse fragments
    G.add_nodes_from((i + 1, {'seq': f}) for (i, f) in enumerate(frags))
    G.add_nodes_from((-(i + 1), {'seq': f.reverse_complement()}) for (i, f) in enumerate(frags))

    # Iterate over all possible combinations of fragments
    edge_pairs = _itertools.combinations(filter(lambda x : x>0, G.nodes), 2)
    for index_first, index_secnd in edge_pairs:
        first = G.nodes[index_first]['seq']
        secnd = G.nodes[index_secnd]['seq']

        # Overlaps where both fragments are in the forward orientation
        matches_fwd = algorithm(str(first.seq).upper(), str(secnd.seq).upper(), limit)
        for match in matches_fwd:
            add_edges_from_match(match, index_first, index_secnd, first, secnd, G)

        # Overlaps where the first fragment is in the forward orientation and the second in the reverse orientation
        matches_rvs = algorithm(str(first.seq).upper(), reverse_complement(str(secnd.seq).upper()), limit)
        for match in matches_rvs:
            add_edges_from_match(match, index_first, -index_secnd, first, secnd, G)

Let me know what you think.

The text was updated successfully, but these errors were encountered:

BjornFJohansson · 2023-12-14T10:31:49Z

OK, lots to digest. In principle I don't mind putting this alongside and eventually replace the existing
code. Ill read it carefully and get back to you.

manulera · 2023-12-15T12:06:06Z

Yes, actually the code has slightly changed since, although it is still in the same spirit. I am currently using that new assembly module to do PCR, ligations (also partial) and homologous recombination. I will adapt for gibson and golden gate as well. I am just writing slightly different algorithm functions for each case.

If you think it can be useful, I can give you a short overview on zoom before you dive into it.

manulera · 2024-01-16T15:37:11Z

Hi @BjornFJohansson, when you merge the branches you mentioned in the call into cutsite_pairs, let me know and I can add the new assembly implementation.

manulera assigned BjornFJohansson and manulera Dec 5, 2023

manulera added the question label Dec 5, 2023

manulera mentioned this issue Dec 6, 2023

Minimal information to represent an assembly manulera/ShareYourCloning_backend#51

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Alternative assembly implementation #165

Alternative assembly implementation #165

manulera commented Dec 5, 2023

BjornFJohansson commented Dec 14, 2023

manulera commented Dec 15, 2023

manulera commented Jan 16, 2024

Alternative assembly implementation #165

Alternative assembly implementation #165

Comments

manulera commented Dec 5, 2023

BjornFJohansson commented Dec 14, 2023

manulera commented Dec 15, 2023

manulera commented Jan 16, 2024