Fix identifier storage in Alignment class #289

thomashopf · 2023-03-30T15:43:55Z

Currently memory usage is defined by longest identifier due to use of numpy for identifier storage, which can create a large overhead if one header is longer than others - but numpy functionality not that relevant on identifiers

Ideally, replace with pd.Series to keep slicing functionality while making use of better string memory management of pandas

@aaronkollasch

thomashopf · 2023-03-30T15:47:19Z

Also add an option to from_file method to split identifiers on first whitespace (off by default)

thomashopf added the enhancement label Mar 30, 2023

thomashopf self-assigned this Mar 30, 2023

thomashopf added a commit that referenced this issue May 11, 2023

Fix alignment identifier memory usage, add header splitting, fixes #289

5cd4033

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix identifier storage in Alignment class #289

Fix identifier storage in Alignment class #289

thomashopf commented Mar 30, 2023

thomashopf commented Mar 30, 2023

Fix identifier storage in Alignment class #289

Fix identifier storage in Alignment class #289

Comments

thomashopf commented Mar 30, 2023

thomashopf commented Mar 30, 2023