-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance support for non-ncs files #18
Comments
I would like to tackle this issue; I have been thinking about it and, as a first step, I thought that making a base interface class from which all other interfaces (ncs, mat, etc.) would inherit could help to guarantee that all data files have a common interface. This way, the main scripts don't have to know if data is stored in mat or ncs files. What do you think about it? |
It's great that you want to do this. Thinking about software design in general, I think it is a great idea to start with a base class as you proposed. What I'm not sure about is whether it makes more sense to first collect all the places where the existence of ncs-files is assumed. For example, I just discovered that even css-plot-extracted assumes that there are ncs-files to read the header information from. These places don't want to read data, they want to have some information about the names or headers of the ncs files. This leads to the issue of meta-information. In several places, it's necessary to know the sampling rates and channel names. One option would be to create a simple text file in each data folder to store this information. It's also possible to use the attributes of h5 files, but then it's impossible to read this information without invoking python. |
Maybe we could look at it the other way around: where is it more comfortable for python to have the metadata? Having them in a text file rather than in the h5 only makes sense if a human is supposed to read it; I think it makes more sense that this information is gathered in one place and converted in human-readable form if and when a human needs to read it. The information could also be converted in different formats (e.g. txt, html, xml, GUI, etc.). |
You are right, once the metadata is somewhere, it's easy to convert it. So far, I have good experiences with text files (the header of ncs files is basically a text file, look at combinato/basics/nlxio.py, function ncs_nfo), but h5 attributes work, too. At least it would be no problem to change css-extract in such a way that it stores the header data in the h5 files when creating them. I already use h5-attritbutes in combinato/manager/create_session.py, function create_session. The important part would then be to rewrite all scripts that read header information, so that they use the information stored in the h5 files. If you decide to write a uniform interface, that could of course be part of the interface. |
I was modifying css-find-concurrent, and one question occurred to me: are the times expressed in seconds or in milliseconds? |
The timestamps that NcsFile.read returns are in microseconds. Everything else is in milliseconds. You might have noticed that concurrent.py divides timestamps by 1000 in two places. This is to convert from the ncs format to our h5 files. |
Ok, thanks! |
One question: I have seen that sometimes you use the tag "AcqEntName" from NCS files. As I'm not familiar with Neuralynx hardware I wanted to ask you what exactly is that tag and whether it is necessary or not when using .ncs files (e.g. you have many files with different "AcqEntName"s for each channel); In case it is not fundamental I would drop it and only use the essential information (basically just the channel) to build filenames. |
AcqEntName is the Neuralynx name for Acquisition Entity Name, basically the name of the Recording channel, which can be different from the filename. If I had thought more carefully in the beginning, I would have created meta-data during the extraction of spikes. The only script that should be allowed to refer to AcqEntName should be css-extract when dealing with ncs files - but I didn't think hard enough about it when I programmed it. |
Have you ever thought about using existing tools that deal with integrating different formats for neurophysiology data? Neo sounds like a useful tool here, since it is python based: https://pythonhosted.org/neo/ |
Hi @bergem1t, Short answerVery important idea for improvement, but I have no resources to fix it now. Long answerFile handling in Combinato could clearly benefit from a thorough redesign. The details vary for the different
|
This includes getting rid of all the small pieces of code like
glob('*.ncs')
orfname[:3]
and so on.Thanks to @eann for pointing this out!
Also see the wiki
The text was updated successfully, but these errors were encountered: