In early setups, researchers used a single electrode—essentially, a pin attached to a computer chip—to record from single cells in brain tissue grown in a Petri dish or from the brains of freely moving mice or rats. But such recordings don’t capture the crosstalk between neurons that is fundamental to the complex brain processes that underlie cognitive functions, such as learning and memory. Ever more sophisticated probes being developed can simultaneously record electrical activity from hundreds or even thousands of neurons in animals as they go about their normal activities.
Electrophysiology in neuroscience is “blooming,” says Pierre Yger, a computational neuroscientist at the French National Institute of Health and Medical Research in Paris. But as researchers seek to put an increasing number of electrodes into ever-smaller regions of the brain, determining which neuron is “saying what, when” becomes a challenge, he says. Large arrays of probes with multiple recording channels also produce massive amounts of data—a 30-minute recording from 4,000 channels generates half a terabyte, for example.
Analyzing the resulting electrode signals is called spike sorting because it requires organizing spikes in electric potential, recorded from just outside each neuron, by shape. A neuron’s characteristic spike shape depends on several factors, explains applied mathematician Rodrigo Quian Quiroga of the University of Leicester in the U.K. These include the structure of the neuron’s branched extensions, called dendrites, and the neuron’s distance to and relative positioning around the recording electrode. Clustering spikes by shape allows researchers to assign them to their respective neurons, and thereby determine neurons’ firing patterns and locations. Doing this by eye, however, is difficult and time-consuming, which it why Yger, Quiroga, and others are developing software that can automate the entire process, saving researchers weeks or even years of sifting through spike data.
These computational efforts face several challenges. For one thing, explains Yger, the spike signals are noisy, making it hard to create an algorithm to characterize them. Additionally, the neurons move, or drift, relative to the electrodes during recording, which can put any given neuron out of range of the electrode that was recording it, and in the range of a different electrode. Such shifts can change the shape of spikes, making it harder to link the electrical signals to their neurons of origin. But tools that can tease apart a multitude of such signals at once could pave the way not only for new neuroscience insights, but also for vastly improved prosthethics and neural implants.
Here are several freely available software solutions for analyzing neuronal activity.
Waves for Spikes
A spike-sorting algorithm must perform three basic steps: detect spikes, extract distinct features of spikes, and cluster the spikes by the identified features. In 2004, Quiroga and his colleagues first introduced Wave_clus, a spike-sorting system that relies on two components (Neural Comput, 16:1661-87). The first is a mathematical tool called wavelets, which extract information from signals such as the neuronal spikes, and the second is a clustering algorithm that draws on ideas from statistical mechanics to group together spikes with similar shapes.
The program first sifts through raw data and identifies major changes in the amplitude of the electrical signal; it labels those as spikes and discards the rest of the data. Then, using wavelet analysis, the spike shapes are characterized and fed into the clustering algorithm, which assigns each spike to a neuron and reveals when each neuron fired. In a final step, researchers must comb through the spike-sorting results and, by eye, search for mistakes—clusters that don’t look quite right, missing spikes, or false positives. Mistakes made by the algorithm and missed by the researcher reduce the reproducibility of results.
In the 14 years since its release, Wav_clus has undergone several updates that have improved its speed and automated some steps, reducing the time a user must spend correcting the algorithm’s mistakes. However, the software is not fully automated, and it can’t yet correct for drift. “There’s still a huge amount of room for improvement,” Quiroga says.
Sorting Kilos of Data
Lead developer: Marius Pachitariu, Group Leader, HHMI Janelia Research Campus
Last year, Pachitariu and collaborators described a new, ultrathin silicon probe called Neuropixels that records from hundreds of neurons across multiple brain regions simultaneously in rodents (Nature, 551:232-36, 2017). Researchers hope that the probe can reveal the networks that connect neuronal firing patterns in distant parts of the rodent brain. But its unprecedented bandwidth—previous probes could record from just dozens of neurons—complicates spike sorting by adding vast amounts of data to the analysis.
To address the analytical complexity introduced by the tens of millions of spikes Neuropixels generates, Pachitariu developed and released KiloSort in 2016 (bioRxiv, doi:10.1101/061481). The spike-sorting system works by stamping out a pattern, or template, for every spike based on its features, then sifting through the raw data to identify spikes that match each template and stripping those from the noise. The algorithm scours the raw data repeatedly to find spike snippets based on the templates. No data are ever discarded, as they are in other algorithms, Pachitariu says.
It’s also fast. An analysis that would take two weeks to conduct with other software programs can be run in 30 minutes with KiloSort. The program runs on graphics processing units and uses the programming language MATLAB installed on the computers doing the analysis, so the spike sorting can be done on-site. On the other hand, it can’t be done in the Cloud, potentially complicating access to the software. But the biggest issue is, again, drift, Pachitariu says. “If we can fix drift, that’s when these software programs will become fully automated.”
Spying on Spikes
Software: SPYKing CIRCUS
Lead developer: Pierre Yger, computational neuroscientist, French National Institute of Health and Medical Research
One difficulty in designing spike-sorting algorithms that can handle massive amounts of data is having the program recognize that spikes from a single neuron will be recorded on several probes, says Yger. One neuron might register a full, well-developed spike on one probe, and its signal might also be picked up on another probe nearby. That’s especially true over time as the nerve cells drift, and probes that were recording one set of neurons are now recording another set. Another problem is analysis time. With more data, algorithms need to work more quickly to deliver results. But the time it takes for software to sort spike data cannot exponentially increase in the same way new probes exponentially increase the amount of data being collected.
If we can fix drift, that’s when these software programs will become fully automated.—Marius Pachitariu,
HHMI Janelia Research Campus
To address these problems, SPYKing CIRCUS, released in 2016, conducts its analysis by performing two main steps (bioRxiv, doi: 10.1101/067843). First, it develops a dictionary of spikes from a given neuron, and second, it matches the templates to spikes in the raw data. The first step reviews the pattern of activity that arises from many electrodes when a single neuron fires, and the second scans the data to find all examples of that activity. It works in a similar way to KiloSort. A user does have to review the templates in the dictionary that define each spike, ensuring that the shapes properly match spikes in the raw data and that there aren’t any errors. But the work is minimal compared to other programs, Yger says, because researchers don’t have to build into the software assumptions about which data to remove. The team is now comparing the software’s performance to that of other spike-sorting programs. SPYKing CIRCUS is not yet fully automated, but the team is working on an update that can handle drift.
Mining a Data Mountain
A few years ago, when Magland and his colleagues began developing software to sort spikes, they realized that no such fully automated software existed. In designing MountainSort, the team stuck to the standard spike-sorting steps: identifying spikes, discarding excess data, sorting the remaining data into small clusters, and then evaluating whether clusters next to each other should be combined.
The process is streamlined, Magland says, so users don’t have to adjust the parameters for how spikes are sorted. It also uses a generic statistical clustering method to group shapes of spikes together, keeping user intervention low. Then, the system looks at the time spikes occurred and de-mixes them if they are too close together.
A neuron can fire electrical activity about once every few milliseconds, so if spikes from the same neuron appear to be closer together than that, the software might be sorting them incorrectly. The team is working on teasing apart those signals with the software. Once the sorting step is done, the program assesses the quality of its sorting. If any of the clustering has a poor quality ranking, the data for that neuron are discarded. Depending on what data a particular lab wants, researchers can set the algorithm’s evaluation criteria, Magland says (Neuron, 95:1381-94, 2017).
Similar to the SPYKing CIRCUS team, Magland and his collaborators are working on an update to MountainSort that can correct for drift in the data. The researchers are also building an online interface where users will be able to run several of the spike-sorting algorithms on the same datasets to see how each performs. This system will give researchers a way to validate their algorithms, compare them, even take elements from one algorithm and integrate them into another to create hybrid versions of the software.
“Spike sorting is a rich and complex field,” Magland says.