A reader shared with me several concerns about potential repetitive plots and values in this paper.
Extended Figure 6A shows "ISI distribution of 18 single neurons over the 7-month recording from a representative head-fixed, awake mouse.". It is assumed that each of the 7 subplots shows the recording of a different month.
Unexpectedly, for all 18 neuron plots, all 7 subplots within that plot are identical to each other. Is this expected?
Close-up of one of the plots. All subplots look the same.
Extended Figure 10C shows "Time evolution of ISI histograms of representative neurons from 5 months to 18 months. The x- and y-axes denote the time between subsequent action potentials of the firing neuron, and mouse age in months, respectively, and the z-axis denotes frequency. The bin size is 2 ms. Colors indicate individual neurons."
Two of the neurons (top row, right-most two plots) appear to be duplications of each other, as marked here with red boxes. Can the authors please check?
The Source Data for Figure 2F spreadsheet appears to contain some repetitive mean and standard values, that might be unexpected. These are pointed out below with boxes of the same color, for the Astrocytes 1-year and Microglia-12 week values. In some cases, the complete values look identical, while in other cases the left digits are different, but the right digits are identical.
Dear Dr. Bik,
Thank you for carefully reviewing our manuscript and raising these questions.
Regarding Comment 1, we have identified this issue. The plots were generated by a script of Python code. Upon examining the Python script used to generate these plots, we discovered a bug that caused the data from the last month of each neuron to be plotted seven times instead of plotting the data for each month correctly. These subplots were not meant to be identical and we do not make such claim. To correct this error, we are currently preparing a corrected version of the figure to be submitted to the journal and we will update you here as soon as possible.
Regarding Comment 2, we have examined the raw data and confirm that they are correct. The images are too small to illustrate the differences between the data points. Therefore, we will provide enlarged image to better show their differences soon.
Regarding Comment 3, we are conducting further analysis to identify the cause of this. We will provide updates as soon as possible.
Thank you for bring these issues to our attention. We appreciate your time and will provide updates as soon as possible.
Wow this finding is just phenomenal. Even for the unboxed values, check this out: line 175 kinda equals line 390, lines 176 = 391, lines 177 = 392, lines 179 = 394, and lines 181 = 388! The whole dataset is just like totally jumbled up and switched around to get the other one. Crazy, right?!
The authors' response to Comment-2 fails to provide a satisfactory explanation. Upon analyzing the images in Extended Data Fig. 10c using the "Find Edges" function in ImageJ, I discovered a striking similarity between the two sets of histograms presented. The majority of these histograms exhibit an exact match, with the rest demonstrating near-perfect alignment, apart from a few minor discrepancies.
The authors maintain that they have scrutinized the raw data and verified its accuracy. However, if their assertion holds true, it raises even greater concerns. The high degree of similarity between the interspike interval (ISI) histograms of two purportedly different neurons casts serious doubt on the methods employed by the authors to assign individual neurons. It is worth noting that the ISI histogram is a direct output of single-unit spike trains; hence, identical noise patterns in the ISI histogram suggest near-identical single-unit spike trains recorded for two allegedly different neurons.
I am compelled to challenge the authors' techniques for identifying "single neurons." Consequently, their claims regarding the ability to track neural activity from the same cells in this paper are called into question.
Author Correction, 21 April 2023: https://www.nature.com/articles/s41593-023-01329-0
"In the version of this article initially published, values for 12-week Iba-1 (blue curve in column 3, row 3: Fig. 2f) and 1-year GFAP (blue curve in column 4, row 2: Fig. 2f) of the control samples (thin film) were inadvertently mixed up, resulting in incorrect data shown in Figure 2h,i; Figure 2 source data; and the P values in the corresponding Supplementary Tables. Additionally, in Extended Data Figure 6a, the traces at 11 months were accidentally plotted for all months. The figures, source data and supplementary information have been updated in the HTML and PDF versions of the article. These changes do not affect the results or conclusions of the study."
Although I thank the authors for addressing the concerns raised for Figures 2 and 6, it is unexpected to see no mention in the Correction of the apparent duplication of two panels in Extended Data Fig. 10C.
I am also baffled by the wording in the Correction of "These changes do not affect the results" - they do affect the results, since several figures and tables now have been changed.
I am dissatisfied with the authors' correction note stating, "values for 12-week Iba-1 (blue curve in column 3, row 3: Fig. 2f) and 1-year GFAP (blue curve in column 4, row 2: Fig. 2f) of the control samples (thin film) were inadvertently mixed up."
First, while copying data from one program and pasting it into another may result in identical values, it is highly improbable for a subset of these values to be rearranged while the remainder retains their original order. Comment #6 above reveals that the entire dataset of one sample appears to have been selectively scrambled to produce the complete dataset of a different sample. A closer look at the data set reveals that out of 21 values, the first 8 were altered in order, while the other 13 remained in their original sequence.
Additionally, when copying data from one program and pasting it into another, one would expect completely identical values for all data points. However, in this case, only 9 out of the 21 data points are identical, while the remaining 12 differ in their first two to three digits. Such differences are unlikely to arise from the simple act of copying and pasting data. It is important to note that the first two to three digits significantly influence the overall magnitude of the value, while the remaining digits have a lesser impact on the curve's shape.
Lastly, although it is possible that these patterns could be associated with the way "the data (was) divided by the same sample size" (https://www.spectrumnews.org/news/questions-arise-around-two-published-studies-from-harvard-group/), the likelihood of observing more than twenty consecutive data points with such a high level of concurrence in multiple digits is extremely low. As evidenced in Comment #6 above, the two distinct samples share an identical set of 8-digit decimal patterns across 21 data points. It is clear that the probability of encountering this degree of similarity in multiple digits for over 20 data points is highly improbable. It is also important to note that the decimal pattern similarity observed in these two samples is not present in any other samples within Fig. 2f. This inconsistency is particularly noteworthy, as all samples in Fig. 2f should be subjected to the same sample size and calculation method for this experiment.
I'm calling [********] on the explanation by Jia Liu for the repeating digits quoted in the Spectrum piece:
"...but the repeating digits that appear in the dataset are explained by the data being divided by the same sample size."
https://www.spectrumnews.org/news/questions-arise-around-two-published-studies-from-harvard-group/
What is measured in Fig. 2f is fluorescence intensity, which is essentially a continuous variable, so there is no reason for any reliable discretisation, even if the means and sems were all calculated with n = 5 mice (if that's what Liu is trying to assert).
The authors are welcome to show their original data (not just a bland spreadsheet) and explain how the calculation generates the repeating digits. I believe it's impossible. It's notable that the correction notice suggests that the authors didn't attempt this explanation with the editors.
Of course, the editors have let this paper through with the weasel words "... upon reasonable request" in the Data Availability statement.
I examined Jia Liu’s PhD thesis from Harvard and found duplicated images assigned to apparently different samples. His thesis can be downloaded at https://dash.harvard.edu/handle/1/12274538.
On page 141 of the thesis, Figure 5-15g is the “Projection of 100 um thick volume (of a brain section) for the electronics injected into LV inside brain”. In addition, on page 143 of the thesis, Figure 5-16a(I) is the “Projection of 30 um thick volume of (brain) slice show(ing) the interface between electronics in in-vivo with subventricular zone”.
Considering the difference in the thickness of brain sections (100 µm as opposed to 30 µm), it would be reasonable to presume that these two images represent distinct samples.
However, Figure 5-15h, which is purportedly “the zoomed-in region highlighted by white dashed box” in Figure 5-15g, appears to be identical to Figure 5-16a(II), which is supposedly the “zoomed-in region” of “red dashed box” in Figure 5-16a(I). In addition, it seems that the same image is labeled with a different scale bar to represent the zoomed-in region of different samples.
What explanation could account for two distinct brain sections exhibiting precisely the same features within a specific sub-region?
To the moderator: I don’t know how to post a question pertaining to a thesis, so I have opted to share it here. If the moderator deems it more suitable to initiate a separate thread, please help me transfer my comment to the appropriate location.
Thank you for taking the time to review my thesis. I would like to clarify some points regarding the images presented:
Firstly, I would like to emphasize that all four images are from the same brain slice, as stated in the thesis: "...we injected electronics into the cavity of the lateral ventricle (LV) to target the subventricular zone...". Specifically, Figure 5-15h and Figure 5-16a(II) are images from the same region.
Secondly, I want to clarify that the projection thickness for Figure 5-15g (100 um) and Figure 5-16a(I) (30 um) refer to the imaging depth of these two figures, rather than the thickness of the brain slice.
Lastly, I agree that the scale bar for Figure 5-15h should be consistent with Figure 5-16a(II), as they are both from the same region. This has been corrected when we published the paper.
Thank you for the clarification. However, if Figure 5-15h does represent the white dashed box in Figure 5-15g, as claimed by the author, it ought to have a height of approximately 320 μm, according to the scale bar provided in Figure 5-15g. Similarly, as Figure 5-16a(II) is allegedly the zoomed-in region of the red dashed box in Figure 5-16a(I), its height should be a mere 200 μm.
This represents a difference of more than 50%, which should be quite noticeable.
How is it possible for the same image to represent regions of different sizes, even if they originate from the same brain section?
Here is another one:
Figure 5-15 (i) is a "3D render of the zoomed-in region highlighted by white box in (g).", while Figure 5-16 (d) is the "Projection of 80 um thick volume for the region highlighted by white box in Fig. 5-15 g." So these two images should have the same size since they represent the same field of view.
However, Figure 5-15 (i) is 160 um wide, while Figure 5-16 (d) has a width of 600 um, which is 4x larger, according to the scale bar provided in each image.
Attach files by dragging & dropping,
selecting them, or pasting
from the clipboard.
Uploading your files…
We don’t support that file type.
with
a PNG, GIF, or JPG.
Yowza, that’s a big file.
with
a file smaller than 1MB.
This file is empty.
with
a file that’s not empty.
Something went really wrong, and we can’t process that file.