(C) PLOS One
This story was originally published by PLOS One and is unaltered.
. . . . . . . . . .



TimeTeller: A tool to probe the circadian clock as a multigene dynamical system [1]

['Denise Vlachou', 'Mathematics Institute', 'Zeeman Institute For Systems Biology', 'Infectious Disease Epidemiology Research', 'University Of Warwick', 'Coventry', 'United Kingdom', 'Maria Veretennikova', 'Laura Usselmann', 'Division Of Biomedical Sciences']

Date: 2024-04

Recent studies have established that the circadian clock influences onset, progression and therapeutic outcomes in a number of diseases including cancer and heart diseases. Therefore, there is a need for tools to measure the functional state of the molecular circadian clock and its downstream targets in patients. Moreover, the clock is a multi-dimensional stochastic oscillator and there are few tools for analysing it as a noisy multigene dynamical system. In this paper we consider the methodology behind TimeTeller, a machine learning tool that analyses the clock as a noisy multigene dynamical system and aims to estimate circadian clock function from a single transcriptome by modelling the multi-dimensional state of the clock. We demonstrate its potential for clock systems assessment by applying it to mouse, baboon and human microarray and RNA-seq data and show how to visualise and quantify the global structure of the clock, quantitatively stratify individual transcriptomic samples by clock dysfunction and globally compare clocks across individuals, conditions and tissues thus highlighting its potential relevance for advancing circadian medicine.

The cellular circadian clock consists of an interacting set of genes that through their interactions oscillate throughout the day. This oscillator also responds to external cues so that the genes oscillate in phase with external environmental rhythms. A cell therefore uses its circadian clock to provide its genes with information about the external time. In this way it can coordinate many of the processes taking place in the cell and allocate some of these processes to specific times of the day. It is becoming increasingly clear that the quality of this timing information influences onset progression and outcome in a number of chronic diseases such as cancer. Our aim is therefore to develop a machine-learning tool that can assess how well the clock is working. We want to use this with patients and therefore, for clinical utility, it needs to work with only a single clinical sample and to produce reproducible results that can be clearly interpreted and easily compared.

Funding: This work was supported by the UK Engineering & Physical Sciences Research Council (EPSRC) (MOAC Doctoral Training Centre grant number EP/F500378/1 for DV and EP/P019811/1 to DAR), by the UK Biotechnology and Biological Sciences Research Council (BB/K003097/1 to DAR), by Cancer Research UK and EPSRC (C53561/A19933 to MV, RD & DAR), by the Anna-Liisa Farquharson Chair in Renal Cell Cancer Research (to GAB) and the UK Medical Research Council Doctoral Training Partnership (MR/N014294/1 for LU and VV). The funders played no role in study design, data collection and analysis, the decision to publish, or the preparation of the manuscript.

Data Availability: A TimeTeller R package is available from GitHub at https://github.com/VadimVasilyev1994/TimeTeller . All data used except that of Bjarnasson et al. is publicly available from the sites given in the Supplementary Information. The Bjarnasson et al. data that is used is available with the R package.

It is also important to stress here that with the currently available data we will have to make and justify some assumptions on the cross-validity of data from different tissues in order to combine the data. For example, in order to estimate the probability model for a particular tissue we would ideally like to use training data that is only from that tissue. In particular, this is not possible for the mouse and baboon datasets as adequate amounts are not currently available and we therefore have to pool data from several tissues. To do this we choose an appropriate rhythmic gene panel based on good cross-tissue synchronicity and, after validation of this, use normalisation to overcome tissue differences in the way explained below and in Fig A in S1 Appendix . For our human datasets we pool across individuals rather than tissues. Another potential limitation comes from the fact that our current RNA-seq training data is only available at a few training time points around the day. Nevertheless, even with these handicaps we obtain very informative results and provide plenty of evidence that the approaches adopted work well. As more data becomes available this situation can only improve.

It is important to understand the limits on what we regard as dysfunction in our discussion. TimeTeller’s view of functionality is based on statistical analysis of gene expression and not on timing of physiological processes. The probability structure of the dynamical system behind a circadian clock is primarily described by the joint probability distribution P(t, g) of the external time t and the expression state g of the core clock genes or some representative function or subset of them. This distribution determines the conditional distributions P(t|g) and P(g|t). These distributions tell us respectively the distribution of g when the time is t and the probability distribution of times t that are found when the gene expression state of the clock is g. The distribution P(t|g) is a critical quantity because the cell has to use some function of the state of the gene products as a surrogate for t and the variance of P(t|g) tells us how well cells can tell the time by just seeing the clock gene state. If g comes from a test sample taken from a well aligned clock with internal time T (possibly distant from the SCT) then we would expect that P(g|T) would be relatively large and, as a function of time t, P(t|g) would be sharply peaked at t = T. From the point of view of TimeTeller if either of these breaks down then the sample’s clock is regarded as dysfunctional to some extent. We quantify this breakdown by a measure ML of the probability that g is drawn from the training clock and another that combines a measure of the variance of the clock’s estimate of the time and a quantity related to the existence of multiple peaks in P(t|g).

Our aim is to develop a tool that (i) provides a multidimensional picture of the clock’s dynamics and structure that integrates the behaviour of multiple genes, (ii) provides a quantitative analysis at the systems level of clock data, (iii) enables a quantitative comparison of different clocks and (iv) enables a quantitative assessment of clock dysfunction both in the core clock and in downstream target genes. We are aiming for a tool that can determine the presence of a dysfunction causing perturbation from just one sample and that can stratify individuals based on clock functionality, and, thus, might be useful to develop as a clinically actionable biomarker. For example, we show that such a stratification can enable the identification of differentially expressed genes between samples that have better and worse clocks. Finally, we consider new methods for comparing clocks across different individuals, tissues and conditions, identifying a “molecular chronotype” associated with these, and uncovering the effect of clock perturbations on downstream genes.

To effectively quantify functionality in individual transcriptomic samples such as those from patients requires reproducibility, comparability and interpretability. Therefore, the results on a given test sample should be independent of those on other test samples and should not depend upon the particular test dataset being considered. Even for timing estimation alone this does not seem possible with the phase-estimation algorithms mentioned above apart from TimeSignature [ 19 ] which requires two samples. However, the key point differentiating TimeTeller from TimeSignature and the other algorithms is that, apart from identifying timing deviations, these do not provide any other assessments of clock functionality or other quality controls on the individual timing assessments. This is essentially also true for ZeitZeiger [ 15 ] but with the caveat that it, like TimeTeller, uses a likelihood curve that it might be considered could be used in a similar way to TimeTeller’s to assess functionality. However, although differences in ZeitZeiger’s likelihood between WT/control and perturbed clocks in controlled experimental situations has been discussed [ 15 ], it has not been proposed or statistically analysed as a measure of dysfunction and has not been used as such when ZeitZeiger has been employed to analyse timing variation in populations [ 23 , 28 , 29 ]. Moreover, analysis by ZeitZeiger of new data as described in [ 15 ] involves renormalizing and batch-correcting this data with the training data and then retraining, resulting in a different predictive model every time and therefore potentially sacrificing the reproducibility, comparability and interpretability discussed above.

The core mammalian circadian clock involves more than a dozen genes [ 27 ] and therefore the regulatory system is a high dimensional stochastic dynamical system. Since emergent systems properties such as oscillation, synchronisation, entrainment, phase-locking, robustness, flexibility and temperature compensation are critical for the functioning of the clock, tools that enable the analysis of the circadian clock’s systems properties are very much needed. Moreover, a substantial amount of data is becoming available including whole transcriptome time-series that should facilitate such systems analysis using mathematical modelling, statistics and machine learning. However, probing the global behaviour of such a system is a highly non-trivial task and almost all analysis of clock data focuses on individual components and connections. This is not the case for the phase estimation algorithms mentioned above but they adopt a model-blind machine-learning approach. While such approaches can be effective it is difficult to see how to quantify clock functionality independently in individual samples without taking advantage of the clock’s structure as a stochastic dynamical system because it is this that determines the well-defined probabilistic structure describing the relationship between time and multidimensional gene state that, via statistical theory, can be linked to functionality.

As a result a number of phase-estimation algorithms have been designed to estimate the molecular clock phase of the circadian clock, i.e., its “internal time”, from the measured levels of rhythmic gene expression [ 14 – 22 ]. If the sample collection time (SCT) is known, then divergence between the estimated timing T and the SCT indicate the possible presence of clock dysfunction and, indeed, this internal phase T has been proposed as a clinically actionable biomarker [ 23 ]. There are problems with such an approach, the most obvious of which is that this internal time may well depend substantially upon genotype or environment (as we show below) and the consequent deviations are unlikely to be related to dysfunction. A different attempt at a systemic approach to define molecular clock disruption has used pair-wise correlations between clock genes across large transcriptomic datasets [ 24 ]. At the population level, this showed greater dysfunction in solid tumours compared to healthy tissue. However, this approach compared datasets of cohorts with each other and, as the authors pointed out, does not lend itself to assessing clock function in single samples. A similar approach using clock correlation matrices together with CYCLOPS ordering [ 18 ] and a measure called nCV [ 25 ] that correlates positively with clock amplitude was used to address clock dysfunction in pancreatic cancer cells [ 26 ].

The mammalian cell-endogenous circadian clock temporally regulates tissue-specific gene expression driving rhythmic daily variation in metabolic, endocrine, and behavioural functions. Indeed, up to half of all mammalian genes are expressed with a circadian rhythm in at least one tissue [ 1 , 2 ] and approximately 50% of all current drugs target the product of a circadian gene [ 1 ]. Moreover, recent studies demonstrated that the circadian clock influences therapeutic outcomes in a number of diseases including heart disease and cancer [ 3 – 9 ], and that disruption of the normal circadian rhythm and sleep (e.g., through shift work) is associated with a higher risk of obesity, hypertension, diabetes, chronic heart disease, stroke and cancer [ 10 – 13 ]. There is therefore a rapidly growing interest in developing circadian medicine tools that aid the incorporation of time in order to provide safer and more efficacious therapeutics.

Results

Multidimensional visualisation provides important information about phenotype When constructing the probability model, the TimeTeller algorithm projects the G-dimension REVs into fewer dimensions using a local version of principal component analysis (Methods, and Note E and Figs E and F in S1 Appendix). This gives a different projection for each time in the dataset and the algorithm extends this to all times around the day. If for the G-dimensional data the distributions P(g|t) are approximately multivariate normal (MVN) then the corresponding distributions of the projected data optimise the capture of the dominant gene-gene correlations after projection (see Methods). We find that for our datasets d = 3 is sufficient for this (e.g., see Fig F in S1 Appendix) and the resulting 3-dimensional model of the clock provides a very informative visualisation. Fig 1A shows such a visualisation for the mouse multi-organ microarray training data from Zhang et al. [1] when timecourse normalisation has been applied. TimeTeller actually produces such a local projection visualisation for each time in the training dataset as shown in Figs E and G in S1 Appendix but normally inspection of just one of these is adequate and we only show one in Fig 1. With each such visualisation we also show the curve given by the means of the estimated distributions P(g|t) as t varies over the day. Also in such plots we often provide for a sample of times t an ellipsoid showing the covariance structure of the estimated distribution P(g|t) (see caption of Fig 1). We color the training data points and mean curve by time with a color coding as given in the legend of Fig 1 using the sample time for the data points. The same color coding is used throughout the paper. Fig 1B plots microarray test data from Fang et al. [32] comparing it with the Zhang et al. microarray training data [1] in Fig 1A. This test data compares liver samples of Nr1d1 (Rev-erb α) knock-out (KO) and wild-type (WT) mice entrained to light-dark (LD)12:12 cycles. The gene Nr1d1 is a core clock gene of the mammalian circadian clock important in one of the interlocked feedback loops and a key link to metabolism [33]. Knocking it out leaves a functional but perturbed clock when compared to WT mice [32]. Since Nr1d1 is a member of the default REP it would not be surprising that TimeTeller could distinguish Nr1d1 KO mice from WT mice, and indeed this is the case. Therefore, for this validation, we exclude Nr1d1 from the REP genes. The visualisation shows that while the WT data appears to fit well with the training data, the KO data has a consistent substantial difference. TimeTeller is able to detect this apparent difference in each of the four KO samples and shows a coherent difference from WT. This suggests that the Nr1d1 KO mice have a significantly perturbed clock when compared to WT mice (Fig 1B). However, it is still somewhat functional as it gives approximately correct timing and the level of sample variation between WT and KO is similar. We investigate this further below. The other test data we visualise (Fig 1C) in this figure is from Kinouchi et al. [34]. This contains samples analysed by RNA-seq from mouse skeletal muscle taken around the clock in LD 12:12 [34], and compares mice that had been fed ad libitum (FED) with mice that had been starved for exactly 24 hrs prior to point of sampling (FAST). On the one hand, while FED samples align with the RNA-seq training data, the FAST samples are substantially perturbed (Fig 1E). On the other hand, the FAST samples show consistency in that for a given sample time they tend to cluster together. A similar visualisation for the liver samples from [34] is given in Fig H in S1 Appendix. It should be noted that the test samples from Kinouchi et al. [34] have been collected in LD whereas the training dataset was collected on the first three days in constant conditions. Interestingly, there is little difference between FED (control) and training dataset mice, which might be due to the fact that the free-running period of these WT C57Bl/6 mice is around 23.8 hours. Other examples demonstrating the utility of such visualisation are discussed below.

Analysis of single test samples TimeTeller’s estimate of P(t|g) from the training data is used to analyse test data. For a normalised test data REV g our estimate L g (t) of P(t|g), which we regard as a function of t, is referred to as the likelihood curve (LC) for the corresponding transcriptomics sample. The quantities for functionality assessment are associated with this LC. For example, we define the internal phase T of a test REV g as the time at which the estimated likelihood function L g (t) ≈ P(t|g) is maximal i.e., the maximum likelihood estimate. Given T, we define the likelihood ratio function (LRF) as R g (t) = L g (t)/L g (T), i.e. it is the LC but normalised so that the value at the maximum is 1. The internal phase can be compared with the SCT but, as noted above, there may be consistent phenotypic deviations of T from the SCT in genetically heterogeneous populations. It is important to emphasise that when we analyse test data the results for any test data sample are independent of the results for any other test data sample. This is because the calculation of the likelihood curve of a test data sample only involves the probability model and the test data sample and has nothing to do with the other test samples. Therefore, the result for any test sample will be exactly the same as if it were the only sample in the test dataset. Fig 2A shows the estimated LRFs for the Zhang et al. microarray data [1]. Each LRF’s highest peak is centered at 12noon to enable visual comparison of many LRFs, a plotting technique used throughout the figures. Many examples of estimated LCs and LRFs can be seen in Figs 1–5 and the S1 Appendix. The resulting predicted timing plotted against the sample time is shown in Fig 2A together with the times corrected to allow for the chronotype explained in the section below on timing. LCs for the Bjarnason et al. human training data [30] are shown in Fig 2C and Fig I in S1 Appendix. These show the general form of the LCs and demonstrate that one can clearly observe qualitative differences between one individual’s LC and those of the others. Apparent timing errors and timing deviations in the training data. For each training dataset we used an appropriate leave-one-out cross-validation approach to compare the sample collection time T a with the estimated time T and evaluated the apparent timing errors T − T a for each sample. The mean and median absolute timing errors (MAEs) for the training datasets are shown in Table 1. We then analysed how the mean timing error varies with tissue, individual or condition to see if there is a consistent timing deviation for any of these. When these deviations are clear and statistically significant we call the mean of them the timing displacement of the tissue, individual or condition. We show below that for the mouse and human training datasets the observed timing displacement is associated with coherent phase changes in the genes. Therefore, in assessing the performance of TimeTeller the apparent timing errors should be corrected to take account of this. The timing displacements of the different mouse tissue in the Zhang et al. data [1] are relatively small (Fig G(G) in S1 Appendix) but, for the more genetically heterogeneous human population of the Bjarnason et al. data [30], we found significant and consistent timing displacements on the individual level (Fig 2E and Fig I in S1 Appendix). When the apparent errors are adjusted for this they are often substantially reduced (Fig 2D and Table 1). For this data this reduction is of the order of 50%. Table 1 shows that timecourse and timecourse then intergene (both) normalisations are performing significantly better than intergene alone. It is difficult to compare performance with that of the published algorithms mentioned above as they have been used on different datasets collected under different conditions and there has been relatively little work on time-stamped genetically heterogeneous data. The Zhang et al. microarray dataset [1] was also analysed by ZeitZeiger and the mean absolute errors on cross-validation were between 0.6h and 1.1h [15]. On these tissues the results for timecourse normalisation with TimeTeller are very similar to those of ZeitZeiger (Table B in S1 Appendix). Moreover, TimeTeller’s apparent timing errors for the genetically heterogeneous human data compare well with those found in other studies which typically have a median absolute error (MdAE) greater than 1.4h. For example, in the study [23] the 1-sample method had a MdAE of 1.6h and the 2-sample method had a MdAE of 1.4h-1.7h and when CYCLOPS was validated against pre-frontal cortex biopsies with annotated time in [18] the MdAE was 1.69h. In an impressive application to data from four distinct human studies TimeSignature [19] reported MdAEs between 1.21h and 1.49h although this requires two samples for each individual. While TimeTeller’s timecourse normalised results for the genetically heterogeneous Bjarnason et al. [30] and Mure et al. data [2] (Table 1) compare favourably with these results we do not wish to claim timing superiority as there is great heterogeneity in the studies giving rise to the data that was analysed and in the transcriptomics platforms employed. Maximum likelihood ML. Given a test sample REV g, the value of ML = L g (T) (i.e. the maximum likelihood of g) is a key diagnostic as, if M denotes the maximum value of the distribution P(⋅|T), we can regard λ = log(ML/M) as a likelihood ratio test statistic for a pure significance test of the hypothesis that g is drawn from the training clock. Thus, a low value of ML relative to the values obtained by training or control data is indicative of the fact that g comes from a clock that is substantially different. We refer to dysfunction of this kind as low ML (lowML). An initial evaluations of the ML values for both training and test data is a key first step of an analysis using TimeTeller. Dysfunction metric Θ. Statistical theory tells us how to estimate the confidence interval for the maximum likelihood estimator T of internal timing for any given degree of confidence using the LRF (Methods and Note D in S1 Appendix). The variance of T arises because g is a random sample from the clock at time t and we want to know how T will vary with other such samples because high variance implies imprecise timing. We call such dysfunction high variance timing (highTvar). The Cramér-Rao Theorem [35] gives a lower bound for variance in terms that can be related to the LRF (Note D in S1 Appendix). Our metric Θ is the proportion of time in the day that the LRF spends above the curve C(t|T) defined in Methods and is associated with the length of such a confidence interval (Note D in S1 Appendix) and therefore Θ gives an assessment of this sort of dysfunction and higher Θ is associated with higher dysfunction. However, our likelihood curves often contain structures that are relevant to assessing dysfunction but which are not covered by this aspect of statistical theory. One important case is where when g has significant dysfunction of type lowML and the other is where the LC and LRF contain significant secondary peaks that have a lower likelihood than that at T. Complex data sets from diseases such as cancer can contain all of these dysfunction signatures with some samples displaying a single type and others a mixture of more than one. As well as seeking to characterise the type of dysfunction, we attempt to construct a statistic that integrates the different types into a single measure. This is our dysfunction metric Θ. As defined by us (see Methods) this metric will contain a contribution from all of these aspects that are present and therefore ML and Θ are not independent. We discuss this further in the following sections after discussing the values of Θ and ML in training data. Θ and ML for training data. To continue the evaluation of TimeTeller’s LCs, and the corresponding dysfunction metrics Θ and ML we first tested it on the Zhang et al. [1] and Bjarnason et al. [30] training datasets using the appropriate leave-one-out cross-validation approach. The results showed consistently low Θ values and relatively high maximum likelihoods across tissues for the genetically homogeneous mouse datasets and genetically inhomogeneous individuals for the human data (Figs G, I and J in S1 Appendix).

The potential for the use of the Θ stratification to identify differential effects in patients It is particularly interesting and important to apply TimeTeller to genetically heterogeneous human data because it allows us to test the idea that it can uncover corresponding heterogeneity in the “clock” phenotype or effects on individuals such as patients. We firstly consider data from a study of the effects of cigarette smoke on the human oral mucosal transcriptome, In this study (Boyle et al. [43]) transcriptomes from buccal biopsies of 39 current smokers (≥ 15 pack-year exposure) and 40 age- and sex-matched never smokers (< 100 cigarettes per lifetime) were analysed and compared. The authors found that smoking altered the expression of numerous genes but none of those found were core clock genes nor did they consider the effect of smoking on the circadian clock. They found smokers had increased expression of genes involved in xenobiotic metabolism, oxidant stress, eicosanoid synthesis, nicotine signalling and cell adhesion and decreases were observed in the genes CCL18, SOX9, IGF2BP3 and LEPR. It has been reported elsewhere that smoking has an impact on multiple sleep parameters and significantly lowers sleep quality [44–46] and this was confirmed in an experimental study which also correlates poor sleep to inflammation [47] while inflammation has been linked to clock disruption. Moreover, CS exposure has been shown to cause circadian disruption in the lungs of WT mice and this is exaggerated in the Nr1d1 knockouts [48] and has a connection to Arntl [49]. Interestingly, when analysed by TimeTeller (Fig 3A–3C) we see a clear and statistically significant difference between the Θ values of the never smoked and smoking individuals (Fig 3A) which is reflected in the 3D visualisation (Fig L(A) in S1 Appendix). Inspection of the LRFs show that the variations in Θ come mainly from second peaks rather than low ML (Fig 3B). Indeed, the ML values for smokers and non-smokers were not significantly different although the smokers had more observations with a very small ML (Fig L(A-C) in S1 Appendix). The lowest values were around e−11 suggesting that a l thresh of about -12 would be appropriate. A significant proportion of the smokers had Θ values similar to those of the never-smokers but many had much higher values (Fig 3A). Therefore, we asked if we could identify differentially expressed genes (DEGs) between the individuals with high Θ versus those with lower Θ. To do this we tested for differential gene expression between the n worst clocks (defined as the bad clock group (BCG)) and the others (good clock group (GCG)) adjusting the p-value appropriately to allow for the multiple testing. For a fixed l thresh in the range from 11 to 13 with n between 8 and 20 we found many differentially expressed genes (DEGs) at the appropriately adjusted p = 0.05 level including some clock genes (Fig L(D,E) in S1 Appendix). However, the particular genes found were sensitive to changing the value of l thresh among the suggested values of -11, -12 or -13 and changing the group size n. We calculated that the probability of finding such numbers of DEGs by chance is extremely low (Fig L(H) in S1 Appendix) and we noticed significant differences between the behaviour when the BCG size was in the range 8 to 20 from that when it was 30 to 40. Therefore, we estimated by simulation the probability p rand (m) of finding m or more DEGs by chance when we choose a random group of n individuals for our BCG and compared this to the probability p Θ (m) of finding m or more DEGs when the stratification by Θ is used to choose the BCG and l thresh and n are chosen randomly in the ranges -11 to -13 and 8 to 20. We find that uniformly in m, p Θ (m)/p rand (m) > 100 (Fig 3C). We get an interestingly different result if we instead let the group size n range between 30 and 40. The probability p rand (m) behaves in approximately the same way but p Θ (m) does not (Fig 3C(inset)). For m very small p Θ (m) is high but as m increases p Θ (m) rapidly decreases to values much smaller than those for p rand (m). There is a 34.26% chance of getting no DEGs but when this is not the case there is a more than 99% chance of getting the gene PER3 and a 68% chance of getting NR1D2. Thus, this analysis identifies two interesting groups of individuals with a nontrivial transcriptional phenotype that distinguishes them from the individuals with good clocks. One of these groups appears to be associated with differential expression of PER3 and NR1D2, genes not identified in the original paper where all non-smokers and smokers were compared. In conclusion, any link between smoking and clock dysfunction is likely to be complex, but these results suggest that in a genetically heterogeneous population where the effects of a perturbation such as smoking are likely to be diverse, TimeTeller’s Θ stratification can help identify individuals or groups where the smoking effect is significant. As a final example of this section we consider the distribution of Θ values by disease state for the transcriptomic data of healthy or dysplastic oral mucosa and oral squamous cell carcinoma (OSCC) from Feng et al. [50]. Since the lowest ML values were around e−11 a l thresh of -12 was used. The ML values for normals and cancer were not significantly different although the cancer group had more observations with a very small ML (Fig L(F) in S1 Appendix). However, there is a highly significant difference in median Θ values between the cancer group (167 individuals) and the the normal mucosa group (45 individuals) (p < 0.002) (Fig 3D). Moreover, there appears to be significant dysfunction in terms of timing estimation (Fig 3E) that can be significantly ameliorated if the second peaks in the LRF is used for timing when the first peak is clearly misleading (Fig 3F, details below). Inspection of the LRCs (Fig 3G) shows that, as for the Boyle et al. data [43], the variations in Θ come mainly from second peaks rather than low ML. As for the Boyle et al. data [43] we asked if there are DEGs between the worst clocks in the cancer group (high Θ) and the best clocks within the same group and carried out a similar analysis. For genes in general and BCG sizes n between 12 and 40 we find similar results with p Θ (m)/p rand (m) > 100 for the number m of DEGs between 2 and 1200 (Fig 3H). Many of these DEGs are associated with gene signatures such as DNA repair, E2F targets, G2M checkpoint and the mitotic spindle. However, we do not find any groups like that for the Boyle et al. data [43] (with n between 30 and 40) that have very low numbers of specific DEGs. A study of the estimated timing T for this data (Fig 3E) was very informative. The estimates for the normal data are generally between 7 am and 3 pm. A large number of cancer samples have unlikely times well outside the normal working day and the median is clearly much too early. Interestingly, it appears that the mistimed samples are primarily so because the likelihood curve has a second peak (Fig 3G) and the peak giving an unreasonable timing estimate is slightly higher than one giving the best estimate. In fact, there are 109 samples whose timing T is before 7am and 93 of these have a second peak. if we replace the timing by that given by the second highest peak, the great majority moved to a time firmly in the early afternoon between 12noon and 4pm (Fig 3F). As a result 74% of all samples then fall in this time slot and only 7% remain before 7am. The analysis in the next section indicates that this corrected timing is likely to be the correct time of sampling to within approximately 0.4h.

TimeTeller’s precision on non time-stamped cancer data The only method currently utilised to estimate the precision of timing/phase algorithms is to use time-stamped data and compare the algorithm’s predicted times T with the SCT time stamps t. However, such a measure of precision is problematic when the individuals, tissues or conditions have a nontrivial molecular chronotype as is the case with the human data considered here and cannot be done if the data is not time-stamped. A related test which avoids these problems is instead to determine the variance or standard deviation of the distribution P(T|g) where T is the predicted time and g is the relevant REV (Note F in S1 Appendix). This addresses the question of how well the estimated timing T is determined by the REV g. Interestingly, we can calculate this precision measure even in some cases where we have no timing data and where there is dysfunction and the Feng et al. data [50] gives a very informative example of this. To illustrate this we study the 77% (176 samples) of that data for which the estimated timing T after adjustment by second peaks is between 12 noon and 4pm (Fig 3F). We ask if within this data we can see coherent timing structure or not. We can estimate the required standard deviation by carrying out a principal component (PC) analysis of the expression data (see Note F in S1 Appendix) and plotting the projection of these data onto the first PC against the predicted time T (Fig 3I). An upper bound for the standard deviation of P(T|g) can be estimated from this (Note F in S1 Appendix) and we obtain an estimate of less than 0.4 hours. If we consider all the deviations from the mean for the timings T (given by the horizontal deviation of the relevant data point from the red curve in Fig 3I) across all of the REVs in this data we obtain the distribution shown in Fig 3J. Remarkably, although the data is not timestamped and has significant dysfunction giving rise to significant second peaks, TimeTeller is able to accurately measure the internal phase T of the clock as a function of the REV g. We carried out a similar analysis (see Note F in S1 Appendix) and found similar results but a bigger standard deviation of 0.83h for the large breast cancer dataset analysed by Cadenas et al. in [51]. In this case there is no need for adjustment for second peaks as 86% of the data has its predicted time T between 10am and 8pm (Note F in S1 Appendix). We believe this approach gives a new simple method to assess timing performance.

Comparing clocks across individuals, conditions and tissues Current analyses comparing the circadian clock across individuals, tissues and conditions such as the three studies we consider below proceed by analysing the behaviour of the individual interesting genes separately. Such analyses tend to focus on the level of expression and do not take into account correlations between related genes. We asked whether using TimeTeller such an analysis could be done in a more integrated way treating the clock as a noisy multigene dynamical system (and hence using correlations) and whether such an approach uncovers some aspects that are hard to see when done gene by gene. The key results here are that it enables us to identify coherent differences in timing across individuals, conditions and tissues and that using these we can determine in a quantifiable way if the timing differences come from a more or less coordinated change in gene phases. Using TimeTeller to identify a molecular chronotype. The human training data that we consider (Bjarnason et al. [30]) involves genetically heterogeneous individuals and therefore we also asked to what extent in this analysis of time-series data we could differentiate systematic variation of timing in an individual or tissue due, for example, to genetic and/or environmental factors, i.e., a molecular chronotype. We observed above that while the Θ and maximum likelihood values are reasonably consistent across individuals, the apparent timing error was not. For some individuals there were substantial timing displacements arising from intra-individually consistent deviations of the estimated time from the sampling time (Fig 2D). For example, the individuals labelled as 1 and 6 in Fig 2D have substantial statistically significant (p < 0.003) timing displacements in opposite directions. To further understand this, we hypothesised that the timing displacement of an individual might be largely a result of well-coordinated phase changes in the core clock genes. If this is the case there should be a definite relation between TimeTeller’s timing deviations and the phase of the genes. Moreover, since this relationship is local in that the timing displacements are small compared to 24 hours, it is reasonable to suspect that it might be approximately linear. Therefore, we tested for a linear relation between the phase variation of the genes in our panel and timing displacement. In this analysis, we regressed the timing displacement against the phase of each of the genes in the REP (Fig 4) using Cosinor [52] to measure gene expression phase. For all the probes used we observed an approximately linear relationship between timing displacement and the variation in the gene phase with a positive slope (Fig 4 and Fig M in S1 Appendix). For all genes the non-zero slope is statistically significant and the r2 value is greater than 0.7, and for many genes it is greater than 0.9. The latter measures the proportion of the variation in the gene phase that is predictable from the TimeTeller displacement using the linear relationship. Thus TimeTeller is able to clearly identify coherent and substantial phase variation in the clock genes for each individual across all genes in the rhythmic expression profile. It identifies a clear “chronotype” for each individual and a quantifiable phase difference. Moreover, the strong coherence between the time estimations and the gene phases are further validation of TimeTeller’s time estimation. These results suggest that if the sample collection time is known, by combining the observation of a Θ suggesting good clock function with an advanced or retarded time prediction, TimeTeller can help identify substantial coherent phase variation in an individual’s clock genes from a single sample. We will utilise such regression plots in the analyses below where we attempt to characterise the nature of the change in the clock caused by different conditions or in different tissues. We call such plots phase displacement plots (PDPs). Timing divergences and clock comparisons for time-restricted feeding in ageing mice. Recently, Acosta-Rodríguez et al. [37] studied the synergistic effects of various time-restricted feeding protocols with caloric restriction (CR) on the prolongation of life span in mice, focusing on the liver which is a major metabolic target of the circadian clock. After 6 weeks of baseline ab libitum (AL) food access, C57BL/6J male mice were subjected to 30% CR. Mice were fed nine to ten 300mg food pellets containing 9.72 to 10.8 kcal every 24 h starting at the beginning of the day (CR-day-2h) or night (CR-night-2h) constrained to consume their food within 2h. Two additional CR groups were fed a single 300mg pellet delivered every 90 min to distribute the food intake over a 12-h window either during the day (CR-day-12h) or during the night (CR-night-12h). A fifth CR group was fed a single 300mg pellet every 160 min continuously spread out over 24 h (CR-spread). Liver gene expression was profiled using RNA-seq in all six feeding conditions at 6 and 19 months of age. Livers were collected in constant darkness at 12 time points every 4 hours for 48 hours across two circadian cycles. We treat the data from time t and t + 24 as replicates of a 24h cycle. Together with a young and old group where feeding was ad libitum (AL) this results in 12 feeding conditions. We used TimeTeller to analyse this data asking if it could identify the nature of systemic changes in the core clock between the different feeding×age conditions. We used the Zhang et al. RNA-seq data [1] as training data. Thus, all feeding conditions of [37] are regarded as test data. We analysed this using both time-course and timecourse-matched normalisation for the test data. The results are very similar and we give the timecourse-matched results here. Visualisation showed that the test data fell nicely within the trained distribution close to the mean cycle. Analysis as in Note C in S1 Appendix points to using a l thresh of -8. The results on the predicted times T showed a substantial timing displacement (Fig 5A) for eight of the conditions with CR-day-2h being the most extreme. Only 12 of the possible 66 comparisons have p ≥ 0.05. Moreover, there is a striking apparent age-related difference for the CR-day-2h feeding conditions in that the timing displacements of the 6 month and 19 month mice differ by over 4 hours (p < 0.0001). There are some statistically significant differences between the Θ and ML values found for the different conditions (Fig 5B and 5C). This is also noticeable from the centred LRFs (Fig 5D). For example CR-spread-19m has significantly higher ML values than all other conditions and lower Θ values than most, and CR-night-2h-6m has significantly lower ML values and higher Θ values than all but CR-night-2h-19m (Fig 5). However, overall the ML values are relatively high and therefore confirm the observation that, although the timing can be displaced, the test data is close in data space to the training clock. This is compatible with the hypothesis that the different feeding condition induce a simple phase change in the clock. Given these timing displacements, we carried out a comparison of the clocks under the different conditions by analysis using PDP plots where we regressed the phases of the genes against the timing displacements of the various conditions to try and quantify the extent to which the observed timing differences are the result of a coherent phase adjustment of each gene (Fig 5F). For the feeding×age conditions the situation is very clear for the core clock genes considered because the r2 values for them (Fig 5F) are typically close to 1 implying the linear model almost completely explains the data. From this analysis, we conclude that it is likely that the different feeding×age conditions cause a change in the core clock that is primarily a simple phase change and that for some of the conditions such as CR-day-2h this is substantial. In summary, for this data, TimeTeller has enabled the discovery of substantial and coherent differences of the core clock systems state associated to the feeding conditions and provided quantified evidence that the core clocks corresponding to the different conditions differ by a simple phase change. This benefitted from a systems approach. Finally, note that although there is time series data in this instance, since our results on the test data samples are independent of each other having a time series is not necessary and also one could reduce the number of mice involved. This opens the possibility to use TimeTeller as a tool to determine a clock parameter in available QTL studies for longevity and other parameters [53]. Timing divergences and clock comparisons for the Mure et al. baboon data [2]. We found a different result when we compared the clocks in the different tissues studied in Mure et al. [2]. In this paper, the transcriptomes of 64 tissues of the diurnal primate Papio anubis (baboon) were analysed from one animal every 2 hours for 24 hours. The results of [2] demonstrate that many ubiquitously expressed genes that participate in essential cellular functions show a tissue-specific rhythmic pattern, and confirmed a shifted temporal organization of central and peripheral tissues between diurnal and nocturnal mammals. Since this RNA-seq dataset involves a genetically heterogeneous population and multiple transcriptionally heterogeneous tissues, we were keen to assess how well TimeTeller was able to analyse it. We studied 33 of the tissues leaving out those from the brain and some others with missing data. An initial leave-one-tissue-out analysis gave reasonably accurate timing (MdAE around 1.23h, Fig 5G and Table 1) and indicated that many tissues had a substantial timing displacement (Fig 5H and 5J) ranging from approximately -3.5h to +2.5h compared to the time the samples were taken. The standard deviation of the individual sample apparent timing errors around the timing displacement from a given tissue was generally much smaller than the 6h range of the timing displacements (Fig 5H). Moreover, the null hypothesis that the mth most advanced tissue has the same timing displacement as the mth most retarded is rejected at the p = 0.01 level for all m < 7 (Wilcoxon-Mann-Whitney test). Given many tissues had large absolute timing displacements, we then used only the 18 tissues with the smallest for the training data. This gives slightly better timing results than using all 33 tissues as can be seen in Table 1. Correcting the TimeTeller time predictions by adjusting them using the phase displacements of the tissues resulted in a substantial improvement of about half an hour in the timing accuracy (Fig 5G and Table 1). Given the heterogeneities in the data this results in a very reasonable performance with a mean absolute error of just over one hour. The analysis of the variation of the core clock across the 64 tissues in Mure et al. [2] is mainly concerned with the overall transcript abundance and rhythmicity of expression of the individual core clock genes. The authors note that the heterogeneity of this implies different composition of core activators, repressors, and modulators in different tissues. They do not mention the timing divergences we find in the data using TimeTeller. Using these timing divergences, for the limited set of 33 tissues, we can study this in a different and more integrated way. As above, we considered a comparison of the clocks in the different tissues by using a PDP plot (Fig 5K). For this dataset we see that the observed differences between them are not due to a simple coherent phase adjustment in the genes but involves a more complex interaction. This is because the r2 values, which measure of the proportion of total variation of outcomes explained by the linear model, are very low and much lower than those for the Bjarnason et al. [30] and Acosta-Rodríguez et al. data [37]. This suggests that the adjustment of the clock from tissue to tissue is more complex than a simple phase shift in the core clock genes. On the other hand, the relatively low p-values suggest that there is a definite correlation between gene phase and timing displacement suggesting that an appreciable component of the changes in the genes is a phase change. Again this analysis benefitted from a systems approach which enables us to identify coherent differences between tissues and relate this to changes in the core clock.

[END]
---
[1] Url: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1011779

Published and (C) by PLOS One
Content appears here under this condition or license: Creative Commons - Attribution BY 4.0.

via Magical.Fish Gopher News Feeds:
gopher://magical.fish/1/feeds/news/plosone/