# Tropical Cyclones and the Lunar Cycle Effect # Lunar Cycle Effect

## November 08, 2022

ABSTRACT

Evidence is herein presented of a clear association of Tropical Cyclone Variation with the Cycles of the Sun and Moon.

INTRODUCTION

“The Eyeball is the Statisticians most Powerful Instrument”

Miniumm’s First Law of Statistics: Statistical Reasoning in Psychology and Education, 1970 Edward W .Minium

A favorite quote from my earliest statistical reference introduces my favorite research and presentation style: generous use of visual representation and graphics.

I am a recently self-taught amateur statistician of a novice level, solely focused on the subject of my lifetime interest: the discernment of actual astrological factors that are true, reliable, and replicable. Experience wise, I am not well rounded statistically speaking, and usually rely on the Minitab stat program for guidance.

My lack of experience may cause errors in logic or procedures, but no attempts are here made to mislead or shape or ‘data-snoop’ the results. My writing style is often casual and light-hearted. Constructive criticism is welcome.

SOURCES & METHODS

The source of my tropical cyclone information is the IBTrACS website where storm data was downloaded onto an MS Excel spreadsheet then organized into a pivot table. Eight daily (3hr) readings of windspeed were sampled from each NHEM storm then summed for each of the days in a synodic Lunar Cycle: New Moon to New Moon. In a given NHEM tropical cyclone season, May to mid-December, there are about 7 Lunar Cycles and each of those 7 synodic Lunar Cycles are recorded longitudinally in a column as a time series then summed laterally for a cumulative total for that full seven cycle season. Finally, the synodic cycles of all 51 seasons are summed laterally for a climatological grand total. The windspeed data is then standardized and Lunar Phase categories are approximated and assigned resulting in what appears to be a very reliable forecast model.

Table One            Sample Data for Seasons 1970 and 2020

GRAPHICAL ANALYSIS

Chart One

Table Two                        Descriptive Statistics

Included in the above line graphic is a summary of the information that makes up the data set introduced as Table One. I would like to emphasize the weightiness of the data as it includes virtually all of the reliable satellite information on the record.

When all of the storm data is added together, the time series distribution displays what appears to be a small, but long, waveform pattern that exhibits a sharp decrease of 8.2% (25,611 kt), from about day 8 to day 15 at the center of the distribution, and then increases 9% to about day 22. This variation is far more than the Standard Error (1287 = 0.4% of Mean) or even the Standard Deviation (7050 = 2.3%)

At first glance, it would seem that there is something significant here and would be worth investigating.

Chart Two

Magnified View of Observed TC Data

When the observed TC data are standardized, the obtained time series distribution becomes magnified into a very distinct wave pattern with some very large effect sizes. This process enables convenient comparisons with the smaller data sets of individual seasons. Also, I’ve approximated and labeled 30-degree intervals of the cycle as the points of primary interest, for reasons I will explain in a later work. For now, they can be understood as the 12 classical astrological ‘aspects’

Whereas there seems to be some choppy oscillations around the New Moon period, the trend lines between the Quarters and the Full Moon are smoother, are turning points in the series, and each appears to be substantially different than the mean of the series.  The range between the 1st Quarter and the Full Moon is quite remarkable as it is a difference of almost 4 standard deviations of downward trend.

Are the differences from the Mean significant? Is the series as a whole significant?

TESTING FOR SIGNIFICANCE

Table Three

For checking the difference from a Mean value, Minitab suggests the One-Sample T as a simple hypothesis test. A two-tailed test is selected, and the results in Table Three speak for themselves.

Chart Three

The rather fantastic significance values obtained from the T- tests may be viewed with some confidence as the very large sample size results in 100% Power values for the obtained statistics of half of the Lunar Phases.

The significant discrepancies from the Mean at the four quarters are interesting, but if this distribution is to be a possible forecast model, then it is significance in the pattern as a whole that is most important.

THE CHI-SQUARE PROBLEM

To analyze a single variable categorical frequency distribution, Minitab suggests the Chi-square Goodness of Fit Test. The data set of this solar/lunar distribution has over 9 million observed counts (knots of windspeed) spread over 30 categories (synodic days). Even as a novice statistician I know this could be a problem, so I ask Google Man this question:

“What sample size is too large for chi-square?”

“Because of how the Chi-Square value is calculated, it is extremely sensitive to sample size – when the sample size is too large (~500), almost any small difference will appear statistically significant.”

Using Chi-Square Statistic in Research

Clearly, in view of this information, the application of chi-square to this problem would seem inappropriate and subject to criticism from knowledgeable statisticians. I’m not inclined to give up so easily, however, on what my best statistical tool (my GoodEye ) tells me is an important clue in the genesis, maturation, and dissipation of tropical cyclones.

Therefore, the first order of business is to reduce this data set to a more manageable size while maintaining its characteristic variation. For reasons I’ll explain in a later work, I am primarily interested in the windspeed scores of twelve specific points in the Solar/Lunar Synodic Cycle: each 30-degree interval beginning at the New Moon, as graphically presented in the previous Chart Two

Chart Four

When the Observed and Expected Values

are viewed in totality on a simple ratio scale,

it becomes clear that even reducing the number of categories from 30 to 12 results in a still enormous sample size of over 3.5 million observed counts displaying a relatively small (~8.5%) variation.

Table Four

Nevertheless, I press on to the test and, as expected, a statistically significant p-value is the obtained result. However, this obtained statistic is not merely the commonly acceptable significance of p = .05, or even the stricter limit of p =.01. Extreme effects should produce extreme results in my view and this distribution does precisely that: a complete value isn’t shown because Minitab results become maxed out at 16 decimal places:

Chi-Square Test (Minitab)

And when the same distribution is tested in Excel, the significance level becomes maxed out at 30 decimal places: p = 0.000000000000000000000000000000

And again, when the same Excel result is formatted in scientific notation, the returned significance is

p = 0.00E+00,

which seems to be an infinite number of Zeros (I had to look it up).

So, the bottom line seems to be that though the chi-square test is challenged by such an enormous sample size, the significance of this obtained statistic is equally enormous.

Still, there are those who will claim the test invalid on account of traditional, conservative requirements, so I will make one more attempt to correctly reduce the sample size mathematically while maintaining the original observed variation.

Referring to the standardized climatological values presented in Table One of this article, I selected each of the values associated with the twelve points of interest in the synodic sample. Each value was then rounded down to two decimal places, a mean value was determined, the chi-square test was again performed, and this was the result: a still enormous significance level.

Table Five                              Chart Five

The original variation is maintained and still clearly apparent. The significance statistic is fantastic to comprehend. The standard error is miniscule. When the countable units are summed as absolute values the sample size becomes 845 observed counts, somewhat more than recommended, but much less than the raw data and when balanced against the obtained statistic, seems acceptable to me. When distributed over 12 categories, a more appropriate application of the Chi-square Goodness of Fit Test is obtained.

As I contemplated a spot on the shelf for my anticipated Nobel Prize, it occurred to me that I hadn’t as yet completed a control group for comparison.

NOW FOR THE BUZZ-KILL

Chart Six

The Lunar Cycle distribution graphically described in Chart Two is obviously not a normal distribution. Minitab finds that the “least bad fitting model” for the LC Synodic is from the Weibull family of probability distributions and also provides parameter information. So, a randomized distribution is created modeled on the original LC data set. Chart Six above is the resulting time series.

Chart Seven

When the Randomized Control Data is standardized and magnified, it becomes immediately obvious that, while no particular pattern appears, the differences from the Mean and the range of Standard Deviations are every bit as substantial as Lunar Cycle Synodic that we have just been considering.

I have re-run all the tests on this data but to re-post them would be unnecessarily boring and redundant. I shall just stipulate for the record that the significance levels for the Random Control Group are the same as for the Lunar Cycle Synodic:

p = 0.00E+00

One of the earliest lessons on the journey to understand statistical reasoning is that just because a finding has statistical significance, it does not automatically follow that the finding has any practical significance.

TIME SERIES ANALYSIS TO THE RESCUE

The Minitab stat app includes a number of time series analysis tools that can be helpful here. This first one is the Wald-Wolfowitz Runs Test of Randomness.

Runs Test: Control Model

Descriptive Statistics

K = sample mean

Test

As expected, the Control Model results cannot reject the Null Hypothesis. So, while the Control Data differences are highly significant statistically, this distribution pattern may have no practical value and is just what it was created to be: random ‘noise’.

Runs Test: Lunar Cycle

Descriptive Statistics

K = sample mean

Test

When the same Runs Test is applied to the LC Synodic data, the resulting p-value is a familiar one: multiple Zeros. The Null is clearly rejected, the obtained Lunar Cycle distribution is not the least bit random.

Testing for Autocorrelation

Chart Eight   ACF for Random Control

The Autocorrelation tool is another means of time series analysis useful in the detection of seasonal (recurring) patterns and their level of significance. The adjacent chart is the ACF for our known set of Random Control data and is an example graphic of no correlation with itself and no evidence of any patterns.

## Chart Nine

By way of comparison, the ACF for the LC Synodic presents a very high correlation of        r = .75 at twice the level of the Significance Limit (red line). The oscillating scallop pattern diminishing to zero is clear evidence of seasonality.

DISCUSSION

Cross Correlation Analysis

Chart Ten

So, the question of ‘statistical’ v. ‘practical’ still remains and is where Random Control Data can be most helpful. Adjacent is a combined times series graph of the data for the 1971 Tropical Cyclone Season compared to Random Control. As can be seen in this comparison, the 1971 TC distribution bears little resemblance to Random Control.

Chart Eleven

A cross correlation analysis confirms that at Lag 0 where the two series are synchronized, no significant correlation is present between the series

Chart Twelve

The practical value of the LC Model becomes clear when compared to the actual performance of past TC seasons. Of the 51 seasons in my data set, the year 2019 is the closest to a near perfect match with the LC Model

Chart Thirteen

As can be seen in this adjacent graph, the cross correlation is quite high at 82% and if the TC data were moved just 1 lag (1 day) to the left on either graphic, the r value would go up to .87. And of course, the obtained significance level, the ‘p’ value, is, as has been usual in this study, double goose eggs: p = .00.

Chart Fourteen

A basic fact of life in the world of statistical reasoning is: samples vary! In the 51-season data set, 1976 is at the other end of the spectrum in that the storm activity was opposite the LC Model compared to 2019.

Chart Fifteen

The 1976 storm activity is not just different from the LC Model, it is almost completely opposite, showing a high negative correlation and yet another high significance level of (yawn) double goose eggs.

Clearly, the Lunar Cycle does explain some, but not all tropical cyclone variation.

Other Sources of Variation

Chart Sixteen     Zodoc Model with Lunar Cycle Superimposed

The above chart is an example of an experimental TC forecast model  that I have been using to predict hurricanes on my Twitter account this past summer of 2022. It is beyond the intended scope of this study to explain the details of this model, suffice to say for now that my theory is that storm association with the Lunar Cycle is greatly modulated by the planets in the signs.

LC Model Correlation with Past Seasons

In order to test the significance of the Model correlation between the 51 seasons of observed TC data, the created Random Control Model and the obtained LC Synodic Model, I ran correlation ratios for all seasons, and the results are summarized as follows. Significance at the .05 level is ≥  r = .365

Chart Seventeen

Clearly evident is that there are no significant correlations between the 51 TC Seasons and the Random Control Model. Most correlations are close to r = zero.

Chart Eighteen

When the LC Model is tested against the performance of the entire history of reliable NHEM TC data, the correlations are many and the validity of the Model is clearly affirmed.

CONCLUSION

• A 51-season sample of over 2900 tropical cyclones
• A distinct waveform pattern that follows the cycle of the Moon
• Large variances – effect sizes of almost 4 standard deviations
• Hi-significance differences from the mean: p < .000
• Whole pattern is unique, chi-square significance: p < .000
• Unique pattern is not random: p = .00
• Pattern is seasonal (repetitive) Autocorrelation Function: r = .75
• Model presents hi-levels of correlation with past seasons
• There are no associations of the TC data with the random control data

My conclusion is quite simple. The evidence is too much, and it is too strong to accept a Null Hypothesis of no effect. Whereas the Lunar cycle may as yet only explain a still unknown percentage of tropical cyclone activity, the Lunar Cycle has been useful in the present time at estimating/forecasting current storms and as a model may be helpful in the analysis of residuals, which could lead to a better understanding of the effect of the planets in signs in the Zodoc model I have proposed.

That’s all I have for now. Thank you for your kind attention.