How to calculate the theoretical frequency in chi-square test?

I'll give your data a name. The first column is called illness, the second column is called health, the first column is called smoking, and the second column is called no smoking.

To calculate the theoretical frequency, the theoretical probability is required first. There are 200 people here, and 82 of them are sick. 128 is not sick, so the theoretical probability of getting sick is 82/200, and the theoretical probability of not getting sick is 128/200.

Then, you divide these 200 people into two groups, smokers and non-smokers. There are 100 smokers in this group, so according to the theoretical probability calculation, among the 100 smokers, the number of patients should be 100*(82/200). In this way, we can calculate the theoretical frequency of smoking and illness.

Similarly, it should be 100*( 128/200) if smoking doesn't get sick. Everything else is the same.

Extended data:

Theoretical frequency, also known as test frequency, is a statistical concept, which refers to the estimation of actual frequency with positive theoretical rate.

The theoretical frequency Eij of bidirectional unordered contingency table is equal to the product of the total number Oi. Total frequency of rows and columns divided by total frequency n

Chi-square test of four-grid table data is used to compare two ratios or two constituent ratios.

1. Special formula:

If the four-cell frequencies of the four-cell data are A, B, C and D respectively, the chi-square value of the four-cell data chi-square test is n (ad-BC) 2/(a+b) (c+d) (a+c) (b+d) (or use the fitting formula).

Degree of freedom v= (number of rows-1) (number of columns-1)= 1.

2. Application conditions:

The sample content is required to be greater than 40, and the theoretical frequency of each grid is not less than 5. When the sample content is greater than 40 but there is 1 =

Chi-square test of row × column data is used to compare multiple ratios or multiple composition ratios.

1. Special formula:

Chi-square value of chi-square test for data in row R and C = n [(a11n12/n1n 2+...+arc/nrnc)-1].

2. Application conditions:

The theoretical frequency t in each grid is required to be greater than 5 or 1

References:

Baidu Encyclopedia-Theoretical Frequency