ISimpson Paradox

Indida kaSimpson ingenye yezinto eziqondakala kalula futhi ngasikhathi sinye ezimangazayo ezibalweni. Kwenzeka noma nini lapho amaqembu emininingwane abonisa ukuthambekela okuthile, kepha leyo nqubo ibuyiselwa emuva lapho amaqembu ehlanganiswa. Ngosizo lwesibonelo esilula, indida ingaqondakala ngokushesha.


Sibheka amasethi amabili ahlanganisiwe \(\#1\) kanye \(\#2\) kanye ne- \(G = \#1 \cup \#2\) bese sihlola izinga lokuphumelela kwe- \(A\) nangaphakathi kwalawa masethi \(B\):

\(A\)\(B\)\(win\)
\(\#1\)\(\frac{1}{1}=100\%\)\(\frac{3}{4}=75\%\)\(A\)
\(\#2\)\(\frac{2}{5}=40\%\)\(\frac{1}{3}=33\%\)\(A\)
\(\#1 \cup \#2\)\(\frac{3}{6}=50\%\)\(\frac{4}{7}=57\%\)\(B\)

Kuvela ukuthi \(A\) iphumelele kakhulu kune- \(B\) ku \(\#1\) kanye ne- \(\#2\) \(B\) , kepha ngokumangazayo ku- \(G\) \(B\) iphumelele kakhulu kune- \(A\) . Lesi sibonelo futhi ngesinye salezo ezinamasethi amancane \(G\) \(|G|=13\) . Akukho \(G\) nge \(|G|<13\) (ubufakazi ngamandla amakhulu).

Manje sihlukanisa isethi \(G\) esikhundleni se \(2\) ku \(3\) ukuhlukanisa okuhlanganisiwe \(\#1, \, \#2, \, \#3\) nge \(\#1 \cup \#2 \cup \#3 = G\) . Ngemuva kwalokho sakha icala elijabulisayo lokuthi kuyo yonke into \(e_k \neq \emptyset\) yamandla esethi \(P(G)\) ye \(G\) okulandelayo kuyasebenza: $$\forall e_1, e_2 \in P(G): |e_1| \neq |e_2| \Rightarrow win(e_1) \neq win(e_2) \land |e_1| = |e_2| \Rightarrow win(e_1) = win(e_2)$$ $$\forall e_1, e_2 \in P(G): |e_1| \neq |e_2| \Rightarrow win(e_1) \neq win(e_2) \land |e_1| = |e_2| \Rightarrow win(e_1) = win(e_2)$$

Ngemuva kwamahora ambalwa wobudlova ku-Core i7 ejwayelekile, isibonelo esilandelayo singatholakala:

\(A\)\(B\)\(C\)\(win\)
\(\#1\)\(\frac{6}{7}=85,71\%\)\(\frac{12}{15}=80,00\%\) \(\frac{22}{37}=59,46\%\) \(A\)
\(\#2\)\(\frac{95}{167}=56,89\%\) \(\frac{48}{88}=54,55\%\) \(\frac{38}{67}=56,72\%\) \(A\)
\(\#3\)\(\frac{48}{144}=33,33\%\) \(\frac{16}{50}=32,00\%\) \(\frac{2}{20}=10,00\%\) \(A\)
\(\#1 \cup \#2\)\(\frac{101}{174}=58,05\%\) \(\frac{60}{103}=58,25\%\) \(\frac{60}{104}=57,69\%\) \(B\)
\(\#1 \cup \#3\)\(\frac{54}{151}=35,76\%\) \(\frac{28}{65}=43,08\%\) \(\frac{24}{57}=42,11\%\) \(B\)
\(\#2 \cup \#3\)\(\frac{143}{311}=45,98\%\) \(\frac{64}{138}=46,38\%\) \(\frac{40}{87}=45,98\%\) \(B\)
\(\#1 \cup \#2\cup \#3\)\(\frac{149}{318}=46,86\%\) \(\frac{76}{153}=49,67\%\) \(\frac{62}{124}=50,00\%\) \(C\)

Ngaleyo ndlela (kucatshangelwa isikhathi esinqunyelwe sekhompyutha) izibonelo ze- \(n\) ukuhlukaniswa kwama-subsets anokuziphatha okufanayo kungatholakala. Lapho izimo ezinjalo zenzeka empeleni, noma iziphi iziphetho ezisuselwa encomweni yokuphumelela kweqembu zombili zinengqondo futhi azinamqondo.

Kuleli qophelo, sincoma ukufundwa okuthakazelisayo Causality: Models, Reasoning and Inference by Judea Pearl .

Emuva