Teaching


Web sites about mathematics should help people understand and appreciate mathematics, not confuse the crap out of them with misinformation. Unfortunately, Wolfram Mathworld does the latter.

Example 1. MathWorld explains here that “The numbers of palindromic numbers less than a given number are illustrated in the plot [below].”

PalindromicNumbers_800

So the left plot tells us that there are about 100 palindromes less than or equal to 20. But there are only 21 nonnegative integers less than or equal to 20, so there can’t be 100 palindromes among them. In fact, there are 11 palindromes less than or equal to 20: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, and 11. My guess is that the left plot illustrates the n-th palindromic number as a function of n. In any case, it’s not what MathWorld describes.

MathWorld begins its list of the “first few palindromic numbers” with 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 (these 10 numbers are palindromes and are all less than 10), but in the next paragraph, MathWorld states that the number of palindromic numbers less than 10 is 9. There are 9 if you don’t count zero for some strange reason, but if you don’t intend to, give a definition that excludes it (MathWorld’s definition is less than clear), and then don’t list it.

Still confused? Read the Wikipedia article.

Example 2. Pascal’s Triangle shouldn’t be hard to screw up, right? Wrong. Here’s MathWorld’s Pascal’s Triangle:

NumberedEquation2

This triangle needs to go to the shop for an alignment. The numbers are neither lined up in columns nor staggered (the latter being the usual presentation). What are the numbers in the column containing the rightmost 4? What numbers are along the diagonal through the top? (1, 1, 1, 1, 1, 5, 6?) As shown, MathWorld’s anyway-ill-worded “each subsequent row is obtained by adding the two entries diagonally above” is meaningless.

Example 3. In its article on Mersenne numbers (numbers that are one less than a power of two), MathWorld attempts to explain why “[i]n order for the Mersenne number [2n-1] to be prime, n must be prime.” MathWorld’s justification: “This is true since for composite n with factors r and s, n = rs. Therefore, 2n-1 can be written as 2rs-1, which is a binomial number and can be factored.” That’s sloppy to say the least. First, if a composite number n has factors r and s, it’s not necessarily the case that n = rs. Furthermore, the fact that a number can be factored doesn’t prove it’s composite. Every Mersenne number 2n-1 can be factored. It’s just that when n is composite, there’s definitely a factorization into positive integers neither of which equals 1. Explaining it isn’t hard: In order for 2n-1 to be prime, n must be prime. For if not, n = rs where r and s are integers greater than 1 and less than n; then 2n-1 = 2rs-1 has a factor between 1 and 2n-1, namely 2r-1.

Example 4. MathWorld describes prime numbers as “numbers that cannot be factored.” Prime numbers, like all integers, however, can be factored, and elsewhere, MathWorld gives the factorization of several prime numbers, such as 7: 7 = 7×1.

Example 5. Any of MathWorld’s articles on statistics.

In the article on the Central Limit Theorem, what is lowercase n? What is f? The “limiting cumulative distribution function” of Xnorm is limiting in the sense of what approaching what? (It’s not clear to me that MathWorld’s statement of the theorem is even correct, but it’s clearly unclear.)

The article “explaining” the p-value has perhaps the worst definition of p-value I’ve ever seen when not grading exams. MathWorld says it’s “[t]he probability that a variate would assume a value greater than or equal to the observed value strictly by chance: P(z > zobserved)” (wrong). Wikipedia says “In statistical hypothesis testing, the p-value is the probability of obtaining a test statistic at least as extreme as the one that was actually observed, assuming that the null hypothesis is true” (right).

Leave a Reply

In today’s number news (State-by-state cremation rates in U.S.), we learn that “slightly more than a third of all persons who died in 2006 were cremated, according to the Cremation Association of North America.” Happily, the article contained the raw data, but only as an alphabetical-by-state table of numbers.

Here’s an illumination, as MapPoint is my amanuensis. Click to embiggen.

Deaths2006 

Explanation: Pie areas are proportional to the number of deaths; the yellow slice is cremations, the red non-cremations.

Pies for our nation’s two newest states are not shown. Alaska’s looks like a two-thirds size Vermont pie; Hawai’i’s looks like a one-third size Oregon pie.

2 Responses to “To Die, Perchance To Cremate”

  1. Mike Says:

    Corollary: If you want to live longer, move to Wyoming. Fewer people die there than most other states.

  2. Using MapPoint to see Deaths - MapPoint Forums Says:

    [...] MapPoint to see Deaths Steve Kass To Die, Perchance To Cremate Good use of pie charts. Eric __________________ ~ Now taking orders for MapPoint 2010 ~ ~~ ~ [...]

Leave a Reply

CheesyBakedPasta

Today’s eLetter from the folks at Fine Cooking began “Baked Pasta 259,200 Ways. We did the math.” As you can imagine, I did the Baked Pasta Recipe Maker math, too. I figure it’s 16,128,000 ways, or about 60 times the number Fine Cooking found when they did the math. Here’s the calculation. The recipe maker walked me through the following steps:

  • Choose one or two of four Flavor Bases
  • Chose one of three Sauces
  • Choose two or three of nine Sauce Enhancers
  • Choose one of eight Pastas
  • Choose zero, one, or two of five Vegetables
  • Choose two or three of six Cheeses

Assuming no choice combinations are forbidden (the recipe maker doesn’t appear to prevent you from adding olives and sherry vinegar to sausage and chicken in pink sauce, for example), you find total number of different ways to make a choice at every step by multiplying together the numbers of choices at each step.

It’s easy to count the number of ways to “choose this many of those Things.” If this many is k, and those Things are n in number, the number of ways to choose k of the n things is “n choose k,” sometimes written as C(n,k). These numbers can all be found in Pascal’s triangle. As it’s shown here, C(n,k) is in the row labeled with the n value, under the column labeled with the k value. Here’s how to use the triangle to find the value of C(9,3):Pascal

  • To choose one or two of the four Flavor Bases, there are C(4,1) = 4 ways to choose one plus C(4,2) = 6 ways to choose two, for a total of 10 ways to choose this item.
  • To choose one of the three Sauces, there are C(3,1) = 3 ways.
  • To choose two or three of the nine Sauce Enhancers, there are C(9,2) = 36 ways to choose two plus C(9,3) = 84 ways to choose three, for a total of 120 ways.
  • There are C(8,1) = 8 ways to choose a pasta.
  • There are C(5,0) + C(5,1) + C(5,2) ways to choose up to two vegetables, or 1 + 5 + 10 = 16 ways.
  • There are C(6,2) + C(6,3) to choose the Cheeses, or 15 + 20 = 35 ways

Multiplying these numbers of choices for each step yields 10·3·120·8·16·35 = 16,128,000 ways, about 60 times as many as Fine Cooking found when they did the math. Counting ways isn’t standard recipe math, and I’d like to note that Fine Cooking’s math is generally fine when it comes to ounces, grams, cups, servings, and calories.

2 Responses to “Cooking Fine, Counting Not So Much”

  1. Sarah Breckenridge Says:

    Hi Steve,
    Thanks for checking our math–and you’re correct in your calculation of the absolute maximum number of combinations for this recipe maker.

    We ran the permutation two different ways: on the lower numbers of the spectrum as well as on the higher number. We decided to go with the lower number in our headline, since, well, 259,200 is more pasta than I’ll ever get to in my lifetime (don’t know about you!).

    4 Flavor Bases (1 choice)

    3 Sauces (1 choice)

    9 Sauce Enhancers (2 choices)

    8 Pasta (1 choice)

    5 Vegetables (1 choice)

    6 Cheese (2 choices)

    4 x 3 x ((9 x 8)/2) x 8 x 5 x ((6 x 5)/2) = 259,200

    Now maybe you can help us grapple with an even trickier question: how many of these combinations do you think are actually tasty? :-)

  2. Steve Kass Says:

    Thanks for stopping by and resolving the mystery of the pasta number, Sarah.

    As for how many of these pasta combinations are tasty? That’s an easy one for me calculate: lots and lots and lots! No mystery at all. :)

    Steve (happy subscriber of Fine Cooking since 1995)

Leave a Reply

Scientific American, you ruined my day, but thanks, I needed it.

Silly me for thinking the Math Wars ended when Mathland bit the dust a couple of years ago. Last May, according to this month’s Scientific American, the Seattle School Board adopted the “Discovering Mathematics series, a reform-math high school text that uses student investigations as a means of discovering math principles—such as using toothpick models to derive recursive sequences.”

I looked at it for as long as my stomach could bear — at least at the one chapter that’s available online as a .pdf file here. It’s wretched. Wrong. Not only wrong like in I-don’t-like-it wrong (which it also is), but falselike wrong. And bad, stupid, dumb, and foolish, among other things. It would take me too long to point out all the things wrong in just the first few pages. (I won’t lie. There were some good things, but not many.)

I don’t think the students who wouldn’t have gotten much out of mathematics curricula in the ‘60s will do any better with this. For the students who want to learn mathematics, unfortunately, school will be even more of a waste than it used to be. They should do their best (especially if they go to public school in Seattle) to learn mathematics from the Internet, which is not nearly so wrong as Discovering Mathematics. With luck, any poor grades they get in stupid reform math courses won’t count against them, and if College Board caves and reforms the SAT to correlate with grades in stupid reform math courses, there will hopefully still be pressure for them to keep the AP and SAT II tests. If everything falls apart, kids that like math can drop out of school, learn from the Internet, then make a living tutoring the hapless victims of the new reform math.

Oh, and if you ever see an elevator whose “control panel displays ‘0’ for the floor number,” when it’s at the basement, please take a photo and send it to me.

Leave a Reply

Early this morning, Wikileaks began posting alphanumeric pager messages from four carriers (Arch, Metrocall, Skytel, and Weblink_B) that were intercepted during a 24-hour period beginning early on September 11, 2001. Alphanumeric pager messages are unencrypted, and, like communications over a public 802.11 wireless network, they’re skimmable with the right (and not exotic) software and hardware.

  • “Due to today’s tragic events, it makes sense to cut back wherever feasible on payroll. Expect a very light business day. Please call all stores and review payroll issues”
  • “RING ALL CHICAGO AIPORTS AND EVERY MAJOR BUILDING DOWNTOWN. BUSH IS DOING A SPEECH.  THIS IS SERIOUS POOH..”
  • “Holy crap, are you watching the news.”
  • “I hope you have gone home by now. The BoA tower and space needle here are closed. I suspect tall buildings across the country will be closed. Take care my love.-cb”

This might be the most interesting public data mine since the AOL breach. The total volume is far less, but unlike the AOL data, this data hasn’t been anonymized. There are full names, phone numbers, and other identifying information in the mix.

Leave a Reply

Need to run a one-sample t-test or z-test? Here’s a little calculator written in Excel to help you out.

HypTest

Need to calculate z-scores, percentiles, or scores based on a normal distribution? Here’s little calculator for that, too.

Normal

Leave a Reply

The following subheadline on the Scientific American website caught my eye today (and not only because of the missing period):

New research makes the case for hard tests, and suggests an unusual technique that anyone can use to learn

I may be a bit thick, because neither the article nor the research paper it mentioned suggested any unusual technique to me. But this was better than my last wild goose chase reading episode, when I vainly sought a footnote on a cereal box (there was a dagger: †, but no footnote. Can you believe that?).

Henry Roediger and Bridgid Finn, the Scientific American article’s authors, write that researchers Kornell, Hays, and Bjork found that “learning becomes better if conditions are arranged so that students make errors.” There’s that pesky word “better.” Better than what? The eternal unanswered question. My guess is that Scientific American is reporting that Kornell et al. have found that learning under a) conditions arranged so that students make errors is better than learning under b) conditions arranged so that students do not make errors. In other words, that the researchers found errorful learning to be better than errorless learning. Not that it’s a bad article, but it would be nice if Roediger and Finn had stated what they’re reporting a bit more clearly. (This is why I give writing assignments to my statistics students. By the end of the semester, they better learn not to use adjectives like better without answering “Better than what?”.)

Anyway, Kornell et al. do mention errorless learning in their paper, recently published in the Journal of Experimental Psychology: Learning, Memory and Cognition® (yes, the name of the journal is a registered trademark), but they don’t study it. The abstract notes that they examine the question of “what happens when one cannot answer a test question—does an unsuccessful retrieval attempt impede future learning or enhance it?” Kornell et al. didn’t exactly examine this question either, because they didn’t (and possibly couldn’t) isolate what part of the learning in their scenario was “future” learning. In addition, they only studied learning after wrong answers, so one must be careful not to assume their research sheds light on getting test questions wrong vs. getting them right. (Suppose a researcher reported that “Student learning among African-Americans is enhanced when they are given test questions they cannot answer.” If the researcher only studied African-Americans and made no comparison to other populations, the reported finding might easily be misinterpreted.)

What Kornell et al. did was compare two scenarios for learning previously unknown information. One scenario was unsuccessful retrieval attempts (the students were asked to provide the not-yet-learned information as answers to test questions, and they answered incorrectly). In this scenario, the retrieval attempt was followed by feedback that included a brief presentation of the new information (i.e., the correct test question answer). The second scenario was a longer-lasting presentation of the new information with no retrieval attempt (the students were not asked to answer a test question, and it’s unclear in some of the experiments whether the students knew what kind of test question they would later be asked). Not surprisingly, unsuccessful retrieval attempts enhanced learning (as measured by scores on a test containing questions like those in the retrieval attempt), when compared to presentation of new information with no retrieval attempts. Despite the Scientific American article’s subheadline, this research makes no case that “hard tests” are better for learning than non-hard tests. They may be, but this research doesn’t help us figure it out. The research does support the value of tests, hard or not-hard, so long as there’s feedback with the right answer.

One Response to “Using flashcards is better than just reading them”

  1. Andrew Willett Says:

    That’s what happens when you let Björk work on your research project. Half your research budget gets spent on crazy outfits and acts of visionary musical weirdness, which means you have to cut corners elsewhere.

    † (That would have driven me crazy as well.)

Leave a Reply

Almost every semester, I use the AOL Breach data as a point of departure for something in at least one of my classes. The data is fascinating. Most data is fascinating, but this data is particularly so: at once shocking, funny, creepy, poignant, sad, frightening, noble, ignoble, shrewd, and lewd. It’s also rich in the way data can be rich. It’s completeness—for a sample of several thousand AOL accounts, it includes the complete account search history during March, April, and May of 2006—which includes timestamped search strings and the result rank and destination of clicks-through, makes it ripe for discovering all sorts of patterns of human thought and behavior.

It’s AOL data week in one of my classes now. This morning, I proposed several nontrivial questions about the data that could be answered with SQL queries. We looked at the results and discussed what they might say about the unwitting study subjects. Then I asked my students to suggest some questions of their own. What are the typical time-of-day and day-of-week patterns of an individual AOL customer’s searches? Are there identifiable differences in the patterns (and by extension in the sleep, social, and perhaps employment or school behavior) of people whose searches included, say, “britney”? For what kinds of searches do users most often click through several pages of results? And so on.

One of my students suggested an excellent simple question. What are the most common searches of the form “how to …”? Out of millions of queries in the AOL data, there were many thousands of “how to … ?” searches. The most frequent was “how to tie a tie,” requested 92 times by a total of 47 distinct users. The rest of the top ten (in terms of most distinct users asking the question) were how to write a resume, gain weight, have sex, get pregnant, write a book, write a bibliography, start a business, lose weight, and make money, each sought by a dozen or more different people. AOL converted the queries to lower case and removed much of the punctuation, but they didn’t correct spelling. Consequently, how to masterbate and how to masturbate appear separately at ranks 49 and 51 respectively. The question would have nearly hit the top 10 without the misspellings.

Here’s a PDF file of the top 1000 “how to” queries submitted through AOL explorer by a sample of AOL users in the spring of 2006. You can probably guess that it’s not safe for work. Although there are no pictures, plenty of sex, drugs, and gambling is spelled out, and there are more than a few questions likely to offend in one way or another. Have a look.

2 Responses to “#836. How to be a sex goddess”

  1. Greg Everitt Says:

    Wow professor, this list is… people are interesting, is all I’m saying.

  2. Steve Kass » Why, why, why? Says:

    [...] AOL data (see #836. How to be a sex goddess) was a little thin on "why is he" queries, but a broader "why is" search [...]

Leave a Reply

A colleague recently drew my attention to this CNN.com article, titled “Confident students do worse in math; bad news for U.S.

It’s interesting, I guess, but the analysis is flawed.

The article reports the results of a Brookings Institute study based on the 2003 Trends in International Science and Mathematics Study (TIMSS).

It’s the culture, stupid.

Where to start? First off, the countries with the best average math scores were East Asian countries. That confirms other studies and general perceptions, and it didn’t surprise anyone. Obviously, then, anything that correlates with East Asianness (straightness of hair, size of epicanthal folds, or facility in Chinese or Japanese) will also correlate with math scores when East Asian country-wide averages are compared with other country averages. One trait that correlates with East Asianness is self-effacement. Self-effacement tends to be highly valued in countries like Taiwan and Japan, but praising one’s self is not, and there was a survey question that measured self-effacement: “How strongly do you agree with the statement ‘I usually do well in mathematics.’”

It shouldn’t have surprised anyone that students in East Asian countries were less likely to answer “strongly agree” when asked how much they agree with the statement “I usually do well in mathematics,” so it shouldn’t have surprised anyone that country-wide averages on this survey question correlated (inversely) with mathematical ability. The correlation is simply explained by culture and the known difference between countries in mathematical ability.

Hurting the chances for a fair airing in the press (if there is ever any chance), the Brookings analysts repeatedly confuse expressed self-confidence with true self-confidence, too: “In the TIMSS data, when one looks at the math scores of students within each country, those who express confidence in their own math abilities do indeed score higher than those lacking in confidence,” and “The world’s most confident eighth graders are found…”, and “students in [East Asian] countries do not believe that they do very well in math.” Combine this with the fact that while the Brookings folks do notice the culture question, they dance around it enough to let reporters to come away with conclusions that are a good six fallacious leaps away from the data and statistics, like “Happy, confident students do worse in math” headlining an article by Association Press education writer Ben Feller. Nothing in the data suggests that confidence in mathematical ability is inversely related to actual mathematical ability at the level of individuals, but the headline gives that impression, strongly.

Another problem with the study is that it commits the ecological fallacy. It speculates about how confidence and ability are related in individual students from country-wide aggregate data, and the press drag these wrong conclusions in the wrong direction. The aggregate data is screwy to begin with, given the very strange list of countries surveyed. Many have small populations: Jordan, Israel, Bahrain, Cyprus, Latvia, the Palestinian Authority, Moldova, Lithuania, Scotland, Norway, Flemish Belgium, Botswana, Macedonia, and Serbia, and a few have large populations: Russian Federation, Japan, Indonesia, Malaysia, the United States, and the Philippines, but each country’s average mathematics score counted the same in the statistical analysis to which the press is paying the most attention. The country list was hardly comprehensive, either. Chile was the only American country on the list besides the U.S.

A regression weighted by population would have been a little better, but drawing any conclusions about individual students from country-wide averages is invalid. Here are a couple of good articles about this fallacy: (link) (link).

The TIMSS data contains plenty of useful information to support a conclusion opposite to the one reported. The data showed that all other things being equal (that is, among students within any one individual country) higher student confidence (which, ceteris paribus, should now correlate with expressed confidence) in math ability tended to be associated with higher math ability. The Brookings folks observed this, and they called it paradoxical: “So an interesting paradox emerges from the international data on student confidence and achievement. The relationships are the opposite depending on whether within-nation or between-nation data are examined.”

When Brookings writes “The international evidence makes at least a prima facie case that self-confidence, liking the subject, and relevance are not essential for mastering mathematics at high levels,” it’s easy to think they are suggesting that the ecological fallacy analysis is telling us something, and they are diminishing the different conclusions supported by the ceteris paribus analysis. And the reporters take the hook. I wonder how good the press corps and the Brookings researchers think they are at statistics.

One Response to “Math is fun, and that’s ok.”

  1. Steve Kass » Lost and Found Says:

    [...] Math is fun, and that’s ok. [...]

Leave a Reply