Steve Kass

7 May 2008 3:37

Researchers have discovered…

Posted by Steve under Research Studies
Comment on this post

… that teaching mathematical concepts with confusing real-world examples is not a good idea.

Last month, the journal Science published an article about learning mathematics. Many newspapers and magazines picked up the story and pitched it at readers, summarizing the research result more or less like ars technica did [here]: “[S]tudents who learn through real-world examples have a difficult time applying that knowledge to other situations.”

This doesn’t agree with my experience in the classroom, and I sought the source. The full text of the Science article isn’t free, and the following remarks are based on what seems to be an earlier report of the same research by the same authors: “Do Children Need Concrete Instantiations to Learn an Abstract Concept?,” in the Proceedings of the XXVIII Annual Conference of the Cognitive Science Society [accessed at http://cogdev.cog.ohio-state.edu/fpo644-Kaminski.pdf on May 6, 2008.] Here the Ohio State researchers described their experiment.

In the first phase, subjects learned a mathematical concept.

[The concept] was a commutative group of order three. In other words, the rules were isomorphic to addition modulo three. The idea of modular arithmetic is that only a finite number of elements (or equivalent [sic] classes) are used. Addition modulo 3 considers only the numbers 0,1, and 2. Zero is the identity element of the group and is added as in regular addition: 0 + 0 = 0, 0 + 1 = 1, and 0 + 2 = 2. Furthermore, 1 + 1 = 2. However, a sum greater than or equal to 3 is never obtained. Instead, one would cycle back to 0. So, 1 + 2 = 0, 2 + 2 = 1, etc.

Subjects learned this concept through one of two scenarios: a scenario using geometric symbols with “no relevant concreteness” or a scenario “with relevant concreteness” using measuring cups containing various levels of liquid.

Ok, I’ll bite. I was a New Math child, and Mrs. Szeremeta taught us modular arithmetic with (analog) clocks. A “liquid left over” scenario with measuring cups ought to work, too. Most students “know” the idea of measuring cups.

Mrs. Szeremeta’s clocks worked, because the clock scenario contained “relevant concreteness,” and because its concreteness was familiar. For a concrete example to be a good teaching tool for an abstract concept, the concreteness has to be both relevant and familiar. Relevant means the concrete instantiation has to behave in real life more or less according to the rules of the abstract concept. Familiar means students know or can quickly learn how it works in real life. As the authors correctly observe, “the perceptual information communicated by the symbols themselves can act as reminders of the structural rules.”

Oops.

In addition to scenarios “with no concreteness” and “with relevant concreteness,” other kinds of scenario exists: ones “with confounding concreteness,” or ones with “irrelevant concreteness,” or ones with “distracting concreteness.” A scenario with confounding concreteness is one that draws on the familiar, but where the familiar behavior works contrarily to the rules of the abstract concept.

Here is the authors’ concrete scenario:

To construct a condition that communicates relevant concreteness, a scenario was given for which students could draw upon their everyday knowledge to determine answers to test problems. The symbols were three images of measuring cups containing varying levels of liquid. Participants were told they need to determine a remaining amount when different measuring cups of liquid are combined.

So far so good, but unfortunately, the three images used to represent the equivalence classes were those of a 1/3-full, a 2/3 -full, and a full measuring cup, representing the equivalence classes [1], [2], and [0] respectively.

Yes, the authors used a full measuring cup, not an empty one, to represent the additive identity, zero. The upshot of the experiment, then, in my opinion, was to compare an abstract implementation (geometric symbols with no concreteness), with a second implementation having both relevant and confounding concreteness. Relevant, because combining liquid and considering remainders works like addition in this group, but confounding, because one notion of the abstract concept is the idea of an additive identity (zero, nothing), which students learned to equate with a full measuring cup, which contains something, not nothing.

Students were expected to report the amount of liquid remaining after combining two amounts (but the result could not be “none”). In both the “cups” domain and the “no relevant concreteness” domain where squiggle = 0, disk = 1, and rhombus = 2, students learned to report correct answers (called “remainders” or “results”, depending on the domain) from the action of combining items.

The now combining-savvy students were challenged to learn similar rules in a new domain. The new symbols were images of a round ladybug, a tallish vase, and (I think) a cabochon ring viewed so its projection on the page is an eccentric ellipse with larger axis horizontal. They were told that the rules of this new system were similar to what they had previously learned, and that they should figure them out using their new knowledge.

Coincidentally, or perhaps not, the three images somewhat resembled baroque renditions of the earlier disk, rhombus, and squiggle, but bore little similarity to variously-filled measuring cups. They were taught a “game” where two students each pointed to an item, and a “winner” then pointed to one. Subjects were expected to learn which object the winner would point to. Students might have discovered that the visual similarities disk~ladybug, rhombus~vase, and squiggle~cabochon explained the new behavior.

Certainly the new domain was an abstract one. A pointing game like this, or real-world “ways” vases, ladybugs, and cabochons combine, are not part of the real world. I’d consider this new domain a purely symbolic one. When we teach mathematics, we want students to be able to transfer their knowledge to new domains both applied (with relevant concreteness) and abstract (symbolic, or with no relevant concreteness).

It doesn’t surprise me that students masters a new abstract domain more easily if they’d previously mastered one; and I’d expect it easier to master a new domain with relevant concreteness if you’d previously mastered one of those.

The short of it? Interesting research, but the experimental design is flawed as far as answering the question posed. Certainly this is no reason to give up finding creative, relevant, and familiar examples for abstract mathematical ideas.

26 Apr 2008 13:46

Automated ad placement [1]

Posted by Steve under News
Comment on this post

This food-themed ad for an insurance company in the middle of an article about a fatal shark attack didn’t strike me as particularly tasteful.

Ad capture

1 Apr 2008 21:20

The best April Fool hoax ever … or is it?

Posted by Steve under Uncategorized
Comments Off

I can’t tell which would be more amazing – if the 1967 Atlantic article on data and privacy rediscovered and reproduced here is for real, or if Modern Mechanix has fooled us big time. Either way, an amazing article. (Thanks for pointing me to this, Jason.)

29 Mar 2008 2:33

Spearman’s rho for SQL Server

Posted by Steve under SQL Server
Comment on this post

Before SQL Server 2005 was released, a calculation that requiring a ranking was both relatively difficult to express as a single query and relatively inefficient to execute. That changed in SQL Server 2005 with support for the SQL analytic functions RANK(), ROW_NUMBER(), etc., and partial support for SQL’s OVER clause.

Spearman’s rho (Spearman’s correlation coefficient) is a useful statistic that can be calculated more easily in SQL Server 2005 than in earlier versions. Below is an implementation of Spearman’s rho for SQL Server 2005 and later.

SQL’s RANK() and the rank order required for the calculation of Spearman’s rho are slightly different: if for example four values are tied for third place, RANK() will equal 3 for all four of them. The Spearman’s formula requires them all to be ranked 4.5, the average of their positions (3rd, 4th, 5th, and 6th) in an ordered list of the data. To address this difference, the code below adjusts the SQL RANK() by adding to it 0.5 for each occurrence of a data value beyond the first. I used COUNT(*) with an OVER clause for this.

The script below demonstrates the calculation for two data sets. The first one is from Wikipedia’s page on Spearman’s rho; I made up the second data set to include duplicate data values. I haven’t tested the code thoroughly, but for a variety of small test data sets, it matches hand calculations and the result here [1].

create table SampleData (
ID int identity(1,1) primary key,
x decimal(5,2),
y decimal(5,2)
);

insert into SampleData(x,y) values(106,7);
insert into SampleData(x,y) values(86,0);
insert into SampleData(x,y) values(100,27);
insert into SampleData(x,y) values(101,50);
insert into SampleData(x,y) values(99,28);
insert into SampleData(x,y) values(103,29);
insert into SampleData(x,y) values(97,20);
insert into SampleData(x,y) values(113,12);
insert into SampleData(x,y) values(112,6);
insert into SampleData(x,y) values(110,17);
go

create procedure Spearman as
with RankedSampleData(ID,x,y,rk_x,rk_y) as (
select
ID,
x,
y,
rank() over (order by x) +
(count(*) over (partition by x) – 1)/2.0,
rank() over (order by y) +
(count(*) over (partition by y) – 1)/2.0
from SampleData
)
select
1e0 –
(
6
*sum(square(rk_x-rk_y))
/count(*)
/(square(count(*)) – 1)
)
from RankedSampleData;
go

exec Spearman;

go
truncate table SampleData;
go

insert into SampleData(x,y) values(1,3);
insert into SampleData(x,y) values(3,5);
insert into SampleData(x,y) values(5,8);
insert into SampleData(x,y) values(3,4);
insert into SampleData(x,y) values(4,7);
insert into SampleData(x,y) values(4,6);
insert into SampleData(x,y) values(3,4);
go

exec Spearman;
go

drop proc Spearman;
drop table SampleData;

[1] Wessa, P. (2008), Free Statistics Software, Office for Research Development and Education, version 1.1.22-r4, URL http://www.wessa.net/

11 Mar 2008 13:36

More much likely

Posted by Steve under News , Nonsense
Comment on this post

A Reuters News Service article today reports [emphasis mine] “If both your parents have Alzheimer’s disease, you probably are more much likely than other people to get it, researchers said on Monday.” What if they both have dyslexia?

27 Feb 2008 0:10

Shoddy Headline

Posted by Steve under News
Comment on this post

Headline: Nuclear Plant Shutdown Causes Massive Florida Power Outages

It’s the other way around, though, according to the article. A power outage caused the plant to automatically shut down, just like it’s supposed to. In the world of news, fear trumps truth.

19 Dec 2007 16:39

Elapsed time excluding nights and weekends

Posted by Steve under SQL Server
Comment on this post

Finding elapsed time in SQL Server is easy, so long as the clock is always running: just use DATEDIFF. But you often need to find elapsed time excluding certain periods, like weekends, nights, or holidays. A fellow SQL Server MVP recently posed a variation on this problem: to find the number of minutes between two times, where the clock is running only from 6:00am-6:00pm, Monday-Friday. He needed this to compute how long trouble tickets stayed at a help desk that was open for those hours.

I came up with a function DeskTimeDiff_minutes(@from,@to) for him. It requires a permanent table that spans the range of times you might care about, holding one row for every time the clock is turned on or off, weekdays at 6:00am and 6:00pm in this case.

The table also holds an “absolute business time” in minutes (ABT-m): the total number of “help desk open” minutes since a fixed but arbitrary “beginning of time.” Elapsed help desk time is then simply the difference between ABT-m values. While the table only records the ABT-m 10 times a week, you can find the ABT-m for an arbitrary datetime @d easily. Find the row of the table with time d closest to @d but not later. In that row you’ll find the ABT-m at time d, and you’ll also find out whether the clock was (or will be) running or not between d and @d. If not, the ABT-m at time @d is the same as at time d. Otherwise, add the number of minutes between d and @d.

Here’s the code. The reference table here is good from early 2000 until well past 2050, and you can easily extend it or adapt it to other business rules. A larger permanent table of times shouldn’t affect performance, because the function only performs (two) index seek lookups on the table.

If you cut and paste this for your own use, watch out for “smart quotes” or other WordPress/Live Writer formatting quirks.

create table Minute_Count(
d datetime primary key,
elapsed_minutes int not null,
timer varchar(10) not null check (timer in (‘Running’,’Stopped’))
);

insert into Minute_Count values (‘2000-01-03T06:00:00′,0,’Running’);
insert into Minute_Count values (‘2000-01-03T18:00:00′,12*60,’Stopped’);

insert into Minute_Count values (‘2000-01-04T06:00:00′,12*60,’Running’);
insert into Minute_Count values (‘2000-01-04T18:00:00′,24*60,’Stopped’);

insert into Minute_Count values (‘2000-01-05T06:00:00′,24*60,’Running’);
insert into Minute_Count values (‘2000-01-05T18:00:00′,36*60,’Stopped’);

insert into Minute_Count values (‘2000-01-06T06:00:00′,36*60,’Running’);
insert into Minute_Count values (‘2000-01-06T18:00:00′,48*60,’Stopped’);

insert into Minute_Count values (‘2000-01-07T06:00:00′,48*60,’Running’);
insert into Minute_Count values (‘2000-01-07T18:00:00′,60*60,’Stopped’);
/* any Monday-Friday week */

declare @week int;
set @week = 1;
while @week < 2100 begin
insert into Minute_Count
    select
      dateadd(week,@week,d),
      elapsed_minutes + 60*@week*60,
      timer
from Minute_Count
set @week = @week * 2
end;

create function DeskTimeDiff_minutes(
@from datetime,
@to datetime
) returns int as begin
declare @fromSerial int;
declare @toSerial int;
with S(d,elapsed_minutes,timer) as (
    select top 1 d,elapsed_minutes, timer
    from Minute_Count
    where d <= @from
    order by d desc
)
    select @fromSerial =
      elapsed_minutes +
      case when timer = ‘Running’
      then datediff(minute,d,@from)
      else 0 end
    from S;
with S(d,elapsed_minutes,timer) as (
    select top 1 d,elapsed_minutes, timer
    from Minute_Count
    where d <= @to
    order by d desc
)
    select @toSerial =
      elapsed_minutes +
      case when timer = ‘Running’
      then datediff(minute,d,@to)
      else 0 end
    from S;
return @toSerial – @fromSerial;
end;
go
select MAX(d) from Minute_Count
select dbo.DeskTimeDiff_minutes(‘2007-12-19T18:00:00′,’2007-12-24T17:51:00’);
go

drop function DeskTimeDiff_minutes;
drop table Minute_Count;

21 Nov 2007 18:16

The hemisphere requirement

Posted by Steve under SQL Server
Comment on this post

Microsoft plans to support spatial data types in SQL Server 2008, and a preview is available to the community in the latest CTP (community technology preview), available here.

John O’Brien, a Windows Live Developer MVP, has been trying out the new spatial types in some cool Virtual Earth projects (John’s site is here), and in one of his projects, SQL Server threw an interesting error message. When he zoomed far enough out in Virtual Earth, then tried to create a polygon from the map bounds, SQL Server reacted with:

“The specified input does not represent a valid geography instance because it exceeds a single hemisphere. Each geography instance must fit inside a single hemisphere. A common reason for this error is that a polygon has the wrong ring orientation.”

John found a workaround, dividing the map into two pieces, but he was interested to know what the SQL Server folk thought about the situation. Here’s my reply. It’s less a response to John’s inquiry than it is a ramble about geometry and what hemispheres and orientation have to do with how you can or can’t specify polygons.

To begin, think of the earth’s Equator as a polygon. How would you answer the following questions?

“If I travel Eastbound around the earth along the equator, have I gone clockwise or counter-clockwise?”
“Is the north pole inside the equator or outside the equator?”

In the plane (or on a flat map of the world), a polygon or other closed non-self-intersecting curve has a well-defined “inside” and “outside”. A polygon separates the plane into two regions, one that has finite area and one that is unbounded. The finite region is deemed “inside” the polygon. On a sphere, however, a closed curve determines two finite regions, either of which might be what someone thinks of as the inside.

For example, the four-sided outline of the US state of Wyoming separates the earth into what you could call “Wyoming” and “anti-Wyoming.” But are we so sure which is the inside and which is the outside? Our intuition is that the smaller region is always the inside, but there’s nothing about geometry and geography to tell us that. Maybe Wyoming is most of the world. A single geographic region could contain most of the earth’s surface within its borders, couldn’t it?

Suppose Wyoming declared itself to be Great Wyoming and annexed all of North America, Europe, and continued to conquer the world. Suppose its armies crossed the equator and eventually took over almost everything—everything but Antarctica, in fact.

Then the boundary of Great Wyoming would then be the same as the boundary of Antarctica. You would probably want Great Wyoming to be inside the boundary of Great Wyoming and Antarctica to be inside the boundary of Antarctica, but how can that work—the boundaries are the same?

This is a problem. On a sphere, the naïve idea of interior/exterior isn’t well-defined. One solution would be to pass a law that every polygon on earth must fit inside a single hemisphere with room to spare. We could then define the interior of a polygon to be the smaller of the two regions it determines. This would place Antarctica, not Wyoming, within the borders of Great Wyoming—wrong, but unambiguous. And anyway, who would ever need to consider a region ~~bigger than 640K~~ that doesn’t fit inside a single hemisphere?

Fortunately, though, we don’t have to abandon or compromise the notion of interior and exterior on the earth’s surface: Antarctica can remain outside Greater Wyoming. All we need to do is be precise about the direction in which we describe a polygon. When specifying the boundary of a region, you can give a forwards/backwards or clockwise/counter-clockwise sense to the boundary by choosing the way you order the list of vertices. List them so that what you consider inside the region is on your left as you “connect the dots,” because we will adopt the convention that the left side as you walk the perimeter is the inside. What’s on the right will be interpreted as outside. Now you can describe the boundary of Great Wyoming. Just describe it as drawn from west to east, so Antarctica is on the right (exterior). (This works because a sphere is an “orientable surface.” SQL Server’s new geography data type isn’t supported on a Klein bottle, where CultureInfo.IsOrientableWorld—if such a property existed—would be false.)

Once we require polygons to be oriented, there’s no need to require that they fit within a single hemisphere, but nonetheless, SQL Server 2008’s geography data type adopts the hemisphere requirement. For geometry objects of type Polygon, I think this is a good idea. I’m not sure whether it’s a standard GIS requirement or just SQL Server’s, but it prevents users from accidentally entering the coordinates of Wyoming in clockwise fashion only to discover later that Perth and Addis Ababa, but not Cheyenne, are in Wyoming. [For some of the other geography types, such as LineString, I don’t see a benefit from requiring the object to fit in a hemisphere, but consistency isn’t a bad thing.]

17 Nov 2007 20:11

Virtual lynch mobs replacing truth and justice?

Posted by Steve under News
Comment on this post

A week ago, a small-town reporter named Steve Pokin wrote an emotional story for the St. Charles (Missouri) Journal. With headings like “SHADOWY CYBERSPACE”, “AX AND SLEDGEHAMMER”, and “THE AFTERMATH IS PAIN”, Pokin wove a modern tragedy about the October 2006 death of young Megan Meier. In Pokin’s tale, a neighbor fabricated a MySpace account through which the girl was tormented to commit suicide.

Whatever tragedy might have taken place in 2006, another tragedy is taking place now, in reaction to Pokin’s story. The AP picked it up yesterday, and it’s being republished widely, often under the headline “Mom: Web Hoax Led Girl to Kill Herself.”

Pokin’s story is taken at face value, people react with outrage and anger, and the supposed villain of the story has been identified and named by bloggers.

I’m not suggesting Pokin got the facts wrong, nor am I suggesting he got them right, either. I wasn’t there. I’m also not suggesting Pokin expected or intended the kind of reaction his story has produced. Pokin appears to have based his story mostly, if not entirely, on the word of one guilt-ridden, aggrieved mother (“I have this awful, horrible guilt and this I can never change,” she said. “Ever.”) and on the second-hand words of the story’s villain, summarized in a police report about an incident between the two families involved. From the reaction so far, that’s apparently all the information many people need to call for (or take) mob action.

A local television station quotes the St. Charles County Prosecutor, Jack Banas as saying, “Me personally, I’ve never seen anything on this case.”

31 Oct 2007 2:10

Lost in Transition

Posted by Steve under News
Comment on this post

Well it’s been much more than a few days. While I might recover what got lost in the change of hosting provider, there’s no point suspending the blog any longer. If I don’t keep posting, It’s laziness and procrastination from now on.

« Previous Page — Next Page »