Author Archive

i735c.gif

  • Description of the data

Instant-win prize data for NJ Lottery games, specifically numbers of large prizes remaining and originally offered. Other data available includes date contest began, total number of tickets initially produced, cost of each ticket. Available at this link: scratchoff.xls is a sample Excel file to show how the data should be entered. It shouldn’t take more than a couple of hours to collect all this data.

  • Why the data is interesting

The lottery is popular, and people probably want to maximize their chances of winning large prizes. We can offer more than other lottery-related sites offer. There are no easy-to-find web sites comparing the odds of winning the different instant-win games.

  • How we can obtain the data

From the New Jersey Lottery web site page on instant-win games (http://www.state.nj.us/lottery/instant/2-1_unclaimed_prizes.htm). Off that page there is a paragraph for each instant-win game that looks like this:

In the “SUPER CROSSWORD” Instant Game,

New Jersey allocates 65% of the gross receipts to prizes. On the average, better than 1 ticket in 5 wins a prize. In a game of 3,900,000 tickets there are 390,000 prizes of $5; 273,000 prizes of $7; 92,300 prizes of $10; 65,325 prizes of $12; 18,200 prizes of $15; 19,500 prizes of $50; 26,000 prizes of $100; 19,500 prizes of $150; 6 prizes of $750; 4 prizes of $7,500 and 6 prizes of $50,000. Odds and number of winners may vary based on sales, distribution and claims.

We can retrieve this paragraph for each game automatically, because the URLs are predictable (http://www.state.nj.us/lottery/instant/ig738.htm, http://www.state.nj.us/lottery/instant/ig739.htm, etc.) And we can put this together with the count of unclaimed large prizes on the main instant-win page, which contains information like this:

$50,000 - 6
$7,500 - 8
$750 - 11

  • What specific questions the data will/can answer. Some of these need some thought about how they should be calculated. It would be good to start with two or three proposals for that.
    • Which games have the best odds of winning a large prize? (How do we calculate the odds for one game?)
    • Provide all the information about instant-win games (particularly large prizes remaining as a fraction of large prizes initially offered) on one page. (This shouldn’t be hard from the Excel spreadsheet)
    • Provide a service to cell phone users to send a query from a lottery retailer to say which of the available games has the best odds of winning a large prize. If possible, do this via picture-messaging, so the cell phone user can just send a picture of the scratch-off display case to our service and have it analyzed. (We probably have no hope of doing this for various reasons - insufficient technical expertise, it involves image recognition, too hard to test, etc., etc. Maybe we could make a static web page updated every couple of days with the information instead - ideally, one that would display nicely on a cell phone or other small device.)

1. For Wednesday, November 7 (but some submissions by Monday would be welcome), post (individually or in groups of 2) a data project description containing these parts:

  • Description of the data
  • Why the data is interesting
  • How we can obtain the data
  • What specific questions the data will/can answer

Between Wednesday and Friday, revise as desired

2. [Evaluation] (process to happen after Friday, Nov. 9, in groups of at least three, each group evaluating five or six projects*)

Requirement for evaluation teams: For each project description you evaluate, write about a page with about a paragraph for each of the categories below. If you want, give feedback for the proposers also, or ask questions as you evaluate these and invite proposers to revise their projects. Then write a summary page comparing the projects to one another, ranking them if possible on 1) the C’s criteria and (separately) 2) feasibility.

  • Clarity
    • Goal: the data should be described sufficiently well so that someone in this class could go with that data description and get the data in a “hands-on” form (in an Excel file, written on paper, or other electronic or written format that is organized and ready for processing to answer the questions).
  • Correctness
    • Is the data available from sources as described?
    • Can the data answer the questions listed?
  • Coolness
    • Is the description creative, catchy, compelling?
    • Is there some value to this project in terms of originality, larger purpose, building on existing knowledge, etc.? (innovative, interesting, practical, reliable)
  • [not for grade, but to be evaluated] Feasibility
    • What kind of time frame and resources would this project take to complete?
    • Do we have the skills and access to information and tools needed?
    • Would you want to complete this project yourself (with some help)?

*See November 9 post for evaluation groups and projects to evaluate.

3. Even later, we will execute at least a couple of these projects.

Within each group, people are listed in the following order according their primary role:

Manager
Programmer
Designer (interface)
Writer (documentation)
[Fifth persons share responsibility in their preferred role.]

Group A:

Emily Capkanis
Robert Gordon
Mike Beideman
Lainie Wilt

Group B:

Anna Forman
Noah Rosenfield
Sarit Ashkenazi
Andrew Schaffer

Group C:

Matt Fingerman
Scott Brandsdorfer
Georgia Cruz
Holly Tarnower
Sami Zavras

Group D:

Jen Dugan
David Sassouni
Jeremy Dery
KC Russell

Group E:

Whitney Chenoweth
Hidehiko Udagawa
Blake Gideon
Amanda Moutner

This is the CSCI 10 course blog for Fall 2007.