TheUMAPJournalPublisher
COMAP, Inc.Vol. 3?, No.??Executive Publisher
Solomon A. Garfunkel
ILAP Editor
Chris Arney
Dept. of Math’l Sciences
U.S. Military Academy
West Point, NY 10996david.arney@usma.eduOn Jargon Editor
Yves Nievergelt
Dept. of Mathematics
Eastern Washington Univ.
Cheney, WA 99004ynievergelt@ewu.eduReviews Editor
James M. Cargal
Mathematics Dept.
Troy University—
Montgomery Campus
231 Montgomery St.
Montgomery, AL 36104jmcargal@sprintmail.comChief Operating Of?cer
Laurie W. Arag′on
Production Manager
George Ward
Copy Editor
Julia Collins
Distribution
John Tomicek
Editor
Paul J. Campbell
Beloit College
700 College St.
Beloit, WI 53511–5595
campbell@beloit.edu
Associate Editors
Don Adolphson
Aaron Archer
Chris Arney
Ron Barnes
Arthur Benjamin
Robert Bosch
James M. Cargal
Murray K. Clayton
Lisette De Pillis
James P. Fink
Solomon A. Garfunkel
William B. Gearhart
William C. Giauque
Richard Haberman
Jon Jacobsen
Walter Meyer
Yves Nievergelt
Michael O’Leary
Catherine A. Roberts
John S. Robertson
Philip D. Straf?n
J.T. Sutcliffe
Brigham Young Univ.
AT&T Shannon Res. Lab.
U.S. Military Academy
U. of Houston—Downtn
Harvey Mudd College
Oberlin College
Troy U.— Montgomery
U. of Wisc.—Madison
Harvey Mudd College
Gettysburg College
COMAP, Inc.
Calif. State U., Fullerton
Brigham Young Univ.
Southern Methodist U.
Harvey Mudd College
Adelphi University
Eastern Washington U.
Towson University
College of the Holy Cross
Georgia Military College
Beloit College
St. Mark’s School, DallasSubscription Rates for 2012 Calendar Year: Volume 33Institutional Web Membership (Web Only) Institutional Web Memberships do not provide print materials. Web memberships allow
members to search our online catalog, download COMAP print materials, and
reproduce them ?or classroom use.
(Domestic) #3030 $467 (Outside U.S.) #3030 $467
Institutional Membership (Print Only)
Institutional Memberships receive print copies o? The UMAP Journal quarterly, our
annual CD collection UMAP Modules, Tools ?or Teaching, and our organizational newslet-
ter Consortium.
(Domestic) #3040 $312 (Outside U.S.) #3041 $351
Institutional Plus Membership (Print Plus Web)
Institutional Plus Memberships receive print copies o? the quarterly issues o? The UMAP
Journal, our annual CD collection UMAP Modules, Tools ?or Teaching, our organizational
newsletter Consortium, and online membership that allows members to search our
online catalog, download COMAP print materials, and reproduce them ?or classroom use.
(Domestic) #3070 $615 (Outside U.S.) #3071 $659
For individual membership options visit
www.comap.com?or more in?ormation.
To order, send a check or money order to COMAP, or call toll-?ree
1-800-77-COMAP (1-800-772-6627).
The UMAP Journal is published quarterly by the Consortium ?or Mathematics and Its
Applications (COMAP), Inc., Suite 3B, 175 Middlesex Tpke.,Bed?ord, MA, 01730, in coop-
eration with the American Mathematical Association o? Two-Year Colleges
(AMATYC), the Mathematical Association o? America (MAA), the National Council o?
Teachers o? Mathematics (NCTM), the American Statistical Association (ASA), the
Society ?or Industrial and Applied Mathematics (SIAM), and The Institute ?or Operations Re-
search and the Management Sciences (INFORMS). The Journal acquaints readers with a
wide variety o? pro?essional applications o? the mathematical sciences and provides a ?orum
?or the discussion o? new directions in mathematical education (ISSN 0197-3622).
Periodical rate postage paid at Bed?ord, MA and at additional mailing o?fces.
Send address changes to: info@comap.com
COMAP, Inc., Suite 3B, 175 Middlesex Tpke., Bed?ord, MA, 01730
? Copyright 2012 by COMAP, Inc. All rights reserved. Mathematical Contest in Modeling (MCM)?, High School Mathematical Contest in
Modeling (HiMCM)?, and Interdisciplinary Contest in Modeling(ICM)?are registered trade marks o? COMAP, Inc.Vol. 33, No.3????012
Table of Contents
Guest EditorialNetwork Science: What’s Math Got to Do with It?
Chris Arney .............................................................................. 185Editor’s NoteAbout This Issue........................................................................... 192MCM Modeling ForumResults of the 2012 Mathematical Contest in ModelingWilliam P. Fox ........................................................................... 193
A Close Look at Leaves
Bo Zhang, Yi Zhang, and TianKun Lu ......................................... 205
Judges’Commentary: The Outstanding Leaf Problem Papers
Peter Olsen.........................................................
....................... 223
ComputingAlong the Big Long River
Chip Jackson, Lucas Bourne, and Travis Peters ............................ 231
Judges’Commentary: The Outstanding RiverProblem Papers
Marie Vanisko ........................................................................... 247
Author’s Commentary: The Outstanding RiverProblem Papers
Catherine A. Roberts.................................................................. 253
Judges’Commentary: The Giordano Award forthe RiverProblem
Marie Vanisko and Richard D. West ............................................ 259ICM Modeling ForumResults of the 2012 Interdisciplinary Contest in Modeling
Chris Arney .............................................................................. 263
Finding Conspirators in the Network via Machine Learning
Fangjian Guo, Jiang Su, and
Jian Gao .......................................... 275
Judges’Commentary: Modeling forCrime Busting
Chris Arney and Kathryn Coronges............................................ 293Reviews............................................................................... 305Guest Editorial185Guest EditorialNetwork Science:
What’s Math Got to Do with It?1Chris ArneyDept. of Mathematical Sciences
U.S. Military Academy
West Point, NY10996david.arney@usma.eduIntroductionThis year’s ICMR?problem involved network science,or more precisely,
a component ofnetwork science—socialnetwork analysis. My post-contest
re?ections have led me to believe it is time for the mathematics community
to engage in this emerging subject to build a rigorous mathematical foun-
dation for this important science and to join in performing mathematical
modeling and interdisciplinary problem solving.
Some people call network science a “new” emerging discipline, yet, as
we know, mathematicians have been developing graph (network) theory
for centuries, and scientists and engineers have been modeling networks
for decades. What is new is that the traditional techniques have been re-
placed by an entirely new arsenal of mathematics, science, and modeling
associated with networks.
Others callnetwork science the “new” operations research in that it con-
nects quantitative concepts and elements from several disciplines such as
mathematics, computer science, and information science with the qualita-
tive models from sociology and other social sciences. By its very nature,
network science is interdisciplinary and involves emerging areas of science
such as complex adaptive systems, cooperative game theory, agent-based
modeling, data analytics, and social network analysis.1With both appreciation and apologies to Tina Turner and her emotional song “What’s Love
Got To Do With It” and fulltongue-in-cheek realization that unlike “love,” mathematics is certainly
not a second-hand emotion.
TheUMAPJournal33(3)(2012)185–191.
c
?Copyright2012by COMAP,Inc. Allrightsreserved.
Permission to make digital or hard copies of part or all of this work for personal or classroom use
is granted without fee provided that copies are not made or distributed for pro?t or commercial
advantage and that copies bear this notice. Abstracting with credit is permitted, but copyrights
for components of this work owned by others than COMAP must be honored. To copy otherwise,
to republish, to post on servers, or to redistribute to lists requires prior permission from COMAP.186TheUMAP Journal 33.3 (2012)
From another, perhaps simpler, perspective, you could merely call net-
work science a form of applied mathematics—applied dynamic graph the-
ory with additional data elements and attributes. Or perhaps from a mod-
eling perspective, it is simply modeling with a highly structured, entity-
linked complex adaptive system framework.
I do not pretend to know exactly what network science is, or where it
?ts in with today’s scienti?c world, or what it will become. However, I
believe that much of the strength of network modeling is in its ability to
embrace the complexity of the real world. For me, that makes network
science an important and empowering form of interdisciplinary modeling
and problem solving—worthy of ICM problems and much more.
In particular, I hope that the mathematics community does not ignore
it. Network science needs the engagement of the mathematics community
to produce its underlying framework and to invent new mathematics and
computational techniques for analysis of its complex structures, develop-
ment of its synergistic processes, and organizing of its overwhelming data.
Likewise, mathematics needs network science to establish the relevance of
mathematics in the modern information-based world. As the ICM teams
discovered, network science is exciting, relevant, enjoyable, and modern—
elements that mathematics currently desperately needs to bolster its future
place in society.Mathematical ElementsWhat are the mathematical elements of network science? One way to
de?ne a network is to establish its?components (nodes, links, data, processes);?properties (dynamic, functional, layered); and?applications(logistics,?ow,transportation,Internet,metabolicnetworks,social networks, organizational networks—perhaps there are just too
many categories to list!).
Another way is to use the concept of a mathematical graph (the nodal-
link structure) with its nontrivial topological features and then classify the
various types of graphs that occur (random, scale free, small world, scale
rich) and the data (often heavy-tailed)that need to be mined and analyzed.
A foundationalresearch management report on network science offered
a layered approach of network roles—physical, communicative, informa-
tional, biological and social/ cognitive—that connect together to produce
the overallweb-like network framework [NationalResearch Council2006].
No matter whatde?nition or theoreticalframeworkisused,network sci-
enceis inherently and essentially mathematicalat its core;there is plenty for
applied mathematicians to do. Most networks are suf?ciently complexthatGuest Editorial187
simply relying on visualization produceserroneousintuitiveconceptions—
or, worse yet, complete misunderstanding. De?ning,computing,and mea-
suring well-de?ned properties can counter those misguided perceptions
and improvenetwork modeling and analysis. Indispensablerolesfor math-
ematical modelers in network science are?working with social scientists to build explicative and empirical models,?creating appropriate measures for important applications, and??nding appropriate properties and formalizing their measurement sys-
tems and calculations. Like many other mathematical modeling con-
structs, these properties can be classi?ed as structural or functional; lo-
cal, global, or regional; discrete or continuous; dynamic or static; and
deterministic or stochastic.
The network science world needs mathematicians to help sort out these
characteristics and do even more.Signi?canceNetwork science has become a major global research thrust (with fund-
ing potential equal to or exceeding that of mathematics and many sciences)
in the research agenda of governments, societies, militaries, businesses,
and organizations. Its publication and citation qualities and quantities are
signi?cant. Just look at the remarkable citation record of the works by
someone such as Albert-László Barabási to see this subject’s in?uence in
science. New societies, new conferences, and new journals with a network
science theme are emerging at a dazzling pace. Network science is reaching
the popular press and also entering the business world with tremendous
fanfare.Mathematicians’EngagementOne could argue that all the Mathematics Awareness Month themes
since 1997,when the theme was “Mathematics and the Internet,” have been
related to networks in some way. The theme for 2004, “The Mathematics
of Networks,” established a?rm connection, and I still use the myriad
networks on that year’s poster as examples to students of the variety and
beauty of networks. However, while mathematicians are certainly aware
of network science, I still do not see much real engagement by the U.S.
mathematics community. Recent meetings of the Mathematical Associa-
tion ofAmericaand theAmerican MathematicalSocietyshow only minimal
mathematically-connected networkresearch. TheSocietyfor Industrialand188TheUMAP Journal 33.3 (2012)
Applied Mathematics and the Institute for Operations Research and Man-
agement Science and their members are slightly more engaged, although
“when it comes to the research agenda now popularized by network sci-
ence, [operations research] has been an underutilized resource” [Alderson
2008].
In my opinion,mathematics is a vastly underutilized and unfortunately
often missing part of network science. In a comment on Alderson [2008],
Nagurney adds that “it is not just the network topology and associated sta-
tistical aspects of networks that matter but?ows that must be incorporated
into network modeling as well as behavior” [Nagurney 2008]. Alderson
and colleagues also wrote about mathematics and its engagement in Inter-
net research that “surprising little attention has been paid in the mathemat-
ics and physics communities ... in the Internet research arena” [Willinger
et al. 2009]. Of course, there is much more to network science than the In-
ternet, but it is a signi?cant network that most of the world confronts many
times and in many ways every day.Family TreeIt may be worth looking at a family tree for network science. While
these relationships are subjective and incomplete, one can see inFigure 1a
?ow that brings together many elements ofmathematics. The mathematics
community should not miss out on an opportunity as rich and stimulating
asnetwork science. Ultimately,themathematicalelementsofthisdiscipline
will be accomplished somehow and by someone. I suggest that this work
be done by mathematicians—and the sooner, the better.Network Science forUndergraduatesDoes network science extend to the undergraduate curriculum? Based
on their interest in this year’s ICM problem, I believe that undergraduates
would respond in the af?rmative. In some institutions, network science is
even making inroads in establishing undergraduate programs,and courses
and programs traditionally offered at the graduate level are entering the
undergraduate realm. My mathematics students tell me that they want to
learn network science. As analysis ofsocialmedia and online games are be-
ginning to be seen to be parts ofnetwork science modeling,student interest
in network science is growing in leaps and bounds. The bottom line is that
network science is highly popular with students. Ihave taught three differ-
ent network science courses over the past four semesters and am designing
a fourth for the Fall semester of 2012. I usually have to turn away students
or add more sections. This past Spring,Iteam-taught socialnetwork analy-
sis with a sociologist and enjoyed the interdisciplinary modeling aspects ofGuest Editorial189Figure 1.Disciplinary connection network modelshowing some ofthe links between mathematics
and network science.this exciting subject. Last year’s award-winning publication for INFORMS
was a network science book by Easley and Kleinberg from Cornell written
for undergraduates [Easley and Kleinberg 2010]. Network science is on the
map of undergraduate education.Worldwide InterestThe ICM data show there may be differences among nations in per-
ceptions or interests in interdisciplinary modeling and network modeling.
The U.S. has always been slightly behind the rest the rest of the world in
ICM/ MCMR?interest ratio, as measured by the proportion of teams who
choose the ICM problem rather than one of the MCM problems. Usually,
there is about half as much interest in the ICM from U.S. teams compared
to teams from the rest of the world; this year, that ratio dropped to about
one-third. I do not know why this is so or if this phenomenon has any real
signi?cance. Perhaps American students have a more disciplinary focus on
mathematics or just haven’t been exposed to as many network or interdis-
ciplinary ideas. Whatever the reason, I personally hope that all students
(high school,undergraduate and graduate)in every nation have the oppor-
tunity to study some aspects of networks, and that the mathematics they
learn in doing so goes to excellent use.190TheUMAP Journal 33.3 (2012)ExhortationMathematicians, let’s not miss this opportunity. Take another look at
network science and see where you can contribute. Talk to colleagues in
other disciplinesand form teamstolearn,study,research,teach,and engage
in this enjoyable and important?eld of network science.ReferencesAlderson, David L. 2008. Catching the “network science” bug: Insight
and opportunity for the operations researcher.Operations Research56
(5) (September-October 2008): 1047–1065.http://www.informs.org/
content/download/255771/2414490/file/networks.pdf.
Easley, D., and J. Kleinberg. 2010.Networks,Crowds, and Markets: Reasoning
AboutaHighlyConnectedWorld.New York: CambridgeUniversityPress.
Nagurney, Anna. 2008. Comment on Alderson [2008].Operations Re-
search (OnlineForum Commentary)56 (5) (October–November 2008) on-
line commentary.http://www.informs.org/content/download/
255775/2414506/file/nagurney.pdf.
National Research Council. 2006.Network Science.Washington, DC: Na-
tional Academies Press.
Willinger, Walter, David Alderson, and John C. Doyle. 2009. Mathematics
and the Internet: A source of enormous confusion and great potential.
Notices of the American Mathematical Society56 (5) (May 2009): 286–299.http://www.ams.org/notices/200905/rtx090500586p.pdf.Guest Editorial191About the AuthorChrisArneygraduated from WestPointand served
as an intelligence of?cer in the U.S. Army. His aca-
demic studies resumed at Rensselaer Polytechnic In-
stitute with an M.S. (computer science) and a Ph.D.
(mathematics). He spent most of his 30-year military
career as a mathematics professor at West Point, be-
fore becoming Dean ofthe SchoolofMathematicsand
Sciences and Interim Vice President for AcademicAf-
fairs at the College of Saint Rose in Albany, NY. Chris then moved to RTP
(Research Triangle Park), NC, where he served for various durations as
chair of the Mathematical Sciences Division, of the Network Sciences Di-
vision, and of the Information Sciences Directorate of the Army Research
Of?ce. Chris has authored 22 books, written more than 120 technical arti-
cles, and given more than 250 presentations and 40 workshops. His techni-
cal interests include mathematicalmodeling,cooperative systems,pursuit-
evasion modeling,robotics, arti?cial intelligence,military operations mod-
eling, and network science;his teaching interests include using technology
and interdisciplinary problems to improve undergraduate teaching and
curricula. He is the founding director of COMAP’s Interdisciplinary Con-
test in Modeling (ICM)R?. In August 2009, he rejoined the faculty at West
Point as the Network Science Chair and Professor of Mathematics.192TheUMAP Journal 33.3 (2012)Editor’s NoteAbout This IssueThis year we had 5,000 (!) participating teams in the two contests com-
bined; the 18 (!) Outstanding papers ran to over 500 manuscript pages.
Editing and publishing all the Outstanding papers, which we once did, is
simply not possible any more.
Hence, as in 2010 and 2011, we are able to present in the pages of the
Journalonly oneOutstandingentry for each oftheMCM and ICM problems.
The selection of which papers to publish re?ected editorial considerations
and was done blind to the af?liations of the teams.
Allofthe18Outstandingpapersappearin theiroriginalform on the2012
MCM-ICM CD-ROM,which also has the press releases for the two contests,
the results, the problems, unabridged versions of the Outstanding papers,
and some of the commentaries. Information about ordering is athttp:
//www.comap.com/product/cdrom/index.htmlor at (800) 772-6627.Results of the2012 MCM193MCM Modeling ForumResults of the 2012
Mathematical Contest in ModelingWilliam P. Fox, MCM DirectorDept. of Defense Analysis
Naval Postgraduate School
1 University Circle
Monterey, CA 93943–5000wpfox@nps.eduIntroductionA total of 3,697 teams of undergraduates from hundreds of institutions and
departments in 16 countries spent a weekend in February working on applied
mathematicsproblemsin the28th MathematicalContestin Modeling(MCM)R?.
The 2012 MCM began at 8:00 P.M. EST on Thursday, February 9, and ended
at 8:00 P.M. EST on Monday, February 13. During that time, teams of up to
three undergraduates researched, modeled, and submitted a solution to one
of two open-ended modeling problems. Students registered, obtained contest
materials, downloaded the problem and data, and entered completion data
through COMAP’s MCM Website. After a weekend of hard work, solution
papers were sent to COMAP on Monday. Two of the top papers appear in this
issue ofTheUMAP Journal, together with commentaries.
In addition to this special issue ofThe UMAP Journal, COMAP offers asupplementary2012MCM-ICM CD-ROMcontaining thepress releasesfor thetwo contests,theresults,theproblems,unabridged versionsoftheOutstanding
papers, and judges’ commentaries. Information about ordering is athttp:
//www.comap.com/product/cdrom/index.htmlor at (800) 772–6627.Results and winning papers from the?rst 27 contests were published in
special issues ofMathematical Modeling(1985–1987) andThe UMAP Journal(1985–2011). The 1994 volume ofTools for Teaching, commemorating the tenth
anniversary of the contest, contains the 20 problems used in the?rst 10 years
of the contest and a winning paper for each year. That volume and the specialTheUMAPJournal33(3)(2012)193–204.
c
?Copyright2012by COMAP,Inc. Allrightsreserved.
Permission to make digital or hard copies of part or all of this work for personal or classroom use
is granted without fee provided that copies are not made or distributed for pro?t or commercial
advantage and that copies bear this notice. Abstracting with credit is permitted, but copyrights
for components of this work owned by others than COMAP must be honored. To copy otherwise,
to republish, to post on servers, or to redistribute to lists requires prior permission from COMAP.194TheUMAP Journal 33.3 (2012)MCM issuesoftheJournalfor thelast few yearsare availablefrom COMAP.The
1994volumeisalsoavailableon COMAP’sspecialModelingResourceCD-ROM.Also available isThe MCM at 21CD-ROM, which contains the 20 problems
from the second 10 years of the contest, a winning paper from each year, and
advice from advisors of Outstanding teams. These CD-ROMs can be ordered
from COMAP athttp://www.comap.com/product/cdrom/index.html.
This year, the two MCM problems represented signi?cant challenges:?Problem A, “The Leaves of a Tree,” asked teams to model the leaves on atree, classifying their shapes, investigating the effect of leaf shape on tree
pro?le and branching, and estimating the leaf mass of a tree.?Problem B, “Camping Along the Big Long River,” asked teams to design a
management plan for scheduling recreational multi-day rafting tours down
a long stretch of a river. The goal was to maximize the number of trips,
optimize campsite usage, and offer an optimal mix of trip lengths.
COMAP also sponsors:?The MCM/ ICM Media Contest (see p. 202).?TheInterdisciplinary Contest in Modeling (ICM)R?,which runsconcurrently
with MCM and next year will offer a modeling problem involving network
science. Results of this year’s ICM are on the COMAP Website athttp:
//www.comap.com/undergraduate/contests. The contest report, anOutstanding paper, and commentaries appear in this issue.?The High School Mathematical Contest in Modeling (HiMCM)R?, whichoffers high school students a modeling opportunity similar to the MCM.
Further details are athttp://www.comap.com/highschool/contests.2012 MCM Statistics?3,697 teams participated (with 1,329 more in the ICM)?8 high school teams (<0.5%)?341 U.S. teams (9%)?2,428 foreign teams (91%), from Canada, China, Finland, Germany, India,
Indonesia, Ireland, Malaysia, Mexico, Palestine, Singapore, South Africa,
South Korea, Spain, Turkey, and the United Kingdom?10 Outstanding Winners (<0.5%)?17 Finalist Winners (<0.5%)?405 Meritorious Winners (11%)?1,048 Honorable Mentions (28%)?2,211 Successful Participants (60%)Results of the2012 MCM195Problem A:The Leaves of a Tree“How much do the leaves on a tree weigh?” How might one estimate the
actual weight of the leaves (or for that matter any other parts of the tree)? How
might one classify leaves? Build a mathematicalmodel to describe and classify
leaves. Consider and answer the following:?Why do leaves have the various shapes that they have??Do the shapes “minimize” overlapping individual shadows that are cast,
so as to maximize exposure? Does the distribution of leaves within the
“volume” of the tree and its branches effect the shape??Speaking of pro?les, is leaf shape (general characteristics) related to tree
pro?le/ branching structure??How would you estimate the leaf mass of a tree? Is there a correlationbetween the leaf mass and the size characteristics of the tree (height, mass,
volume de?ned by the pro?le)?
In addition to your one-page summary sheet, prepare a one-page letter to
an editor of a scienti?c journal outlining your key?ndings.Problem B: Camping along the Big Long
RiverVisitorsto theBig Long River (225miles)can enjoy scenicviewsand exciting
white water rapids. The river is inaccessible to hikers, so the only way to enjoy
it is to take a river trip that requires several days of camping.
River trips all start at First Launch and exit the river at Final Exit, 225 miles
downstream. Passengers take either oar- powered rubber rafts, which travel
on average 4 mph or motorized boats, which travel on average 8 mph. The
trips range from 6 to 18 nights of camping on the river, start to?nish.
The government agency responsible for managing this river wants every
trip to enjoy a wilderness experience, with minimal contact with other groups
of boats on the river.
Currently,Xtrips travel down the Big Long River each year during a six-
month period (the rest of the year it is too cold for river trips). There areYcamp sites on the Big Long River, distributed fairly uniformly throughout the
river corridor. Given the rise in popularity of river rafting, the park managers
have been asked to allow more trips to travel down the river. They want to
determinehow theymightschedulean optimalmixoftrips,ofvaryingduration
(measured in nights on the river)and propulsion (motor or oar)that willutilize
the campsites in the best way possible. In other words, how many more boat
trips could be added to the Big Long River’s rafting season?196TheUMAP Journal 33.3 (2012)The river managers have hired you to advise them on ways in which to
develop the best schedule and on ways in which to determine the carrying
capacity of the river, remembering that no two sets of campers can occupy the
same site at the same time.
In addition to your one-page summary sheet, prepare a one-page memo to
the managers of the river describing your key?ndings.The ResultsThe solution papers were coded at COMAP headquarters so that names
and af?liations of the authors would be unknown to the judges. Each paper
was then read preliminarily by two “triage” judges at either Appalachian State
University (Leaf Problem) or at the National Security Agency (River Problem)
or at Carroll College (River Problem). At the triage stage, the summary and
overall organization are the basis for judging a paper. If the judges’ scores
diverged for a paper, the judges conferred; if they still did not agree, a third
judge evaluated the paper.
AdditionalRegionalJudgingsiteswerecreated attheU.S.MilitaryAcademy,
the Naval Postgraduate School, and Carroll College to support the growing
number of contest submissions.
Final judging took place at the Naval Postgraduate School, Monterey, CA.
The judges classi?ed the papers as follows:Honorable Successful
Outstanding Finalist Meritorious Mention Participation Total
Leaf Problem 4 8 226 482 862 1,582
River Problem 69 179 566 1,349 2,109
10 17 405 1,048 2,211 3,691We list here the 10 teams that the judges designated as Outstanding;the list
of all participating schools, advisors, and results is at the COMAP Website.Outstanding TeamsInstitution and Advisor Team Members
Leaf Problem“How to Measure the Weight of Leaves on a Tree ”
Hong Kong Baptist University
Kowloon, Hong Kong
Alex Wing Kee Mok
Xiaotian Wu
Qingran Li
Jin LiangResults of the2012 MCM197“A Close Look at Leaves”
Shanghai Foreign Language School
Shanghai, China
YiJung Wang
Bo Zhang
Yi Zhang
TianKun Lu
“Geometrical Tree”
National University of Singapore
Singapore
Weizhu Bao
Wenji Xu
Jing Zhang
Jingyi Lu
“The Secrets of Leaves”
Zhejiang University
Hangzhou, China
Zhiyi Tan
Cheng Fu
Danting Zhu
Hangqi ZhaoRiverProblem“Best Schedule to Utilize the Big Long River”
Peking University
Beijing, China
Liu Xu Feng
Nan Bi
Chenwei Tian
Yuan Liu
“Computing Along the Big Long River”
Western Washington University
Bellingham, WA
Edoh Y. Amiran
Chip Jackson
Lucas Bourne
Travis Peters
“Optimization of Seasonal Throughput and
Campsite Utilization on the Big Long River”
University of Colorado
Boulder, CO
Anne M. Dougherty
Stephen M. Kissler
Christopher Corey
Sean Wiese
“Getting Our Priorities Straight”
Bethel University
Arden Hills, MN
Nathan Gossett
Michael D. Tetzlaff
Blaine Goscha
Jacob Smith198TheUMAP Journal 33.3 (2012)“Optimal Scheduling for the Big Long River”
University of Colorado
Boulder, CO
Anne M. Dougherty
Tracy Babb
Christopher V. Aicher
Daniel J. Sutton
“C.A.R.S.: Cellular Automaton Rafting Simulation”
University of Louisville
Louisville, KY
Changbing Hu
SurajKannan
Joshua Mitchell
James JonesAwards and ContributionsEach participatingMCM advisor and team member received a certi?cate
signed by the Contest Director and the appropriate Head Judge.
INFORMS, the Institute for Operations Research and the Management
Sciences,recognized the team from the ShanghaiForeign Language School,
China (Leaf Problem) and the team of Babb, Aicher, and Sutton from the
University of Colorado (River Problem) as INFORMS Outstanding teams
and provided the following recognition:?a letter of congratulations from the current president of INFORMS to
each team member and to the faculty advisor;?a check in the amount of $300 to each team member;?a bronze plaque for display at the team’s institution, commemoratingteam members’achievement;?individual certi?cates for team members and faculty advisor as a per-sonal commemoration of this achievement;and?a one-year student membership in INFORMS for each team member,which includes their choice of a professional journal plus theOR/MS
Todayperiodical and the INFORMS newsletter.
The Society for Industrialand Applied Mathematics (SIAM)designated
one Outstanding team from each problem as a SIAM Winner. The SIAM
Award teams were from Zhejiang University (Leaf Problem) and the Uni-
versity of Louisville (River Problem). Each team member was awarded a
$300 cash prize, and the teams received partial expenses to present their
results in a special Minisymposium at the SIAM Annual Meeting in Min-
neapolis, MN in July. Their schools were given a framed hand-lettered
certi?cate in gold leaf.
The MathematicalAssociation ofAmerica (MAA)designated one North
American team from each problem as an MAA Winner. The Winner for theResults of the2012 MCM199
Leaf Problem was a Finalist team from Cornell University with members
Dennis Chua, Jessie Lin, and Alvin Wijaya, and advisor John R. Callister.
The winner for the River Problem was the Outstanding team of Kissler,
Corey, and Wiese from the University of Colorado. With partial travel sup-
port from theMAA,theteamspresented their solution at a specialsession of
the MAA Mathfest in Madison,WIin August. Each team member was pre-
sented a certi?cate by an of?cial ofthe MAA Committee on Undergraduate
Student Activities and Chapters.Ben Fusaro AwardOne Meritorious or Outstanding paper is selected for the Ben Fusaro
Award, named for the Founding Director of the MCM and awarded for
the ninth time this year. It recognizes an especially creative approach;
details concerning the award, its judging, and Ben Fusaro are in Vol. 25
(3) (2004): 195–196. The Ben Fusaro Award Winner, for the Leaf Problem,
was the Outstanding team from the National University of Singapore. A
commentary about it appears in this issue.Frank Giordano AwardFor the?rst time, the MCM is designating a paper with the Frank
Giordano Award. This award goes to a paper that demonstrates a very
good example of the modeling process in a problem featuring discrete
mathematics—this year, the River Problem. Having worked on the con-
test since its inception, Frank Giordano served as Contest Director for 20
years. The Frank Giordano Award for 2012 went to the Outstanding team
from Western Washington University in Bellingham, WA.JudgingDirector
William P. Fox, Dept. of Defense Analysis, Naval Postgraduate School,
Monterey, CA
AssociateDirectors
Patrick J. Driscoll, Dept. of Systems Engineering, U.S. Military Academy,
West Point, NY
Kelly Black, Mathematics Dept., Clarkson University, Potsdam, NYLeaf ProblemHead Judge
Patrick Driscoll, Dept. of Systems Engineering, U.S. Military Academy,
West Point, NY200TheUMAP Journal 33.3 (2012)
AssociateJudges
William C. Bauldry, Chair-Emeritus, Dept. of Mathematical Sciences,
Appalachian State University, Boone, NC (Head Triage Judge)
Karen Bolinger, Dept of Mathematics, Clarion University, Clarion, PA
Tim Elkins, Dept. of Systems Engineering, U.S. Military Academy,
West Point, NY(INFORMSJudge)
J. Douglas Faires, Youngstown State University, Youngstown, OH
(MAA Judge)
Ben Fusaro, Dept. of Mathematics,Florida State University, Tallahassee,FL
(SIAM Judge)
Michael Jaye, Dept. of Defense Analysis, Naval Postgraduate School,
Monterey, CA
Mario Juncosa, RAND Corporation, Santa Monica, CA (retired)
Peter Olsen, Johns Hopkins Applied Physics Laboratory, Baltimore, MD
John Scharf, Dept. of Mathematics, Engineering, and Computer Science,
Carroll College, Helena, MT (Fusaro Award Judge)
Michael Tortorella, Dept. of Industrial and Systems Engineering,
Rutgers University, Piscataway, NJ(Problem Author)
Dan Zwilliger, Raytheon, Boston, MA
Regional Judging Session at the U.S. Military Academy
Head Judge
Patrick J. Driscoll, Dept. of Systems Engineering
AssociateJudges
Tim Elkins, James Enos, Kenny McDonald, and Elizabeth Schott,
Dept. of Systems Engineering
Paul Steve Horton, Dept. of Mathematical Sciences
Jack Picciuto, Of?ce of Institutional Research
—all from the United States Military Academy at West Point, NY
Paul Heiney, Dept of Mathematics, U.S. Military Academy Preparatory
School, West Point, NY
Ed Pohl, Dept. of Industrial Engineering
Tish Pohl, Dept. of Civil Engineering
—both from University of Arkansas, Fayetteville, AR
Triage Session at Appalachian State University
Head TriageJudge
William C. Bauldry, Chair, Dept. of Mathematical Sciences
AssociateJudges
Bill Cook, Ross Gosky, Jeffry Hirst, Katie Mawhinney, Trina Palmer, Greg
Rhoads, Ren′e Salinas, Tracie McLemore Salinas, Kevin Shirley, and Nate
Weigl
—all from the Dept. of Mathematical Sciences, Appalachian State
University, Boone, NCResults of the2012 MCM201
Amy H. Erickson and Keith Erickson
—Dept. of Mathematics, Georgia Gwinnett College, Lawrenceville, GA
Steven Kaczkowski and Douglas Meade
—Dept. of Mathematics, University of South Carolina, Columbia, SCRiverProblemHead Judge
Maynard Thompson, Mathematics Dept., University of Indiana,
Bloomington, IN
AssociateJudges
Peter Anspach, National Security Agency, Ft. Meade, MD
(Head Triage Judge)
Robert Burks, Operations Research Dept., Naval Postgraduate School,
Monterey, CA
Jim Case, Baltimore, MD (SIAM Judge)
Veena Mendiratta, Lucent Technologies, Naperville, IL
Greg Mislick, Operations Research Dept., Naval Postgraduate School,
Monterey, CA
Scott Nestler, Operations Research Dept., Naval Postgraduate School,
Monterey, CA
Jack Picciuto, Of?ce of Institutional Research, U.S. Military Academy,
West Point, NY
Kathleen M. Shannon, Dept. of Mathematics and Computer Science,
Salisbury University, Salisbury, MD (MAA Judge)
Dan Solow, Case Western Reserve University, Cleveland, OH
(INFORMSJudge)
Marie Vanisko, Dept.ofMathematics,Engineering,and Computer Science,
Carroll College, Helena, MT (Giordano Award Judge)
Richard Douglas West, Francis Marion University, Florence, SC
(Giordano Award Judge)
Regional Judging Session at the Naval Postgraduate School
Head Judges
William P. Fox, Dept. of Defense Analysis
Frank R. Giordano, Dept. of Defense Analysis
AssociateJudges
Michael Jaye, Dept. of Defense Analysis
Robert Burks, Greg Mislick, and Scott Nestler, Operations Research Dept.
—all from the Naval Postgraduate School, Monterey, CA
Joanna Bieri, University of Redlands, Redlands, CA
Rich West, (retired) PA202TheUMAP Journal 33.3 (2012)
Triage Session at Carroll College
Head Judge
Marie Vanisko
AssociateJudges
Terry Mullen and Kelly Cline
—all from Dept. of Mathematics, Engineering, and Computer Science,
Carroll College, Helena, MT
Triage Session at the National Security Agency
Head TriageJudge
Peter Anspach, National Security Agency (NSA), Ft. Meade, MD
AssociateJudges
Jim Case, Dean McCullough, and judges from within NSASources of the ProblemsThe Leaf Problem was contributed by Lee Zia (Program Director, Na-
tional Science Foundation Division of Undergraduate Education). The
River Problem wascontributed by CatherineRoberts(Dept.ofMathematics
and Computer Science, College of the Holy Cross, Worcester, MA).AcknowledgmentsMajor fundingfor theMCMisprovided by theNationalSecurityAgency
(NSA) and by COMAP. Additional support is provided by the Institute for
Operations Research and the Management Sciences (INFORMS), the Soci-
ety for Industrial and Applied Mathematics (SIAM), and the Mathematical
Association of America (MAA). We are indebted to these organizations for
providing judges and prizes.
We also thank for their involvement and un?agging support the MCM
judges and MCM Board members, as well as?Two Sigma Investments.“This group of experienced, analytical, and
technical?nancial professionals based in New York builds and operates
sophisticated quantitative trading strategies for domestic and interna-
tional markets. The?rm is successfully managing several billion dol-
lars using highly-automated trading technologies. For more information
about Two Sigma, please visithttp://www.twosigma.com.”Results of the2012 MCM203CautionsTothereader of research journals:
Usually a published paper has been presented to an audience, shown
to colleagues, rewritten, checked by referees, revised, and edited by a jour-
nal editor. Each paper here is the result of undergraduates working on
a problem over a weekend. Editing (and usually substantial cutting) has
taken place; minor errors have been corrected, wording has been altered
for clarity or economy, and style has been adjusted to that ofThe UMAP
Journal. The student authors have proofed the results. Please peruse these
students’efforts in that context.
Tothepotential MCM advisor:
It might be overpowering to encounter such output from a weekend
of work by a small team of undergraduates, but these solution papers are
highly atypical. A team that prepares and participates will have an enrich-
ing learning experience, independent of what any other team does.
COMAP’sMathematicalContestin Modelingand InterdisciplinaryCon-
test in Modeling are the only international modeling contests in which
students work in teams. Centering its educational philosophy on mathe-
matical modeling, COMAP serves the educational community as well as
the world of work by preparing students to become better-informed and
better-prepared citizens.Editor’s NoteThe complete roster of participating teams and results has become too
longtoreproducein theJournal. Itcan now befound attheCOMAPWebsite,
in separate?les for each problem:http://www.comap.com/undergraduate/contests/mcm/contests/
2012/results/2012-Problem-A.pdf
http://www.comap.com/undergraduate/contests/mcm/contests/
2012/results/2012-Problem-B.pdf204TheUMAP Journal 33.3 (2012)Media ContestThis year, COMAP again organized an MCM/ ICM Media Contest.
Over the years, contest teams have increasingly taken to various forms
of documentation of their activities over the grueling 96 hours—frequently
in video, slide, or presentation form. This material has been produced to
provide comicrelief and let off steam, as well as to provide some memories
days, weeks, and years after the contest. Weloveit, and we want to encour-
age teams (outside help is allowed) to create media pieces and share them
with us and the MCM/ ICM community.
The media contest iscompletely separatefrom MCM and ICM. No matter
how creative and inventive the media presentation, it hasnoeffect on the
judging of the team’s paper for MCM or ICM. We do not want work on the
media project to detract or distract from work on the contest problems in
any way. This is a separate competition, one that we hope is fun for all.
Further information about the contest is athttp://www.comap.com/undergraduate/contests/mcm/media.html.
There were 11 entries, from Zhejiang University, United States Military
Academy, Dalian Maritime University, and Beijing Institute of Technology.
Outstanding Winners:?United States Military Academy, joint entry from three teams
(Nolan Miles, Andrew Lopez, Benjamin Garlick, Brian Kloiber, Calla
Glavin, Kailee Kunst, Samuel Ellis, Tanner Robertson, Robert Hume)?Zhejiang University
(Jiajun Chen, Yuchen Lei, Canyang Jin)
Finalists:?Dalian Maritime University
(Chengcheng Bi, Xuefu Bai, Bo Han)?Dalian Maritime University(Zuchen Tang, Zihao Yu, Bowen Zhang)
The remaining entries were awarded Honorable Mention.
Complete results, including links to the Outstanding and Finalist videos,
are athttp://www.comap.com/undergraduate/contests/mcm/contests/
2012/results/media/media.html.A CloseLook at Leaves205A Close Look at LeavesBo Zhang
Yi Zhang
Tiankun LuShanghai Foreign Language School
Shanghai, China
Advisor: YiJung WangAbstract
We construct four models to study leaf classi?cation, relationships be-
tween leaf shape and leaf distribution, correlations between leaf shape and
tree pro?le, and total leaf mass of a tree.
Model 1 deals with the classi?cation of leaves. We focus primarily on the
most conspicuous characteristic of leaves, namely, shape. We create seven
geometric parameters to quantify the shape. Then we select six common
types of leaves to construct a database. By calculating the deviation index of
the parameters of a sample leaf from those of typical leaves, we can classify
the leaf. To illustrate this classi?cation process, we use a maple leaf as a test
case.
Model 2 studies the relationship between leaf shape and leaf distribu-
tion. First, we simplify a tree into an idealized model and then introduce
the concept of solar altitude. By analyzing the overlapping individual shad-
ows through considering the relationship between leaf length and internode
length under different solar altitudes, we?nd that the leaf shape and distri-
bution are optimized to maximize sunlight exposure according to the solar
altitude. We apply the model to three test types of trees.
Model 3 discusses the possible association between tree pro?le and leaf
shape. Based on the similarity between the leaf veins and branch structure of
trees,weproposethatleafshapeisatwo-dimensionalmimicofthetreepro?le.
Employing the method of Model 1, we set several parameters re?ecting the
general shape of each tree and compare them with those of its leaves. With
the help of statistical tools, we demonstrate a rough association between tree
pro?le and leaf shape.
Model 4 estimates the total leaf mass of a tree given size characteristics.
Carbon dioxide (CO2) sequestration rate and tree age are introduced to es-
tablish the link between leaf mass and tree size. Since a unit mass of a leaf
sequesters CO2at a constant rate, the CO2sequestration rate has a quadratic
relationship with the age of the tree, and the size the tree experiences logistic
growth.TheUMAPJournal33(3)(2012)205–222.
c
?Copyright2012by COMAP,Inc. Allrightsreserved.
Permission to make digital or hard copies of part or all of this work for personal or classroom use
is granted without fee provided that copies are not made or distributed for pro?t or commercial
advantage and that copies bear this notice. Abstracting with credit is permitted, but copyrights
for components of this work owned by others than COMAP must be honored. To copy otherwise,
to republish, to post on servers, or to redistribute to lists requires prior permission from COMAP.206TheUMAP Journal 33.3 (2012)IntroductionWe tackle four main subproblems:?classi?cation of leaves,?the relationship between leaf distribution and leaf shape,?the relationship between the tree pro?le and the leaf shape, and?calculation of the total leaf mass of a tree.
To tackle the?rst problem, we select a set of parameters to quantify the
characters of the leaf shape and use the leaf shape as the main standard for
our classi?cation process.
For the second question, we use the overlapping area that one leaf’s
shadow casts on the leafdirectly under it as the link between the leafdistri-
bution and the leaf shape, since the leaf shape affects the overlapping. We
assume that the leaf distribution tries to minimize the overlapping area.
As for the third question, we set parameters for the tree pro?le and
comparethosewith theparametersfor thetree’sleafshapetojudgewhether
there is a relation between tree pro?le and leaf shape.
We use age to link the size of tree and the total weight of its leaves,
because the tree size has an obvious relationship with its age and the age
affectsatree’ssequestration ofcarbon dioxide,which affectsthetotalweight
of a tree’s leaves.Assumptions?The trees are all individual (“open grown”) trees, such as are typically
planted along streets, in yards, and in parks. Our calculation does not
apply to densely raised trees, as in typical reforestation projects where
large numbers of trees are planted close together.?The shape of the leaves does not re?ect special uses for the trees, such
as to resist extremely windy, cold, parched, wet, or dry conditions, or to
produce food.?The type of the leaf distribution (leaf length and internode distance rela-
tion) re?ects the tree’s natural tendency to sunlight.?The tree pro?le that we consider is the part above ground, including thetrunk, the branches, and leaves.?All parts of a leaf can lie?at, and the thickness or protrusion of veins canbe neglected.?Leaves are the only part of the tree that reacts in photosynthesis andrespiration, so that the carbon dioxide sequestration of a tree is the sum
of the sequestration of the leaves.A Close Look at Leaves207?The sequestration of a tree or a leaf is the net amount of CO2?xed in a
tree, which is the difference between the CO2released in respiration andthe CO2absorbed in photosynthesis.?The trees are in healthy, mature, and stable condition. Trees of the samespecies have same characteristics.Model 1: Leaf Classi?cationDecisive ParametersTo classify the shape of a leaf, we set seven parameters and establish adatabase for comparison.RectangularityWe de?ne the ratio of the area of the leaf to the area of its minimumbounding rectangle as the leaf’srectangularity(Figure 1).Figure 1.
Figure 2. Figure 3. Figure 4.
The photographs of leaves inFigures 1–4are reproduced (with overlays by the authors of this
paper) from Knight et al. [2010], by kind permission of that paper’s authors.Aspect RatioThe aspect ratio is the ratio of the height of the minimum bounding
rectangle to its width. (Figure 2).CircularityTo evaluate how round a leaf is, we consider that ratio if the ex-circle tothe in-circle. (Figure 3).Form FactorForm factor, a famous shape description parameter, is calculated as
FF=
4πA
P2,208TheUMAP Journal 33.3 (2012)
whereAis the area of the leaf andPis its perimeter.Edge Regularity Area IndexAlthough the aspect ratio and the rectangularity of two leaves may be
similar, the contour or the exact shape of two leaves may vary greatly.
To take the different contour of the leaf into consideration, we join every
convexdot along thecontour and develop what wecalltheboundingpolygon
area. The ratio between the leaf area and this bounding polygon area is the
edge regularity area index. The closer this ratio is to 1, the less jagged and
smoother the leaf’s contour is (Figure 3).Edge Regularity PerimeterIndexSimilarly,we develop another parameter,theboundingpolygon perimeter,
the perimeter of the polygon when we join the convex dots of a leaf. We
de?ne the ratio of the bounding polygon perimeter to the perimeter of the
leaf to be theedgeregularity perimeter index. The smaller this ratio, the more
jagged and irregular the contour of the leaf is (Figure 3).Proportional IndexSince it is also highly critical to capture the spatial distribution of dif-
ferent portions of a leaf along its vertical axis, we divide the minimum
bounding rectangle into four horizontal blocks of equal height, and then
calculate the proportion of the leaf area in a particular region to the total
leaf,which we refer to as theproportionalindex (PI)for that region (Figure 4).
Hence, the PI is a vector of length four.Common Types of LeavesWe develop a database of the six most common leaf types in North
America (Figure 6), using the seven parameters discussed above.Table
2gives the values of the parameters for each leaf type, as measured from
scans of photos of leaves in Knight et al. [2010].ComparisonGiven a speci?cleaf,we calculatethe seven characteristicsofit and com-
pare them with our database by calculating the squared deviation of each
parameter of the given leaf from the corresponding standard parameter of
each category. We realize that some of the parameters are somehow more
important than others. So in an effort to make our model more accurateA Close Look at Leaves209Figure 5.The six most common seen leaf types in North America. (The photos, from Knight et al.
[2010], are reproduced by kind permission of that paper’s authors.)
Table 1.
Parameter values for the six leaf types.
Type 1 2 3 4 5 6
Rectangularity 0.6627 0.5902 0.6250 0.4772 0.4876 0.6576
Aspect Ratio 0.8615 0.6600 0.1800 0.6383 0.4792 0.3111
Circularity 0.8140 0.5432 0.4564 0.3454 0.3123 0.3311
Form Factor 0.9139 0.6206 0.2823 0.2470 0.3662 0.4956
ER Area Index 0.9322 0.8780 0.9091 0.8500 0.7880 0.8895
ER Perimeter Index 0.8727 0.8889 0.9384 0.8602 0.8231 0.9903
PI10.0649 0.0769 0.1179 0.1909 0.1299 0.2920
PI20.2958 0.3555 0.2208 0.3892 0.3606 0.4187PI30.3439 0.4243 0.4139 0.3047 0.4123 0.2677PI40.2954 0.1433 0.2474 0.1152 0.0970 0.0220and reliable, we introduce a weightedindex of deviationID, withID=7?i=1wiIi,where eachIiis the squared deviation, except thatI7=
1
44?j=1(PIj?PInew,i)2.We determine the weights via the Analytical Hierarchy Process (AHP)
[Saaty 1982]. We build a7×7matrix reciprocal matrix by pair comparison:210TheUMAP Journal 33.3 (2012)?
?
?
?
?
?
?
?
?R AR C FF ERAI ERPI PI
R1 1/3 1 1/4 1/2 1/2 1/7AR3 1 3 1 2 2 1/3C1 1/3 1 1/4 1/2 1/2 1/7FF4 1 4 1 3 3 1/2ERAI2 1/2 2 1/3 1 1 1/4ERPI2 1/2 2 1/3 1 1 1/4PI7 3 7 2 4 4 1
?
?
?
?
?
?
?
?
?
.The meaning of the number in each cell is explained inTable 2. The
numbers themselves are based on our own subjective decisions.Table 2.The multiplication table ofD10.
Intensity of Value Interpretation
1 Requirementsiandjhave equal value.
3 Requirementihas a slightly higher value thanj.
5 Requirementihas a strongly higher value thanj.
7 Requirementihas a very strongly higher value thanj.
9 Requirementihas an absolutely higher value thanj.
2, 4, 6, 8 Intermediate scales between two adjacent judgments.
Reciprocals Requirementihas alowervalue thanj.We then input the matrix into a Matlab program that calculates the
weightwiof each factor, as given inTable 3.Table 3.
AHP-derived weights.
Factor R AR C FF ERAI ERPI PI
Weight 0.0480 0.1583 0.0480 0.2048 0.0855 0.0855 0.3701We test the consistency of the preferences for this instance of the AHP.
For good consistency [Alonso and Lamata 2006, 446–447]:?Theprincipaleigenvalueλmaxofthematrixshould beclosetothenumbernof alternatives, here 7; we getλmax= 7.05.?The consistency index CI =(λmax?n)/(n?1)should be close to 0; weget CI= 0.009.?The consistency ratio CR = CI/ RI (where RI is the average value of CIfor random matrices) should be less than 0.01; we get CR= 0.006.
Hence, our decision method displays perfectly acceptable consistency and
the weights are reasonable.A Close Look at Leaves211Model TestingWe use a maple leafofFigure 6to test our classi?cation model. Visually,
it resembles Category 4 most.Figure 6.Test maple leaf.Now we test this hypothesis with our model. First, we process theimage of the leaf, calculating rectangularity, aspect ratio, circularity, form
factor, edge regularity area index, edge regularity perimeter index, and the
proportional index, with values as inTable 2. The values of the seven
parameters are shown inTable 4.Table 4.
Parameter values for the sample maple leaf.
Factor R AR C FF ERAI ERPI PI1PI2PI3PI4Measured
0.355 0.908 0.269 0.157 0.625 0.379 0.097 0.463 0.431 0.009
valueFinally, we use our weights to calculate the index of deviationIDof the
maple leaf from each of the six categories of leaves considered earlier. We
show the results inTable 5.Table 5.
Index of deviation of maple leaf from six common leaf categories.
Category 1 2 3 4 5 6
Index of deviationID0.27 0.12 0.230.080.24 0.18Sincetheindexofdeviation between thegiven mapleleafand Category4
is smallest, the model predicts that the maple leaf falls into Category 4—
which conclusion is consistent with our initial hypothesis.ConclusionOur model is robust under reasonable conditions, as can be seen from
the testing above. However, since our database contains only the six
commonly-seen leaf types in North America, the variety in the database
has room for improvement.212TheUMAP Journal 33.3 (2012)Model2: LeafDistributionandLeafShapeIntroductionGeneticand environmentalfactors contribute to the pattern ofleafveins
and tissue, thereby determining leaf shape. In this model, we investigate
how leaf distribution in?uences leaf shape.Idealized Leaf Distribution ModelWe construct an idealized model that immensely simpli?es the complex
situation: The tree is made up of a branch perpendicular to the ground
surface, and two identical leaves grown on the branch ipsilaterally (on the
same side) and horizontally. The leaves face upward and point toward
the sun in the sky. We suppose that the tree is at latitudeL(Northern
Hemisphere). Let the greatest average solar altitude in a year, which is
attained at noon on the vernal equinox, beα.
Figure 7illustrates our primitive model of a tree at noon on the vernal
equinox.Figure 7.Primitive model of a tree, at noon on the vernal equinox.Analysis of Overlapping AreasOur focus is the partly shaded leaf inFigure 7. The output of the model
is what proportion of the leaf (PL) is shaded. We divide the situation into
three scenarios, depending on the in?uence of the angleαon PL.A Close Look at Leaves213SolarAltitude Near90?This situation usually takes place in tropical regions, where leaf shapes
are typically broad and wide and the tree crown usually contains only one
layer of leaves. This can be explained in terms ofFigure 7: Withαnear
90?, the shaded part of the lower leaf would be too big to supply enoughsolar energy for photosynthesis, and the greatest absorption of energy can
be achieved by a broad leaf shape.SolarAltitude Near0?This situation usually takes place in frigid zones, where leaves are typ-
ically acicular (needle-shaped) and the tree crown contains dense layers of
closely-grown leaves. In terms ofFigure 7: Withαnear 0?, the shaded part
ofthe lower leafwould approach zero, allowing a much more concentrated
distribution of leaves than in other situations. In addition, the maximum
absorption of energy can be best achieved by needle-like leaves.SolarAltitude within Normal RangeThis scenario is typical in the temperate zone on earth, where sunlight
irradiates the leaves in a tilted way. It is also the case in which our idealized
model is the most suitable. Another crucial factor that we control in this
case is the distancehbetween the two points connecting the leaves and
the branch. We assume that a tree’s leaf distribution tries to minimize the
overlappingarea between leaves,soour modelinvestigatesthequantitative
relationship between the overlapping area and the shape of the leaf.
To simplify the model, we model the leaf as a rhombus, whose major
axis has lengthLmajorand whose minor axis has lengthLminor. Also, we
?x the area of the leaf asA, to ensure constant exposure area to the sun.
With area?xed, now we only need to change the length of the major axis
to change the shape of the leaf (seeFigure 8).Figure 8.Two leaves of the same area but different lengths of major axis.214TheUMAP Journal 33.3 (2012)
Also, since we have?xed the area of the leaf and just adjust its shape,
the minimum proportion of the lower leaf shaded isE=
AoverlappingA
,whereAoverlappingis the smallest overlapping area.
The most ef?cient situation is for both leaves to be totally exposed to
sunlight, as inFigure 9a: For some valueh=h0, we achieveE= 0.Figure 9a.Upper leaf does not overlap
lower one.
Figure9b.Upper leafoverlapslower one.What ifh < h0, as inFigure 9b? We can easily give the relationship
amongh,Lmajor, andEfor a given?xed solar altitudeα:E=
?
Lmajortanα?h
Lmajortanα
?2=
?
1?
h
Lmajortanα
?2.For?xedhandα, the overlap area increases as the length of the leaf
increase. The closerLmajoris toh/tanα, the smaller the overlap.From our discussion, the best leaf distribution occurs whenh=h0,which meansh=Lmajortanα.Model TestingWe need to test whether this relation between leaf distribution and leafshape is right. We offer data on leaf lengthLmajorand internode distancehof several kinds of trees and use our formula to calculate the respectivesolar altitudes of the trees. By converting the solar altitude into latitude,
we can predict the origin of a tree! We choose species native to China:?Ligustrum quihoui Carr.(waxy-leaf privet or Quihou privet, a semi-
evergreen to evergreen shrub);A CloseLook at Leaves215?Osmanthusfragrans(sweet olive,tea olive,or fragrant olive,an evergreen
shrub or small tree that is the city?ower of Hangzhou, China); and?Camelliajaponica(Japanese camellia)
as our test trees.Table 6shows the results.Table 6.
Test of model for leaf shape as a function of latitude.
Tree kindLmajorhCalculated Latitude
tanαPredicted True
Ligustrum quihoui Carr.2 2.5 1.25 38.7?35–35?Osmanthus fragrans10 18.5 1.85 28.4?23–29?Camelliajaponica6 9 1.50 33.7?32–36?The predicted latitudes of origin are close to the true latitudes, con-
?rming our hypothesis of a relationship between leaf distribution and leaf
shape.Model 3: Tree Pro?le and Leaf ShapeHypothesisSince?the vein structure determines the leaf shape;?the branch structure determines the tree pro?le; and?to some degree, the leaf veins resemble branches,
we have a wild hypothesis that the leaf shape is two-dimensional mimic of
the tree pro?le.Comparison of Leaf Shape and Tree ContourThe leafshape is two-dimensional,so it is relatively easy to study its pa-
rameters. However, the tree pro?le is three-dimensional, so it is important
to?nd a two-dimensional characteristic of a tree to use for comparison.
Since the longitudinal section of a particular tree re?ects its general size
characteristics, we focus on that.Tree Pro?le Classi?cationIn theleafclassi?cation model,thereare6generalclassesofleaves. Since
we are comparing only the general resemblance between leaf and tree, we216TheUMAP Journal 33.3 (2012)
incorporate Class 5 (elliptic leaf with serrated margin) into Class 2 (elliptic
leaf,smooth margin). As a result,we get 5classes ofleaves and 5respective
types of trees:?Class 1: Cordate (Texas redbud)?Class 2 and Class 5: Elliptic (camphor tree)?Class 3: Subulate (pine)?Class 4: Palmate (oak)?Class 6: Obovate (mockernut hickory)Parameters of the TreeWe appoint three parameters for the longitudinal section that can be
compared with those of the leaf shape, namely, rectangularity, aspect ratio,
and circularity.
Table 7shows the measurements for both trees and leaves.Table 7.
Comparison of leaf parameters and tree parameters.
Class 1 2 and 5 3 4 6
Rectangularity (R)
Leaf 0.6627 0.5902 0.6250 0.4772 0.6576
Tree 0.6281 0.6846 0.5180 0.5292 0.6238
Aspect Ratio (AR)
Leaf 0.8615 0.6600 0.1800 0.6383 0.3111
Tree 0.7914 0.7243 0.6601 0.7980 0.6750
Circularity (C)
Leaf 0.6396 0.5698 0.1834 0.3069 0.2889
Tree 0.5800 0.5928 0.2895 0.4070 0.3866For each of the parameter types, we drew a scatterplot, calculated the
correlation, and investigated the statistical signi?cance of the resulting line
of best?t. Aspect ratio (AR) and circularity (C) were each statistically
signi?cant, pointing to linear relationships;rectangularity (R) was not.ConclusionThetestsofaspect ratio and circularity support thetheory that leafshape
is a two-dimensional mimic of the tree contour. Thus, the shape of leaf
resembles the shape of tree to some extent.A Close Look at Leaves217Model 4: Leaf MassIntroductionA simple way to calculate the total leaf mass is to multiply the number
of leaves by the mass of a single leaf. Our method is more accurate and less
demanding, in that our model is (surprisingly!) independent of these two
factors but dependent on a more reliable factor of a grown tree: photosyn-
thesis. Our methodology of estimating the leaf mass of a tree is based on
three variables:?tree age;?growth rate, which is determined by tree species; and?general type (hardwood or conifer).
In other words, given the age and type of a tree, we can estimate the total
mass of leaves. In this model, CO2is used as a calculating medium.Leaf Mass and Tree AgeLeaf Mass and CO2SequestrationTrees sequester CO2from the atmosphere in their leaves but mostly
elsewhere in the tree. A tree’s ability to sequester CO2is measured in termsof massASof CO2(in pounds)per gram of leaf. Hardwood trees sequestermore CO2per gram of leaf than conifers.A tree’s ability to sequester CO2is different from its ability to absorb it,sincethe tree also releasesCO2into the atmosphere as part ofits respiration.In other words,
CO2sequestration = CO2absorption?CO2release.
Now we need only to estimate the weight of CO2sequestered by the
tree and then calculate the total mass of the leaves as the ratio of the mass
of CO2sequestered to the mass of CO2sequestered per leaf:mleaves=
mCO2AS.CO2Sequestration and Tree AgeThe relationship between the amount of CO2sequestered, the age of
a tree, and the type of tree is given in a table by the Energy Information
Administration [1998, Table 2, 8–9], which also divides trees based on their
growth rate: fast, moderate, or slow.
For each growth rate, we graphed the annual sequestration rate vs. age
of the tree and?tted a quadratic model (seeFigure 10for conifer example).218TheUMAP Journal 33.3 (2012)Figure 10.Annual CO2sequestration rates, in pounds of carbon per tree per year, for three rates
of growth of conifer trees of increasing age.We were surprised to?nd that the curves?t the data perfectly! (This
fact strongly suggests that the original table values were not measured but
calculated from such a model.) From the equations of the?tted curves, we
can easily estimate the CO2sequestered for a tree ofa given age and growth
rate and consequently calculate the mass of the leaves.Tree Age and Tree SizeAbove, we used the age of a tree as a link between the two leaf massand the size characteristics of the tree. Since we now know the relationship
between the age ofa tree (ofa particular growth rate)and its totalleafmass,
now we only need to work out the relationship between the age of the tree
and the size characteristics of it. Tree size is the accumulation of growth,
which is a biological phenomenon of increase with time.
In its life cycle, a tree experiences logisticgrowth, leading to a model for
its “size,” or pro?le,P(height, mass, diameter) asP=k1?
1?ek2A?k3,henceA=k4ln
?
1?k5Pk6?
,whereAis the age of the tree and thekiare constants that depend on the
species of tree.Leaf Mass and Tree SizeFinally, we get to answer the question of whether there is a relationshipbetween leaf mass and tree size characteristics. Putting together our earlier
models, we have the relationships inFigure 11.A Close Look at Leaves219Figure 11.According to our earlier results,leafmassand treeagearerelated to each
other through CO2sequestration, and we have just determined a functionbetween tree age and tree size.Strengths and WeaknessesModel 1Strengths:Our modelis based on quantitativeanalysis,so the classi?cation processis both objective and ef?cient.
Our model is based on categories of leaf types that are the most typical
and common.Weakness:We divide leaves into only six categories, which may not cover all leaf
types.Model 2Strengths:Wehavetaken intoconsideration threeclimateconditions(tropicalzone,temperate zone,and frigid zone)in discussing the relationship between the
leaf distribution and the leaf shape.
The results of our model conform to the data that we found.220TheUMAP Journal 33.3 (2012)Weakness:We consider the leaf distribution on a single branch but have not con-
sidered the inner-in?uence between different leaves of different branches.Model 3Strength:The whole process uses data and quantitative analysis as foundations,so the output is objective and reasonable.Weakness:We have limited categories of tree pro?les.Model 4Strength:We use the carbon sequestration rate and age as the media to calculatethe total mass of leaves, which is better than trying to estimate the number
of leaves and the average weight of each.Weakness:The data are from a source that does not refer to the method of arriving
at the data.Letterto a Science Journal EditorDear Editor:
We present to you our key?ndings.
We?rst focus on the possible in?uence on leafshape ofthe leafdistribu-
tion on the tree. For survival reasons, a tree should develop an optimal leaf
distribution and shape pattern that adjust to the speci?c region of its ori-
gin, thereby gaining the most nutrients for photosynthesis by maximizing
the exposure area to sunshine. We demonstrate a mathematical relation-
ship among solar altitude, leaf shape, and leaf distribution. Based on this
?nding, we may be able to determine the best location for replanting or
assisted-migration of a tree species by observing its leaf distribution.
Our second key?nding is a rough relationship between the tree’s pro-
?le and its leaves. In fact, we hypothesize that a leaf is a two-dimensional
mimic of the tree. For several trees, we compared the shape of the leaf and
the contour of the tree,?nding similarities between certain characteristics.A CloseLook at Leaves221
This?nding is another instance of the natural world containing examples
of self-similarity, a mathematical concept that means that an object is ap-
proximately similar to a part of itself, as is the case for the mathematical
objects of the Koch snow?ake and the Mandelbrot set.
The third part of our study deals with the relationship between tree size
characteristics and the total mass of the leaves. The two are linked by the
CO2sequestration rate and the age of the tree. Hence, we can estimate the
total mass of the leaves given some pro?le parameters of a tree, such as its
height, diameter, volume, age, and type. This?nding might have potential
for agricultural and environmental uses, such as a new method to estimate
tea production or wood production, or estimation of the CO2sequestration
effect of a forest as an alleviator of global warming.
In hope of publishing our research in your journal, we enclose our re-
search paper for you to examine and judge. We are convinced that our
research on leaves promises to contribute to a variety of areas.
Sincerely yours,
Team 14990AcknowledgmentThe authors thank David Knight, James Painter, and Matthew Potter of
the Dept. of Electrical Engineering at Stanford University for permission to
reproduce photos of leaves from their paper Knight et al. [2010].ReferencesAlonso, Jos′
e Antonio, and Ma[Mar′?a] Teresa Lamata. 2006. Consistency
in the analytic hierarchy process: A new approach.International Journal
of Uncertainty, Fuzziness and Knowledge-Based Systems14 (4): 445–459.http://hera.ugr.es/doi/16515833.pdf.
Du,Ji-Xiang,Xiao-Feng Wang,and Guo-Jun Zhang. 2007. Leafshapebased
plant species recognition.Applied Mathematics and Computation185 (2)
(February 2007): 883–893.
Energy Information Administration, U.S. Department of Energy. 1998.
Method for calculating carbon sequestration by trees in urban and sub-
urban settings.ftp://ftp.eia.doe.gov/pub/oiaf/1605/cdrom/pdf/
sequester.pdf.
Im, C., H. Nishida, and T.L. Kunil. 1998. Recognizing plant species by
leaf shapes—a case study of theAcerfamily. InProceedings of 1998 IEEE222TheUMAP Journal 33.3 (2012)
International Conference on Pattern Recognition, Brisbane, August 1998,
vol. 2, 1171–1173.
Knight, David, James Painter, and Matthew Potter. 2010. Automatic plant
leaf classi?cation for a mobile?eld guide: An android application.http://www.stanford.edu/~jpainter/documents/Plant%20Leaf%
20Classification.pdfandhttp://www.stanford.edu/class/ee368/Project_10/Reports/Knight_Painter_Potter_PlantLeafClassification.pdf.
Saaty, Thomas L. 1982.Strategy and Organization, The Analytical Hierarchy
ProcessforDecisionsin aComplex World. Belmont,CA:Lifetime Learning
Pub.
Tsukaya, Hirokazu 2006. Mechanism of leaf-shape determination.Annual
Review of Plant Biology57 (1): 477–496.
Wang,Z.,Z.Chi,and D.Feng. 2003. Shape based leafimage retrieval.IEEE
Proceedings: Vision, Image, and Signal Processing150 (1) (February 2003):
34–43.
Team members Tiankun Lu, Bo Zhang, and Yi Zhang.Judges’Commentary223Judges’Commentary:
The Outstanding Leaf Problem
PapersPeter Olsen, P.E.Commander, US Coast Guard Reserve
Baltimore, MDpcolsen@gmail.comA manager would rather live with a problem he cannot solve than accept asolution hedoes not understand.—Robert E.D. “Gene” Woolsey [2003]IntroductionProblem A of the 2012 MathematicalContest in Modeling (MCM)tmwaswritten by Lee Zia, who posed a challenging problem, “How can you mea-
sure the weight of leaves on a tree?” and several equally challenging sub-
problems. The problems were easy to state, but there were no traditional
approaches. Successful teams would have to combine existing models,
data, and new ideas in creative and original ways.
The results were gratifying. The judges were impressed by the variety
of approaches submitted by the teams. The approaches were creative and
the models showed each team’s ability to use their own new ideas to re?ne
and extend work that had gone before.
No two of the Outstanding papers shared the same model. Some share
parts and data; but those are emphasized, combined, and used in different
ways. Existing work, often quickly?ndable on Google, forms the scaffold
on which each team built their own model. The?nal structures were a
pleasure to behold.Problem Statement“How much do the leaveson a tree weigh?” How might oneestimatethe
actual weight of the leaves (or for that matter any other parts of the tree)?TheUMAPJournal33(3)(2012)223–229.
c
?Copyright2012by COMAP,Inc. Allrightsreserved.
Permission to make digital or hard copies of part or all of this work for personal or classroom use
is granted without fee provided that copies are not made or distributed for pro?t or commercial
advantage and that copies bear this notice. Abstracting with credit is permitted, but copyrights
for components of this work owned by others than COMAP must be honored. To copy otherwise,
to republish, to post on servers, or to redistribute to lists requires prior permission from COMAP.224TheUMAP Journal 33.3 (2012)
How might one classify leaves? Build a mathematical model to describe
and classify leaves. Consider and answer the following:?Why do leaves have the various shapes that they have??Do the shapes “minimize” overlapping individualshadows that are cast,
so as to maximize exposure? Does the distribution of leaves within the
“volume” of the tree and its branches effect the shape??Speaking of pro?les, is leaf shape (general characteristics) related to tree
pro?le/ branching structure??How would you estimate the leaf mass of a tree? Is there a correlationbetween the leaf mass and the size characteristics of the tree (height,
mass, volume de?ned by the pro?le)?
In addition to your one-page summary sheet, prepare a one-page letter
to an editor of a scienti?c journal outlining your key?ndings.DataFor some of the subproblems, such as the leaf classi?cation, real data
could be obtained easily. For others, such as the calculation of the mass of
the leaves of the tree, it was dif?cult or impossible to obtain real data. In
these latter cases, the teams showed creativity?nding and using secondary
sources.Criteria forJudgingHere are some of the issues that kept papers from the?nal rounds:?Errors in mathematics, which quickly took them out of further consider-
ation.?Including mathematics that didn’t?t the?ow of the presentation. In afew cases, mathematics appears to have been inserted to make a paper
look more credible or to take the place of other work that had led to a
dead end.?Changing notation, sometimes even within a single section.?Using unde?ned or poorly de?ned symbols, or using symbols before
de?ning them.?Incomplete expressions, either because the team made an error or be-cause the expression did not survive the word-processor. (One of the
Outstanding papers addressed in this commentary had a few incomplete
expressions, probably because they didn’t survive the word-processor,Judges’ Commentary225
but the coherence of its model and the strength of its presentation over-
came that defect.)Modeling IssuesThis problem required two different types of model:?Increasing abstraction:The leaf classi?cation model abstracted from an
immense number of natural leaf characteristics a set of arti?cial ones
small enough to be useful for classi?cation.?Decreasing abstraction:The leaf mass problem took abstract models, ap-
plied them to data, and got concrete numerical results
Some models were dif?cult to understand; poor writing was the most
common cause. Another cause was the use of inapposite mathematics. If
the mathematics was a result of a “drive-by” insertion,?tting it into the
model could be dif?cult.
Here are a few of the modeling issues that hurt some papers’chance of
entering the?nal rounds:?Questionable,con?icting,orunjusti?ablyspeculativeassumptions. Good
papers did not assume any spherical cows (“a metaphor for highly sim-
pli?ed scienti?c models of complex real life phenomena” [Wikipedia
2012]).?Dependence ondeus ex machina: an assumption, equation, reference, or
procedure invoked without explanation or context. Often the invocation
would start with the phrase “It is well-known that...” It may be well-
known to those who know it well, but that is unlikely to be the customer
or client.?Confusing, missing, or misplaced model de?nitions; model de?nitions
are more complex and more important than mathematical ones, since
they must not only name thede?niendumbut also specify what it is and
what it is to be used for.?Failure to reach a conclusion.?Con?icting subproble models with unexplained con?icts between as-
sumptions.?Unexplained inconsistencies in data.?An unclear, incomplete, or unrepresentative letter to the journal editor.?A poor abstract:
–too much detail, so much that it was dif?cult to see the overall struc-
ture of the model or the strategy for using it; or226TheUMAP Journal 33.3 (2012)
–too little detail, so that it was dif?cult for the reader to what was
actually to be done; or
–an incomplete abstract, presenting only part of the problem.?Poor presentation, including bad prose style, poor vocabulary, and dis-
organized explanations. Good presentation won’t get a bad paper into
the?nals, but poor presentation may keep a good one out. (The weight
given to this criterion varies among the judges.)Letterto a Journal EditorThe one-page letter to a journal editor was an important part of the
problem. Its goal was to give insight into whether or not the teams could
explain their results clearly, simply, and directly. The most important crite-
rion of modeling is whether or not the models are used, either to increase
understanding directly (through use) or indirectly (through publications,
conferences,or professionaltools such as software). A modelthat cannot be
understood will not be used (see the quotation from Woolsey [2003] at the
head of this commentary). A good letter should present an overview of the
problem, technique, and results in a single page. The clarity of each team’s
letter is one indication of how their model might fare in the real world.The Outstanding PapersHong Kong Baptist UniversityThis team’s entry was nicely laid out and easy to follow. The tree-
classi?cation models appeared to be traceable back to the?rst principles of
physics.
Each model’s development began with a clear description of the ap-
proach the team intended to follow. For example, in the leaf classi?cation
subproblem theapproach wasto reduceallleafstructuresto a oneofseveral
polar coordinate functional shapes. These easily can be distinguished.
The team’s solution to the problem of?nding the mass of leaves on a
tree was unique. The team used the structural properties of the tree, not
properties of the tree canopy directly. The advantage of this approach is
that the team did not need any information about the size or density of the
canopy,the properties ofindividualleaves,or the number or distribution of
the leaves. Knowing each branch’s modulus of elasticity and its de?ection
under load provided enough information so that its leaf load could be
inferred from thebranch’sde?ection. Conceptually,thissolution wasmuch
simpler than most of the others. As a practical matter, users of this solution
might?nd dif?culty in obtaining some of the data, such as the de?ection ofJudges’Commentary227
an unloaded branch; but if they could, this would an ef?cient and elegant
solution.
The presentation was excellent for all models. The prose, graphics, and
equations?owed seamlessly throughout the paper.
The team’s letter to the editor was the paper’s one weakness. The team
employed a very high-level approach, laying out the overall goals for the
problem, but without giving insight into the models’operational details.Shanghai Foreign Language SchoolThis paper had a particularly strong beginning. Within the space of
three pages, the team?reorganized the problem into four consolidated subproblems,?stated their assumptions clearly and succinctly, and?provided a table listing their model’s parameters and their symbols.
The team’s leaf classi?cation model used seven simple measurement
procedures involving 10parameters,the most complicated ofwhich is area.
The measurementscan be conducted on-siteusing only a sheet of?ne-ruled
graph paper. Only one parameter requires calculation: division of the area
of a fractional leaf segment by the leaf’s entire area. (In times past, this
could have been done by eye with a simple nomograph. Now people will
stop and key data into calculators.)
As with the team from Hong Kong Baptist University, the model for
estimating leaf mass has an unusual approach. The model does not rely on
direct measurements of leaf characteristics or tree size. This can be used to
show that leaf-mass and tree size are correlated. The challenge in using this
model is determination of the rate of sequestration of carbon-dioxide. The
modelusessequestration data from a U.S.DepartmentofEnergy document.
The last section of the paper contained a clear and well-organized sum-
mary list of each problem’s strengths and weaknesses.
The team’s letter to the editor was clear and concise. It covered the high-
levelstatement ofthe problem,then gave enough detailofthe solution plan
that an knowledgeablebut non-expertreader could feelconversantwith the
approach.National University of SingaporeThis team’s leaf classi?cation algorithm is the simplest of the four de-
scribed in this commentary. It has four steps:?project the leaf onto a grid;?determine the grid squares covered by the projection to determine if theleaf has convexities:228TheUMAP Journal 33.3 (2012)?If it convex, it is a palmate leaf, exit;?if it is not convex, then perform further classi?cation.
The leafmass is calculated based on the team’s vector tree modeloftree-
structureand their insolation model. Thevector treemodelrepresentsa tree
as three-dimensional vectors; daughter branches are obtained by applying
a linear transform to the parents.
This entry made excellent use of graphics in presenting their models
and results.
This team’s letter to the editor successfully wove their research, their
results, and their ideas about further research into a single clear narrative.Zhejiang UniversityThis paper presented a neural-net-based leaf classi?er that was most
sophisticated of all of the leaf classi?cation schemes. The input layer had 4
nodes, the middle layer 10 nodes, and the output layer had 1 node.
The team divided a sample of leaves into four classes. They trained the
network on 32 exemplars of each class, then tested the network on 8 other
leaves drawn at random from the entire ensemble.
The network misclassi?ed 1 of the 8. In general, it’s impossible to tell
how a back-propagation reaches its results; but it’s reasonable to hypothe-
size that more training data might have corrected the one misclassi?cation.
The leaf mass estimation was the most traditional of these four papers.
It was based directly on the leaf mass constant, a known value that varies
with treespecies,and an estimateofthevolumeofan approximatingregular
solid.SummaryThese four solutions had strong similarities—importantly, not in the
solutions themselves. Models work when they provide understandable
bases for reasonable decisions. All four solutions met that criterion and
several others:?They were presented clearly.
–The descriptive text was clear. There were comparatively few errors
in grammar, vocabulary, or style; and these didn’t interfere with the
reader’s understanding.
–Graphics were appropriate and clear. They supported the argument
being made. The appropriate text referred to them.?The models were appropriate to the problem to be solved, in that
–the assumptions and goals were clearly stated;Judges’Commentary229
–thephysicswascorrectand appropriate—therewerenodeiexmachina
or spherical cows;
–there was no extraneous mathematics air-dropped into the model—
the solution was organized in sections; and
–the graphics were easy to?nd.ReferenceArney, Chris, and Kathryn Coronges. 2012. Modeling for crime busting.
TheUMAP Journal33 (3) (2012) 291–302.
Wikipedia. 2012. Spherical cow.http://en.wikipedia.org/wiki/Spherical_cow.
Woolsey, Robert E.D. 2003.Real World Operations Research: The Woolsey
Papers. Marietta, GA: Lionheart Publications.AcknowledgmentsThis paper bene?tted from insights in the Judges’Commentary by Chris
Arney and Kathryn Coronges [2012] in this issue.About the AuthorA graduate of the U.S. Coast Guard Academy, Peter Olsen retired fromthe Coast Guard Reserveas a Commander in 1997,after 27years service,ac-
tive and reserve. His most challenging assignment was to build the quanti-
tativemodelused to allocateresourcesfor theExxon Valdezoil-spillcleanup.
Of the model, Vice Admiral Robbins, the on-scene coordinator, wrote that
it was completed on time, it was used by the people who paid for it, and its
predictions were borne out by events.230TheUMAP Journal 33.3 (2012)Computing Along theBig Long River231Computing Along the Big Long RiverChip Jackson
Lucas Bourne
Travis PetersWestern Washington University
Bellingham, WA
Advisor: Edoh Y. AmiranAbstract
We develop a model to schedule trips down the Big Long River. The goal
is to optimally plan boat trips of varying duration and propulsion so as to
maximize the number of trips over the six-month season.
We model the process by which groups travel from campsite to campsite.
Subject to the given constraints, our algorithm outputs the optimal daily
schedule for each group on the river. By studying the algorithm’s long-term
behavior, we can compute a maximum number of trips, which we de?ne as
the river’s carrying capacity.
We apply our algorithm to a case study of the Grand Canyon, which has
many attributes in common with the Big Long River.
Finally, we examine the carrying capacity’s sensitivity to changes in the
distribution of propulsion methods, distribution of trip duration, and the
number of campsites on the river.IntroductionWe address scheduling recreational trips down the Big Long River so as
to maximize the number of trips. From First Launch to Final Exit (225 mi),
participants take either an oar-powered rubber raft or a motorized boat.
Trips last between 6and 18nights,with participants camping at designated
campsites along the river. To ensure an authentic wilderness experience,
at most one group at a time may occupy a campsite. This constraint limits
the number of possible trips during the park’s six-month season.
We model the situation and then compare our results to rivers with
similar attributes,thus verifying that our approach yields desirable results.
Our model is easily adaptable to?nd optimal trip schedules for rivers
of varying length, numbers of campsites, trip durations, and boat speeds.TheUMAPJournal33(3)(2012)231–246.
c
?Copyright2012by COMAP,Inc. Allrightsreserved.
Permission to make digital or hard copies of part or all of this work for personal or classroom use
is granted without fee provided that copies are not made or distributed for pro?t or commercial
advantage and that copies bear this notice. Abstracting with credit is permitted, but copyrights
for components of this work owned by others than COMAP must be honored. To copy otherwise,
to republish, to post on servers, or to redistribute to lists requires prior permission from COMAP.232TheUMAP Journal 33.3 (2012)De?ning the Problem?How should trips of varying length and propulsion be scheduled to
maximize the number of trips possible over a six-month season??How many new groups can start a river trip on any given day??What is the carrying capacity of the river—the maximum number ofgroups that can be sent down the river during its six-month season?Model OverviewWe design a model that?can be applied to real-world rivers with similar attributes (i.e.,the GrandCanyon);?is?exible enough to simulate a wide range of feasible inputs; and?simulates river-trip scheduling as a function of a distribution of triplengths (either 6, 12, or 18 days), a varying distribution of propulsion
speeds, and a varying number of campsites.
The model predicts the number of trips over a six-month season. It also
answers questions about the carrying capacity of the river, advantageous
distributions of propulsion speeds and trip lengths, how many groups can
start a river trip each day, and how to schedule trips.ConstraintsThe problem speci?es the following constraints:?Trips begin at First Launch and end at Final Exit, 225 miles downstream.?There are only two types of boats: oar-powered rubber rafts and motor-
ized boats.?Oar-powered rubber rafts travel 4 mph on average.?Motorized boats travel 8 mph on average.?Group trips range from 6 to 18 nights.?Trips are scheduled during a six-month period of the year.?Campsites are distributed uniformly along the river.?No two groups can occupy the same campsite at the same time.Computing Along theBig Long River233Assumptions?We can prescribe the ratio of oar-powered river rafts to motorized boats
that go onto the river each day.
Therecan beproblems if toomany oar-powered boats arelaunchedwith
short trip lengths.?The duration of a trip is either 12 days or 18 days for oar-powered rafts,
and either 6 days or 12 days for motorized boats.
Thissimpli?cation stillallowsourmodeltoproducemeaningfulresults
whileletting us comparetheeffect of varying trip lengths.?There can only be one group per campsite per night.
This agrees with thedesires of theriver manager.?Each day, a group can only move downstream or remain in its current
campsite—it cannot move back upstream.
Thisrestrictsthe?ow ofgroupstoasingledirection,greatly simplifying
how wecan movegroups from campsitetocampsite.?Groups can travel only between 8a.m.and 6p.m., a maximum of 9
hours of travel per day (one hour is subtracted for breaks/ lunch/ etc.).
Thisimpliesthat perday, oar-poweredraftscan travelat most 36miles,
and motorized boats at most 72 miles. This assumption allows us to
determinewhich groups can reasonably reach agiven campsite.?Groups never travelfarther than the distancethat they can feasibly travel
in a single day: 36 miles per day for oar-powered rafts and 72 miles per
day for motorized boats.?We ignore variables that could in?uence maximum daily traveldistance,
such as weather and river conditions.
Thereis noway of accurately including thesein themodel.?Campsites are distributed uniformly so that the distance between camp-
sites is the length of the river divided by the number of campsites.
Wecan thusrepresent theriverasan array ofequally-spacedcampsites.?A group must reach the end of the river on the?nal day of its trip:
–A group will not leave the river early even if able to.
–A group will not have a?nish date past the desired trip length.
This assumption?ts what we believe is an important standard for the
river manager and for thequality of thetrips.234TheUMAP Journal 33.3 (2012)Table 1.
Notation.
Symbol Meaning
gigroupi
titrip length for groupi, measured in nights;6≤ti≤18dinumber of nights groupihas spent on the riverYnumber of campsites on the river
cYlocation of campsiteYin miles downstream;0< cY<225
c0campsite representing First Launch (used to construct a waitlist of groups)c?nalcampsite (which is always “open”) representing Final Exitlilocation of groupi’s current campsite in miles down the river;0< li<225aiaverage distance that groupishould move each day to be on schedule;ai= 225/timimaximum distance that groupican travel in a single dayPipriority of groupi;Pi= (di/ti)(li/225)Gcset of groups that can reach campsitecRratio of oar-powered rafts to motorized boats launched each day
Xcurrent number of trips down Big Long River each year
Mpeak carrying capacity of the river (maximum number of groups
that can be sent down the river during its six-month season)
Ddistribution of trip durations of groups on the riverMethodsWe de?ne some terms and phrases:
Open campsite:A campsiteisopen ifthereisnogroup currentlyoccupying
it: Campsitecnis open if no groupgiis assigned tocn.
Moving to an open campsite:For a groupgi, its campsitecn, moving to
some other open campsitecm?=cnis equivalent to assigninggito thenew campsite. Since a group can move only downstream, or remain at
their current campsite, we must havem≥n.
Waitlist:The waitlist for a given day is composed ofthe groups that are not
yet on the river but will start their trip on the day when their ranking on
the waitlist and their ability to reach a campsitecincludes them in the
setGcof groups that can reach campsitec, and the groups are deemed“the highest priority.” Waitlisted groups are initialized with a current
campsite value ofc0(the zeroth campsite), and are assumed to have
priorityP= 1until they are moved from the waitlist onto the river.
Off the River:We consider the?rst space off of the river to be the “?nal
campsite”c?nal, and it is always an open campsite (so that any number of
groups can be assigned to it. This is consistent with the understanding
that any number of groups can move off of the river in a single day.Computing Along theBig Long River235The Farthest Empty CampsiteOur schedulingalgorithm usesan arrayasthedata structuretorepresent
the river, with each element of the array being a campsite. The algorithm
begins each day by?nding the open campsitecthat is farthest down the
river, then generates a setGcof all groups that could potentially reachcthat night. Thus,Gc={gi|li+mi≥c},whereliis the group’s current location andmiis the maximum distancethat the group can travel in one day.?The requirement thatmi+li≥cspeci?es that groupgimust be able to
reach campsitecin one day.?Gccan consist of groups on the river and groups on the waitlist.?IfGc=?, then we move to the next farthest empty campsite—located
upstream,closer to the start ofthe river. The algorithm always runs from
the end of the river up towards the start of the river.?IfGc?=?,then thealgorithm attemptstomovethegroup with thehighest
priority to campsitec.
The scheduling algorithm continues in this fashion until the farthest
empty campsite is the zeroth campsitec0. At this point, every group that
was able to move on the river that day has been moved to a campsite, and
we start the algorithm again to simulate the next day.PriorityOnce a setGchas been formed for a speci?c campsitec, the algorithm
must decide which group to move to that campsite. ThepriorityPiis ameasure of how far ahead or behind schedule groupgiis:?Pi>1: groupgiis behind schedule;?Pi<1: groupgiis ahead of schedule;?Pi=1: groupgiis precisely on schedule.
We attempt to move the group with the highest priority intoc.
Some examples of situations that arise, and how priority is used to re-
solve them, are outlined inFigures 1and2.Priorities and OtherConsiderationsOur algorithm always tries to move the group that is the most behind
schedule, to try to ensure that each group is camped on the river for a236TheUMAP Journal 33.3 (2012)Downstream?→
Campsite 1 2 3 4 5 6
Group A B C Open Open Farthest
PriorityPA= 1.1PB=1.5PC= 0.8open campsite
Figure 1.The scheduling algorithm has found that the farthest open campsite is Campsite 6 and
Groups A, B, and C can feasibly reach it. Group Bhas the highest priority, so we move Group Bto
Campsite 6.
Downstream?→
Campsite 1 2 3 4 5 6
Group A Open C Open Farthest B
Priority
PA=1.1PC= 0.8open campsite
Figure 2.As the scheduling algorithm progresses past Campsite 6, it?nds that the next farthest
open campsite is Campsite 5. The algorithm has calculated that Groups A and C can feasibly reach
it; sincePA> PC, Group A is moved to Campsite 5.number of nights equal to its predetermined trip length. However, in some
instances it may not be ideal to move the group with highest priority to
the farthest feasible open campsite. Such is the case if the group with the
highest priority isaheadof schedule (P <1).
We provide the following rules for handling group priorities:?Ifgiisbehindschedule,i.e.Pi>1,then movegitoc,its farthest reachableopen campsite.?Ifgiisaheadof schedule, i.e.Pi<1, then calculatediai, the number ofnights that the group has already been on the river times the average
distance per day that the group should travel to be on schedule. If the
result is greater than or equal(in miles)to the location ofcampsitec,then
movegitoc. Doing so amounts to movinggionly in such a way that itis no longer ahead of schedule.?Regardless ofPi, if the chosenc=c?nal, then do not movegiunlessti=
di. This feature ensures thatgi’s trip will not end before its designated
end date.
Theonecasewhereagroup’spriorityisdisregarded isshown inFigure3.Scheduling SimulationWe now demonstrate how our model could be used to schedule river
trips.
In the following example, we assume 50 campsites along the 225-mile
river, and we introduce 4 groups to the river each day. We project the tripComputing Along theBig Long River237Downstream?→
Campsite 170 171 ... 223 224 Off
Group D Open Open Open Open Farthest
PriorityPD=1.1open campsite
tD= 12dD= 11
Figure 3.The farthest open campsite is the campsite off the river. The algorithm?nds that Group
D could move there, but Group D hastD> dD—that is, Group D is supposed to be on the river for
12 nights but so far has spent only 11—so Group D remains on the river, at some campsite between
171 and 224 inclusive.schedules of the four speci?c groups that we introduce to the river on day
25. We choose a midseason day to demonstrate our model’s stability over
time. The characteristics of the four groups are:?g1: motorized,t1= 6;?g2: oar-powered,t2= 18;?g3: motorized,t3= 12;?g4: oar-powered,t4= 12.
Figure 5shows each group’s campsite number and priority value for
each night spent on the river. For instance, the column labeledg2gives
campsite numbers for each of the nights ofg2’s trip. We?nd that eachgiis off the river after spending exactlytinights camping, and thatP→1asdi→ti, showing that as time passes our algorithm attempts to get (andkeep) groups on schedule.Figures 6and7display our results graphically.
These?ndings are consistent with the intention of our method; we see in
this small-scale simulation that our algorithm produces desirable results.Case StudyThe Grand CanyonThe Grand Canyon is an ideal case study for our model, since it shares
many characteristics with the Big Long River. The Canyon’s primary river
raftingstretch is226miles,ithas235campsites,and itisopen approximately
six months of the year. It allows tourists to travel by motorized boat or by
oar-powered river raft for a maximum of12or 18days,respectively [Jalbert
et al. 2006].
Using the parameters of the Grand Canyon, we test our model by run-
ning a number ofsimulations. We alter the number ofgroups placed on the
water each day, attempting to?nd the carrying capacity for the river—the238TheUMAP Journal 33.3 (2012)Figure 4.Visual depiction of scheduling algorithm.Computing Along theBig Long River239
Campsite numbers and priority values for each group Number of nights spent on river
??
?? ?? ?? ?? ?? ?? ??
1 ????????????????????????????2 ?????????????????????????????3 ??????????????????????????????4 ????????????????????????????????5 ????????????????????????????????6 ?????????????????????????????7 ??????????????????????????????8 ??????????????????????????9 ??????????????????????????11 ?????????????????????????12 ??????????????????????????13 ???????????????????????14 ??????????????????????15 ??????????????16 ?????????????17 ??????????????18 ??????????????19 ??????????????20 ????????????Figure 5.Schedule for example of groups launched on Day 25.?
?
??
??
??
??
??
??
??
??
??
? ? ? ? ? ? ? ? ? ? ????????????????????
????????
??????
?????
????? ?
????? ?
????? ?
????? ?Figure 6.Movement of groups down the river based onFigure 5. Groups reach the end of the
river on different nights due to varying trip-duration parameters.240TheUMAP Journal 33.3 (2012)?
?
??
??
??
??
??
??
??
??
??
? ? ? ? ? ? ? ? ? ? ????????????????????
????????
??????
?????
????? ?
????? ?
????? ?
????? ?Figure 7.Priority values of groups over the course of each trip. Values converge toP= 1due to
the algorithm’s attempt to keep groups on schedule.maximum number ofpossibletrips over a six-month season. The main con-
straint is that each trip must last the group’s planned trip duration. During
its summer season, the Grand Canyon typically places six new groups on
the water each day [Jalbert et al.2006],so we use this value for our?rst sim-
ulation. In each simulation, we use an equal number of motorized boats
and oar-powered rafts, along with an equal distribution of trip lengths.
Our model predicts the number of groups that make it off the river
(completed trips), how many trips arrive past their desired end date (late
trips), and the number of groups that did not make it off the waitlist (total
left on waitlist). These values change as we vary the number ofnew groups
placed on the water each day (groups/ day).Table 2.
Results of simulations for the number of groups to launch each day.
Simulation Groups/ day Trips Left on
Completed Late waitlist
1 6 996
2 8 1328
3 10 1660
4 12 1992
5 14 2324
6 16 2656
7 17 2834
8 18 2988
9 19 3154 5
10 20 3248 10 43
11 21 3306 14 109Computing Along theBig Long River241
Table 1indicates that a maximum of 18 groups can be sent down the
river each day. Over the course of the six-month season, this amounts to
nearly 3,000 trips. Increasing groups/ day above 18 is likely to cause late
trips (some groups are stillon the river when our simulation ends)and long
waitlists. In Simulation 1, we send 1,080 groups down river (6 groups/ day×180days)but only 996groupsmakeit off;theother groupsbegan near the
end of the six-month period and did not reach the end of their trip before
the end of the season. These groups have negligible impact on our results
and we ignore them.Sensitivity Analysis of Carrying CapacityManagersoftheBig Long River arefaced with a similar task to that ofthe
managers of the Grand Canyon. Therefore, by?nding an optimal solution
for the Grand Canyon, we may also have found an optimal solution for
the Big Long River. However, this optimal solution is based on two key
assumptions:?Each day, we put approximately the same number of groups onto the
river; and?the river has about one campsite per mile.
We can make these assumptions for the Grand Canyon because they are
true for the Grand Canyon, but we do not know if they are true for the Big
Long River.
To dealwith theseunknowns,wecreateTable 3. Itsvaluesaregenerated
by?xing the numberYof campsites on the river and the ratioRof oar-
powered rafts to motorized boats launched each day, and then increasing
the number of trips added to the river each day until the river reaches peak
carrying capacity.Table 3.
Capacity of the river as a function of the number of campsites and the ratio of oarboats to
motorboats.
Number of campsites on the river
100 150 200 250 300
1:4 1360 1688 2362 3036 3724
Ratio 1:2 1181 1676 2514 3178 3854
oar : motor 1:1 1169 1837 2505 3173 3984
2:1 1157 1658 2320 2988 3604
4:1 990 1652 2308 2803 3402The peak carrying capacities inTable 3can be visualized as points in
a three-dimensional space, and we can?nd a best-?t surface that passes
(nearly) through the data points. This best-?t surface allows us to estimate242TheUMAP Journal 33.3 (2012)
the peak carrying capacityMof the river for interpolated values. Essen-
tially, it givesMas a function ofYandRand shows how sensitiveMis tochanges inYand/ orR.Figure 7is a contour diagram of this surface.Figure 7.Contour diagram of the best-?t surface to the points ofTable 3.The ridge along the vertical lineR= 1 : 1predicts that for any givenvalue ofYbetween 100 and 300, the river will have an optimal value ofMwhenR= 1 : 1. Unfortunately, the formula for this best-?t surface israther complex, and it doesn’t do an accurate job of extrapolating beyond
the data ofTable 3;so it is not a particularly usefultoolfor the peak carrying
capacity for other values ofR. The best method to predict the peak carrying
capacity is just to use our scheduling algorithm.Sensitivity Analysis of Carrying Capacity reRandDWe have treatedMas a function ofRandY, but it is still unknown to ushowMis affected by the mix of trip durations of groups on the river (D).Computing Along theBig Long River243
For example, if we scheduled trips of either 6 or 12 days, how would this
affectM? The river managers want to know what mix of trips of varying
duration and speed will utilize the river in the best way possible.
We use our scheduling algorithm to attempt to answer this question.
We?x the number of campsites at 200 and determine the peak carrying
capacity for values ofRandD. The results of this simulation are displayed
inTable 4.Table 4.Carrying capacity of the river by trip lengths and boat type.
Distribution of trip lengths
12 only 12 or 18 6 or 12 6, 12, or 18
1:4 2004 1998 2541 2362
Ratio 1:2 2171 1992 2535 2514
oar : motor 1:1 2171 1986 2362 2505
2:1 1837 2147 2847 2320
4:1 2505 2141 2851 2308Table 4is intended to address the question ofwhat mix oftrip durations
and speeds will yield a maximum carrying capacity. For example: If the
river managers are currently scheduling trips of length?6, 12, or 18: Capacity could be increased either by increasingRto be
closer to 1:1 or by decreasingDto be closer to “6 or 12.”?12 or 18: DecreaseDto be closer to “6 or 12.”?6 or 12: IncreaseRto be closer to 4:1.ConclusionThe river managers have asked how many more trips can be added totheBig Long River’sseason. Without knowing thespeci?csofhow theriver
is currently being managed, we cannot give an exact answer. However, by
applyingour modeltoa studyoftheGrand Canyon,wefound resultswhich
could be extrapolated to the context of the Big Long River. Speci?cally,
the managers of the Big Long River could add approximately(3,000?X)groups to the rafting season, whereXis the current number of trips and
3,000 is the capacity predicted by our scheduling algorithm.
Additionally, we modeled how certain variables are related to each
other;M,D,R, andY. River managers could refer to our?gures and
tables to see how they could change their current values ofD,R, andYtoachieve a greater carrying capacity for the Big Long River.
We also addressed scheduling campsite placement for groups moving
down the Big Long River through an algorithm which uses priority values
to move groups downstream in an orderly manner.244TheUMAP Journal 33.3 (2012)Limitations and ErrorAnalysisCarrying Capacity OverestimationOur model has several limitations. It assumes that the capacity of the
river is constrained only by the number of campsites, the trip durations,
and the transportation methods. We maximize the river’s carrying capac-
ity, even if this means that nearly every campsite is occupied each night.
This may not be ideal, potentially leading to congestion or environmental
degradation of the river. Because of this, our model may overestimate the
maximum number of trips possible over long periods of time.Environmental ConcernsOur case study of the Grand Canyon is evidence that our model omits
variables. We are con?dent that the Grand Canyon could provide enough
campsites for 3,000 trips over a six-month period, as predicted by our algo-
rithm. However, since the actual?gure is around 1,000 trips [Jalbert et al.
2006],the error is likely due to factors outside ofcampsite capacity, perhaps
environmental concerns.Neglect of RiverSpeedAnother variable that our model ignores is the speed of the river. River
speed increases with the depth and slope of the river channel, making
our assumption of constant maximum daily travel distance impossible
[Wikipedia 2012]. When a river experiences high?ow, river speeds can
double, and entire campsites can end up under water [National Park Ser-
vice 2008]. Again, the results of our model don’t re?ect these issues.ReferencesC.U. Boulder Dept. of Applied Mathematics. n.d. Fitting a surface to scat-
tered x-y-z data points.http://amath.colorado.edu/computing/Mathematica/Fit/.
Jalbert, Linda, Lenore Grover-Bullington, and Lori Crystal, et al. 2006.
Colorado River management plan. 2006.http://www.nps.gov/grca/
parkmgmt/upload/CRMPIF_s.pdf.
NationalPark Service. 2008. Grand Canyon NationalPark. High?ow river
permitinformation.http://www.nps.gov/grca/naturescience/high_
flow2008-permit.htm.
. 2011. Grand Canyon National Park. 2011 Campsite List.http:
//www.nps.gov/grca/parkmgmt/upload/2011CampsiteList.pdf.Computing Along theBig Long River245
Sullivan, Steve. 2011. Grand Canyon River Statistics Calendar Year 2010.http://www.nps.gov/grca/planyourvisit/upload/Calendar_
Year_2010_River_Statistics.pdf.
Wikipedia. 2012. River.http://en.wikipedia.org/wiki/River.Memo to Managers of the Big Long RiverIn response to your questions regarding trip scheduling and river ca-
pacity, we are writing to inform you of our?ndings.
Our primary accomplishment is the development of a scheduling al-
gorithm. If implemented at Big Long River, it could advise park rangers
on how to optimally schedule trips of varying length and propulsion. The
optimal schedule will maximize the number of trips possible over the six-
month season.
Our algorithm is?exible, taking a variety of different inputs. These
include the number and availability of campsites, and parameters associ-
ated with each tour group. Given the necessary inputs, we can output a
daily schedule. In essence, our algorithm does this by using the state of the
river from the previous day. Schedules consist of campsite assignments for
each group on the river, as well those waiting to begin their trip. Given
knowledge of future waitlists, our algorithm can output schedules months
in advance,allowing managementto scheduletheprecisecampsitelocation
of any group on any future date.
Sparing you the mathematical details, allow us to say simply that our
algorithm uses a priority system. It prioritizes groups who are behind
schedule by allowing them to move to further campsites, and holds back
groups who are ahead of schedule. In this way, it ensures that all trips will
be completed in precisely the length of time the passenger had planned for.
But scheduling is only part of what our algorithm can do. It can also
compute a maximum number of possible trips over the six-month season.
We call this the carrying capacity of the river. If we?nd we are below our
carrying capacity, our algorithm can tell us how many more groups we
could be adding to the water each day. Conversely, if we are experiencing
river congestion, we can determine how many fewer groups we should be
adding each day to get things running smoothly again.
An interesting?nding of our algorithm is how the ratio of oar-powered
river rafts to motorized boats affects the number oftrips we can send down-
stream. When dealing with an even distribution oftrip durations (from 6to
18 days), we recommend a 1:1 ratio to maximize the river’s carrying capac-
ity. If the distribution is skewed towards shorter trip durations, then our
model predicts that increasing towards a 4:1 ratio will cause the carrying
capacity to increase. Ifthe distribution is skewed the oppositeway,towards
longer trip durations, then the carrying capacity of the river will always be246TheUMAP Journal 33.3 (2012)
less than in the previous two cases—so this is not recommended.
Our algorithm has been thoroughly tested, and we believe that it is
a powerful tool for determining the river’s carrying capacity, optimizing
daily schedules,and ensuring that people will be able to complete their trip
as planned while enjoying a true wilderness experience.
Sincerely yours,
Team 13955Team members Chip Jackson, Lucas Bourne, and Travis Peters, and team advisor Edoh Amiran.Judges’Commentary247Judges’Commentary:
The Outstanding RiverProblem
PapersMarie VaniskoDept. of Mathematics, Engineering, and Computer Science
Carroll College
Helena, MT 59625mvanisko@carroll.eduProblem Overview and General RemarksThis year’s problem dealt with scheduling variable-length river trips
down a 225-mile stretch of a particular river, using either oar-powered
rubber rafts (at 4mph)or motor boats (at 8mph). A?xed starting point and
a?xed ending point were speci?ed for all trips, with campsites distributed
fairly uniformly down the river corridor. Minimal contact between groups
of visitors was desired, and no two groups could share the same campsite.
The goal was to maximize the number of trips over a six-month period,
utilizing both types of transportation and allowing for trip lengths of 6 to
18 nights on the river. In addition to the executive summary, teams were
required to write a memo to the managers of the river trips, advising them
on the optimal scheduling of trips of various lengths over the six-month
period, and taking the carrying capacity of the river into account.
The teams’ approaches varied greatly, especially regarding the num-
ber of campsites available—a factor that had a signi?cant impact on the
number of trips that could be scheduled. Many teams found that the “Big
Long River” greatly resembled a stretch of the Colorado River in the Grand
Canyon, and some used that as a case study for their models. Simulations
are available for scheduling trips on that river, but teams had to address all
of the issues raised in the problem statement and come up with a solution
that demonstrated their own creativity. The judges looked for that and for
carefully-explained mathematical model-building with sensitivity analysis
that went beyond what is found in the literature.TheUMAPJournal33(3)(2012)247–251.
c
?Copyright2012by COMAP,Inc. Allrightsreserved.
Permission to make digital or hard copies of part or all of this work for personal or classroom use
is granted without fee provided that copies are not made or distributed for pro?t or commercial
advantage and that copies bear this notice. Abstracting with credit is permitted, but copyrights
for components of this work owned by others than COMAP must be honored. To copy otherwise,
to republish, to post on servers, or to redistribute to lists requires prior permission from COMAP.248TheUMAP Journal 33.3 (2012)Executive Summary and MemoTheexecutivesummaryisofcriticalimportance,especiallyin earlyjudg-
ing. It should?motivate the reader;?be polished, with a good synopsis of key results;?give an overview of the model(s) used, together with the rationale forusing such a model and the primary results obtained from that model;
and?state speci?c results obtained for the optimal solution.
Teams were also asked to write a memo appropriate for the manager
of the Big River boat tours. Whereas the executive summary usually con-
tains technical details, this memo was intended for a nontechnical person
who wanted to know how best to schedule trips. Hence, the memo was
supposed to give speci?cs on how to schedule trips of various types to best
accommodate as many groups as possible. Vague generalizations were of
little or no value.DocumentationIn comparison with previous contests, the judges were pleased to ob-
servea noticeableimprovementin how referenceswereidenti?ed and in the
speci?cprecision oftheir documentation. Considering the online resources
available, proper documentation was an especially important factor in this
year’s problem. Despite the improvement, many papers contained charts
and graphs from Web sources with no documentation.
All graphs and tables should?have labels and/ or legends;?provide information about what is discussed in the paper;?be “called out” in the text of the paper, so as to refer the reader to them;
and?be explained in the text, including their signi?cance.
Thebestpapersused graphstohelp clarifytheirresults,and thosepapers
also documented trustworthy resources whenever used.AssumptionsTeams made many assumptions about travel along the river. Some
were appropriate and played integral roles in the models used;others wereJudges’Commentary249
super?uous. Some teams assumed that there would always be enough
customers to?ll any trips scheduled; other teams used probability distri-
butions to describe the demand for different trips at different times of the
season. Either approach could beused,but each led to different results. The
carrying capacity of the river was dependent on the number of campsites
available and the types of trips to be scheduled.
Since this is a modeling contest, much weight is put on whether or not
the model could be used (with modi?cation) in the real world. Therefore,
assumptions required for simpli?cation could not be totally unrealistic.
Also, clear writing and exposition is essential to motivate and explain as-
sumptions, and to derive and test models based on those assumptions.The Model(s)One can arrive at a fairly complete solution to this problem with pencil
and paper alone. Problem solvers should at least consider this possibility
before launching a simulation! Some teams began with a simple model,
then improved it to accommodate the requirements better. Teams should
be aware that it is not the quantity of models considered that is important,
but rather the quality of the model selected and its applicability to the case
at hand.
At a minimum, the solutions should try to come up with a mix of trips
that seem reasonable. Most teams recognized that for a 225-mile river, a
motor boat could run the entire distance in 6 to 8 nights, whereas a raft
powered by oars would need 12 to 18 nights. While it is true that requiring
only the shortest trip lengths would permit the most boats to get down
the river, it was important to consider that not all groups would choose to
travel that way.
Some teams considered a pro?t incentive when scheduling trips of
varying duration on the river and used selected numbers from the Grand
Canyon boat trips as a guide. For example, the cost of the trip might be a
constant?xed cost plus an amount based on the number of nights on the
river. In that case, shorter trips might allow more boats to launch and be
optimal in terms of pro?t. Or perhaps it would be more valuable to people
to get more time in this pristine wilderness, so they would pay a premium
for the longer trips—in which case it might be worth sending fewer boats
down the river. Many teams ignored cost/ pro?t issues.
Teams assigned campsites so as to ensure that no two sets of campers
occupied the same site at the same time. At the end of each night, the
teams had to be sure that all crafts camped in reasonable locations and that
the model did not require a boat to travel too far in any one day. Many
teams measured the percentage of campsites occupied each night as a help
in determining an optimalnumber ofcampsites and how good the solution
was from a manager’s perspective.250TheUMAP Journal 33.3 (2012)
In addition to having no two groups at the same campsite, minimum
contact between groups also implies minimizing crafts passing one another
on the river. Teams that took this into account showed true diligence. Some
teams even measured the average number of such contacts. Although ne-
glecting this aspect was not a fatal?aw, proper consideration of the cross-
ings gave the model added value.Testing the Model—Simulations and
Sensitivity AnalysisMCM teams are getting better at carrying out simulations,and this tech-
nique was of great value for the Big River problem. However, to carry out
a simulation properly, criteria had to be speci?ed for scheduling trips of
varying length. A good?owchart with examples was very powerful in
clarifying how a simulation was to be carried out. Some teams used a well-
de?ned prioritization scheme that assured that no two groups stayed at the
same campsite on any given night and rejected assignments that violated
that criterion.
Sensitivity analysis was an essential ingredient. The better papers con-
sidered how their solution was impacted by changing the number ofcamp-
sites and by changing the types of trips. This included varying the ratio of
motor boats to oar-powered rafts and varying the ratio of trip durations.
The graphical demonstration of the results of such sensitivity analysis was
a powerful way to communicate the outcomes and to check for patterns of
optimality.
Although sensitivityanalysiscould haveincluded issuesassociated with
boating accidents, inclement weather, and?ash?oods, most papers only
alluded to such possibilities. Few teams considered anything but constant
speeds for the river?ow and the boats. Some teams considered extending
the hours of travel.Strengths and WeaknessesA strong paper must assess its strengths and its weaknesses. One of the
greatest strengths of any model is how well it re?ects the real world situ-
ation. Hence, using a case study to validate a model is a powerful means
to make that case. Most papers recognized the limitations of their mod-
els in failing to consider weather, river, and individual camper issues. A
strong solution might mention among weaknessesthat assigning campsites
is something of a limitation, because an accident that prevents a boat from
reaching its assigned campsite could mess up the model. A more realistic
model would say that a given boat will go at most—rather than exactly—nJudges’Commentary251
miles per day; and a?exible model would ensure that a boat could?nd an
open campsite if it didn’t make it to its goal campsite.Concluding RemarksMathematical modeling is an art. It is an art that requires considerable
skill and practice in order to develop pro?ciency. The big problems that
we face now and in the future will be solved in large part by those with
the talent, the insight, and the will to model these real-world problems
and continuously re?ne those models. The judges are very proud of all
participants in this Mathematical Contest in Modeling and we commend
you for your hard work and dedication.About the AuthorMarie Vanisko is a Mathematics Professor Emerita from Carroll College
in Helena, Montana, where she taught for more than 30 years. She was
also a Visiting Professor at the U.S. Military Academy at West Point and
taught for?ve years at California State University, Stanislaus. She chairs
the Board of Directors at the Montana Learning Center on Canyon Ferry
Lake and serves on the Engineering Advisory Board at CarrollCollege. She
has been a judge for the MCM for 17 years and for the HiMCM for eight
years.252TheUMAP Journal 33.3 (2012)Author’s Commentary253Author’s Commentary:
The Outstanding RiverProblem
PapersCatherine A. RobertsDept. of Mathematics and Computer Science
College of the Holy Cross
Worcester MA 01610croberts@holycross.eduThis MCM problem was inspired by a research project for the Grand
Canyon NationalPark in Arizona,U.S.A.My collaboratorsand Ideveloped
a mathematicalmodel to simulate white-water rafting traf?c along the 225-
mile Colorado River corridor within the national park. The National Park
Service manages access to the river, guided by a document called the Col-
orado River Management Plan (CRMP).This research program began with
efforts to revise the 1989 CRMP in the late 1990s. Our model was used as a
tool by river managers at the National Park Service to explore options for
scheduling rafting traf?c.
At the time, every year (almost entirely over the summer months) more
than 19,000 people rafted the river on trips guided by 16 licensed commer-
cial companies, while approximately 3,500 private boaters paddled them-
selvesdown theriver. Demand for accessto theriver far exceeded supply—
a waiting list for a private river trip pass had over 7,000 names on it, and a
quarter of those people had already waited over a dozen years.
The hope was that this mathematical model would provide insight into
alternativemanagement scenariosso that park managerscould make smart
decisions that would enable as many visitors as possible to enjoy the river,
while at the same time maintaining standards for a wilderness experience.
Some simpli?cations were built into the MCM Problem, compared to
the actual situation on the Colorado River.?The campsites on the Colorado River are not distributed evenly through-
out the river corridor. Indeed,there’s a big congestion problem in a reach
of the river with few campsites and many popular attraction sites. Some
campsites are not suitable for motorized boats.TheUMAPJournal33(3)(2012)253–257.
c
?Copyright2012by COMAP,Inc. Allrightsreserved.
Permission to make digital or hard copies of part or all of this work for personal or classroom use
is granted without fee provided that copies are not made or distributed for pro?t or commercial
advantage and that copies bear this notice. Abstracting with credit is permitted, but copyrights
for components of this work owned by others than COMAP must be honored. To copy otherwise,
to republish, to post on servers, or to redistribute to lists requires prior permission from COMAP.254TheUMAP Journal 33.3 (2012)?It is permissible to have more than one group camping at the same site,
although theColoradoRiver ManagementPlan dictatesthattheschedule
should minimize any camping within sight or sound of another party.?There are two?xed-points on the river corridor—places where passen-
gers are exchanged via hiking in-and-out of the canyon or traveling via
helicopter. A trip with an exchange must make it to their site at a prede-
termined dateand time. Otherwise,therearenoassigned campsites—it’s
really impossible to assign a rafting trip a speci?ed set of campsite loca-
tions because so much (?ash?oods, boat spills, accidents, health prob-
lems) can interfere with a party’s ability to reach a certain location at a
?xed time. Moreover, the river culture is such that assigned campsites
would be anathema.
The model uses a Geographical Information System (GIS) to divide the
river into 90-meter cells. We assigned each cell speci?c attributes (camp-
site, lunch spot, dangerous rapid, hiking trail, waterfall, etc.). We used
hundreds of trip diaries and personal interviews with river guides to de-
termine appropriate weights for the popularity of camping and attraction
sites along the river corridor. Trip diaries also helped us estimate the av-
erage rate of travel of motor and oar boats through various reaches of the
river (when the river corridor narrows, the water’s velocity increases and
so boats travel through faster). The model captures the complex dynamics
of human visitors interacting with the environment and each other. It is
both temporal and spatial as it carefully tracks every move that every trip
makes.
Our model, titled the Grand Canyon River Trip Simulator (GCRTSim),
was programmed in VisualBasic. A user can build any imaginable launch
schedule and “run” the season down the virtual river. The results are then
analyzed and judged against criteria established by the Park Service.
Our model leveraged a number of mathematical theories and ideas.?Intelligentagenttheory: Each trip hasan assigned “personality”and makes
all of its decisions consistent with that personality to optimize each day.
Thus, a short commercial trip would be less likely to choose a long hike
when it needs more time just to paddle down the river. Each trip is an
intelligent agent operating within a complex system.?Decision theory: Each trip makes decisions based on a?xed set of choices
(e.g., to stop to camp or to continue to the next campsite). The model
measures the utility gained from each choice and seeks to maximize
the total utility for each trip (e.g., best campsites, key attractions, low
crowds).?Game theory: Strategic behavior and bargaining rules come into play as
each trip seeks to in?uence the decisions of other trips. For example,can
one trip claim a downstream campsite earlier in the day by communi-
cating its desire with the other trips that it encounters?Author’s Commentary255?Essentially, the GCRTSim model boils down to aconstrained optimization
problem where the success of the entire season depends on individual
decisions made by all of the trips, and the outcome depends on the com-
bined strategies. For the National Park Service to manage the Grand
Canyon rafting season successfully, the sum of all the individual deci-
sions over the course of the entire season contributes to an overall utility
that must be maximized.
The GCRTSim model suggested that the best solution was to expand
the rafting season into the shoulder months in the spring and fall. The new
CRMP was authorized in 2006, and the new approach to scheduling river
trips has been in place since 2008. The number of private launches was
dramatically increased without lowering the commercial use. The waiting
list was converted to a lottery system that appears to be in favor with the
private boaters. Yet, even with more trips being sent down the river each
year, the overall crowding at any particular moment was reduced because
the trips were spread out over additional months. The number of trips on
the river at any one time was reduced from a high of 70 to a high of 60, so
the perception of visitors is that the river is less crowded now than it used
to be. It is also quieter, since the number of months in which motorized
rafts and helicopter exchanges are allowed have been cut in half. A rafter
going through the Grand Canyon NationalPark on the Colorado River will
enjoy a genuine wilderness experience.Photo Credit: Catherine A. Roberts.256TheUMAP Journal 33.3 (2012)ReferencesBieri, Joanna A., and Catherine A. Roberts. 2000. Using the Grand Canyon
River Trip Simulator totestnew launch scheduleson theColoradoRiver.
AWIS [Association for Women in Science] Magazine29 (3): 6–10.http:
//mathcs.holycross.edu/~croberts/publications/AWIS.PDF.
Gimblett, R., T.C. Daniel, C.A. Roberts, and M. Ratliff. 1998. Update on
river research at the Grand Canyon: Grand Canyon River Trip Simu-
lator Project.(ColoradoRiver) Soundings: Newsletter of theColoradoRiver
Management Planning Process(May 1998): 1–2.
Gimblett, H. Randy, Catherine A. Roberts, Terry C. Daniel, Michael Ratliff,
Michael J. Meitner, Susan Cherry, Doug Stallman, Rian Bogle, Robert
Allred, Dana Kilbourne, and Joanna Bieri. 2000. An intelligent agent-
based modeling for simulating and evaluating river trip scheduling
scenarios along the Colorado River in Grand Canyon National Park.
InIntegrating GIS and Agent-Based Modeling Techniques for Simulating
Social and Ecological Processes, edited by H. Randy Gimblett, 245–275.
New York: Oxford University Press.http://mathcs.holycross.
edu/~croberts/RESEARCH/GCRTSim/SantaFe.PDF.
O’Brien, Gary, and Catherine Roberts. 1999. Evaluation of river beach car-
ryingcapacityinformation utilized bytheGrand Canyon RiverTrip Sim-
ulator: Analysis and recommendationsfor future study. Grand Canyon
Science Center (CA8210-99-002), Final Report.http://mathcs.
holycross.edu/~croberts/RESEARCH/Beach/BEACH.PDF.
Roberts, Catherine A. 2002a. How can a computer program aid the Col-
orado River planning process?The Waiting List: The Grand Canyon Pri-
vateBoatersAssociationQuarterly5(4): 6–8.http://mathcs.holycross.
edu/~croberts/RESEARCH/GCRTSim/waitinglist.pdf.
. 2002b. A computer model for the Colorado River Management
Plan.TheRiver Management Society News(Winter 2002): 6–7.
. 2007. Environmental mathematical modeling: Grand Canyon.
Math Horizons15 (1) (September 2007): 10–13.http://www.maa.org/
mathhorizons/MH-Sep2007_GrandCanyon.pdf.
, and Joanna A. Bieri. 2001. Impacts of low?ow rates on recre-
ational rafting traf?c on the Colorado River in Grand Canyon National
Park. Final Report. Bureau of Reclamation, Grand Canyon Monitoring
and Research Center.http://www.gcmrc.gov/library/reports/
cultural/Recreation/roberts2001.pdf. 2008. Summarized inSyn-opses of Studies Completed in Association with theLow Steady Summer Flow
Experimental Release from Glen Canyon Dan in WY2000, edited by B.E.
Ralston and J.L. Waring, 58–61. Washington, DC: U.S. Department of
Interior and U.S. Geological Survey.Author’s Commentary257
Roberts,CatherineA.,and Randy Gimblett. 2001. Computer simulation for
rafting traf?c on the Colorado River. InProceedings of the5th Conference
of Research on theColoradoPlateau, 19–30. Washington, DC:U.S. Geolog-
ical Survey.http://mathcs.holycross.edu/~croberts/RESEARCH/
GCRTSim/USGS.PDF.
Roberts,CatherineA.,Doug Stallman,and Joanna A.Bieri. 2002. Modeling
complex human-environment interactions: The Grand Canyon river
trip simulator.Journal of Ecological Modeling153 (2): 181–196.http://
mathcs.holycross.edu/~croberts/RESEARCH/GCRTSim/EcoMod.
pdf.About the AuthorCatherine Roberts is Chair of the Dept.
of Mathematics and Computer Science at the College of the Holy Cross
and Editor-in-Chief of the journalNatural Resource Modeling. She has an
A.B. magna cum laude from Bowdoin College in mathematics and art his-
tory and a Ph.D. from Northwestern University in applied mathematics
and engineering sciences. She has served on numerous committees of the
American Mathematical Society and the Association for Women in Mathe-
matics, and she is an Associate Editor of thisJournal.258TheUMAP Journal 33.3 (2012)GiordanoAward Commentary259Judges’Commentary:
The Giordano Award forthe
RiverProblemMarie VaniskoDept. of Mathematics, Engineering, and Computer Science
Carroll College
Helena, MT 59625mvanisko@carroll.eduRichard D. WestMathematics Dept.
Francis Marion University
Florence, SC 29501rwest@fmarion.eduIntroductionFor the?rst time in its history, the MCM is designating a paper with
the Frank Giordano Award. This designation goes to a paper that demon-
strates a very good exampleofthe modeling process in a problem involving
discrete mathematics.
Havingworked on thecontestsinceitsinception,FrankGiordanoserved
as Contest Director for 20 years. As Frank says:
It was my pleasure to work with talented and dedicated profession-
als to provide opportunities for students to realize their mathematical
creativity and whet their appetites to learn additional mathematics.
The enormous amount of positive feedback I have received from par-
ticipants and faculty over the years indicates that the contest has made
a huge impact on the lives of students and faculty, and also has had
an impact on the mathematics curriculum and supporting laborato-
ries worldwide. Thanks to all who have made this a rewarding and
pleasant experience!TheUMAPJournal33(3)(2012)259–262.
c
?Copyright2012by COMAP,Inc. Allrightsreserved.
Permission to make digital or hard copies of part or all of this work for personal or classroom use
is granted without fee provided that copies are not made or distributed for pro?t or commercial
advantage and that copies bear this notice. Abstracting with credit is permitted, but copyrights
for components of this work owned by others than COMAP must be honored. To copy otherwise,
to republish, to post on servers, or to redistribute to lists requires prior permission from COMAP.260TheUMAP Journal 33.3 (2012)
The Frank Giordano Award for 2012went to the Outstanding team from
Western Washington University (WWU) in Bellingham, WA. This solution
paper was characterized by?a high-quality application of the complete modeling process, including
assumptions with clear justi?cations, a well-de?ned simulation, a case
study, and sensitivity analysis;?originality and creativity in the modeling effort to solve the problem as
given; and?clear and concise writing, making it a pleasure to read.The RiverProblemThis year’s problem dealt with scheduling variable-length river tripsdown a 225-mile stretch of a particular river, using either oar-powered
rubber rafts (at 4 mph) or motor boats (at 8 mph). Fixed starting and
ending points were speci?ed for all trips, with campsites distributed fairly
uniformly down the river corridor. Minimal contact between groups of
visitors was desired and no two groups could share the same campsite.
The goal was to maximize the number of trips over a six-month period
utilizing both types of transportation and allowing for trip lengths of 6 to
18 nights on the river. In addition to the executive summary, teams were
required to write a memo to the managers of the river trips, advising them
of the carrying capacity of the river and how to schedule trips of various
lengths over the six-month period.
The approaches that teams took varied greatly, especially with regard to
the number of campsites available. That factor had a signi?cant impact on
the number of trips that could be scheduled. Many teams found that the
“Big Long River” greatly resembled a stretch of the Colorado River in the
Grand Canyon, and some looked at this as a case study for their models.
Simulations for scheduling trips on the Colorado River were available,
but teams had to address all the issues raised and come up with a solution
that demonstrated their own creativityTheWesternWashingtonUniversityPaperExecutive Summary Sheet and MemoAlthough well written, this team’s one-page sheet at the start was an
abstract rather than a one-page executive summary. Typically, an executive
summarycontainsmoreinformation (and often moresensitiveinformation)
than the abstract does. This team’s one-page summary was too short andGiordanoAward Commentary261
did not state results, but, to the team’s credit, it did motivate the reader to
read on.
Although it should have contained more speci?cs with regard to the
scheduling,the team’s memo, written in an appropriate nontechnical man-
ner, was done much better.AssumptionsOne of the?rst things that made this paper stand out from the others
was that assumptions were not merely listed but each one was justi?ed.
Assumptions were reasonable,and it was noted how the assumptions were
to be used in the algorithm. This is something that is most important in
the modeling process, but something that is frequently overlooked, so the
team is to be commended for their thoroughness in this regard.The Model and MethodsTheteam used a schedulingalgorithm. Thevariableswerewell-de?ned;
and it was clear how they arrived at their constraints, based on the stipu-
lations stated in the problem. This was one of the few papers that allowed
for groups to stay at a camp for more than one night, but that worked well
for their algorithm and did not con?ict with the problem statement. Using
a very speci?c de?nition for the priority that one group would have over
another group, the team was able to assign campsites in a successful man-
ner. Interestingly, they started at the end of the river;and using the priority
list, they placed the groups in campsites each night. One drawback with
their model was that they did not consider crossings ofgroups while on the
river.Testing TheirModelsThe?owchart for the team’s scheduling algorithm was clari?ed by the
useofexamplesand simulations. Thecasestudy,using data from theGrand
Canyon, enabled them to validate their model. They considered many
different numbers of campsites, ranging from 50 to 235. With regard to the
ratio of the types of boats and lengths of trips, they carried out sensitivity
analysis,although they limited their trip lengths to 6, 12,or 18nights on the
river. The use of contour maps to demonstrate their results and to observe
the “ridge” representing the 1:1 ratio of motor boats to oar-powered rafts
was particularly noteworthy.262TheUMAP Journal 33.3 (2012)Recognizing Limitations of the ModelRecognizing the limitations of a model is an important last step in the
completion of the modeling process. The students recognized that their
algorithm would have to be modi?ed if the river speed were taken into
account. They also acknowledged that their carrying capacity for trips
might be overestimated and that they had not considered environmental
concerns.References and BibliographyThe list of references was fairly thorough, and it was very good to see
speci?c documentation of where those references were used in the paper.ConclusionThe careful exposition in the development of the mathematical modelmade this paper one that the judges felt was worthy of the Frank Giordano
Award. The team is to be congratulated on their analysis, their clarity,
and using the mathematics that they knew to create and justify their own
creative model for scheduling camping trips along the Big Long River.About the AuthorsRich Westisa MathematicsProfessor Emeritusfrom FrancisMarion Uni-
versity in Florence, South Carolina,where he taught for twelve years. Prior
to his time at Francis Marion, he served in the U.S. Army for 30 years, 14 of
which were spent teaching at the U.S. Military Academy. He is currently
working on a National Science Foundation Grant on freshmen placement
tests. He also serves as a Reading Consultant for AP Calculus and as a
developmental editor for CLEP (College Level Equivalency Program) Cal-
culus Exam. He has judged for both the MCM and HiMCM for over 10
years.
Marie Vanisko is a Mathematics Professor Emerita from Carroll College
in Helena, Montana, where she taught for more than 30 years. She was
also a Visiting Professor at the U.S. Military Academy at West Point and
taught for?ve years at California State University, Stanislaus. She chairs
the Board of Directors at the Montana Learning Center on Canyon Ferry
Lake and serves on the Engineering Advisory Board at CarrollCollege. She
has been a judge for the MCM for seventeen years and for the HiMCM for
eight years.Results of the2012 ICM263ICM Modeling ForumResults of the 2012 Interdisciplinary
Contest in ModelingChris Arney, ICM DirectorDept. of Mathematical Sciences
U.S. Military Academy
West Point, NY10996david.arney@usma.eduIntroductionIn the 14th Interdisciplinary Contest in Modeling (ICM)R?,1,329teams from
six countries spent a weekend in February working on an applied modeling
problem involving a criminal network. This year’s contest began on Thursday,
February 9, and ended on Monday, February 14, 2012. During that time, teams
of up to three undergraduate or high school students researched, modeled,
analyzed, solved, wrote, and submitted their solutions to an open-ended inter-
disciplinary modeling problem involving a criminalconspiracy network. After
theweekend ofchallengingand productivework,thesolution papersweresent
to COMAP for judging. Seven of the papers were judged to be Outstanding by
the expert panel of judges.
COMAP’s Interdisciplinary Contest in Modeling (ICM) involves students
working in teams to model and analyze an open interdisciplinary problem.
Centeringitseducationalphilosophyon mathematicalmodeling,COMAPsup-
ports the use of mathematical tools to explore real-world problems. It serves
society by developing students as problem solvers in order to become better
informed and prepared ascitizens,contributors,consumers,workers,and com-
munity leaders. The ICM is an exampleofCOMAPsefforts in working towards
these goals.
Thisyear’sproblem waschallengingin itsdemand for teamsto utilizemany
aspects of science, mathematics, and analysis in their modeling and problem
solving. The problem required teams to investigate the relationships of theTheUMAPJournal33(3)(2012)263–273.
c
?Copyright2012by COMAP,Inc. Allrightsreserved.
Permission to make digital or hard copies of part or all of this work for personal or classroom use
is granted without fee provided that copies are not made or distributed for pro?t or commercial
advantage and that copies bear this notice. Abstracting with credit is permitted, but copyrights
for components of this work owned by others than COMAP must be honored. To copy otherwise,
to republish, to post on servers, or to redistribute to lists requires prior permission from COMAP.264TheUMAP Journal 33.3 (2012)members of a criminal conspiracy network within a business organization
through social network analysis of their message traf?c. It required teams
to understand concepts from the informational and social sciences to build
effective network and statistical models to analyze more than 400 messages,
categorized into 15 topics, among 83 people. To accomplish their tasks, the
students had to consider many dif?cult and complex disciplinary and interdis-
ciplinary issues. The problem also included the customary requirements in the
ICM to perform thorough analysis and research, exhibit creativity, and demon-
strate effective communication. All members of the 1,329 competing teams are
to be congratulated for their excellent work and dedication to interdisciplinary
modeling and problem solving.
Instructions to the teams included:?Your ICM submission should consist of a 1-page Summary Sheet and your
solution cannot exceed 20 pages for a maximum of 21 pages.?As modelers, you have to deal with the data you have and through valid
assumptions decide what to do with holes, irregularities,redundancies,and
errors.
Next year, we will continue the network science theme for the contest prob-
lem. Teams preparing for the 2013 contest should consider reviewing interdis-
ciplinary topics in the areas ofnetwork science and socialnetwork analysis and
assemble teams accordingly.The Problem Statement:
The Crime-Busting ProblemYour organization, the IntergalacticCrime Modelers (ICM),is investigating
a conspiracy to commit a criminal act. The investigators are highly con?dent
they know several members of the conspiracy, but hope to identify the other
members and the leaders before they make arrests. The conspirators and the
possible suspected conspirators all work for the same company in a large of?ce
complex. The company has been growing fast and making a name for itself in
developing and marketing computer software for banks and credit card com-
panies. ICM has recently found a small set of messages from a group of 82
workers in the company that they believe will help them?nd the most likely
candidates for the unidenti?ed co-conspirators and unknown leaders. Since
the message traf?c is for all the of?ce workers in the company, it is very likely
that some (maybe many)ofthe identi?ed communicators in the message traf?c
are not involved in the conspiracy. In fact,they are certain that they know some
people who are not in the conspiracy. The goal of the modeling effort will be to
identify people in the of?ce complex who are the most likely conspirators. A
priority list would be ideal so ICM could investigate, place under surveillance,
and/ or interrogate the most likely candidates. A discriminate line separatingResults of the2012 ICM265conspirators from non-conspirators would also be helpful to distinctly catego-
rize the people in each group. It would also be helpful to the district attorney
(DA) if the model nominated the conspiracy leaders.
Before the data of the current case are given to your crime modeling team,
your supervisor gives you the following scenario (called Investigation EZ)that
she worked on a few years ago in another city. Even though she is very proud
of her work on the EZ case, she says it is just a very small, simple example, but
it may help you understand your task. Her data follow:
The 10 people whom she was considering as conspirators were: Anne#,
Bob, Carol, Dave*, Ellen, Fred, George*, Harry, Inez, and Jaye# (* indicates
prior known conspirators, # indicate prior known non-conspirators).
Here is the chronology of the 28 messages that she had for her case, with
a code number for the topic of each message that she assigned based on her
analysis of the message:
Anne to Bob: Why were you late today? (1)
Bob to Carol: That darn Anne always watches me. I wasn’t late. (1)
Carol to Dave: Anne and Bob are?ghting again over Bob’s tardiness. (1)
Dave to Ellen: I need to see you this morning. When can you come by? Bring
the budget?les. (2)
Dave to Fred: I can come by and see you anytime today. Let me know when it
is a good time. Should I bring the budget?les? (2)
Dave to George: I will see you later — lots to talk about. I hope the others are
ready. It is important to get this right. (3)
Harry to George: You seem stressed. What is going on? Our budget will be
?ne. (2) (4)
Inez to George: I am real tired today. How are you doing? ( 5)
Jaye to Inez: Not much going on today. Wanna go to lunch today? (5)
Inez to Jaye: Good thing it is quiet. I am exhausted. Can’t do lunch today —
sorry! (5)
George to Dave: Time to talk — now! (3)
Jaye to Anne: Can you go to lunch today? (5)
Dave to George: I can’t. On my way to see Fred. (3)
George to Dave: Get here after that. (3)
Anne to Carol: Who is supposed to watch Bob? He is goo?ng off all the time.
(1)
Carol to Anne: Leave him alone. He is working well with George and Dave.
(1)
George to Dave: This is important. Darn Fred. How about Ellen? (3)
Ellen to George: Have you talked with Dave? (3)
George to Ellen: Not yet. Did you? (3)266TheUMAP Journal 33.3 (2012)Bob to Anne: I wasn’t late. And just so you know — I am working through
lunch. (1)
Bob to Dave: Tell them I wasn’t late. You know me. (1)
Ellen to Carol: Get with Anne and?gure out the budget meeting schedule for
next week and help me calm George. (2)
Harry to Dave: Did you notice that George is stressed out again today? (4)
Dave to George: Darn Harry thinks you are stressed. Don’t get him worried
or he will be nosing around. (4)
George to Harry: Just working late and having problems at home. I will be
?ne. (4)
Ellen to Harry: Would it be OK, if I miss the meeting today? Fred will be there
and he knows the budget better than I do. (2)
Harry to Fred: I think next year’s budget is stressing out a few people. Maybe
we should take time to reassure people today. (2) (4)
Fred to Harry: I think our budget is pretty healthy. I don’t see anything to
stress over. (2)
END of MESSAGE TRAFFIC
Your supervisor points out that she assigned and coded only 5 different
topics of messages:?1) Bob’s tardiness,?2) the budget,?3) important unknown issue but assumed to be part of conspiracy,?4) George’s stress, and?5) lunch and other social issues.
Asseen in themessagecoding,somemessageshad two topicsassigned because
of the content of the messages.
The way that your supervisor analyzed her situation was with a network
that showed the communication links and the types of messages.Figure 1is
a model of the message network that resulted, with the code for the types of
messages annotated on the network graph.
Your supervisor points out that in addition to known conspirators George
and Dave,Ellen and Carolwereindicted fortheconspiracybased on yoursuper-
visor’s analysis, and later Bob self-admitted his involvement in a plea bargain
for a reduced sentence, but the charges against Carol were later dropped. Your
supervisor is still pretty sure that Inez was involved, but the case against her
was never established. Your supervisor’s advice to your team is identify the
guilty parties so that people like Inez don’t get off, people like Carol are not
falsely accused, and ICM gets the credit so people like Bob do not have the
opportunity to get reduced sentences.Results of the2012 ICM267Figure 1.Network of messages from EZ Case.The Current CaseYour supervisor has put together a network-like database for the current
case,which has the same structure but is a bit larger in scope. The investigators
have some indications that a conspiracy is taking place to embezzle funds from
the company and use Internet fraud to steal funds from credit cards of people
who do business with the company. The small example that she showed you
for case EZ had only 10 people (nodes), 27 links (messages), 5 topics, 1 suspi-
cious/ conspiracy topic, 2 known conspirators, and 2 known non-conspirators.
So far, the new case has 83 nodes, 400 links (some involving more than 1
topic), over 21,000 words of message traf?c, 15 topics (3 have been deemed to
be suspicious), 7 known conspirators, and 8 known non-conspirators. These
data are given in the attached spreadsheet?les:Names.xls,Topics.xls, andMessages.xls1:?Names.xlscontains the key of node number to the of?ce workers’names.?Topics.xlscontainsthecodefor the15topicnumberstoa shortdescription
ofthe topics. Because ofsecurity and privacy issues,your team willnot have
direct transcripts of all the message traf?c.?Messages.xlsprovides the links of the nodes that transmitted messages
and the topic code numbers that the messages contained. Several messages
contained up to three topics.1These?les were available to contestants athttp://www.comap.com/undergraduate/
contests/mcm/contests/2012/problems/2012_ICM.zip.268TheUMAP Journal 33.3 (2012)To help visualize the message traf?c, a network model of the people and
message links is provided inFigure 2. In this case, the topics of the messagesare not shown in the?gure as they were inFigure 1. These topic numbers are
given in the?leMessages.xlsand described inTopics.xls.Figure 2.Visual of the network model of the 83 people (nodes) and 400 messages between these
people (links).Requirements:Requirement 1:So far, it is known that Jean, Alex, Elsie, Paul, Ulf, Yao, and
Harvey are conspirators. Also, it is known that Darlene, Tran, Jia, Ellin, Gard,
Chris, Paige, and Este are not conspirators. The three known suspicious mes-
sage topics are 7, 11, and 13. There is more detail about the topics in?le Top-
ics.xls. Build a model and algorithm to prioritize the 83 nodes by likelihood
of being part of the conspiracy and explain your model and metrics. Jerome,
Delores, and Gretchen are the senior managers of the company. It would beResults of the2012 ICM269very helpful to know if any of them are involved in the conspiracy.Requirement2:How would the priority list change ifnew information comes
to light that Topic 1 is also connected to the conspiracy and that Chris is one of
the conspirators?Requirement3:A powerfultechnique to obtain and understand text informa-
tion similar tothismessagetraf?ciscalled semanticnetworkanalysiswhereasa
methodology in arti?cial intelligence and computationallinguistics it provides
a structure and process for reasoning about knowledge or language. Another
computationallinguisticscapability in naturallanguageprocessing is text anal-
ysis. For our crime busting scenario, explain how semantic and text analyses
of the content and context of the message traf?c, if you could obtain the orig-
inal messages, could empower your team to develop even better models and
categorizations of the of?ce personnel. Did you use any of these capabilities
on the topic descriptions in?le Topics.xls to enhance your model?Requirement4:Your complete report will eventually go to the DA, so it must
be detailed and clearly state your assumptions and methodology;but it cannot
exceed 20 pages of write-up. You may include your programs as appendices
in separate?les that do not count in your page restriction, but including these
programs is not necessary. Your supervisor wants ICM to be the world’s best in
solving white-collar,high-tech conspiracy crimesand wantsyour methodology
will contribute to solving important cases around the world, especially those
with very large databases of message traf?c (thousands of people with tens
of thousands of messages and possibly millions of words). She speci?cally
asked you to include a discussion on how more thorough network, semantic,
and text analyses of the message contents could help with your model and
recommendations. As part ofyour report to her, explain the network modeling
techniques you have used and why and how they can be used to identify,
prioritize, and categorize similar nodes in a network database of any type, not
just crimeconspiraciesand messagedata. For instance,could your method?nd
the infected or diseased cells in a biological network where you had various
kinds of image or chemical data for the nodes indicating infection probabilities
and already identi?ed some infected nodes?The ResultsThe 1,329 solution papers were coded at COMAP headquarters so that
names and af?liations of the authors were unknown to the judges. Each paper
was then read preliminarily by triage judges at the U.S. Military Academy at
West Point, NY. At the triage stage, the summary, the model description, and
overallorganization are the primary elements in judging a paper. Finaljudging
by a team of modelers, analysts, and subject-matter experts took place in late
March. The judges classi?ed the 1,329 submitted papers as follows:270TheUMAP Journal 33.3 (2012)Honorable Successful
Outstanding Finalist Meritorious Mention Participant Total
Crime-Busting 7 4 125 640 553 1,329Outstanding TeamsInstitution and Advisor Team Members“Social Network Analysis in Crime Busting”
Northwesteren Polytechnical University
Xi’an, China
Bingchang Zhou
Chen Dong
Cunle Qian
Jianjun Ma
“Message Network Modeling for Crime Busting”
Nanjing Univ. of Information Science and Technology
Nanjing, Jiangsu, China
Guosheng Cheng
Yizhou Zhuang
Senfeng Liu
Liusi Xiao
“Crime Busting by an Iterative Two-Phase Propagation
Method”
Shanghai Jiaotong University
Shanghai, China
Zulin Sun
Xilun Chen
Hang Qiu
Chunzhi Yang
“Finding Conspirators in the Network: Machine
Learning with Resource-Allocation Dynamics”
Univ. of Electronic Science and Technology of China
Chengdu, Sichuan, China
Tao Zhou
Fangjian Guo
Jiang Su
Jian Gao
“iRank Model: A New Approach to Criminal Network
Detection”
Mathematical Modeling Innovative Practice Base,
Zhuhai College of Jinan University
Zhuhai, Guangdong, China
Jianwen Xie
Yi Zheng
Yi Zeng
You Tian
“Extended Criminal Network Analysis Model Allows
Conspirators Nowhere to Hide”
Huazhong University of Science and Technology
Wuhan, Hebei, China
Zhengyang Mei
Dekang Zhu
Junmin Yang
Xiang Chen
“Crime Ring Analysis with Electric Networks”
Cornell University
Ithaca, NY
Alexander Vladimirsky
Michael Luo
Anirvan Mukherjee
Myron ZhangResults of the2012 ICM271Awards and ContributionsEach participating ICM advisor and team member received a certi?cate
signed by the Contest Director. Additional awards were presented to the team
from Cornell University by the Institute for Operations Research and the Man-
agement Sciences (INFORMS).JudgingContest DirectorsChris Arney, Dept. of Mathematical Sciences, U.S. Military Academy,
West Point, NY
Joseph Myers, Computing Sciences Division, Army Research Of?ce,
Research Triangle Park, NCAssociateDirectorRodney Sturdivant, Dept. of Mathematical Sciences, U.S. Military Academy,
West Point, NYJudgesDimitris Christopoulos, University of the West of England, Bristol,United Kingdom
Kathryn Coronges, Dept. of Behavioral Sciences and Leadership,
U.S. Military Academy, West Point, NY
Kayla de la Haye, RAND Corporation, Santa Monica, CA
Tina Hartley, Dept. of Mathematical Sciences, U.S. Military Academy,
West Point, NY
Brian Macdonald, Dept. of Mathematical Sciences, U.S. Military Academy,
West Point, NY
Christopher Marcum, RAND Corporation, Santa Monica, CA
Robert Ulman, Network Sciences Division, Army Research Of?ce,
Research Triangle Park, NCTriageJudgesChris Arney,John Bacon,Jocelyn Bell,Kevin Blaine,Nicholas Clark,Gabe
Costa, Michelle Craddock, Kevin Cummiskey, Chris Eastburg, Michael
Findlay, James Gatewood, Andy Glen, Tina Hartley, Alex Heidenberg,
Steven Horton, Nicholas Howard, John Jackson, Bill Kaczynski, Phil La-
Casse, Bill Pulleyblank, Elizabeth Russell, Mick Smith, James Starling,
Rodney Sturdivant, Andrew Swedberg, Csilla Szabo, Ben Thirey, Johan
Thiel, Chris Weld, and Shaw Yoshitani.
—all of Dept. of Mathematical Sciences, U.S. Military Academy, West Point,
NY; and272TheUMAP Journal 33.3 (2012)Joseph Myers, Army Research Of?ce, Research Triangle Park, NC
Michelle Isenhour, George Mason University, VA
Hise Gibson and Chris Farrell, U.S. Army; and
Amanda Beecher, Dept. of Mathematics, Ramapo College of New Jersey,
Mahwah, NJ.AcknowledgmentsWe thank:?the Institute for Operations Research and the Management Sciences (IN-
FORMS)for its support in judging and providing prizes for a winning team,
and?all the ICM judges for their valuable and un?agging efforts.CautionsTothereader of research journals:Usually a published paper has been presented to an audience, shown to
colleagues, rewritten, checked by referees, revised, and edited by a journal
editor. Each of the team papers here is the result of undergraduates working
on a problem over a weekend. Editing (and usually substantial cutting) has
taken place; minor errors have been corrected, wording has been altered for
clarity or economy, and style has been adjusted to that ofThe UMAP Journal.
The student authors have proofed the results. Please peruse these students’
efforts in that context.Tothepotential ICM advisor:It might be overpowering to encounter such output from a weekend of
work by a small team of undergraduates, but these solution papers are highly
atypical. A team that prepares and participates will have an enriching learning
experience, independent of what any other team does.Editor’s NoteThe complete roster of participating teams and results has become too
longtoreproducein theJournal. Itcan now befound attheCOMAPWebsite:http://www.comap.com/undergraduate/contests/mcm/contests/2012/results/2012_ICM_Results.pdfResults of the2012 ICM273About the AuthorChrisArneygraduated from WestPointand served
as an intelligence of?cer in the U.S. Army. His aca-
demic studies resumed at Rensselaer Polytechnic In-
stitute with an M.S. (computer science) and a Ph.D.
(mathematics). He spent most of his 30-year military
career as a mathematics professor at West Point, be-
fore becoming Dean ofthe SchoolofMathematicsand
Sciences and Interim Vice President for AcademicAf-
fairs at the College of Saint Rose in Albany, NY. Chris then moved to RTP
(Research Triangle Park), NC, where he served for various durations as
chair of the Mathematical Sciences Division, of the Network Sciences Di-
vision, and of the Information Sciences Directorate of the Army Research
Of?ce. Chris has authored 22 books, written more than 120 technical arti-
cles, and given more than 250 presentations and 40 workshops. His techni-
cal interests include mathematicalmodeling,cooperative systems,pursuit-
evasion modeling,robotics, arti?cial intelligence,military operations mod-
eling, and network science;his teaching interests include using technology
and interdisciplinary problems to improve undergraduate teaching and
curricula. He is the founding director of COMAP’s Interdisciplinary Con-
test in Modeling (ICM)R?. In August 2009, he rejoined the faculty at West
Point as the Network Science Chair and Professor of Mathematics.274TheUMAP Journal 33.3 (2012)Finding Conspirators275Finding Conspirators in the Network
via Machine LearningFangjian Guo
Jiang Su
Jian GaoWeb Sciences Center
University of Electronic Science and Technology of China
Chengdu, Sichuan, China
Advisor: Tao ZhouKey ConceptsMachine learning
Logistic regression
Semantic diffusion
Bipartite graph
Resource-allocation
dynamics
Kendall’s tau
Problem Clari?cation:A conspiracy network is
embedded in a network of employees of a com-
pany, with each edge representing a message sent
from one employee (node) to another and catego-
rized by topics. Given a few known criminals, a
few known non-criminals, and suspicious topics,
we seek to estimate the probability of criminal in-
volvement for other individuals and to determine
the leader of the conspirators.
Assumptions?Conspirators and non-conspirators are linearly
separable in the space spanned by localfeatures
(necessary for machine learning).?A conspirator is reluctant to mention to an out-
sider topics related to crime.?Conspirators tend not to talk frequently witheach other about irrelevant topics.?The leader of the conspiracy tries to minimizerisk by restricting direct contacts.?A non-conspirator has no idea of who are con-spirators, hence treats conspirators and non-
conspirators equally.TheUMAPJournal33(3)(2012)275–292.
c
?Copyright2012by COMAP,Inc. Allrightsreserved.
Permission to make digital or hard copies of part or all of this work for personal or classroom use
is granted without fee provided that copies are not made or distributed for pro?t or commercial
advantage and that copies bear this notice. Abstracting with credit is permitted, but copyrights
for components of this work owned by others than COMAP must be honored. To copy otherwise,
to republish, to post on servers, or to redistribute to lists requires prior permission from COMAP.276TheUMAP Journal 33.3 (2012)Key TechniquesGradient Descent
Revised LeaderRank
Model Design and Justi?cationFor an unidenti-
?ed node (an employee not identi?ed as a conspir-
ator or non-conspirator), we model the probabil-
ity of conspiracy as a sigmoid function of a lin-
ear combination of the node’s features (logistic re-
gression). Those features are formulated from lo-
cal topological measures and the node’s semantic
messaging patterns. Parameters of the model are
trained on a subset of identi?ed conspirators and
non-conspirators. The performance of the model
is enhanced by discovering potential similarities
among topics via topic-word diffusion dynamics
on a bipartite graph. We also perform resource-
allocation dynamics to identify the leader of the
conspirators; the identi?cation is supported by
empirical evidence in criminal network research.
Strengths and WeaknessesThe combination
of topological properties and semantic af?nity
among individuals leads to good performance.
The time complexity of the algorithm is linear, so
the method is suitable for large amounts of data.
However, our model requires assistance from se-
mantic network analysis to form an expert dictio-
nary. Also, intrinsic differences among networks
may hinder portability of the model’s features.IntroductionAs shown inFigure 1, criminals and conspirators tend to form organi-
zational patterns, interconnected with one another for collaboration, while
still maintaining social ties with the outside, thus providing a natural con-
text for description and analysis via networks [Baker and Faulkner 1993].
Criminal networks can be captured from various information, resulting
in different types of networks, where each node represents a person, and
an edge is present when two nodes collaborate in the same task, share the
same family name, or (as in this case) exchange messages [Krebs 2002].
Since the nodes in the graph can be a mixture ofboth criminals and non-
criminals, it is desirable to determine suspected criminals from topologi-
cal properties of the network and other prior knowledge, which includes
known criminals, known non-criminals, and information related to their
interactions. Moreover, we desire a priority list of descending criminal
likelihood so as to identify the primary leader of the organization.Finding Conspirators277Figure 1.The 83-employee network. Red (darker gray) nodes are known conspirators and the
blue (lighter gray) nodes are known non-conspirators.Many authors have adopted centrality measures of the graph for ana-
lyzing the characteristics of criminals. Criminals with high betweenness-
centrality are usually brokers, while those with high degree-centrality en-
joy better pro?t by taking higher risk [Krebs 2002]. Morselli [2010] pro-
posed that leaders ofa criminal organization tend to balance pro?t and risk
by making a careful trade-off between degree-centrality and betweenness-
centrality.
However, centrality approaches, which utilize local properties, tend to
overlook the complex topology of the whole network. Therefore, social
network analysis (SNA)methods,including subgroup detection and block-
modeling,have been introduced,which try to discover the hidden topolog-
ical patterns by partitioning the big network into small closely-connected
cliques [Xu and Chen 2005]. Despite the light that they shed on the internal
structuresofcriminalnetworks,thesemethodsstillsuffer from intimidating
complexity with large databases [Wheat 2007].
We carefully combine the local-feature-based methods with approaches
related to global topology of conspiracy networks. We propose a machine
learning scheme to leverage local features, so as to estimate each node’s
likelihood ofconspiracy involvement. We adopt dynamics-based methods,
which are less computationally expensive than most other topology-based
approaches, to help identify the lead conspirator and to discover semantic
connections between topics.
We start with the formulation of useful local features of a node in the
network,which then lead to the machinelearning scheme. We feed a subset
of known conspirators and non-conspirators as a training samples into the
learning algorithm. We then use the algorithm to estimate the probability
of being a conspirator for every other individual in the network.278TheUMAP Journal 33.3 (2012)
Since highly suspicious topics are essential to the performance of ma-
chine learning, we then try to discover similarities between topics, by per-
forming simple source-allocation dynamics on the bipartite semantic net-
work made up of topics and sensitive words. Those?ndings expand our
knowledge on suspicious topics, thus enhancing the accuracy of our ma-
chine learning model.
To?nd the leader of the conspirators,we apply a dynamics-based rank-
ing algorithm on a subgraph extracted from the network. Our?ndings
are in agreement with empirical knowledge about the centrality balance of
criminal leaders.
Finally, we perform sensitivity analysis to test the robustness of our
approach.A Machine Learning SolutionWe use machine learning mainly because of its adaptiveness and reor-
ganization, which simulate humans’actions to obtain fresh knowledge.
We describe the construction of our machine learning framework in
detail, including feature formulation, core learning methods, and experi-
mental results. Through statistical analysis on the results, we propose an
enhancement based on semantic diffusion.
We commence with several necessary assumptions:?All the data and information about the EZ case network and the 83-node
network are relatively stable over a long period.?The contents of the communication among conspirators tends to be rel-
evant about suspicious topics or some formal issues, rather than gossip.?The two networks feature similar core mechanisms for communication
transmission.Feature formulation?CentralityWe exploit three types of centrality—degree centrality, betweenness
centrality, and closeness centrality—to determine the center of the sus-
picious network from different aspects:?Degreecentrality.Degree centrality [Freeman 1979] indicates active-
ness of a member, and a member who tends to have more links to
others may be the leader. However, as explained in Xu and Chen
[2003], degree centrality is not quite reliable to indicate the team
leader in a criminal network. For a graphG(V,E), the normalized
degree centrality of nodeiisFinding Conspirators279CD(i) =
?|V|
j=1ν(i,j)
|V| ?1
, i?=j,(1)
whereνis a binary indicator showing whether there exists a link be-
tween twonodes. Sinceour graph isdirected,wecalculateseparately
the in-degree and out-degree of every node.?Betweenness centrality.Betweenness centrality [Freeman 1979] de-
scribes how much a node tends to be on the shortest path between
other nodes. A node with large betweenness centrality does not nec-
essarily have large degree but illustrates the role of “gatekeeper”—
someone who is more likely to be a intermediary when two other
members exchange information. The normalized betweenness cen-
trality isCB(i) =
?|V|
j=1?|V|k<jωj,k(i)
|V| ?1
, k?=i,(2)
whereωj,k(i)indicates whether the shortest path between nodejand nodekpasses through nodei.?Closeness centrality.Closeness centrality [Sabidussi 1966] is usually
utilized to measure how far away one node is from the others. Close-
ness of a node is de?ned as the inverse of the sum of its distances to
all other nodes and can be treated as a measure of ef?ciency when
spreading information from itself to all other nodes sequentially. It
indicates how easily an individual connects with other members.
The normalized closeness centrality isCc(i) =
?|V|
j=1ρ(i,j)?CcminCcmax?Ccmin, i?=j,(3)
whereρ(i,j)is the length ofthe shortest path connecting nodesiandj.CcminandCcmaxare the minimum and maximum lengths of the
shortest paths.f?Numberof known neighboring conspiratorsWe consider as a signi?cant feature the number of known neighboring
conspirators of a node. The interaction among conspirators in a mes-
sage network suggests a much stronger connectivity than that among
non-conspirators: A conspirator is more likely to communicate with an
accomplice. As shown inFigure 2, we calculate the ratio of known con-
spirators among one’s adjacent neighbors, which measures proximity
with known accomplices: The value is 1 if the individual connects with
all the known conspirators, and 0 means that no conspirators connect to
theindividual. Theknown suspiciouscliqueobviouslyrepresentsamore
compact connectivity. Therefore, the more known conspirators among280TheUMAP Journal 33.3 (2012)
an individual’s neighbors, the greater the possibility that the individual
is an accomplice.30
1
00
00.25
0.50.75
0.25
0.5
0.75Known non-conspirators
Known conspiratorsHarvey
Elsie
Alex
Yao
Ulf
Paul
Jean
Chris
Paige
Derlene
Gard
Ellin
Tran
Este
JiaFigure 2.Ratio of known conspirators among adjacent neighbors. To avoid the overlapping of
names with a linear scale, we adopt a topographic map type of diagram, with a peak at the center
and symmetric contour circles around it. The closer a person is to the center, the more likely that
the person is a conspirator.?Numberof currentnon-suspicious messages from known conspirators
Table 1shows the topics mentioned between known conspirators.1Aknown conspirator rarely talks with accomplices about topics irrelevant
to their conspiracy, though a very small proportion of unknown topics
appear. If most of the information received from a known conspirator is
irrelevant, the receiver is probably not a conspirator.Table 1.
Topics among known conspirators. Known conspiratorial topics have an asterisk and are
highlighted in blue (light gray).
Jean Alex Elsie Poul Ulf Yao Harvey
Jean11*8 14
Alex1 13*11*3,7*
Elsie11*13*
Poul11*7*7*4
Ulf7*,11*,13*13*
Yao13*7*,11*,13*7*,9 13*2,7*
Harvey13*1Topic 16 in the raw data is regarded as wrong and thus discarded.Finding Conspirators281MethodsWe use logistic regression to model the probability of a node being in-
volved in the conspiracy. We obtain the parameters ofthe model by using a
gradient descent algorithm to solve an optimization problem on a training
set.Logistic RegressionWe consider a training set{(x(i),y(i))}of sizem, wherex(i)is ann-
dimensional feature vector andy(i)indicates the classi?cation of the node,
i.e.,y(i)= 1for conspirators andy(i)= 0for non-conspirators. The nodesin the training set are drawn from the 15 known conspirators and non-
conspirators.
As a specialization of a generalized linear model for Bernoulli distri-
bution, logistic regression estimates the probability of being a conspirator
asP(y= 1|x;θ) =hθ(x) =
1
1 +e?θTx,(4)
whereθ∈Rnis the parameter vector.
Under the framework of the generalized linear model, themaximum a
posteriori(MAP) estimate of the parameterθisminθJ(θ),(5)
where the cost function is given byJ(θ) =
1
mm?i=1?
?y(i)log
?
hθ(x)(i)?
?
?
1?y(i)?log
?
1?hθ(x(i))
??
+
λ
2mn?j=1θ2
j,(6)
withλbeing a regularization parameter.Gradient DescentThe cost functionJ(θ)is minimized by gradient descent, which drivesθdown the locally steepest slope, in hope of reaching the global minimum
of the cost function.
At every iteration before convergence, a newθreplaces the oldθviaθ:=θ?α?θJ(θ),(7)
whereαis a small positive constant.282TheUMAP Journal 33.3 (2012)Leave-One-Out Cross ValidationSince we are informed of the correct classi?cation of onlyN0nodes
(N0= 15in our case), in a given round we only use (N0?1) of them asthe training set, while leaving one out for cross validation (C-V). At every
round, the next correctly classi?ed node is left out and the others serve as
the training set; then the trained hypothesis is tested on the left-out node.
In this way, by averagingN0rounds without overlapping, the errors for
both the training set and the cross validation set can be evaluated.
Suppose,for example,that in thej-th round sample(x(j),y(j))is left out
and the training set is given bySj={(x(l),y(l))|l= 1,2,···,j?1,j+ 1,···,N0}.(8)
Using this training set, parameter vectorθ(j)is obtained, and the corre-
sponding hypothesis is tested on bothSjand the left-out(x(j),y(j)), ob-
taining this round’s training errorεSjand C-V errorεj.Hence, by averaging overj, the training error and C-V error areεS=
1
N0
N0?j=1εSj, ε=
1
N0
N0?j=1εj.(9)Setting the Regularization ParameterThe regularization parameterλ >0is selected to minimize the cross
validation error, i.e.,λ= argminλ>0ε.(10)ResultsBy training the logistic regression model with our leave-one-out cross
validation strategy,λis optimally set to1.9and the overall C-V error isε= 0.27(with training errorεS= 0). Then, while?xing the chosenλ, we
retrain the hypothesis on the maximum training set, making full use of
known conspirators and non-conspirators.Table 2.
Top 10 in the priority list (known conspirators excluded).
Name Dolores* Crystal Jerome* Sherri Neal Christina Jerome William Dwight Beth
Node No. 10 20 34 3 17 47 16 50 28 38
Probability of
conspiracy .56 .51 .39 .32 .30 .27 .25 .25 .24 .23Finding Conspirators283
Thetrained hypothesisgivestheestimated probability for nodeibeing a
conspirator, resulting in a priority list of suspects, ranked in descent order
of criminal likelihood. The top 10 suspects are shown inTable 2, with
managers marked by an asterisk.
Figure 3illustrates the probability of criminal involvement estimated
byhθ(x)versus the corresponding rank in the priority list, where three
managers (Jerome, Dolores, and Gretchen)2are marked by circles.Dolores (manager) is indeed the person deserving highest suspicion,
and Jerome (manager) is also likely to be involved in conspiracy.10
20 30 40 50 60 70 80
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1Rank in the priority listProbability of being criminal
All the members
Gretchen (manager)
Jerome (manager)
Dolores (manager)Figure 3.Probability of conspiracy vs. corresponding rank in the priority listSemantic Model EnhancementSemanticinformation ismoreimportanttohumansthan thecomplicated
topology structure. For example,similar text information in their messages
motivates us to conclude in the EZ case that Inez is similar to George, who
is de?nitely a conspirator (for instance,“tired” when describing Inez and
“stressed” when describing George). Similar cases can be also found in the
83-people network case: The word “Spanish” from known conspiratorial
topic7ishighly suspiciousand appearsrepeatedlyin other unknown topics
(e.g., topic 2 and 12). The contents about “computer security,” which is2Since several nodes are named either Gretchen or Jerome, we select those with bigger out-
degrees to be managers, that is, Node 32 is Gretchen (manager) and Node 34 is Jerome (manager).284TheUMAP Journal 33.3 (2012)
treated as part of the key in the whole conspiracy, is also active in many
other unknown topics, such as 5 and 15. Hence, it is natural to train a
computer to measure similarities among topics so as to reveal potential
information.Conspiratorial Message 1
Conspiratorial Message 2
Conspiratorial Message x
suspicious
word 3
suspicious
word n
suspicious
word 1
suspicious
word 2
Conspiratorial
Dictionary
New conspiratorial Message
New non-conspiratorial Message
Undetermined Message 1
Undetermined Message 2
Undetermined Message 3
Undetermined Message 4
Undetermined Message m
Legend:Figure 4.Framework of topic semantic diffusion.Lexicalambiguity exists widely among words, which can have different
meanings depending on context. So it is not wise to abandon human intel-
ligence and depend only on algorithms. Therefore,we draw the problem of
topic semantic diffusion into a topic-similarity measurement task based on
an expert dictionary. We form the bipartite network illustrated inFigure 4,
between the conspiratorial dictionary constructed from the conspiratorial
messages about known suspicious topics, and all of the information in the
message traf?c. We exploit a resource allocation mechanism to extract the
hidden information ofnetworks [Zhou et al.2007]and unfold the similarity
among different topics.
The bipartite network is modeled as the bipartite graphG= (D,T,E),
where?D={di}is the dictionary of suspicious words, shown in the middlecolumn inFigure 4;?T={tl}is the message set, which is divided into two categories:
–messages with known status (left column inFigure 4), and
–undetermined messages (right column inFigure 4);?Eis an edge set, with(di,tl)∈Eindicating that worddiin the conspir-
atorial dictionaryDoccurred in messagetlof the message setT;
Weinitially give1unit ofresourceto each known conspiratorialmessage
inTand 0 to the remaining messages. The process of semantic diffusionFinding Conspirators285
includes two steps, namely the redistribution of resource from message
topics to keywords, and that from keywords back to topics.
We commence with the?rst allocation from setTto setD:f(di) =n?i=1ailf(tl)
K(tl)
.(11)
Equation(11)expresses the calculation of the resource held bytlafter the
?rst step, whereK(tl)denotes the degree of the nodetl,f(x)denotes theresource carried byx, andailis de?ned asail=
?
1,(di,tl)∈E;
0,otherwise.(12)
Theintuitiveexplanation ofStep 1is simply theprocessofredistributing
resource fromTtoD, with the transferred amount regulated by the degree
of nodes inT.This is followed by Step 2, which is to re?ect the resource?ow back toTfromDobeying the same rule but in the inverse direction, as shown fromthe middle column to the right column inFigure 4. So the resource?nally
locates ontiand satis?esf?(ti) =m?l=1ailf(dl)
K(dl)
=m?l=1ailK(dl)n?j=1ajif(tj)
K(tj)
.(13)
After this two-fold process, the amount of resource held by every element
inTcan be interpreted as a score ofsimilarity. The rank ofa topicaccording
to its score represents the degree of its similarity to the information in the
dictionary—thatis,thehigher thescore,themore likely thetopicis a newly-
found conspiratorial topic.
We setD={’Spanish’, ’system’, ’network’, ’computer’, ’meeting’}as
the conspiratorial dictionary, andTable 3illustrates the?nal result for all
15topics in the 83-people network case. The known suspicious topics are 7,
11, and 13. They are naturally the top three, and topic 5 is more suspicious
than other unknown topics. Topics 2, 12, and 15 are among the group with
the second highest possibility in unknowns;and the remaining topics tend
to be irrelevant to the conspiracy.
We then append topic 5 to the set of known conspiratorial topics and
train the model again; the overall C-V error decreases from 0.27 to 0.13.
Since Since topics 2, 12, and 15 are less similar to known suspicious topics,
as shown inTable 3, appending them to model training does not evidently
in?uence the correctness. The enhanced correctness here indicate that with
enough keywords in the conspiratorial dictionary and enough topics with
abundant contents, such a method is likely to perform very well.
On the other hand, if we utilize the speaker instead of the keywords
to construct a bipartite graph with the topics, we will also get similarity286TheUMAP Journal 33.3 (2012)Table 1.
Rank of all topics based on similarity to known suspicious topics (known conspiratorial topics
have an asterisk and are highlighted in blue).
Rank Topic Number Similarity to known suspicious topics
111* 0.750
27* 0.667
313* 0.667
4 5 0.417
5 2 0.167
6 12 0.167
7 15 0.167
8 1,3,4,6,8,9,10,14 0among topics based on the transmitting speaker. However, the determina-
tion ofthe relationship between differing results under these two standards
is de?nitely beyond this paper.
The resource allocation method is also highly ef?cient: Its time com-
plexity of computation is linear in the number of edges of the graph, which
enables good performance with huge amounts of data.Identifying the Leaderof the ConspiracyOur machine learning scheme tries to estimate the likelihood of a node
committing conspiracy. However, the likelihood does not proportionally
indicate leadership inside the network,because the identi?cation ofleaders
is further complicated by the network’s topology.
We adopt LeaderRank, a node-ranking algorithm closely related to net-
work topology,to?nd theleader. Weextractfrom thenetworkthesubgraph
connected by known suspicious topics. Because of its robustness against
random noise, LeaderRank is appropriate for addressing criminal network
problems, which usually suffer from incompleteness and incorrectness.LeaderRankThe LeaderRank algorithm is a state-of-the-art achievement on node
rankingthatismoretolerantofnoisy data and robustagainstmanipulations
than traditionalalgorithmssuch asHITSand PageRank[L¨u etal.2011]. This
method is mathematically equivalent to a random-walk mechanism on the
directed network with adaptive probability, leading to a parameter-free
algorithm readily applicable to any type of graph.
We insert a ground node, which connects with every node through
newly-added bidirectionallinks,in ordertomaketheentirenetworkstrongly
connected, so that the random walk will de?nitely converge.Finding Conspirators287
For a graphG= (V,E), every node in the graph obtains 1 unit of re-
source except the ground node. After the beginning of the voting process,
nodeiat steptwillget an adaptivevoting scoreν(t)according to the voting
from its neighbors:νi(t+ 1) =|V|+1?j=1μijDout(j)
νi(t),(14)
whereμijis a binary indicator with value 1 if nodeipoints tojand 0
otherwise.Dout(j)denotes the out-degree of nodej. The quotient of theabove two arguments could be considered as the probability that a random
walker atigoes tojin the next step. Finally, the leadership score of nodeiisνi(Tc) +νgn(Tc)/|V|, whereνgn(Tc)is the score of the ground node at
steady state.Suspicious Topic Subnetwork ExtractionWe extract from the network of company employees the subnetworkGTSconnected by suspicious topics only, so as to minimize the coupling ofthe company’s hierarchical structure to the conspiracy relations.
Suppose thatTijdenotes the set of topics mentioned by messages from
nodeito nodej, andTSdenotes the set of known suspicious topics (TS={7,11,13}). ThenGTSis the maximum subgraph of the original graphG,whereasTij?TS,for all(i,j)?ETS.(15)Edge ReverseTheoriginalLeaderRankalgorithm dealswith?ndingleadersin Internetsocial networks, where the direction of an edge has a dissimilar meaning
from our case: If A points to (follows) B in Twitter, then B is considered to
be a leader of A. However, in our communication network, an edge from A
to Bsuggests that A has sent Ba message. Therefore,assuming that a leader
in a criminal network tends to be a sender rather than a receiver, each edge
inGTShas to be reversed to be compatible with LeaderRank’s design. We
denote byG?
TSthe reversed subnetwork induced by suspicious topics.ResultsBy running LeaderRank onG?TS, a ranking score is assigned to every
node in this subgraph, which generates a list of possible leaders ranked in
descent order, as shown inTable 4.
Yao (node number 67) is ranked as the chief leader of the conspiracy.288TheUMAP Journal 33.3 (2012)Table 4.
Partial results of LeaderRank onG?
TS.
Name LeaderRank score
Yao 2.67
Alex 2.21
Paul 1.92
Elsie 1.62Empirical SupportEmpirical analysis of criminal networks?nds that a leader of a criminal
organization tends to carefully balance degree-centrality and betweenness-
centrality. It has been proposed that the leader usually maintains a high
betweenness-centralitybut a relatively low degree-centrality,for enhancing
ef?ciency while ensuring safety [Morselli 2010].Figure 5.The joint distribution of betweenness centrality and degree centrality. Yao is at the lower
right.Figure 5illustrates the joint distribution of betweenness centralityCBand degree centrality (Din+Dout) for the 7 known conspirators and 10
other nodes with high conspiracy likelihood,where two dashed lines mark
average values of the displayed nodes. Yao’s high betweenness-centrality
with relatively low degree-centrality accord with the identity of a leader.
Our conclusion that Yao is the leader is thus empirically supported.Finding Conspirators289DiscussionWeidentifytheleader ofthecriminalnetworkby performingtheLeader-
Rankalgorithm on theextracted,edge-reversed,suspicious-topic-connected
subgraph;and our?ndings are strengthened by empirical research results.Evaluating the ModelSensitivity AnalysisConsidering the usual incompleteness, imprecision, and even inconsis-
tency in mapping criminalsocialnetworks [Xu and Chen 2005],the method
for deducing criminality should be robust enough to cope with minor al-
ternations of the network. Otherwise,there could be mistaken accusations.
Therefore, we perform a sensitivity analysis on our approach.
Requirement 2 of the problem statement provides an appropriate sce-
nario for such a test: While other conditions remain unchanged, new in-
formation indicates that Topic 1 is also connected to criminal activity, and
Chris, who was considered innocent before, is now proven guilty.Priority ListBy applyingour methodstothesealtered conditions,we?nd thatamong
the top 10 of the previous priority list (the 7 known conspirators excluded),
7 of them are still in the new top 10, while the remaining 3?nd their new
places at 12th, 14th, and 16th.
A more sophisticated measurement of the sensitivity of the priority list
isKendall’s taucoef?cientτ[Sen 1968]. Given two priority lists{pk}=
{p1,p2,···,pn}and{qk}={q1,q2,···,qn}—for example,p2= 5meansnode 2 is ranked 5th in the{pk}list—then?(i,j)(fori?=j) is aconcordant pairif their relative rankings agree in thetwo lists, i.e.,pi> pjandqi> qj, orpi< pjandqi< qj;?otherwise,ifpi< pjbutqi> qj,orpi> pjbutqi< qj(i,j)is adiscordantpair.
Kendall’s tauis de?ned asτ=(number of concordant pairs)?(number of discordant pairs)1
2n(n?1)
,(16)
which lies in[?1,1], with1for perfect ranking agreement and?1for utter
disagreement.
The value ofKendall’s taufor the two priority lists of Requirement 1 and
Requirement 2isτ= 0.86,justifying therobustnessofthemachinelearning
approach.290TheUMAP Journal 33.3 (2012)
Let us assume that known conspirators and non-conspirators are inde-
pendently wrongly classi?ed with the same speci?c probability.Figure 6
depicts the expected Kendall’s tau vs.the misclassi?cation probability, sep-
arately for conspirators and non-conspirators. Even if the misclassi?cation
probability is as high as 0.5, Kendall’s tau does not drop below 0.8, sub-
stantially proving the inherent stability of our methods.?
???? ??? ???? ??? ???? ??? ???? ??? ???? ???
???
????
????
????
????
???
????
????
????
????
??????????????????????????????????????????????????
?????????????
????????????????Figure 6.The expected Kendall’s tau declines as misclassi?cation probability increases.Probability In?ationFigure 7illustrates the change of estimated conspiracy probability due
to the modi?ed conditions of Requirement 2, with the previous value asx-axis,and thenew asy-axis. Generally,mostnodesexhibita small“in?ation”in criminal probability, as indicated by the distribution of dots skewed
from the diagonal line. The augmented probability is compatible with the
new information that expands both the set of suspicious topics and known
conspirators.
The analysis suggests that our machine learning method is insensitive
to minor alterations and can still produce reasonable results with new in-
formation.ReferencesBaker, Wayne E., and Robert R. Faulkner. 1993. The social organization of
conspiracy: Illegal networks in the heavy electrical equipment industry.Finding Conspirators2910 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1Probability of being a conspirator (Requirement 1)Probability of being a conspirator (Requirement 2 )
Training set
Unknown nodes
Chris
Gretchen (manager)
Jerome (manager)
Dolores (manager)Figure 7.Criminal probabilities before and after the change of conditions.American Sociological Review58 (6) (December 1993): 837–860.http://
webuser.bus.umich.edu/wayneb/pdfs/networks/Conspiracy.pdf.
Chen, Hao, and Burt M. Sharp. 2004. Content-rich biological network
constructed by mining PubMed abstracts.BMC Bioinformatics5: 147.http://www.biomedcentral.com/1471-2105/5/147,doi:10.1186/
1471-2105-5-147.
Freeman, Linton C. 1979. Centrality in social networks conceptual clari?-
cation.Social Networks1 (3) (1978–1979): 215–239.http://psyonline.com.br/portal/administrator/components/
com_jresearch/files/publications/freeman.pdf.
Girvan, M., and M.E.J. Newman. 2002. Community structure in social and
biologicalnetworks.Proceedingsof theNationalAcademy of Sciences99(12)
(11 June 2002): 7821–7826.http://techlab.bu.edu/members/gail/
710Girvan,Newmann2002.pdf,doi:10.1073/pnas.122653799.
Krebs,ValdisE.2002.Mappingnetworksofterroristcells.Connections24(3):
43–52.http://vlado.fmf.uni-lj.si/pub/networks/doc/Seminar/
Krebs.pdf.
L¨u, Linyuan, Yi-Cheng Zhang, Chi Ho Yeung, and Tao Zhou. 2011.
Leaders in social networks, thedeliciouscase.PloS ONE, 6 (6):292TheUMAP Journal 33.3 (2012)
e21202.http://www.plosone.org/article/info:Adoi/10.1371/
journal.pone.0021202,doi:10.1371/journal.pone.0021202.
Morselli, Carlo. 2010. Assessing vulnerable and strategic positions in a
criminal network.Journalof Contemporary CriminalJustice26 (4) (Septem-
ber 2010): 382–392.http://ccj.sagepub.com/content/26/4/382.
short,doi:10.1177/1043986210377105.
Sabidussi, Gert. 1966. The centrality index of a graph.Psychometrika31 (4):
581–603.
Sen, Kumar Pranab. 1968. Estimates of the regression coef?cient based on
Kendall’s tau.Journal of theAmerican Statistical Association63 (December
1968): 1379–1389.
Wheat, Christopher. 2007. Algorithmic complexity and structural mod-
els ofsocialnetworks.http://scripts.mit.edu/~cwheat/research/
modelsel.20070416.
Xu, Jennifer, and Hsinchun Chen. 2003. Untangling criminal networks: A
case study. InIntelligence and Security Informatics: Lecture Notes in Com-
puter Science2665, edited by G. Goos, J. Hartmanis, and J. van Leeuwen,
232–248. New York: Springer, 2003.
. 2005. Criminal network analysis and visualization.Communica-
tionsoftheAssociation forComputingMachinery48(6)(June2005): 100–107.
Zhou, Tao, Jie Ren, Mat′uˇs Medo, and Yi-Cheng Zhang. 2007. Bipar-
tite network projection and personal recommendation.Physical Re-
view E76 (4): 046115.http://doc.rero.ch/lm.php?url=1000,43,
2,20071213113651-JT/zhang_bnp.pdf.Jiang Su, Jian Gao, Tao Zhou (advisor), and Fangjian Guo.Judges’Commentary293Judges’Commentary:
Modeling forCrime BustingChris ArneyDept. of Mathematical Sciences
U.S. Military Academy
West Point, NY10996david.arney@usma.eduKathryn CorongesDept. of Behavioral Sciences and Leadership
U.S. Military Academy
West Point, NY10996IntroductionThe new topicarea for this year’s Interdisciplinary Contest in Modeling
(ICM) was network science. The shift was popular with the student teams,
sincea record 1,329teamssubmitted papersin solution to a “crime-busting”
problem. Network science and/ or social network analysis will continue to
be the topic area for next year’s problem as well. So, for teams that enjoyed
this year’s problem or want to prepare early for next year’s contest,prepare
by studying network modeling and assemble a team with that subject in
mind.
The ICM continues to be an opportunity for teams of students to tackle
challenging, real-world problems that require a wide breadth of under-
standing in multiple academicsubjects. These elements are practically part
of the de?nition of network science—an emerging subject that blends con-
cepts, theories, structures, processes, and applications from mathematics,
computer science, operations research, sociology, information science, and
several other?elds. ICM problems are often open-ended and challenging.
Some, like the one this year, could be termed “wicked,” in that there is
not one correct answer nor a set or established method to model such a
problem.TheUMAPJournal33(3)(2012)293–303.
c
?Copyright2012by COMAP,Inc. Allrightsreserved.
Permission to make digital or hard copies of part or all of this work for personal or classroom use
is granted without fee provided that copies are not made or distributed for pro?t or commercial
advantage and that copies bear this notice. Abstracting with credit is permitted, but copyrights
for components of this work owned by others than COMAP must be honored. To copy otherwise,
to republish, to post on servers, or to redistribute to lists requires prior permission from COMAP.294TheUMAP Journal 33.3 (2012)
ThecomplexnatureoftheICM problemsand theshort timelimit require
effective communication and coordination of effort among team members.
One of the most challenging issues for the team is how to best organize and
collaborate to use each team member’s skills and talents. Teams that solve
this organizationalchallenge often submit solutions that can rise to the?nal
rounds of judging.The Criminal Network Analysis ProblemThe Information Age, along with its information-laden and highly-
linked Internet,has brought us many amazing capabilities,along with new
ways to commit crimes. This year’s problem focused on potential conspir-
ators within a company’s communication network plotting to commit a
crime. Some people were already identi?ed either as known conspirators
or as known non-conspirators. The goal of the model was to identify the
most likely conspirators from the remaining people in the network through
the analysis of con?scated and categorized message traf?c. The many con-
nections and links between the people and the messages made this an es-
pecially appropriate topic for network modeling. The main tasks expected
of the students were to:?Requirement 1:Build a model to prioritize the 83 people by likelihood
of being part of the conspiracy and explain your model and metrics. Are
any senior managers of the company involved in the conspiracy??Requirement 2:As new information comes to light, use your model
to analyze this changing situation. A good network model is?exible
and able to handle the changing nature, structure and information in a
dynamic network setting.?Requirement 3:If you could obtain the original messages, explain how
semanticand text analyses ofthe message traf?ccould help you develop
even better models.?Requirement 4:Explain the network modeling techniques you devel-
oped and how they can be used to identify, prioritize, and categorize
nodes in a network involving other kinds of data sources, not just crime
and message data. Does your model generalize to other important prob-
lems in society? Again,this is the mark ofstrong models within network
science and their potential to impact society.Judges’CriteriaThe panelofexpert judges were impressed both by the strength ofmany
of the submissions of individual teams, and fascinated by the variety ofJudges’Commentary295
innovative approaches that students used to address the issues, challenges,
and questions that were posed by the problem. The papers were rich in
modelingmethodologyand creativity. In order toensurethattheindividual
judges assessed submissions on the same criteria, a rubric was developed.
The framework used to evaluate submissions is described below.Executive SummaryIt was important that studentssuccinctly and clearly explained the high-
lights of their submissions. The executive summary should contain brief
descriptions of both the modeling approach and the bottom-line results.
The remaining report provides a more detailed statement of the contents of
the executive summary. One mark of an Outstanding paper is a summary
with a well-connected and concise description of the approach used, the
results obtained and any recommendations.ModelingModels and measures were needed to classify the people in the organi-
zation to identify conspirators. Many teams used probability or likelihood
measures for criminal-like behavior of the people within the context of the
known data. Other used decision-making criteria as their basic modeling
framework. Some teams used the explicit structures of networks or graphs
to determine classic local or global network metrics, properties, node clus-
ters, or performance outcomes. For such a structure, critical assumptions,
such as thedirectionality ofin?uenceand connection within thegraph,lead
to viable network models. Other teams ignored some of the aspects of the
network structure and performed data mining, element classi?cation, and
discrimination. Those teams often found prioritization and ordering easier
than discrimination.
Where to draw the line and commit to predict a conspirator was some-
timesdif?cult. Nomatterthemodelingframework,theassumptionsneeded
for these models and the careful and appropriate development of these
models were important in evaluating the quality of the solutions. The
better submissions explicitly discussed why key assumptions were made
and how assumptions affected the model development. Stronger submis-
sions presented a balanced mix of mathematics and prose rather than a
series of equations and parameter values without explanation. One major
discriminator was the use or misuse of arbitrary parameters without any
explanation or analysis. Establishing and explaining parameter values in
models are at least as signi?cant as making and validating assumptions.296TheUMAP Journal 33.3 (2012)ScienceSemantic and text analysis are elements of the science of computational
linguistics or natural language processing involving many challenging sci-
enti?cand technologicalissuesrelated to the nature,valueand understand-
ing of information and the production of knowledge or intelligence. Cur-
rently, many information-rich systems and organizations are facing data
deluge and overload. Vast amounts of unstructured textual data are often
collected and held for practically impossible human analysis. The magni-
tude ofdata makes this potentially valuable information at best a worthless
distraction. Through natural language processing using semantic and text
analysis the potentially valuable but hidden information can become visi-
ble, understandable, organized, and useful.
The ultimate goals of semantic and text analysis are to identify con-
text, meaning, categorization, and entity attributes, and thereby produce
human-ready synopses and standardized, interconnected, structured data
(information networks). These highly sophisticated and complexprocesses
are exactly what would be needed to model and solve this network con-
spiracy problem. Some teams did effective research and insightful analysis
that tackled the complexity of the problem and included elements of text
or semantic analysis in their model or described how their model could ac-
commodate such capability had the raw message data been available. No
matter what modeling was performed by the teams, the interdisciplinary
nature of this problem was fully revealed in this requirement. These areas
of information science and analytics will experience signi?cant scienti?c
and technological improvements in the future, and the ICM teams were
exposed to this developing?eld in the context of their interdisciplinary
science research.Data/Validity/SensitivityOnce the model was created, the use of test data and checks on the
accuracy and robustness of the solution help to build con?dence in the
modeling approach. Sensitivity analysis of models to determine the effects
of changing data and errors can often be more meaningful than speci?c
output values. This is especially true for highly-structured and powerful
data-rich models like networks. Some network structures are highly robust
and?exible while others are fragile and highly sensitive to data. While this
is a challenging element of network modeling, it was important to address
this issue in the report.Strengths/WeaknessesA discussion of the strengths and weaknesses of the models is often
wherestudentsdemonstratetheir understandingofwhattheyhavecreated.Judges’Commentary297
The ability of a team to make useful recommendations fades quickly if
team members do not understand the limitations or constraints of their
assumptions or the implications oftheir modeling methodology. Networks
are complex structures and, therefore, the strengths and weaknesses are
often hidden from direct view or control of the modeler. Again, the better
teams were able to discuss these elements despite these challenges.Communication/Visuals/ChartsTo clearly explain solutions, teams must use multiple modes of expres-
sion including diagrams and graphs, and, in the case of this competition,
English. A solution that could not be understood did not progress to the
?nal rounds of judging. The judges were delighted by the amazing array
ofpowerfulcharts and graphs that explained both models and results.Fig-
ures 1–3on the this page and the next are intended as samples to show the
richness of this kind of graphical analysis and reporting.Figure 1.Teamsprovided informativegraphicschematicsto show therelationship and connections
uncovered by their models. This graphic is from Team 12460 from Harbin Institute of Technology
in Harbin, Heilongjiang, China.RecommendationsTeams were speci?cally asked to discuss their conspiratorial priorities
and thepotentialinvolvementofsenior managersin their report that would
be read by the district attorney. The ability of teams to evaluate the results
of their analysis and make recommendations was important in identifying
strong submissions.298TheUMAP Journal 33.3 (2012)Figure 2.This network portrayal vividly showing the likelihood of conspirators is from Team
16075 from Huazhong University of Science and Technology in Wuhan, Hubei, China.?Figure 3.Teams that performed data analysis often used probability charts like this one from Team13104 from Southeast University, Jiulonghu Campus, Nanjin, Jiangsu, China, to demonstrate their
results.Judges’Commentary299Discussion of the Outstanding PapersAs you will discover in this section, many different approaches were
used by ICM teams to model various aspects of the problem. Some teams
used the basic structure of networks and their properties and computed
classic centrality measures to tackle the issues. Some chose to model using
a data mining framework. The Analytic Hierarchy Process (AHP) was a
common method for addressing discrimination in the identi?cation ofa po-
tential conspirator. As a result, the submissions this year were diverse and
interesting to read. Overall, the basic modeling was often sound, creative,
and sometimes quite powerful. Those that did not reach?nal judging gen-
erally suffered from two shortcomings. Some lacked clear explanation or
evidence to support their conclusions and recommendations. They seemed
to jump from their modeling directly to the results without suf?cient anal-
ysis. Others failed to connect their mathematical models to the aspects
and basic elements of information science. In general, poor communica-
tion was the most signi?cant discriminator in determining which papers
reached the?nal judging stage. Although the outstanding papers used
different methodologies, they all addressed the problem in a comprehen-
sive way by embracing the complexity of the issues, data, questions, and
team objectives. These papers were generally well written and presented
explanations of their modeling procedures. In several outstanding papers,
a unique or innovative approach distinguished them from the rest of the
?nalists. Others were noteworthy for either the thoroughness oftheir mod-
eling or the power of their communicated results.Huazhong University of Science and TechnologyThe ICM team from Huazhong University of Science and Technology,
Wuhan, China performed a thorough network analysis of the information
?ow and relationshipsofemployeesin theorganization. In their paper,“Ex-
tended Criminal Network Analysis Model Allows Conspirators Nowhere
to Hide,” they provided an in-depth analysis of the relationships between
people and the way the criminal network operated and expanded. This
report presented their framework, models, analysis, and results in power-
ful visual formats that enabled readers to understand their work and feel
con?dent in their results. In many ways, this paper is an excellent exam-
ple of the potential of network modeling and the power of social network
analysis to sort out nodal,edge,and data attributes through use ofnetwork
measures and data analysis.Mathematical Modeling Innovative Practice BaseThe report entitled “iRank Model: A New Approach to Criminal Net-
work Detection” wassubmitted by a team from theMathematicalModeling300TheUMAP Journal 33.3 (2012)
Innovative Practice Base, China. The Mathematical Modeling Innovative
Practice Base,China,established in 2008,is an institute that promotes inter-
disciplinary research and educational activities, integrating mathematical
modeling and computational approaches to address problems arising in
various areas of science and engineering. Their report contained creative
analysis of the available data from several perspectives, starting with basic
analysis as shown by:
Carefully examining into the patterns of information exchanges and
social connections in the network, we can see that only 24% messages
carry conspiratorial information, which seems not systematically sig-
ni?cant given that 20% of all the topics are conspiratorial. Therefore,
two patterns can be inferred from the statistical results:?Although conspirators are generally more active than the known
innocent people, they exchange irrelevant information with each
other. Conspiratorial messages only take a small portion in their
message traf?c.?Since the existing 7 conspirators have already involved in spread-
ing about 40% of the total conspiratorial messages, it is very likely
that the total number of conspirators is less than 20.
They also performed a very thorough social network analysis of the
message network. This report contained excellent visualizations to explain
their algorithm, analysis and results.Nanjing University of Information Science and TechnologyTheICM team from NanjingUniversityofInformation Scienceand Tech-
nology, Nanjing, China, built three different models for?nding and sepa-
rating conspirators and then merged these for their best-case solution. A
fourth model was used to identify the conspiracy leaders. Their paper,
“Message Network Modeling for Crime Busting,” was an excellent synop-
sis of the diverse methods one could use to approach this problem. Their
emphasis was in classical network analysis and data mining algorithms.
Once again, this team did a thorough job analyzing semantic analysis and
its utility for information and network modeling.Northwestern Polytechnical UniversityFinding the hidden features of a network was the theme of the paper
entitled “Social Network Analysis in Crime Busting,” by the ICM team
from Northwestern Polytechnical University, Shaanxi, China. This paper
started with the foundations of graphs and networks and built the concept
of cooperation within the network. This concept was a fundamentally
sound and deeper approach than those of many of the other models. The
resulting model was a powerful one for understanding a conspiracy andJudges’Commentary301
the team did an excellent job in their creative modeling and analysis. Their
discussion on semantics and text analysis was thorough and insightful in
?nding ways for possible inclusion of these more powerful methodologies
in their models.Shanghai Jiaotong University“Crime Busting by an Iterative 2-phase Propagation Method,” was sub-
mitted by a team from Shanghai Jiaotong University, Shanghai, China.
Their classic propagation model of performing iterative and alternating
computation of person suspiciousness and topic suspiciousness from each
other was creative and powerful. Upon convergence of their model, they
produced a priority list ofconspirators and performed a thorough analysis.
This team’s model was both mathematically and scienti?cally simple yet
elegant.University of Electronic Science and Technology of ChinaThe report and work entitled “Finding Conspirators in the Network:
Machine learning with Resource-allocation Dynamics” from the University
ofElectronicScienceand Technology ofChina,Chengdu,China,wasstrong
from start to?nish. This team made careful and thorough assumptions:
(i)Two classes,conspiratorsand non-conspirators,are linearly separa-
ble in the space spanned by localfeatures ofa node,which is necessary
to machine learning. (ii) A conspirator is reluctant to mention topics
related to crime when talking with an outsider. (iii)Conspirators tend
not to talk about irrelevant topics frequently with each other. (iv) The
leader of conspiracy tries to minimize risk by restricting direct con-
tacts. (v) A non-conspirator has no idea of who are conspirators, thus
treating conspirator and non-conspirators equally.
Then they used machine learning and logistic regression to build their
model. They were careful to show their analysis of leader selection and
other problem requirements. They followed up their modeling and anal-
ysis with sensitivity analysis and a careful discussion of the strengths and
weaknesses of their model and its approach. Most impressive was their
ability to discuss the incorporation of semantic analysis into their model
and the discussion of the power of information modeling to the future.Cornell UniversityThe team from Cornell University, Ithaca, NY, took a very different ap-
proach than the other Outstanding papers. Their paper “Crime Ring Anal-
ysis with Electric Networks” presented a model using an electrical circuit
analogy for the conspiracy where the interactions between people, repre-
sented as circuit nodes, were considered a conductance term. This model302TheUMAP Journal 33.3 (2012)
was creative in its structure and enabled the team to perform an interesting
analysis of the conspiracy factors. This team was selected as the INFORMS
winner.ConclusionAmong the 1,329 papers, there were many strong submissions, which
made judging dif?cult. However, it was gratifying to see so many students
with the ability to combine modeling,science and effective communication
skills in order to understand such a complex problem and recommend
solutions. We look forward to next year’s competition, which will involve
another problem in network science and hopefully, the participation of
many teams of competent and passionate interdisciplinary modelers.Recommendations forFuture Participants?Answerthe problem.Weak papers sometimes do not address a signif-
icant part of the problem. Outstanding teams often cover all the bases
and then go beyond.?Time management is critical.Every year there are submissions that do
an outstanding job on one aspect of the problem, then “run out of gas”
and are unable to complete their solution. Outstanding teams have a
plan and adjust as needed to submit a complete solution.?Coordinate yourplan.It is obvious in many weak papers how the work
and writing was split between group members,then pieced together into
the?nal report. For example, the output from one model doesn’t match
the input for the next model or a section appears in the paper that does
not?t with the rest of the report. The more your team can coordinate the
efforts of its members, the stronger your?nal submission will be.?The model is not the solution.Some weak papers present a strong
model,and then stop. Outstandingteamsusetheir modelstounderstand
the problem and recommend or produce a solution.?Explain what you are doing and why.Weak teams tend to use too
many equations and too few words. Problem approaches appear out of
nowhere. Outstanding teams explain what they are doing and why.Judges’Commentary303About the AuthorsChrisArneygraduated from WestPointand served
as an intelligence of?cer in the U.S. Army. His aca-
demic studies resumed at Rensselaer Polytechnic In-
stitute with an M.S. (computer science) and a Ph.D.
(mathematics). He spent most of his 30-year military
career as a mathematics professor at West Point, be-
fore becoming Dean ofthe SchoolofMathematicsand
Sciences and Interim Vice President for Academic Affairs at the College of
SaintRosein Albany,NY.Christhen moved toRTP(Research TrianglePark),
NC,where he served for various durations as chair ofthe MathematicalSci-
ences Division, of the Network Sciences Division, and of the Information
Sciences Directorate of the Army Research Of?ce. Chris has authored 22
books,written morethan 120technicalarticles,and given morethan 250pre-
sentations and 40 workshops. His technical interests include mathematical
modeling, cooperative systems, pursuit-evasion modeling, robotics, arti-
?cial intelligence, military operations modeling, and network science; his
teaching interests include using technology and interdisciplinary problems
toimproveundergraduateteachingand curricula. Heisthefoundingdirec-
tor of COMAP’s Interdisciplinary Contest in Modeling (ICM)R?. In August
2009, he rejoined the faculty at West Point as the Network Science Chair
and Professor of Mathematics.
Kate Coronges is an Assistant Professor in the De-
partment of Behavioral Sciences and Leadership and a
research fellow in the Network Science Center at the U.S.
Military Academy. She has a Master’s in Public Health
and a Ph.D. in Health Behavior Research from the Uni-
versity of Southern California. Kate teaches courses in
social network analysis and public policy, working with
cadets to apply analytic tools to understand and model
complex systems, particularly as they relate to public policy issues such
as energy, education, information security, and health care. Her primary
research effort involves a social network study of leadership and organi-
zational performance. She also is working on an analysis of social accept-
ability of automatic biometric authentication tools, social determinants of
phishing security vigilance, and modeling social media data to understand
how protests turn to riots. Her publications in network science include the
study of education, drug addiction, DADT (“Don’t ask, don’t tell”) policy,
coalition building, and security.304TheUMAP Journal 33.3 (2012)Reviews305ReviewsMaasz, Juergen, and John O’Donoghue (eds.). 2011.Real-World Problems
for Secondary School Mathematics Students: CaseStudies. Rotterdam, The
Netherlands: Sense Publishers; ix + 281 pp, $49.99 (P). ISBN 978–94–
6091–541–3.
Secondary school mathematics teachers seek resources for bringing rel-
evant applications of mathematics to their students, and this book is de-
scribed as being “full of ideas for introducing real world problems into
mathematics classrooms.” The collection of 16 papers promises to provide
teachers with a wealth of applications from a wide variety of school con-
tent areas (e.g., statistics, geometry, and calculus), and to focus on topics
that should appeal to a student audience with diverse interests (e.g., en-
ergy issues,traveling to Mars,rugby and snooker, lotteries,logisticgrowth,
worldwide oil reserves, and even Dirk Nowitzki).
That is, in fact, what this collection does provide. Many of the authors
offer suggestions on how to format the material into classroom lessons,
yet they also encourage teachers to individualize the lessons for their own
students and their own circumstances. There is also an international?avor
to thecollection,a fact that willappealto many teachersand many students.
There are several cautions, however:?Although the content is timely, class time will be needed for students
to pro?t from these lessons; most lessons are not one- or two-period
explorations. Teachers who are already short on time will have to weigh
whether the advantageofproviding studentswith interesting,nontrivial
real world applications is enough to warrant requisite class days.?Unless a teacher has a multiple-courseassignment,he or she willnot?nd
a variety of lessons from which to select if one of the requirements is to
illustrate applications of the mathematical topics covered in a particular
course.?Since the mathematics is accessible, but de?nitely nontrivial, much of
the content may be daunting for students who are not already mathe-
matically pro?cient.
For these reasons, this collection may best be seen as an excellent resource
for a mathematics department rather than for one particular teacher. ItTheUMAPJournal33(3)(2012)305–308.
c
?Copyright2012by COMAP,Inc. Allrightsreserved.
Permission to make digital or hard copies of part or all of this work for personal or classroom use
is granted without fee provided that copies are not made or distributed for pro?t or commercial
advantage and that copies bear this notice. Abstracting with credit is permitted, but copyrights
for components of this work owned by others than COMAP must be honored. To copy otherwise,
to republish, to post on servers, or to redistribute to lists requires prior permission from COMAP.306TheUMAP Journal 33.3 (2012)
would also be a?ne resource for an independent study course or for an
upper-level course in mathematical modeling.
J.T.Sutcliffe,MathematicsDepartment,St.Mark’sSchoolofTexas,10600Preston
Road, Dallas, TX 75230;sutcliffe@smtexas.org.
Thomson, Brian S. 2010.The Calculus Integral.North Charleston, SC: Cre-
ateSpace. x + 291pp,$14.95(P).ISBN 978–1–442180956. Free download
athttp://classicalrealanalysis.info/documents/
T-CalculusIntegral-AllChapters-Portrait.pdf.
The study of integration within a?rst course of calculus is always prob-
lematic. Thestandard approach isto begin with theproblem ofdetermining
areas under curves,createapproximating sums,moveon to thegeneralRie-
mann sums, de?ne the de?nite integral as a limit of these sums, and then
prove the Fundamental Theorem of Calculus that links these limits of Rie-
mann sumsto what Ishallrefer to asantiderivatives,also known asprimitives
orinde?niteintegrals.
Thomson is one in a long line of mathematicians dissatis?ed with this
approach. It hasmany?aws. Thede?nition ofthede?niteintegralasa limit
ofRiemann sums is incredibly sophisticated. For most students,the formal
de?nition is quickly forgotten and the working de?nition of integration
becomes antidifferentiation. The problem with this is that many students
lose the link between antidifferentiation and Riemann sums. The reason
that the standard textbook approach is problematic is that it is seriously
ahistorical. Riemann created his de?nition in the 1850s for the speci?c
purpose of determining how discontinuous a function might be yet still be
integrable. His formulation is ideally suited for this purpose, a purpose
that bears no relevance for the?rst year of university calculus.
The fact is that from the time that Newton?rst recognized the power of
reversing differentiation as a tool for computing areas until Cauchy sought
a characterization of integration that would enable him to assert that every
continuous function is integrable, integration was de?ned as antidiffer-
entiation. Thomson embraces this natural and historical de?nition of the
integral, what he calls “The Calculus Integral,” and uses it as the starting
point for an exploration into our modern understanding of integration.
This book is described as appropriate for a course of honors calculus
or a?rst course in real analysis. In either context, it would be challeng-
ing but do-able with the right students. The development is elegant and
extremely original. After a dense?rst chapter that introduces the basic
theorems needed to work with limits, sequences and series, continuity, and
differentiability, Thomson begins by de?ning the inde?nite integral offon
an open interval as a continuous function whose derivative coincides withfexcept possibly at?nitely many points. De?nite integrals are de?nedReviews307
in terms of inde?nite integrals. The Fundamental Theorem is introduced
in two steps: First is the use of the Mean Value Theorem to establish the
existence of a sequence of tagsζisuch that?b
af(x)dx=n?i=1f(ζi)(xi?xi?1).Second comes the theorem that the de?nite integral can be uniformly
approximated by Riemann sums with arbitrary tags. The emphasis has
switched in a pedagogically signi?cant way from de?ning the de?nite in-
tegral as a limit of Riemann sums to demonstrating that it can be approxi-
mated arbitrarilycloselybyRiemann sums,simplybycontrollingthelength
of the subintervals in the partition. This approach opens the door to the
result that de?nite integrals are also uniformly approximated by Robbins
sums, an interesting variation on the Riemann sum that was described by
Herbert E. Robbins [1943].
The text continues through the study of sequences and series of inte-
grals and the monotone convergence theorem, then into Cantor sets, sets of
measure zero, functions with zero variation, and absolute continuity. The
most original aspect of this text is the de?nition of the Lebesgue integral.
Parallel to the Calculus Integral, the inde?nite Lebesgue integral offon
an open interval is de?ned as an absolutely continuous function in the Vi-
tali sense whose derivative coincides withfexcept possibly on a set of
measure zero. The Lebesgue integral offover the interval[a,b]is thende?ned asF(b)?F(a), whereFis an inde?nite Lebesgue integral offonthis interval. Connecting this de?nition to Riemann sums leads naturally
into a discussion of the Henstock-Kurzweil integral, where the text ends.
Traditional measure theory is nowhere to be found.
One of the most distinctive features of this book is that none of the
theorems or corollaries is proven in the text. Instead, Thomson leads the
reader through a series of exercises that build to each proof. The actual
text is quite short, only 150 pages. It is followed by an almost equally
long presentation of the solutions to the exercises. Just the text, leaving the
scaffolded proofs to the students without the option of looking them up,
would provide an excellent inquiry-based introduction to real analysis or
a challenging senior seminar.
Thomson has given us a rich introduction to the complexities ofintegra-
tion with many historical references and intriguing asides. I agree with his
use of the Calculus Integral and his approach to Riemann sums. De?ning
integration as a limit of Riemann sums makes no sense for?rst-year calcu-
lus. I am not convinced that his approach to Lebesgue integration makes
better pedagogical sense than a more traditional route, but it does form
part of a coherent and consistent approach to integration. The student who
completes this book willbe very wellversed in realanalysis and fully ready
to tackle measure theory.308TheUMAP Journal 33.3 (2012)ReferenceRobbins, Herbert E. 1943. Note on the Riemann integral.American Mathe-
matical Monthly50 (10) (December 1943): 617–618.
DavidBressoud,MathematicsandComputerScience,MacalesterCollege,St.Paul,
MN 55105;bressoud@macalester.edu.