QQ登录

只需要一步,快速开始

 注册地址  找回密码
查看: 1414|回复: 1
打印 上一主题 下一主题

屌丝必看案例:加州大学光棍极客通过大数据搞定女朋友

[复制链接]
字体大小: 正常 放大
math3056        

446

主题

9

听众

1143

积分

升级  14.3%

  • TA的每日心情

    2014-6-21 01:29
  • 签到天数: 45 天

    [LV.5]常住居民I

    自我介绍
    没有
    跳转到指定楼层
    1#
    发表于 2014-4-17 10:14 |只看该作者 |倒序浏览
    |招呼Ta 关注Ta
    本帖最后由 wangzheng3056 于 2014-4-17 16:24 编辑

    春节期间,回家面对各种七大姑八大姨的催命问题,相信对于广大的宅男极客来说——“找女朋友没有?”已经被选为最不受欢迎的一句话了。其实在这个大数据时代里,我们生活在一个充满“数据”的世界,找个女朋友真的很难么?有的人可能说了“天天大数据大数据能帮我找女朋友么?”回答是肯定的,有了“大数据”的帮助,找女朋友的成功率会高很多。请看来自美国的Chris McKinlay给我们分享的经典案例:如何通过大数据找到你的另一半!

    1.jpg

    在加州大学洛杉矶分校数学楼5层的一个阁楼里,显示器上闪烁着微弱的灯光。Chris McKinlay正在使用罗拉多州超算为他博士论文(大规模数据处理和并行数值方法 )做实践,而凌晨三点却是能压榨这个计算机资源的最佳时间,他打开了第二个窗口——OkCupid(美国在线约会网站的领头羊 )的收件箱。

    McKinlay, 35岁,体型偏瘦,一头蓬乱头发的中年男子。在4000万通过Match.com、J-Date、e-Harmony这些网站在网络上寻找浪漫的美国中,他是非常不起眼的一个。自从去年分手以后,他已经在网上搜索了9个月,可惜毫无结果。他已经给几十个OkCupid网站推荐为潜在配偶的女性们发去了自我介绍信息,但大部分都被忽略了。同时他只去过为数不多的六次约会中的一次。

    2012年六月的那天早上,电脑一个窗口显示着编译器正在处理的代码,而另一个显示着被遗弃的约会资料,他突然醒悟到,自己做错了。他一直把自己当做一名相亲对象来在网上寻找其他用户,这样做是不对的,他意识到自己应该像一个数学家一样去约会。

    OkCupid由哈佛大学数学专业人士创办于2004,首先吸引交友者的是因为他的相亲对象是通过计算方法来自动匹配的。成员通过回答一系列的问题进行匹配,比如政治、宗教、家人、爱、性f和智能手机。

    平均而言,用户从问题库中选择350个类似于“下列哪个最有可能吸引你去看电影吗?”或“宗教/上帝在你的生活中有多重要?“这种问题。通过对每一个用户问题答案的分析寻找和他们问题答案相近的异性伴侣,同时将这些用户从“毫无关系”到“特别亲密”分为5个等级。OkCupid的匹配引擎使用该数据来计算一对夫妇在一起是否合适,得分越接近百分之一百,证明他们是一对越好的灵魂伴侣。

    但是推理一下,在洛杉矶,McKinlay与女性的匹配度简直是糟糕透顶。OkCupid的算法只使用两个潜在的选择决定回答问题,以及相匹配的问题(或多或少随机出现),并不能正确的体现出一个人的内心。当McKinlay 查看他匹配对象的时候,发现相互匹配额超过百分之九十女性不超过100个。要知道在洛杉矶这个城市大约有200万女性(在OkCupid上也有8万女性),而从McKinlay的匹配结果和影响来看,他几乎就是一个隐形人。

    McKinlay意识到他必须提高这个数据,通过抽样统计,McKinlay可以确定哪些问题关系到他喜欢的那种女性,他可以针对这些问题建立新的“形象”,从而去匹配洛杉矶中所有适合他的女性,而忽视其他人。

    1.jpg


    Chris McKinlay使用Python脚本快速调取了大量OkCupid的调查问题,然后他将女性约会者分为七个维度,比如“Diverse” 、 “Mindful”,每个都有自己的特点

    MauricoAlejo 从一个数学家的角度来说,Chris McKinlay的故事非常独特。他在波士顿郊区长大,2001年从明德学院毕业,大学本科获中文学位,同年8月到纽约世贸大厦91楼作汉译英,五周后世贸大楼倒塌( McKinlay那天下午两点才上班,侥幸躲过了911爆炸)。“后来我问自己,我到底想做什么?”他说,当时哥伦大学毕业一个朋友招募他加入MIT的决战21点队员,接下来的几年他往返于纽约和拉斯维加斯,曾一年从拉斯维加斯赢得6万美金。

    经历了这些事情,他对应用数学非常感兴趣,因此爱上了数学并读了数学博士。他说:“他们的数学天赋可以适用于许多不同的情况。他们可以看到一些新的扑克游戏,然后回家,写一些代码,并想出一个策略来战胜它。”

    现在他将这种模式搬到了寻找爱情的过程中。首先他需要数据。他建立了12个OkCupid账户,写了一个Python脚本管理它们,同时也没有忽略他的论文。程序脚本将会收集他的目标人群(年龄在25-45之间的异性恋以及双性恋女士),从这些女士的个人页面上搜集所有可能用到的数据:种族、身高、是否抽烟、星座等等。

    为了得到这些数据,他不得不做一些额外的工作。OkCupid中只有你回答别人的问题,你才能看到别人的信息。McKinlay用机器人回答一些简单的问题,他没有使用一些虚假的信息来欺骗这些女士,因为答案对他并不重要,他并不是想要吸引这些女生,他只是想把这些女生的回答收集到自己的数据库中。

    McKinlay非常满意他的机器人的工作成果。然而在他收集了一千个资料后,他遇到了第一个障碍。OkCupid有一个系统专门来防止这种机器的数据搜集行为,不断的将他的机器人账号禁止。

    他必须试着让这些机器账户模仿人的行为动作

    他将目标转向了一位向他学习高等数学课程同时教他音乐理论的朋友 Sam Torrisi,Torrisi 是一位神经学家。Torrisi也经常使用OkCupid,Torrisi同意在他的电脑上安装间谍软件监控自己在网站的运动轨迹和数据。同时McKinlay通过编程让机器人模拟Torrisi 的点击速率以及打字速度。McKinlay又从家里带来一台电脑,通过数学系的宽带,保证一天24小时不间断的运行。

    三周后他已经收获了来自全国各地2万名女性的600万个问题。随着数据挖掘的深入,McKinlay完全将他的论文抛至一边,他本来就很少在公寓睡觉,现在基本上就完全放弃了,搬到了工作的地方,睡觉的时候在办公桌上铺上一层薄薄的床垫。

    按照McKinlay的计划,他必须要在这些统计的数据中找到一种根据这些女生的相似性进行大致分组的方式。McKinlay在修改贝尔实验室一个名为K-Modes的算法时得到了灵感。这个算法第一次在1998年用于分析生病的豌豆谷物,它使用分类数据并且把数据整合堆积。通过微调,设备可以调节出结果的速度,得到自己想要的方式。

    他调整刻度,找到了一个平衡点,这个点上20000个女人根据她们的问题和回答能够在统计上分为7个清晰分离的群。“我太高兴了”,他说,“这真是6月最好的一天。”

    用这种方式,McKinlay又搜集了另外5000个女生的样本,她们都来自洛杉矶和旧金山,最近刚刚在OkCupid上注册。这些样本经过K-Modes的处理也大致分布在7个组里,McKinlay的统计样本奏效了。

    现在McKinlay只需要确定哪个组的女生更适合自己就行了。他大概看了一下这些女生的简介,有一组女生年龄太小,两组年龄太大,另外一组是虔诚的基督徒。他发现有一组女生大多在20几岁,多数看起来很独立或是音乐家和艺术家。McKinlay认为自己或许能在这组中找到真爱。

    实际上,还有一组女生看起来也很不错,她们年龄稍大,从事编辑和设计等有创造性的工作。McKinlay决定在这两组女生中寻找目标。他建立了两个个人档案,一个用于A组,一个用于B组。

    McKinlay研究后发现,两组女生都对教学这个话题很感兴趣,所以他将自己定位成一个数学教授。他将这两个集群中最受欢迎的500个问题进行收集然后填写他自己最真实的答案。因为他并不想让自己的未来建立在计算机自动生成的谎言之上。但是他会让电脑分析出每个问题的重要性,通过机器学习算法,提供一个最佳的权重。

    1.jpg

    这样他做了两份个人简介,一份附上了他攀岩时的照片,另一份上是他在音乐演出时弹吉他的照片。“不管未来的计划是什么,关键是你现在对什么感兴趣?是性还是爱?”答案显然是:爱。但对于年轻的A群体,他根据电脑的指示,他认为是“very important”。对于B群体,他认为是“mandatory”。

    当回答完最后一个问题并排名后,他在OkCupid中搜索洛杉矶的女性并按照匹配率排序。在首页:一整页的女性和他的匹配率达到了99%。他继续向下浏览……浏览完整个洛杉矶的1万多名女性,却一点进展也没有。

    他需要更进一步的努力来获取人气。当有人访问OkCupid会员的网页时,会员会得到通知,因此他写了一段新程序来访问和他匹配度很高的页面,按照年龄循环:星期一访问一千个41岁女性,然后在星期二访问二千名40岁女性,两周后到访问完27岁女性后。有时一天会有400个女性回访他的简历。接着就有了大量的留言。

    “之前我还从来没有遇到过谁能有这么多的访问量,我觉得你的简历特别有吸引力。”一名女性这样写道。“就是关于有这么多粉丝的粗犷的男人的一些事迹……因此我想对你问好。”

    “嗨——您的简介真的打动了我,我想跟你打个招呼。”另一个写到。“我认为我们有很多共同之处,也许不是数学,但肯定很多其他地方!”

    “你真的能翻译中文吗?”还有人问道。“我上过一段时间课但是效果不好。”

    McKinlay搜索的数学部分完成了,那就只剩下一件事了,他得走出他的小卧室出去约会。

    6月30号这天, McKinlay在加州大学洛杉矶分校体育馆洗澡然后开着他的破旧的Nissan到城市的另一端开始他的数据挖掘约会。Sheila是一名网络设计师,属于A群体的年轻艺术家类型。他们约在回声公园的一家咖啡馆吃午饭。 McKinlay说:“太恐怖了,几乎就是一项学术活动。”

    在和Sheila约会结束时,他们显然对对方已经不感兴趣了。第二天他开始了他的第二个约会——来自B群体的一位很有魅力的博客编辑。他曾计划在回声公园湖附近浪漫地散步,但后来发现她正在疏远他。她一直在读普鲁斯特的书,对生活情绪低落。“有点让人沮丧”,他说。

    第三个约会的对象也来自B类。他和Alison约在韩国城的一家酒吧见面。她是肩上有着一个斐波那契螺旋纹的编剧学生。McKinlay喝醉了,直到第二天才从他的小卧室很痛苦的醒来。他在OkCupid上给Alison发送了一条后续信息,但她没有回复。

    被拒绝很痛苦,但他仍然能一天收到20条留言。约会和在电脑上发布简介完全不同。他可以忽略不满意的留言,回复那些表现出幽默感或者简历中存在有意思事的人。想当初他追求别人的时候,他得用三条到五条留言才能换来一次约会。现在他只回复一条:“你的条件不错,能见个面吗?”

    到了第20个约会,他注意到一些潜在的因素出现了。在年轻的群体中,女性总有两个以上的纹身而且住在洛杉矶的东边。其他的则不成比例地养着他们钟爱的中型犬。

    他早期的约会都是精心策划的,但由于等着和他约会的人太多了,约会就简化成吃个午饭或喝杯咖啡,往往一天就有两次约会。为了他的马拉松爱情搜索,他制订了一套自己的规矩(其中之一就是不喝酒),不合适就分,严格的约束自己。也不会去听音乐会或看电影。他说:“除了约会了解对方外不会再关注其他事物,否则的话没有效率”。

    爱情是一个数据区

    McKinlay的代码发现集中于的统计学上可识别群体的女性倾向于以相似的方式回答OkCupid调查问题。这个群体被称为Greens,是网上交友新手;其他的被称为Samanthas,更加成熟而且乐于冒险。以下是各个群体是如何回答以下最流行的四个问题的回答。

    1.jpg

    1.jpg

    经过了一个月约会,他断定他和城市东部有纹身的女性群体约会在高速公路上花去的时间太多。他删除A类简历。效率提高了,但结果还是一样。夏天快要结束了,他至少有了55次约会,每一次约会都认真地记在一个实验室笔记本中。只有三个发展到第二次约会;只有一个发展到第三次约会。

    大多数约会失败的人都会面临自尊心问题。McKinlay的情况更糟糕。他不得不怀疑他的计算。

    这时Christine Tien Wang的留言到了,她是一个28岁的艺术家和监狱废除活动家。麦金利在搜索加州大学洛杉矶分校附近6英尺高的蓝眼睛女性时,她出现在了屏幕上,她在加州大学洛杉矶分校学习艺术。他们的匹配度达到了 91%。

    他在学校的雕塑花园和她见了面,从那里他们走到了学院的寿司店,他立刻就喜欢上了她。他们一起谈论书籍、艺术、音乐。当她承认在遇到他之前对简历做了一些改动时,他跟她说了love hacking的整个过程。

    她说:“我以为黑客是忧郁的,愤世嫉俗的,我喜欢这个感觉”。

    这是第88个约会的第一次,接着是第二次约会,然后是第三次。两个星期后他们都暂停他们的OkCupid 帐户。

    McKinlay说:“我认为我只是多用了点算法、大规模和基于机器学习的那种,就像每个人都会在网上做的一样。”每个人都想做一个最好的简历——他只是用数据做了一个。

    距离他们的第一次约会已经过了一年了,McKinlay和Tien Wang曾约我在西木寿司店见面,那是他们爱情开始的地方。MacKinlay有一个博士学位 ,他教数学,目前正致力于音乐研究生学位。Tien Wang在卡塔尔被录取并获得一年的助学金,现在在加利福尼亚看望MacKinlay。他们一直在Skype上保持联系,她已经回来看望他好几次了。

    在我的请求下,MacKinlay带来了他的实验室笔记。Tien Wang之前还从来没有见过。里面密密麻麻地写满了公式和等式,结尾整洁有序地记录着女性的姓名和约会的日期以及几条简短的记录。Tien Wang翻阅了一下,笑话笔记里的一些亮点。她注意到8月24日这一天他带两个女性到同一个海滩。“太难以想象了!”她说。

    对于Tien Wang来说,MacKinlay的OkCupid Hacking是一个有趣的故事。但所有的数学和编码只是他们的故事开始的序幕。真正的hacking是在你们见面之后。“人比他们的简历要复杂得多,”她说。“所以我们相遇的方式是微不足道的,但相处就不是那样了,爱情是经过许多努力培养出来的。

    “不是说我们般配,我们就能建立很好的关系。”这只是让我们坐在一起的一种机制,我能做的也只是使用OkCupid来找到某个人”。

    她有些生气,抚着他的胳膊说:“不是你找到我,是我找到你的。”McKinlay思忖片刻,承认她是对的。

    一周后,Tien Wang回到了卡塔尔,在他们一次日常的Skype聊天时,McKinlay拿出了钻石戒指放到视频摄像头前向她求婚,她说了Yes。

    他们还未确定何时举办婚礼,为了决定哪天最适合结婚,还有很多的研究工作要做。

    英语原文:

    Chris McKinlay was folded into a cramped fifth-floor cubicle in UCLA’s math sciences building, lit by a single bulb and the glow from his monitor. It was 3 in the morn-ing, the optimal time to squeeze cycles out of the supercomputer in Colorado that he was using for his PhD dissertation. (The subject: large-scale data processing and parallel numerical methods.) While the computer chugged, he clicked open a second window to check his OkCupid inbox.

    McKinlay, a lanky 35-year-old with tousled hair, was one of about 40 million Americans looking for romance through websites like Match.com, J-Date, and e-Harmony, and he’d been searching in vain since his last breakup nine months earlier. He’d sent dozens of cutesy introductory messages to women touted as potential matches by OkCupid’s algorithms. Most were ignored; he’d gone on a total of six first dates.

    On that early morning in June 2012, his compiler crunching out machine code in one window, his forlorn dating profile sitting idle in the other, it dawned on him that he was doing it wrong. He’d been approaching online matchmaking like any other user. Instead, he realized, he should be dating like a mathematician.

    OkCupid was founded by Harvard math majors in 2004, and it first caught daters’ attention because of its computational approach to matchmaking. Members answer droves of multiple-choice survey questions on everything from politics, religion, and family to love, sex, and smartphones.

    On average, respondents select 350 questions from a pool of thousands—“Which of the following is most likely to draw you to a movie?” or “How important is religion/God in your life?” For each, the user records an answer, specifies which responses they’d find acceptable in a mate, and rates how important the question is to them on a five-point scale from “irrelevant” to “mandatory.” OkCupid’s matching engine uses that data to calculate a couple’s compatibility. The closer to 100 percent—mathematical soul mate—the better.

    But mathematically, McKinlay’s compatibility with women in Los Angeles was abysmal. OkCupid’s algorithms use only the questions that both potential matches decide to answer, and the match questions McKinlay had chosen—more or less at random—had proven unpopular. When he scrolled through his matches, fewer than 100 women would appear above the 90 percent compatibility mark. And that was in a city containing some 2 million women (approximately 80,000 of them on OkCupid). On a site where compatibility equals visibility, he was practically a ghost.

    He realized he’d have to boost that number. If, through statistical sampling, McKinlay could ascertain which questions mattered to the kind of women he liked, he could construct a new profile that honestly answered those questions and ignored the rest. He could match every woman in LA who might be right for him, and none that weren’t.


    Even for a mathematician, McKinlay is unusual. Raised in a Boston suburb, he graduated from Middlebury College in 2001 with a degree in Chinese. In August of that year he took a part-time job in New York translating Chinese into English for a company on the 91st floor of the north tower of the World Trade Center. The towers fell five weeks later. (McKinlay wasn’t due at the office until 2 o’clock that day. He was asleep when the first plane hit the north tower at 8:46 am.) “After that I asked myself what I really wanted to be doing,” he says. A friend at Columbia recruited him into an offshoot of MIT’s famed professional blackjack team, and he spent the next few years bouncing between New York and Las Vegas, counting cards and earning up to $60,000 a year.

    The experience kindled his interest in applied math, ultimately inspiring him to earn a master’s and then a PhD in the field. “They were capable of using mathema-tics in lots of different situations,” he says. “They could see some new game—like Three Card Pai Gow Poker—then go home, write some code, and come up with a strategy to beat it.”

    Now he’d do the same for love. First he’d need data. While his dissertation work continued to run on the side, he set up 12 fake OkCupid accounts and wrote a Python script to manage them. The script would search his target demographic (heterosexual and bisexual women between the ages of 25 and 45), visit their pages, and scrape their profiles for every scrap of available information: ethnicity, height, smoker or nonsmoker, astrological sign—“all that crap,” he says.

    To find the survey answers, he had to do a bit of extra sleuthing. OkCupid lets users see the responses of others, but only to questions they’ve answered themselves. McKinlay set up his bots to simply answer each question randomly—he wasn’t using the dummy profiles to attract any of the women, so the answers didn’t mat-ter—then scooped the women’s answers into a database.

    McKinlay watched with satisfaction as his bots purred along. Then, after about a thousand profiles were collected, he hit his first roadblock. OkCupid has a system in place to prevent exactly this kind of data harvesting: It can spot rapid-fire use easily. One by one, his bots started getting banned.

    He would have to train them to act human.

    He turned to his friend Sam Torrisi, a neuroscientist who’d recently taught McKinlay music theory in exchange for advanced math lessons. Torrisi was also on OkCupid, and he agreed to install spyware on his computer to monitor his use of the site. With the data in hand, McKinlay programmed his bots to simulate Torrisi’s click-rates and typing speed. He brought in a second computer from home and plugged it into the math department’s broadband line so it could run uninterrupted 24 hours a day.

    After three weeks he’d harvested 6 million questions and answers from 20,000 women all over the country. McKinlay’s dissertation was relegated to a side project as he dove into the data. He was already sleeping in his cubicle most nights. Now he gave up his apartment entirely and moved into the dingy beige cell, laying a thin mattress across his desk when it was time to sleep.

    For McKinlay’s plan to work, he’d have to find a pattern in the survey data—a way to roughly group the women according to their similarities. The breakthrough came when he coded up a modified Bell Labs algorithm called K-Modes. First used in 1998 to analyze diseased soybean crops, it takes categorical data and clumps it like the colored wax swimming in a Lava Lamp. With some fine-tuning he could adjust the viscosity of the results, thinning it into a slick or coagulating it into a single, solid glob.

    He played with the dial and found a natural resting point where the 20,000 women clumped into seven statistically distinct clusters based on their questions and answers. “I was ecstatic,” he says. “That was the high point of June.”

    He retasked his bots to gather another sample: 5,000 women in Los Angeles and San Francisco who’d logged on to OkCupid in the past month. Another pass through K-Modes confirmed that they clustered in a similar way. His statistical sampling had worked.

    Now he just had to decide which cluster best suited him. He checked out some profiles from each. One cluster was too young, two were too old, another was too Christian. But he lingered over a cluster dominated by women in their mid-twenties who looked like indie types, musicians and artists. This was the golden cluster. The haystack in which he’d find his needle. Somewhere within, he’d find true love.

    Actually, a neighboring cluster looked pretty cool too—slightly older women who held professional creative jobs, like editors and designers. He decided to go for both. He’d set up two profiles and optimize one for the A group and one for the B group.

    He text-mined the two clusters to learn what interested them; teaching turned out to be a popular topic, so he wrote a bio that emphasized his work as a math professor. The important part, though, would be the survey. He picked out the 500 questions that were most popular with both clusters. He’d already decided he would fill out his answers honestly—he didn’t want to build his future relationship on a foundation of computer-generated lies. But he’d let his computer figure out how much importance to assign each question, using a machine-learning algorithm called adaptive boosting to derive the best weightings.


    With that, he created two profiles, one with a photo of him rock climbing and the other of him playing guitar at a music gig. “Regardless of future plans, what’s more interesting to you right now? Sex or love?” went one question. Answer: Love, obviously. But for the younger A cluster, he followed his computer’s direction and rated the question “very important.” For the B cluster, it was “mandatory.”

    When the last question was answered and ranked, he ran a search on OkCupid for women in Los Angeles sorted by match percentage. At the top: a page of women matched at 99 percent. He scrolled down … and down … and down. Ten thousand women scrolled by, from all over Los Angeles, and he was still in the 90s.

    He needed one more step to get noticed. OkCupid members are notified when some-one views their pages, so he wrote a new program to visit the pages of his top-rated matches, cycling by age: a thousand 41-year-old women on Monday, another thousand 40-year-old women on Tuesday, looping back through when he reached 27-year-olds two weeks later. Women reciprocated by visiting his profiles, some 400 a day. And messages began to roll in.

    “I haven’t until now come across anyone with such winning numbers, AND I find your profile intriguing,” one woman wrote. “Also, something about a rugged man who’s really good with numbers … Thought I’d say hi.”

    “Hey there—your profile really struck me and I wanted to say hi,” another wrote. “I think we have quite a lot in common, maybe not the math but certainly a lot of other good stuff!”

    “Can you really translate Chinese?” yet another asked. “I took a class briefly but it didn’t go well.”

    The math portion of McKinlay’s search was done. Only one thing remained. He’d have to leave his cubicle and take his research into the field. He’d have to go on dates.

    On June 30, McKinlay showered at the UCLA gym and drove his beat-up Nissan across town for his first data-mined date. Sheila was a web designer from the A cluster of young artist types. They met for lunch at a cafe in Echo Park. “It was scary,” McKinlay says. “Up until this point it had almost been an academic exercise.”

    By the end of his date with Sheila, it was clear to both that the attraction wasn’t there. He went on his second date the next day—an attractive blog editor from the B cluster. He’d planned a romantic walk around Echo Park Lake but found it was being dredged. She’d been reading Proust and feeling down about her life. “It was kind of depressing,” he says.

    Date three was also from the B group. He met Alison at a bar in Koreatown. She was a screenwriting student with a tattoo of a Fibonacci spiral on her shoulder. McKinlay got drunk on Korean beer and woke up in his cubicle the next day with a painful hangover. He sent Alison a follow- up message on OkCupid, but she didn’t write back.

    The rejection stung, but he was still getting 20 messages a day. Dating with his computer-endowed profiles was a completely different game. He could ignore messages consisting of bad one-liners. He responded to the ones that showed a sense of humor or displayed something interesting in their bios. Back when he was the pursuer, he’d swapped three to five messages to get a single date. Now he’d send just one reply. “You seem really cool. Want to meet?”

    By date 20, he noticed latent variables emerging. In the younger cluster, the women invariably had two or more tattoos and lived on the east side of Los Angeles. In the other, a disproportionate number owned midsize dogs that they adored.

    His earliest dates were carefully planned. But as he worked feverishly through his queue, he resorted to casual afternoon meetups over lunch or coffee, often stacking two dates in a day. He developed a set of personal rules to get through his mara-thon love search. No more drinking, for one. End the date when it’s over, don’t let it trail off. And no concerts or movies. “Nothing where your attention is directed at a third object instead of each other,” he says. “It’s inefficient.”

    LOVE IS A DATA FIELD

    McKinlay’s code found that the women clustered into statistically identifiable groups who tended to answer their OkCupid survey questions in similar ways. One group, which he dubbed the Greens, were online dating newbies; another, the Samanthas, tended to be older and more adventuresome. Here’s how each cluster answered four of the most popular questions.

    The Questions


    (1) About how long do you want your next relationship to last?
    One night
    A few months to a year
    Several years
    The rest of my life


    (2) Say you’ve started seeing someone you really like. As far as you’re concerned, how long will it take before you have sex?
    1-2 dates
    3-5 dates
    6 or more dates
    Only after the wedding


    (3) Have you ever had a sexual encounter with someone of the same sex?
    Yes, and I enjoyed myself
    Yes, and I did not enjoy myself
    No, and I would never
    No, but I’d like to


    (4) How important is religion/God in your life?
    Extremely important
    Somewhat important
    Not very important
    Not important at all




    After a month of dating equally from both of his profiles, he decided he was spending too much time on the freeway reaching east-side women from the tattoo cluster. He deleted his A-group profile. His efficiency improved, but the results were the same. As summer drew to a close, he’d been on more than 55 dates, each one dutifully logged in a lab notebook. Only three had led to second dates; only one had led to a third.

    Most unsuccessful daters confront self-esteem issues. For McKinlay it was worse. He had to question his calculations.

    Then came the message from Christine Tien Wang, a 28-year-old artist and prison abolition activist. McKinlay had popped up in her search for 6-foot guys with blue eyes near UCLA, where she was pursuing her master’s in fine arts. They were a 91 percent match.

    He met her at the sculpture garden on campus. From there they walked to a college sushi joint. He felt it immediately. They talked about books, art, music. When she confessed that she’d made some tweaks to her profile before messaging him, he responded by telling her all about his love hacking. The whole story.

    “I thought it was dark and cynical,” she says. “I liked it.”

    It was first date number 88. A second date followed, then a third. After two weeks they both suspended their OkCupid accounts.

    “I think that what I did is just a slightly more algorithmic, large-scale, and machine-learning-based version of what everyone does on the site,” McKinlay says. Everyone tries to create an optimal profile—he just had the data to engineer one.

    It’s one year after their first date, and McKinlay and Tien Wang have met me at the Westwood sushi bar where their relationship began. McKinlay has his PhD; he’s teaching math and is now working on a postgraduate degree in music. Tien Wang was accepted into a one-year art fellowship in Qatar. She’s in California to visit McKinlay. They’ve been staying connected on Skype, and she has returned for a couple of visits.

    At my request, McKinlay has brought his lab notebook. Tien Wang hasn’t seen it before today. It’s page after page of formulas and equations in McKinlay’s tight handwriting, ending in a neatly ordered list of women and dates, a few terse notes about each. Tien Wang leafs through it, laughing at some of the highlights. On August 24, she notices, he took two women to the same beach on the same day. “That’s horrible,” she says.

    To Tien Wang, McKinlay’s OkCupid hacking is a funny story to tell. But all the math and coding is merely prologue to their story together. The real hacking in a relationship comes after you meet. “People are much more complicated than their profiles,” she says. “So the way we met was kind of superficial, but everything that happened after is not superficial at all. It’s been cultivated through a lot of work.”

    “It’s not like, we matched and therefore we have a great relationship,” McKinlay agrees. “It was just a mechanism to put us in the same room. I was able to use OkCupid to find someone.”

    She bristles at that. “You didn’t find me. I found you,” she says, touching his elbow. McKinlay pauses to think, then admits she’s right.

    A week later Tien Wang is back in Qatar, and the couple is on one of their daily Skype calls when McKinlay pulls out a diamond ring and holds it up to the webcam. She says yes.

    They’re not entirely sure when they’ll get married. There’s research to be done to determine the optimal wedding day.



    原文链接: How a Math Genius Hacked OkCupid to Find True Love(编译/Arron、毛梦琪 责编/仲浩)本文为CSDN编译整理


    zan
    转播转播0 分享淘帖0 分享分享0 收藏收藏1 支持支持0 反对反对0 微信微信

    16

    主题

    6

    听众

    3151

    积分

  • TA的每日心情

    2019-4-16 08:47
  • 签到天数: 954 天

    [LV.10]以坛为家III

    发帖功臣 新人进步奖

    群组数学专业考研加油站

    群组第二届数模基础实训

    群组学术交流B

    群组数学中国第二期SAS培训

    群组2014年网络挑战赛交流

    回复

    使用道具 举报

    您需要登录后才可以回帖 登录 | 注册地址

    qq
    收缩
    • 电话咨询

    • 04714969085
    fastpost

    关于我们| 联系我们| 诚征英才| 对外合作| 产品服务| QQ

    手机版|Archiver| |繁體中文 手机客户端  

    蒙公网安备 15010502000194号

    Powered by Discuz! X2.5   © 2001-2013 数学建模网-数学中国 ( 蒙ICP备14002410号-3 蒙BBS备-0002号 )     论坛法律顾问:王兆丰

    GMT+8, 2024-4-29 23:48 , Processed in 0.369909 second(s), 60 queries .

    回顶部