1、UNIT 25Data Mining Data Mining.Youve probably heard of it.Or maybe youve heard of data warehousing.Mining and warehousing are related-warehousing brings your data together for analysis.Mining sorts through the data youve collected and turns up interesting and useful connections.UNIT 25Understanding
2、data mining may be important to you.After all,1Forrester Research Inc.in Cambridge,Mass.,predicts that the next two years will see an explosion of data mining projects,with almost four times the number that currently exist,says Frank Gillett,a Forrester analyst.It all starts with a load of finely de
3、tailed historical data that needs to be sifted through for gems.2Then,you need to decide what discrete problem you want to solve-increasing direct-mail response rate,finding mortgage customers or boosting grocery sales,for example.UNIT 253To get through all the data,you need mining tools based on al
4、gorithms that scan through the data looking for patterns(such as grocery shoppers buying peanut butter and jelly together).Most mining tools need to have data in a flat file format in order to start sorting through it,so the data is extracted and put in a flat text file.Then the mining process can b
5、egin.The tools themselves work in a variety of ways.Some are desktop-based,others are client/server.Some,like Right Point Software Inc.s,have one algorithm that does one type of search.Others,such as SAS Institute Inc.s offering,include a toolkit of several algorithms.UNIT 25You have to carefully se
6、lect the variables.If you dont include a key variable,you may not get the relationship youre looking for-too many variables produce too much output.But an over-reliance on tools capabilities could lead to trouble.4There are other areas that could cause problems if not addressed in the beginning stag
7、es of a data mining project.5You must have someone who knows what theyre doing as your mining expert.6To think you can do data mining without a statistical or mining background is mind-bogging.UNIT 25Properly selecting which searches is imperative.For example,a project leader with a statistical back
8、ground may not understand that a customers age wouldnt be as good a predictor as an ageto income ratio.On the other hand,if the project leader has only a statistical and business background,he may not understand data storage,transportation and maintenance requirements.Some projects suffer because to
9、o much attention is spent on preparing the data instead of refining the mining models.UNIT 25New Words mine v.挖掘warehouse n.仓库 analysis n.分析sort v.分类explosion n.爆炸analyst n.分析家historical adj.历史上的 sift v.筛选discrete adj.不连续的mortgage n.按揭UNIT 25scan v.扫描grocery n.食品format n.模式extract v.提取toolkit n.工具箱v
10、ariable n.变量statistical adj.统计的imperative adj.急需的,必要的maintenance n.维护transportation n.传递refine v.推敲 UNIT 25Phrases and Expressions data mining 数据挖掘data warehousing 数据仓库turn up 出现after all 毕竟get through 完成scan through 扫描look for 寻找sort through 筛选put in 插入lead to 导致 UNIT 25The Explanation of Difficult
11、 Statements 1.Forrester Research Inc.in Cambridge,Mass predicts that the next two years will see an explosion of data mining projects,with almost four times the number that currently exist,says Frank Gillett,a Forrester analyst.分析分析:此句中涉及了倍数的翻译,请参照课后的“Translation Skills”。句意句意:麻省坎布里奇市的Forrester研究公司预测
12、,数据挖掘项目今后两年将有突破性发展,Forrester公司的分析师Frank Gillett称,几乎达到目前已有数量的四倍。UNIT 252.Then,you need to decide what discrete problem you want to solveincreasing direct-mail response rate,finding mortgage customers or booting grocery sales,for example.分析分析:“what discrete problem you want to solve”为名词性从句,在句中作宾语,相当于t
13、he things that。句意句意:然后你需要决定你想解决哪些具体问题,如增加直邮的回答率,找出按揭的客户或者提高仪器的销售等等。UNIT 253.To get through all the data,you need mining tools based on algorithms that scan through the data(such as grocery shoppers buying peanut butter and jelly together).分析分析:“based on algorithms that scan through the data looking
14、for patterns”中分别是过去分词、现在分词作定语,其区别在于:现在分词表示正在进行的以及主动的动作,而过去分词则表示已经完成的以及被动的动作。句意句意:为看完所有的数据,你需要挖掘工具,而这些工具以扫描数据、找出模式(如食品店中同时购买花生酱和果冻的人)的算法为基础。UNIT 254.There are other areas that could cause problems if not addressed in the beginning stages of data mining project.分析分析:其中“if not addressed in the beginnin
15、g stages of date mining project”一句为省略句,完整结构应为:if they are not addressed in the beginning stages of data project.在英语中,由as if,if,no matter what,once,though/although,unless,when,where,whether,while等引导的句子中,若谓语动词是be,而主语与主句的主语相同时或主语为it 时,从句的主语与谓语动词通常省略。例如:Although(he was)exhausted by the climb,he continue
16、d his journey.句意句意:还有一些方面,如果不在数据挖掘项目的开始阶段就注意到的话,也会出问题。UNIT 255.You must have someone who knows what theyre doing,as your mining expert.分析分析:“what theyre doing”为名词性从句,在句中作宾语,相当于“the things that theyre doing”。句意句意:你必须用了解情况的人作你的挖掘专家。6.To think you can do data mining without a statistical or mining back
17、ground is mind-boggling.分析分析:此句中为动词不定式作主语,通常以形式主语形式出现,即“It is mindboggling to think you can do data mining without a statistical or mining background.”。句意句意:那种认为没有统计或挖掘背景的人也能做好数据挖掘,是不健全的想法。UNIT 25Grammar Translation Skills倍数的翻译法倍数的翻译法英汉两种语言在表达数字(或倍数)增减时,存在较大的差异,翻译时应特别注意,现列举几种常见的句型如下:1.句型“+by+数字或倍数+”
18、当本句型中有比较级出现或有表示增减意义的动词或分词出现时,by后面的是净减数或净增数的倍数,例如:This year the value of our industrial output has increased by twice as compared with that of last year.今年我们工业的产值比去年增加了两倍。UNIT 252.句型“+数字或倍数+比较级+than+”本句型中的数字或倍数多半是净增减数,即净增加的倍数或减到1/(n+1)。例如:A is twice less than B.A是B的1/3。3.句型“.as much(many,large,fast)a
19、gain as.”.本句型表示净增加一倍,例如:Wheel A turns as fast again a wheel B.A轮转动比B轮快一倍。A is half as long again as B.A 的长度是B 的一倍半(或A比B长一半)。UNIT 254.句型“表示增减意义的动词或分词+to+数词+”本句型表示增加到(或减少到)某个数字。例如:The members have increased to 1000.成员增加到1000名。By using this new process the loss of metal was reduced to 20%.采用这种新工艺使金属耗损降
20、到20%。5.句型“as+形容词(如high,many,much 等)+as+具体数字”本句型表示(高,多)达(具体数字)之意。例如:The temperature is as high as 6000.温度高达6000。UNIT 256.句型“表示增减意义的谓语+by a factor of n”本句型表示增加(n-1)倍或减小(n-1)/n(或降到1/n),例如:In case of electronic scanning the beam width is broader by a factor of two.电子扫描时,波束宽度展宽一倍。The equipment under deve
21、lopment will reduce the error probability by a factory of 7.正在研制的设备将使出错概率降低6/7(或降到1/7)。UNIT 257.句型“倍数+as+形容词或副词+as+”本句型表示增加(n-1)倍或减少到1/n(或减少了(n-1)/n),例如:This substance reacts three times as fast as the other one.这种物质的反应速度是早一种物质的三倍(或比另一物质快两倍)。This substance reacts one-tenth as fast as the other one.这
22、种物质的反应速度是另一种物质的1/10(或比另一物质慢9/10)。UNIT 258.句型“增减意义的谓语或词组+倍数”本句型表示增加(n-1)倍或减(n-1)/n(或减少到1/n)。例如:The production has increased three times.生产增加了两倍。The principal advantage over the oldfashioned machine is a four-fold reduction in weight.与旧式机器相比的主要优点是重量减少了四分之三。以上为几种常见的表示倍数、分数、百分数增减的句子结构,然而在英语中不仅限于以上几种,但只要
23、掌握各种语言表达方法的特点,就能正确理解原意。例如:UNIT 25We can use a total of 170 repeaters to achieve approximately double the original bandwidth.使用170台增音机可以使带宽达到原有带宽的两倍左右。(或比原有带宽大一倍左右)。As the high voltage was abruptly trebled all the valves burnt.由于高压突然增加了2倍,管子都烧坏了(或突然增加到原来的3倍)。UNIT 25Exercises.Fill in the blanks accord
24、ing to the text:(1)_ sorts through the data youve collected and turns up interesting and useful connections.(2)To get through all the data,you need mining tools based on _ that scan through the data looking for patterns.(3)Most mining tools need to have data in a _ in order to start sorting through
25、it,so the data is extracted and put in a flat text file.(4)The tools themselves work in a variety of ways.Some are desktopbased,others are _.(5)If you dont include a _,you may not get the relationship youre looking for too many variables produce too much output.UNIT 25.Decide whether each of the fol
26、lowing statements is true or false according to the text:(1)Some,like Right Point Software Inc.s,include a toolkit of several algorithms.(2)To think you can do data mining without a statistical or mining background is mind-bogging.(3)An over-reliance on tools capabilities couldnt lead to trouble.(4)
27、A project leader with a statistical background may understand that a customers age wouldnt be as good a predictor as an age-to-income ratio.(5)Some projects suffer because too much attention is spent on preparing the data instead of refining the mining models.UNIT 25.Translate the following sentence
28、s into Chinese:(1)The impact on an unrestrained driver or passenger would be about 10 times greater than the impact on the car.(2)Car accidents increased by 2.5 times compared with the late 1980s.(3)In Mexico,English speaking secretaries can double their wages;in Egypt,their pay goes up 10 times.(4)
29、The principal advantage over the old-fashioned typewriter is four-fold reduction in weight.(5)In 1766 the English scientist Henry Cavendish had shown that hydrogen was seven times lighter than air.UNIT 25.Translate the following paragraphs into Chinese:A digital wallet is software that enables users
30、 to pay for goods on the Web.It holds creditcard numbers and other personal information such as a shipping address.Once-entered,the data automatically populates order fields at merchant sites.When using a digital wallet,consumers dont need to fill out order forms on each site when they purchase an i
31、tem because the information has already been stored and is automatically updated and entered into the order fields across merchant sites.Consumers also benefit when using digital wallets because their information is encrypted or protected by a private software code.And merchants benefit by receiving
32、 protection against fraud.UNIT 25Reading Materials Digital Wallets()A digital wallet is software that enables users to pay for goods on the Web.It holds creditcard numbers and other personal information such as a shipping address.Once entered,the data automatically populates order fields at merchant
33、 sites.When using a digital wallet,consumers dont need to fill out order forms on each site when they purchase an item because the information has already been stored and is automatically updated and entered into the order fields across merchant sites.Consumers also benefit when using digital wallet
34、s because their information is encrypted or protected by a private software code.And merchants benefit by receiving protection against fraud.UNIT 25Digital wallets are available to consumers free of charge,and theyre fairly easy to obtain.For example,when a consumer makes a purchase at a merchant si
35、te thats set up to handle server-side digital wallets,he types his name and payment and shipping information into the merchants own form.At the end of the purchase,one consumer is asked to sign up for a wallet of his choice by entering a user name and password for future purchases.Users can also acq
36、uire wallets at a wallet vendors site.Although a wallet is free for consumers,vendors charge merchants for wallets.UNIT 25New Words and Expressions order fields 订货域merchant sites 商家网站encrypt vt.加密fraud n.欺诈(的行为)Questions(1)What is a digital wallet?(2)How to use a digital wallets according to the tex
37、t?UNIT 25Digital Wallets()Digital wallets come in two main types:client-side and server-side.Within those divisions are wallets that work only on specific merchant sites and those that are merchant agnostic.Client-based digital wallets,the other of the two types,are falling by the wayside,according
38、to analysts,because they require users to downloads the wallet application and inputs payment and mailing information.At that point,the information is secured and encrypted on the users hard drive.The user retains control of his credit card and personal information locally.UNIT 25With a server-based
39、 wallet,a user fills out his personal information,and a cookie is automatically downloaded.(A cookie is a text file that contains information about the user.)In this scenario,the consumer information resides on the sever of a financial institution or a digital wallet vendor rather than on the users
40、PC.Server-side wallets provide assurance against merchant fraud because they use certificates to verify the identity of all parties.When a party makes a transaction,it presents is certificate to the other parties involved.A certificate is an attachment to an electronic-message used to verify the ide
41、ntity of the party and to provide the receiver with the means to encode a reply.UNIT 25Furthermore,the cardholders sensitive data is typically housed at a financial institution,so theres an extra sense of security because financial environments generally provide the highest degree of security.But ev
42、en though wallets provide easy shopping online,adoption hasnt been widespread.Standards are pivotal to the success of digital wallets.Last month,major vendors,including Microsoft Corp.,Sun Microsystems Inc.And America Online Inc.announced their endorsement of a new standard called ECML,or E-Commerce
43、 Modeling Language,to give Web merchants a standardized way to collect electronic data for shipping,billing and payment.UNIT 25New Words and Expressions scenario n.情况reside on 驻留在financial institution 金融机构assurance n.承诺,担保,保证,保险certificate n.证书attachment n.附件cardholder n.信用卡持有人ECML(E-Commerce Modeling Language)电子商务建模语言 UNIT 25Questions(1)Which types does digital wallets come into?(2)What are the benefits when you use the server-based digital wallets?