1、Data RepresentationChapter 2天津大学软件学院天津大学软件学院第1页,共88页。计算机导论2.1 DATA TYPES(数据类型数据类型)Data today come in different forms such as numbers,text,images,audio,and video.People need to process all these data types.The computer industry uses the term“multimedia”to define information that contains numbers,text
2、,images,audio,and video.number:数值 text:文本 image:图像audio:音频Video:视频Multimedia:多媒体多媒体第2页,共88页。计算机导论Analog and Digital(模拟和数字模拟和数字)Information Computers are finite(有限的).Computer memory and other hardware(硬件)devices have only so much room to store and manipulate a certain amount of data.The goal of data
3、representation(数据表示)is to represent enough of the world to satisfy our computational needs and our senses of sight and sound.第3页,共88页。计算机导论Analog and Digital Information Information can be represented in one of two ways:analog or digital.Analog data A continuous representation,analogous to the actua
4、l information it represents.Digital data A discrete representation,breaking the information up into separate elements.第4页,共88页。计算机导论Analog and Digital InformationA mercury thermometer exemplifies analog data as it continually rises and falls in direct proportion to the temperature.Digital displays o
5、nly show discrete(离散的)information.第5页,共88页。计算机导论2.2 DATA INSIDE THE COMPUTER All data types from outside a computer are transformed into a uniform representation when stored in a computer and then transformed back when leaving the computer.This universal format is called a bit pattern(位组合格式)(位组合格式).
6、BIT(位)(位)A bit(binary digit)is the smallest unit of data that can be stored in a computer;it is either 0 or 1.BIT PATTERN(位组合格式)(位组合格式)A bit pattern is a sequence,or as it is sometimes called,a string of bits that can represent a symbol.e.g.BYTE(字节)(字节)A bit pattern of length 8 is called a byte.第6页,
7、共88页。计算机导论Examples of bit patterns第7页,共88页。计算机导论2.3 REPRESENTING DATA TEXT(文本)(文本)A piece of text in any language is a sequence of symbols used to represent an idea in that language.You can represent each symbol with a bit pattern.In other words,text such as“BYTE”,which is made of four symbols,can b
8、e represented as 4 bit patterns,each pattern defining a single symbol.第8页,共88页。计算机导论 How many bits are needed in a bit pattern to represent a symbol in a language?The length of the bit pattern that represents a symbol in a language depends on the number of symbols used in that language.More symbols
9、mean a longer bit pattern.The relationship is not linear;it is logarithmic.If you need n symbols,the length is log2n bit.Number of Symbols-24816128256Bit Pattern Length-123478第9页,共88页。计算机导论Codes(编码)(编码)Different sets of bit patterns have been designed to represent text symbols.Each set is called a c
10、ode,and the process of representing symbols is called coding.ASCII The American National Standards Institute(ANSI)developed a code called American Standard Code for Information Interchange(ASCII)(美国信息交换标准代码).This code uses 7 bits for each symbol.This means 128 different symbols can be defined by thi
11、s code.e.g.第10页,共88页。计算机导论 ASCII CODEAmerican Standard Code for Information Interchange第11页,共88页。计算机导论The Unicode Character Set(统一的字符编码标准统一的字符编码标准,采用双字节对字符进行编码)采用双字节对字符进行编码)Figure 3.6 A few characters in the Unicode character set第12页,共88页。计算机导论 AUDIO(音频)(音频)Audio is converted to digital data,then we
12、 can use bit patterns to store them.Audio is by nature analog data.It is continuous(analog),not discrete(digital).WAV,AU,AIFF,VQF,and MP3.sampling:采样quantization:量化Coding:编码第13页,共88页。计算机导论IMAGES(图像)(图像)Images today are represented in a computer by one of two methods:bitmap graphic or vector graphic.
13、Bitmap Graphic(位图)(位图)In this method,an image is divided into a matrix of pixels(picture elements),where each pixel is a small dot.The size of the pixel depends on what is called the resolution.After dividing an image into pixels,each pixel is assigned a bit pattern.The size and the value of the pat
14、tern depend on the image.pixel:像素像素resolution:分辨率:分辨率第14页,共88页。计算机导论 To represent color images,each colored pixel is decom-posed into three primary colors:red,green,and blue(RGB).Then the intensity of each color is measured,and a bit pattern(usually 8 bits)is assigned to it.In other words,each pixel
15、 has three bit patterns:one to represent the intensity of the red color,one to represent the intensity of the green color,and one to represent the intensity of the blue color.BMP,GIF,JPEG,PNG,TIFF,XBM,and PCXthree primary colors:三基色三基色第15页,共88页。计算机导论Digitized Images第16页,共88页。计算机导论 Vector Graphic The
16、 vector graphic method does not store the bit patterns.An image is decomposed into a combination of curves and lines.Each curve or line is represented by a mathematical formula.For example,a line may be described by the coordinates of its endpoints,and a circle may be described by the coordinates of
17、 its center and the length of its radius.The combination of these formulas is stored in a computer.When the image is to be displayed or printed,the size of the image is given to the system as an input.The system redesigns the image with the new size and uses the same formula to draw the image.In thi
18、s case,each time an image is drawn,the formula is reevaluated.WMF,PICT,EPS,SVG,SWF,and TrueType fontscurve:曲线,:曲线,mathematical formula:数学公式:数学公式第17页,共88页。计算机导论Representing VideoTo simulate motion,movies need to record(and play back)at least 12 frames per second.However,good sound quality requires 24
19、 frames/s.24 frames/s=1440 frames/minute=46400 frames/hourIf each frame has a resolution of 1024 x 768*there are 786,432 pixels in a frame.If the colour of each pixel is stored as 24 bits(3 bytes)of data,one frame alone requires 2,359,296 bytes(2 MB)of memory.An hour of film then,requires 203,843,17
20、4,400 bytes(194,400 MB more than 190 Gigabytes)of storage just for the images.video:视频frame:祯第18页,共88页。计算机导论Data Compression(数据压缩数据压缩)It is important that we find ways to store and transmit data efficiently,which leads computer scientists to find ways to compress it.Data compression is a reduction i
21、n the amount of space needed to store a piece of data.Compression ratio is the size of the compressed data divided by the size of the original data.Data compression:数据压缩数据压缩Compression ratio:压缩比:压缩比第19页,共88页。计算机导论Data Compression A data compression technique can be lossless,which means the data can
22、be retrieved without any loss of the original information,lossy,which means some information may be lost in the process of compaction.As examples,consider these 3 techniques:keyword encoding(关键字编码)run-length encoding(扫描宽度编码)Huffman encoding(霍夫曼编码)Lossless:无损:无损Lossy:有损:有损第20页,共88页。计算机导论 Numbers,text
23、,images,audio,and video are all forms of data.Computers need to process all types of data.All data types are transformed into a uniform representation called a bit pattern for processing by computers.A bit is the smallest unit of data that can be stored in a computer.A bit pattern is a sequence of b
24、its that can represent a symbol.A byte is 8 bits.SUMMARY 第21页,共88页。计算机导论 Coding is the process of transforming data into a bit pattern.ASCII is a popular code for symbols.Images use the bitmap graphic or vector graphic method for data representation.The image is broken up into pixels which can then
25、be assigned bit patterns.Audio data are transformed to bit patterns though sampling,quantization,and coding.Video data are a set of sequential images.SUMMARY (continued)第22页,共88页。计算机导论EXERCISES 2-1;2-2;2-11;2-12;2-13;2-14;2-152-23;2-24;2-25;2-26;2-272-34;2-35;2-36;2-37;2-38;2-39第23页,共88页。NumberRepre
26、sentationChapter 3天津大学软件学院天津大学软件学院第24页,共88页。计算机导论Number System The Decimal system is based on 10,0-9;The binary system is based on 2,0-1;Octal notation is based on 8,0-7;Hexadecimal notation is based on 16,0-9,A-F。Decimal system:十进制十进制binary system:二进制:二进制Octal notation:八进制:八进制Hexadecimal notation:十
27、六进制:十六进制第25页,共88页。计算机导论3.1 DECIMAL AND BINARY Two numbering systems are dominant today in the world of computers:decimal and binary.DECIMAL SYSTEM第26页,共88页。计算机导论 BINARY SYSTEM The binary system is based on 2.There are only two digits in the binary system,0 and 1.第27页,共88页。计算机导论OCTAL NOTATIONOctal no
28、tation is based on 8.This means there are 8 symbols:0,1,2,3,4,5,6,7.-000001010011-0123-100101110111-4567第28页,共88页。计算机导论Binary to octal and octal to binary transformation 第29页,共88页。计算机导论 HEXADECIMAL NOTATION Hexadecimal notation is based on 16(hexadec is Greek for 16).This means there are 16 symbols(
29、hexadecimal digits):0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,and F.Each hexadecimal digit can represent 4 bits,4 bits can be represented by a hexadecimal digit.-00000001001000110100010101100111-01234567-10001001101010111100110111101111-89ABCDEF第30页,共88页。计算机导论CONVERSION(转换)(转换)Converting from a bit pattern to h
30、exadecimal is done by organizing the pattern into groups of four and finding the hexadecimal value for each group of 4 bits.For hexadecimal to bit pattern conversion,convert each hexadecimal digit to its 4-bit equivalent.Hexadecimal notation is written in two formats.In the first format,you add a lo
31、wercase(or uppercase)x before the digits.For example,xA34;In another format,you indicate the base of the number(16)as the subscript after the notation.For example,(A34)16.;A34H第31页,共88页。计算机导论Show the hexadecimal equivalent of the bit pattern 1100 1110 0010.Each group of 4 bits is translated to one h
32、exadecimal digit.The equivalent is xCE2.第32页,共88页。计算机导论Show the hexadecimal equivalent of the bit pattern 0011100010B.Divide the bit pattern into 4-bit groups(from the right).In this case,add two extra 0s at the left to make the number of bits divisible by 4.So you have 000011100010,which is transla
33、ted to 0E2H.第33页,共88页。计算机导论What is the bit pattern for x24C?Write each hexadecimal digit as its equivalent bit pattern to get 001001001100.第34页,共88页。计算机导论3.2 CONVERSION BINARY TO DECIMAL CONVERSION Start with the binary number and multiply each binary digit by its weight.Since each binary bit can be
34、 only 0 or 1,the result will be either 0 or the value of the weight.After multiplying all the digits,add the results.第35页,共88页。计算机导论Convert the binary number 10011 to decimal.Write out the bits and their weights.Multiply the bit by its corresponding weight and record the result.At the end,add the re
35、sults to get the decimal number.Binary 10011Weights 16 8 4 2 1 -16 +0 +0 +2 +1 Decimal 19 第36页,共88页。计算机导论 DECIMAL TO BINARY CONVERSION To convert from decimal to binary,use repetitive division.division:除法quotient:商remainder:余数第37页,共88页。计算机导论Convert the decimal number 35 to binary.Write out the numbe
36、r at the right corner.Divide the number continuously by 2 and write the quotient and the remainder.The quotients move to the left,and the remainder is recorded under each quotient.Stop when the quotient is zero.0 1 2 4 8 17 35 Dec.Binary 1 0 0 0 1 1 第38页,共88页。计算机导论3.3 INTEGER REPRESENTATION(整数表示法整数表
37、示法)Integers are whole numbers(i.e.,numbers without a fraction).An integer can be positive or negative.0 +To use computer memory more efficiently,two broad categories of integer representation have been developed:unsigned integers and signed integers.Signed integers may also be represented in three d
38、istinct ways.Integer:整数fraction:分数unsigned integer:无符号整数signed integer:带符号整数第39页,共88页。计算机导论 UNSIGNED INTEGERS FORMAT An unsigned integer is an integer without a sign.Most computers define a constant called the maximum unsigned integer.An unsigned integer ranges between 0 and this constant.The maximu
39、m unsigned integer depends on the number of bits the computer allocates to store an unsigned integer.Range:0.(2N-1)N is the number of bits allocated to represent one unsigned integer.-816-0 .2550 .65,535第40页,共88页。计算机导论 Representation Storing unsigned integers is a straightforward process as outlined
40、 in the following step:1.The number is changed to binary.2.If the number of bits is less than N,0s are added to the left of the binary number so that there is a total of N bit.Store 7 in an 8-bit memory location(存储单元).第41页,共88页。计算机导论Store 258 in a 16-bit memory location.第42页,共88页。计算机导论 Overflow(溢出)I
41、f you try to store an unsigned integer such as 256 in an 8-bit memory location,you get a condition called overflow.-7 234 258 24,7601,245,6788-bit allocation-0000011111101010overflowoverflowoverflow-0000000000000111000000001110101000000001000000100110000010111000overflow第43页,共88页。计算机导论 Interpretatio
42、n How do you interpret an unsigned binary representation in decimal?The process is simple.Change the N bits from the binary system to the decimal system.Interpret 00101011 in decimal if the number was stored as an unsigned integer.第44页,共88页。计算机导论SIGNED INTEGERS FORMAT SIGN-AND-MAGNITUDE FORMAT(原码)(原
43、码)In sign-and-magnitude representationthe leftmost bit defines the sign of the number.If it is 0,the number is positive.If it is 1,the number is negative positive:正数negative:负数第45页,共88页。计算机导论 SIGN-AND-MAGNITUDE FORMAT Range:-(2N-1-1)+(2N-1-1)There are two 0s in sign-and-magnitude representation:posi
44、tive and negative.In an 8-bit allocation:+0 00000000 -0 10000000-81632-127 -0-32767 -0-2,147,483,647 -0 +0 +127 +0 +32767 +0 +2,147,483,647 第46页,共88页。计算机导论 Representation Storing sign-and-magnitude integer is a straightforward process:1.The number is changed to binary;the sign is ignored.2.If the nu
45、mber of bits is less than N-1,0s are added to the left of the number so that there is a total of N-1 bits.3.If the number is positive,0 is added to the left(to make it N bits).If the number is negative,1 is added to the left(to make it N bits).第47页,共88页。计算机导论Store+7 in an 8-bit memory location using
46、 sign-and-magnitude representation.第48页,共88页。计算机导论Store 258 in a 16-bit memory location using sign-and-magnitude representation.第49页,共88页。计算机导论 Interpretation How do you interpret a sign-and-magnitude binary representation in decimal?The process is simple:1.Ignore the first(leftmost)bit.2.Change the
47、 N-1 bits from binary to decimal as shown at the beginning of the chapter.3.Attach a+or a sign to the number based on the leftmost bit.-+7-124+258-24,760-0000011111111100overflowoverflow-0000000000000111100000000111110000000001000000101110000010111000第50页,共88页。计算机导论Interpret 10111011 in decimal if t
48、he number was stored as a sign-and-magnitude integer.第51页,共88页。计算机导论 ONES COMPLEMENT FORMAT(反码)(反码)Ones complement of a number is obtained by changing all 0s to 1s and all 1s to 0s.The leftmost bit defines the sign of the number.If it is 0,the number is positive.If it is 1,the number is negative.A p
49、ositive number is presented by its SIGN-AND-MAGNITUDE FORMAT A negative number is presented by its ONES COMPLEMENT FORMATThere are two 0s in ones complement representation:positive and negative.In an 8-bit allocation:+0 00000000 -0 11111111第52页,共88页。计算机导论 Representation Storing ones complement integ
50、ers requires the following steps:1.The number is changed to binary;the sign is ignored.2.0s are added to the left of the number to make a total of N bits.3.If the sign is positive,no more action is needed.If the sign is negative,every bit is complemented(changed from 0 to 1 or from 1 to 0).-81632-12