Scientific computing is a vast field that now has influence on all branches of science. This has also been a very rapidly developing field. Doing justice to such a field in a short introductory course is impossible. Hence I shall define scientific computing very narrowly. These are broadly divided into two types of computations:
The first one deals with numbers the second one deals with symbols. You may have already come across the most common example of symbolic computation software, mathematica. Symbolic computation is often build on a programming paradigm called functional programming , where one does not deal with data but functions. This technique is closer to mathematical operations, and has been projected one of the techniques that are going to be more and more important as our programs become more and more complex. Of course, functional programming is not limited to symbolic computations. If you are interested to know what functional programming is and why it is of interest, you might start by trying out functional programming in python.
In this course we shall stay clear off both functional programming and symbolic computations. We shall be knee-deep in numbers. Our choice of language is fortran 90. Although, strictly-speaking, a language has nothing to do with the computations we do, our choice of language does set constrains on the programs we write and eventually even on the way we think. We have chosen fortran because it is still the most commonly used language in academic scientific community, although C++ has by now made quite an inroad. In the rest of this course, unless otherwise stated, fortran will mean fortran 90, neither fortran 77 (god forbid) nor fortran 2000.
There are three principal aims of this course:
!******************************
program
endprogram
a.out
.
!******************************
program add_integers
implicit none
integer :: i,j,k
endprogram
!******************************
Here we have first added a "declaration" that i
, j
and k
are integers.
Historically this was not necessary in fortran. There
was an implicit convection that a variable name which started with the
letter i,j,k
(and some others which I dont care to remember)
are integers, others were real numbers. We shall never use
such convention but due to a mistaken respect for backward-compatiability
fortran compilers still use this convention. But you can rule that out
by saying, in the second line of your program, the statement
implicit none
.
I strongly recommend that you use this in every piece of fortran code
that you write.
Note that we have declared the integers but we have not yet given them any value.
A good compiler initializes all variables with "zero". Although in some
cases even this may not be desiarable. As a good programming practice always initialize the variables you use.
To check what your compiler does write out the integers by adding a write
statement to our code:
!******************************
program add_integers
implicit none
integer :: i,j,k
write(*,*) 'i,j,k', i,j,k
endprogram
!******************************
> ./a.out
i,j,k 1 222093376 32767
> ./a.out
i,j,k 1 76013632 32767
> ./a.out
i,j,k 1 245846080 32767
Clearly in my computer these integers are initialized to random integers. And each time I run my code I get a different set of random integers. By the way, this is not a good way to actually generate random integers. So the first lesson of today is to always initialize. This turns our code to
!******************************
program add_integers
implicit none
integer :: i=10,j=5,k=0
write(*,*) 'i,j,k', i,j,k
k=i+j
write(*,*) 'i,j,k=i+j', i,j,k
endprogram
!******************************
-finit-integer=n
. Use this option with the first
version of the program add_integers
to initialize i
and
i
to any value you want.
!addition
k=j+j
!subtraction
k=i-j
!multiplication
k=i*j
!division
k=i/j
!exponentiation
k=i**j
integer(8)
integer*8
k=mod(i,j)
!or
k=modulo(i,j)
Euclid's algorithm. Start with two integer $x$ and $y$ with $x < y$. Obtain reminder of division. If the reminder is zero you have found the gcf. If not continue. In this fashion go down to one. In that case 1 is the gcd.
Any computations that you do in a computer with integers is the cleanest computation you can do. This you shall appreciate when I describe how computers deal with real numbers.
In a computer we must use finite number of memory to store a number. Of course, this limits the largest (absolute) integer you can store. But smaller integers are stored exactly. Not so in the case of real numbers. One of the ways, rather the most common way, real number arithmatic is done in a computer is by representating them by floating points. For example the number $1/10$ has a floating point representation as $1.00\times10^{-1}$. This particular floating point representation uses three digits and one exponent. In general the representation may look like, \begin{equation} \pm d_{0}.d_1d_2\ldots d_{p-1}\times b^{e} \end{equation} where we use a representation with $p$ ``significands'' or significant digits, base $b$ and exponent $e$. The representation has the following properties:
Due to (2) above most real numbers when represented by floating point number will have an error. This process of constructing floating point numbers from real number is called rounding or rounding off. The error is called rounding error . Let us say we want to use a floating point representation with $p=3$, $b=10$, $e_{\rm max}=2$ and $e_{min}=-2$. We shall use this representation to illustrate several important issues in floating point arithmatic. Now let us try to represent the number $0.3141$ in this system. The floating point representation will be $3.14E-1$. The rounding error is $0.0001$. So the {\it relative} rounding error is $\frac{0.0001}{0.3141} \approx 0.00031 $ A different way of writing the same error is to say that it is $0.1$ {\it units in last place}, or in short ${\tt ulp}$. In terms of this unit, the maximum rounding error is $0.5 {\tt ulp}$.
Theorem 1 The relative error and the ${\tt ulp}$ are related by \begin{equation} \frac{1}{2}b^{-p} \leq \frac{1}{2} {\tt ulp} \leq \frac{b}{2}b^{-p} \end{equation} where we are using a system of floating point representation with base $b$ and number of significant digits to be $p$.
This implies that an error that is fixed in ${\tt ulp}$ can vary by a factor of $b$. This factor is called wobble. Note that the maximum rounding error is always less a relative error of $\varepsilon = \frac{b}{2}b^{-p}$. This number is called the machine precision. The relative error of a calculation is often given in units of $\varepsilon$.
Theorem 2 For a floating point representation with base $b$ and number of significant digits $p$ the relative error can be as large as $b -1$.
The error we have noted so far comes merely from the floating-point representation of real number. But arithmatic of floating point numbers can magnify the error by huge amounts. The basic unit of arithmatic is addition. An addition in floating point is done by moving the decimal point till both the numbers being added have the same $e$. Then they are added. For example let us add $3.75E2$ to $1.45E-1$. The addition will proceed as follows. \begin{eqnarray} 3.75 {\rm E} 2 \nonumber \\ +1.45 {\rm E} -1 = 0.00145 {\rm E} 2 \approx 0.00 {\rm E} 2 \nonumber \\ 3.75 {\rm E} 2 \end{eqnarray} Whereas the actual number would be $3.75145$ which when rounded off to three significant digits will be exactly $3.75$.
Now consider the following subtraction. Find $x-y$ where $x=10.1$ and $y=9.95$. The actual answer is $0.15$. In floating point operations \begin{eqnarray} x=10.1 \approx 1.01E1 \nonumber \\ y=9.95\approx 0.99E1 \nonumber \\ x-y\approx 0.02E1 \end{eqnarray} The correct answer is $0.15$, so error is $50 {\tt ulp}$ and wrong in every digit !
Now let us look at a slightly improved method of subtraction. In this method the smaller number is rounded off with one more digit. And the result of the subtraction is rounded off the the usual number of digits. Let us see what happens in exactly the same problem: \begin{eqnarray} x=10.1 \approx 1.01 E1 \nonumber \\ y=9.95\approx 0.995E1 \nonumber \\ x-y\approx 0.015 = 1.5E-2 \end{eqnarray} In this case the answer is exact ! The answer being exact is a lucky break, but in generaly by adding of one extra digit, called the guard digit, the accuracy in subtraction can be vastly improved.
Theorem 3 If $x$ and $y$ are two floating-point numbers with base $b$ and number of significant digits $p$, and if the subtraction is done with a guard digit ($p+1$ digits) the relative rounding error is less than $2\varepsilon$.
In other words the guard digits is useful when subtracting two numbers which are almost equal to each other. But the guard digit unfortunately is not a magic bullet. If the two numbers themselves are results of other floating point operations, then the guard digit will not help. To illustrate let us look at the roots of quadratic equation, \begin{equation} ax^2+bx+c = 0 \label{eq:quadratic} \end{equation} The solutions of which is, \begin{equation} x = \frac{-b \pm \sqrt{b^2-4ac}}{2a} \label{eq:root_quadratic} \end{equation} Let us take $a=0.08$,$b=2.75$ and $c=23.43$. In this case we have \begin{eqnarray} b^2 = 7.5625 \nonumber \\ ac = 1.8744 \nonumber \\ 4ac = 7.4976 \nonumber \\ b^2-4ac = 0.0624 \nonumber \\ \sqrt{b^2-4ac} = 0.24979 \nonumber \\ x_1,x_2 = -2.500, -2.999. \end{eqnarray}
Let us see what we get by floating-point operations, \begin{eqnarray} a \approx 8.00E-2 \nonumber \\ b \approx 2.75E1 \nonumber \\ c \approx 2.34E2 \nonumber \\ b^2 \approx 7.56E1 \nonumber \\ ac \approx 1.87E1 \nonumber \\ 4ac \approx 7.48E1 \nonumber \\ b^2 - 4ac \approx 8.00E-1 \nonumber \\ \sqrt{b^2 - 4ac} \approx 8.94E-1 \nonumber \\ x_1,x_2 = 1.86E1, 3.64E1 \end{eqnarray} So the error is $0.894-0.25 = 0.64$ which is $64 {\tt ulp}$. Note that in this process the actual subtration of $4ac$ from $b^2$ (once each of them were used in their floating point representation) was exact. Using guard digits would not have improved this calculation at all.
Here the subtraction has merely brought fourth the error which were already inherent in the two terms that were subtracted, such a subtraction can potentiall remove most of the accurate digits to disappear and to keep merely the error from previous floating point operations. This is termed catastropic cancellation. Problems such as these must be dealt with in a case-by-case basis. For two floating point numbers $x$ and $y$ which of the two following methods are better
to calculate $x^2$-$y^2$ ?
Let us now take this example of the quadradic formula as an excuse to study solution of quadratic equations. The equation to solve is \begin{equation} 0.08x^2 + 2.75x + 23.43 = 0 \end{equation} Consider this equation as a linear equation with the quadratic term as a mere perturbation. But to proceed with this method let us first try an even simpler equation that can solved by hand in this technique. \begin{equation} x^2-10x+1 = 0 \end{equation} Which has the roots, (upto 4th decimal places) $9.8989$ and $0.1010$. Write this in the following form, \begin{equation} x = \frac{1}{10}+\frac{x^2}{10} \end{equation} Now let us try iterative method. First ignore the second term on the right. This gives us our first approximation, \begin{equation} x_1 = 0.1 \end{equation} Now the second one \begin{equation} x_2 = 0.1 + \frac{0.1*0.1}{10} \end{equation} and so on \begin{equation} x_j = 0.1 + 0.1*x_{j-1}^2 \end{equation} Let us try to put this in a fortran code. To begin with let us forget about the beginning and end of the program but write its core first.
x = 0.1 + 0.1*x*x
do i=1,10
x = 0.1+0.1*x*x
enddo
do i=1,10
x = 0.1+0.1*x*x
write(*,*)'i,x',i,x
enddo
x=0.
do i=1,10
x = 0.1+0.1*x*x
write(*,*)'i,x',i,x
enddo
program root_quadratic
implicit none
integer :: i
real :: x
x=0.
do i=1,10
x = 0.1+0.1*x*x
write(*,*)'i,x',i,x
enddo
end program
i,x 1 0.10000000
i,x 2 0.10100000
i,x 3 0.10102011
i,x 4 0.10102051
i,x 5 0.10102051
i,x 6 0.10102051
i,x 7 0.10102051
i,x 8 0.10102051
i,x 9 0.10102051
i,x 10 0.10102051
program root_quadratic
implicit none
integer :: i
real :: x
real :: a,b,c
a = 1.
b = -10.
c = 1.
x=0.
do i=1,10
x = -(c/b)-(a/b)*x*x
write(*,*)'i,x',i,x
enddo
end program
1 -8.5200005
2 -10.631721
3 -11.808247
4 -12.576282
5 -13.121101
6 -13.528387
7 -13.844138
8 -14.095569
9 -14.299929
10 -14.468740
program root_quadratic
implicit none
integer, parameter :: N = 100
integer :: i
real :: x,xmin,xmax,dx
real :: a,b,c
! The coefficients of the quadratic equation.
a=0.08
b=2.75
c=23.43
! First plot the function.
! For this we need a minimum, a maximum and
! an interval.
xmin=-10.
xmax=10.
dx=(xmax-xmin)/float(N)
! Now generate the function for N points
! and write it to a file.
! First open a file
open(unit=10,file='quadratic')
do i=1,N
x = xmin+dx*float(i-1)
y = a*x*x + b*x + c
write(10,*)x,y
enddo
close(10)
! Now go into iterative method of finding the
! root
x=0.
do i=1,10
x = -(c/b)-(a/b)*x*x
write(*,*)'i,x',i,x
enddo
end program
-18.900000 3.17995623E-02
-18.799999 5.19947615E-03
-18.699999 -1.98005978E-02
-18.600000 -4.32002284E-02
-15.700000 -2.58000903E-02
-15.599999 -1.19998469E-03
-15.499999 2.50001326E-02
-15.400000 5.27999885E-02
i,x 86 -15.594783
i,x 87 -15.594830
i,x 88 -15.594872
i,x 89 -15.594910
i,x 90 -15.594944
i,x 91 -15.594975
i,x 92 -15.595004
i,x 93 -15.595030
i,x 94 -15.595054
i,x 95 -15.595075
i,x 96 -15.595094
i,x 97 -15.595111
i,x 98 -15.595127
i,x 99 -15.595141
i,x 100 -15.595155
Indeed we are converging, but the convergence was much slower than the first problem we solved. Can you improve the convergence ?
While we were solving this equation, we have actually found out a much more general universal method of root finding. By just tabulating the function and looking at its values we have found the limits on both the roots. We can then just refine ${\tt dx}$ above and find the root upto a higher accuracy. Let us write a code that does this. It will turn out that such a code will be applicable to higher order polynomial equations and even transcendental equations. This technique of root-finding is called a binary chop
Before we end let us make a last note on rounding. We have noted that to subtract two floats which are close to each other it is useful to have a guard digit and then round off. The question then is how exactly do we round off. Till now we have been just ignoring the digits which are greater than $p$. But a better method is to take them into account and use them to change the $p$-th digit of our float. An obvious way is to round to $0$ if $p+1$th digit is less than $5$ and round to $1$ otherwise. Another school of thought says that half the time $5$ should be rounded up and rounded down the other half. One way of doing this is by demanding that the rounded number will have its least significant digit to be even.
Now let us look at an example \begin{equation} x = 1.00, y = -0.555 \end{equation} Look at the following sequence \begin{equation} x_0 = x_1, \nonumber \\ x_1 = (x_0-y)+y, \nonumber \\ x_2 = (x_1-y)+y \nonumber \\ \cdots \nonumber \\ x_n = (x_{n-1}-y)+y \end{equation} We get the following sequence by rounding up \begin{eqnarray} x_0 = 1.00 \nonumber \\ x_1 = (1.00+0.555) - 0.555 \approx 1.56 - 0.555 = 1.005 \approx 1.01 \nonumber \\ x_2 = (1.01+0.555) - 0.555 \approx 1.57 - 0.555 = 1.015 \approx 1.02 \nonumber \\ \cdots \end{eqnarray} This makes $x$ increase by $0.01$ at every iteration. Now let us round to to even. Then the sequence looks like \begin{eqnarray} x_0 = 1.00 \nonumber \\ x_1 = (1.00+0.555) - 0.555 \approx 1.56 - 0.555 = 1.0005 \approx 1.00 \nonumber \\ \cdots \end{eqnarray} There is no problem at all.
machine precision -- round-off and truncation error -- show the picture of how real numbers are stored from the first chapter of Numerical Recepies
This connects us to the fascinating world of nonlinear maps. Let us start by considering the logistic map: \begin{equation} x_{n+1} = \lambda x_n (1-x_n) \end{equation} So far we have been concerned with similar iterations to solve quadratic equations. But our principal concern has been the rate of convergence of such iterations. But it might happens that the iterations we are considering will not converge to one value. This is of course a numerical nightmare from the point of view of finding roots. But this phenomenon worthy of study on its own right.
Please return the solution of these problems by Thursday 21st of February midnight. The computer programs should be checked-in in your directory at Google Drive that we share. The handwritten answers should be in my mailbox by the morning of 22nd February or handed over to me in class on the same day. The marks allocated to each problem is give inside parenthesis. The total marks of this homework adds up to $100$.
Let us use a floation point representation with base $b$ and number of
significant digits $p$. Then prove that:
(Hint: These three theorems which were
mentioned in class are proved in the beautiful article
"What Every Computer Scientist Should Know About Floating-Point Arithmetic",
by
David Goldberg,
originally published in the March, 1991 issue of Computing Surveys. The purpose of this exercise is to
make you read this paper. )