Scripting vs regular programming#
What is a script?#
Very high-level, often short, program written in a high-level scripting language
Scripting languages:
This course: Python + a taste of Bash (Unix shell)
Why Python? It is the most popular scripting programming language#
TIOBE Index - very long term history#
Characteristics of a script#
Glue other programs together and automate tasks
Often special-purpose code
Extensive text processing
File and directory manipulation
Many small interacting scripts may yield a big system
Perhaps a special-purpose GUI on top
(Sometimes) portable across Unix, Windows, Mac
Interpreted program (no compilation+linking)
Why not stick to Java or C/C++?#
Features of scripting languages compared to Java, C/C++ and Fortran:
shorter, more high-level programs
much faster software development
more convenient programming
you feel more productive:
no variable declarations, but lots of consistency checks at run time
technical details are hidden: no pointers, automatic garbage collection, …
easy to combine software components and interact with the OS
lots of standardized libraries and tools
Scripts yield short code#
Consider reading real numbers from a file, where each line can contain an arbitrary number of real numbers:
1.1 9 5.2
1.762543E-02
0 0.01 0.001
9 3 7
Python solution:
F = open("myfile.txt")
n = F.read().split()
print(n)
['1.1', '9', '5.2', '1.762543E-02', '0', '0.01', '0.001', '9', '3', '7']
Using regular expressions (1)#
Suppose we want to read complex numbers written as text
(-3, 1.4)
or (-1.437625E-9, 7.11)
or ( 4, 2 )
Python solution:
import re
m = re.search(r"\(\s*([^,]+)\s*,\s*([^,]+)\s*\)", "( -3,1.4)")
re, im = (float(x) for x in m.groups())
print("Real", re, " Img: ", im)
Real -3.0 Img: 1.4
(This will only find the first match of the regular expression, use
re.findall
to return a list of all matches.)
Using regular expressions (2)#
Regular expressions like
\(\s*([^,]+)\s*,\s*([^,]+)\s*\)
constitute a powerful language for specifying text patterns
Doing the same thing, without regular expressions, in Fortran and C requires quite some low-level code at the character array level
Remark: we could read pairs (-3, 1.4) without using regular expressions,
s = "(-3, 1.4 )"
re, im = s[1:-1].split(",")
Script variables are not declared#
Example of a Python function:
import os
def debug(leading_text, variable):
if os.environ.get("MYDEBUG", "0") == "1":
print(leading_text, variable)
Dumps any printable variable (number, list, hash, heterogeneous structure)
Printing can be turned on/off by setting the
environment variable MYDEBUG
The same function in C++#
Templates can be used to mimic dynamically typed languages
Not as quick and convenient programming:
template <class T>
void debug(std::ostream &o, const std::string &leading_text,
const T &variable) {
char *c = getenv("MYDEBUG");
bool defined = false;
if (c != NULL) { // if MYDEBUG is defined ...
if (std::string(c) == "1") { // if MYDEBUG is true ...
defined = true;
}
}
if (defined) {
o << leading_text << " " << variable << std::endl;
}
}
The relation to OOP#
Object-oriented programming can also be used to parameterize types.
Introduce base class
A
and a range of subclasses, all with a (virtual) print function;Let
debug
work withvariable
as anA
reference;Now
debug
works for all subclasses ofA
.
Advantage: complete control of the legal variable types that debug
are allowed to print (may be important in big systems to ensure that a function can only make transactions with certain objects)
Disadvantage: much more work, much more code, less reuse of debug
in new occasions
Flexible function interfaces (1)#
User-friendly environments (Python, Matlab, Maple, Mathematica, S-Plus, …) allow flexible function interfaces
First try:
# f is some data
plot(f)
More control of the plot:
plot(f, label='f', xrange=[0,10])
More fine-tuning:
plot(f, label='f', xrange=[0,10], title='f demo',
linetype='dashed', linecolor='red')
Flexible function interfaces (2)#
In C++, some flexibility is obtained using default argument values, e.g.,
void plot(const double[] & data, const char[] label ="",
const char[] title = "", const char[] linecolor = "black");
Limited flexibility, since the order of arguments is significant.
Python uses keyword arguments = function arguments with keywords and default values, e.g.,
def plot(data, label='', xrange=None, title='',
linetype='solid', linecolor='black', ...)
The sequence and number of arguments in the call can be chosen by the user.
Classification of languages (1)#
Many criteria can be used to classify computer languages.
Dynamically vs statically typed (or type-safe)#
Python (dynamic):
c = 1 # c is an integer
c = [1,2,3] # c is a list
C (static):
double c; c = 5.2; // c can only hold doubles
c = "a string..."; // compiler error
Classification of languages (2)#
Weakly vs strongly typed#
Perl (weak):
$b = '1.2'
$c = 5*$b; # implicit type conversion: '1.2' -> 1.2
Python (strong):
import math
b = "1.2"
# c = 5*b # legal, but probably not the result you want
# c = math.exp(b) # illegal, no implicit type conversion
c = math.exp(float(b)) # legal
print(c)
3.3201169227365472
Classification of languages (3)#
More classifications:
Interpreted vs compiled languages
High-level vs low-level languages (Python-C)
Scripting vs system languages
Turning files into code (1)#
Code can be constructed and executed at run-time
Consider an input file with the syntax
a = 1.2
no of iterations = 100
solution strategy = 'implicit'
c1 = 0
c2 = 0.1
A = 4
How can we read this file and define variables a
, no_of_iterations
, solution_strategy
, c1
, c2
, A
with the specified values?
Turning files into code (2)#
The answer lies in this short and generic code:
file = open("inputfile.dat")
for line in file:
variable, value = line.split("=") # separate the statement by the = sign
variable = variable.strip() # strip leading and trailing blanks
variable = variable.replace(" ", "_") # replace blanks by _
exec(variable + "=" + value) # magic...
print(A) # noqa
4
This cannot be done in Fortran, C or C++! Why?
Scripts can be slow#
Perl and Python scripts are first compiled to byte-code.
The byte-code is then interpreted.
Text processing is usually as fast as in C.
Loops over large data structures might be very slow:
for i in range(len(A)):
A[i] = ...
Fortran, C and C++ compilers are good at optimizing such loops at compile time and produce very efficient assembly code (e.g. 100 times faster).
Fortunately, long loops in scripts can easily be migrated to Fortran or C.
Scripts may be fast enough#
Read 100 000 (x,y) data from file and write (x,f(y)) out again
Pure Python: 4s
Pure Perl: 3s
Pure Tcl: 11s
Pure C (fscanf/fprintf): 1s
Pure C++ (iostream): 3.6s
Pure C++ (buffered streams): 2.5s
Numerical Python modules: 2.2s (!)
Remark: in practice, 100 000 data points are written and read in binary format, resulting in much smaller differences
When scripting is convenient#
The application’s main task is to connect together existing components
The design of the application code is expected to change significantly
The application performs extensive string/text manipulation
The application can be made short if it operates heavily on list or hash structures
CPU-time intensive parts can be migrated to C/C++ or Fortran
When to use C, C++, Java, Fortran#
Does the application implement complicated algorithms and data structures?
Does the application manipulate large datasets so that execution speed is critical?
Are the application’s functions well-defined and changing slowly?
Will type-safe languages be an advantage, e.g., in large development teams?
Some personal applications of scripting#
Get the power of Unix also in non-Unix environments
Automate manual interaction with the computer
Customize your own working environment and become more efficient
Increase the reliability of your work (what you did is documented in the script)
Have more fun!
Some business applications of scripting#
Many business sectors make use of scripting language internally:
Financial sector (Model prototyping, R&D),
Mobile App & Web companies (Development language)
Engineering (Setup of simulation models, R&D)
Python/bash knowledge is a welcomed skill for many jobs.
What about mission-critical operations?#
Scripting languages are free
What about companies that do mission-critical operations?
Can we use Python when sending people (or robots) to Mars?
Who is responsible for the quality of products?
The reliability of scripting tools#
Scripting languages are developed as a world-wide collaboration of volunteers (open source model)
The open source community as a whole is responsible for the quality
There is a single repository for the source code (plus mirror sites)
This source is read, tested and controlled by a very large number of people (and experts)
The reliability of large open source projects like Linux, Python, and Perl appears to be very good - at least as good as commercial software