CSC447: Lecture 1

Overview of Course

Overview of Programming Languages

Implementation Strategies

Describing Syntax

Imperative Programming

Introduction to Perl

Overview of Course

Why study Programming Languages?

Learn new languages more easily

Choose appropriate languages

Program better

Design better languages

Contact Information

Office: CS&T Center 604

Office Hours: Wed 4:30 - 5:30, Thu 4:30 - 5:30

Phone: (312) 362-8322, Fax: (312) 362-6116

Email: ajeffrey@cs.depaul.edu

Course Web Page: http://klee.cs.depaul.edu/csc447/

Languages taught

Imperative: Perl

Functional: ML

Object-oriented: Java

Texts

Recommended:

Ravi Sethi, Programming Languages: Concepts and Constructs, Addison-Wesley, 1996.

Optional:

Randal L. Schwartz, Erik Olson & Tom Christiansen, Learning Perl on Win32 Systems, O'Reilly and Associates, 1997.

Ken Arnold and James Gosling, The Java Programming Language, Second Edition, 1998.

Larry Wall, Tom Christiansen & Randal L. Schwartz, Programming Perl, 2nd Edition, O'Reilly 1997.

Jeffrey D. Ullman, Elements of ML Programming (ML97 Edition), Prentice Hall, 1998.

Prerequisites

C++ Programming (CSC215 or CSC225)

Foundations of Computer Science (CSC415, CSC416, CSC417)

Operating Systems (CSC343)

Computer Architecture (CSC345)

Grading

Weekly homeworks: total 50% of course marks.

Mid-term: 25% of course marks.

Final: 25% of course marks.

Students are expected to write their own assignments. The penalty for plagiarism is a grade of F in the course.

Late homework will not be accepted.

Web Pages

http://klee.cs.depaul.edu/csc447/

Course Page

Announcements

Syllabus

Lectures

Homework Assignments

Useful Links

Video Students

Same due dates and assignments

Must come to Midterm and Final

Get assignments from the web page

Keep in touch (email, office, telephone)

Come to class when you can

Overview of Programming Languages

Toward Higher-level Languages

Binary machine language: 0101100 01110101 10011000...

Assembly language: MOV L1 R1; MOV L2 R2; ADD; MOV R3 L3

Imperative programming: x + y;

Structured programming: while, for, but goto considered harmful.

Functional programming: := considered harmful.

OO programming: encapsulate state and code together in objects.

Other paradigms...

Imperative Programming Languages (1957-)

Fortran I, Fortran II, Fortran IV, Fortran 77, Fortran 90

ALGOL 58, ALGOL 60, ALGOL W

Pascal, MODULA-2, MODULA-3, Oberon

CPL, BCPL, B, C, ANSI C

BASIC, Visual Basic

COBOL

ADA

Perl

Functional Programming Languages (1958-)

Lisp, Scheme, Common Lisp

ML, Standard ML, CAML, ML97

Lazy ML

Miranda

Haskell, Gofer

Object-Oriented Programming Languages (1967-)

Simula

Smalltalk

C++

Eiffel

Java

Other Programming Paradigms

Logic

Concurrent

Constraint

Visual

Some Language Evaluation Criteria

Syntax

Control Structures

Data types

Support for Abstraction

Expressiveness

Type System

Efficiency

Some Influences on Programming Language Design

Application Domain

Available Hardware

Programming Methodology

Implementation Methods

Programming Environments

Translating from high-level to low-level languages

Programming languages are much higher-level now than in the 1950s.

Machine architecture hasn't changed much (additions: virtual memory, caches, pipelines).

How do we get from high-level:

  x := y + z;

to low-level:

  MOV X R1
  MOV Y R2
  ADD
  MOV R3 Z

Compilers

Compilers use the traditional compile/link/run strategy:

Examples: C, C++, ML.

Interpreters

Interpreters execute the source code directly:

Examples: BASIC, Perl, TCL/Tk, ML.

Virtual Machines

Virtual machines use an intermediate byte code rather than native object code:

Examples: Pascal p-code, Java.

Just-in-time compilers

Just-in-time compilers are like VMs, but compile the byte code to native code:

Examples: Java.

Describing Syntax

Structure of Programming Languages

Syntax: does it `look like a program'?

Static semantics: does it compile?

Dynamic semantics: what does it do when it runs?

Programming Language Descriptions

Tutorials

Reference Manuals

Formal Definitions:

In this course, we shall concentrate on syntax, and `wave our hands' about statics and dynamics.

Lexical Structure

The lexical structure of a program defines the lexical tokens of the language including:

For example:

  return -x;

the tokens are:

The lexical structure also says which characters should be ignored, such as white space or comments.

Lexical structure

A language's lexical structure is given by a grammar.

For example:

  <identifier> ::= <alphabetic> { <alphanumeric> }
  <alphanumeric> ::= <alphabetic> | <numeric> | _
  <alphabetic> ::= a-z | A-Z
  <numeric> ::= 0-9

The grammar defines a language, that is a set of words, for example:

  a  a1  aFish  A_FISH  

but not:

  @  1a  a$nake  _A_FOUL

A quick reminder about grammars coming up...

Grammars

You have already seen grammars in CSC417.

A grammar is given by definitions of nonterminals:

  <foo> ::= { a | b }

A grammar defines a language: in this case any sequence of as and bs:

  a  b  aa  ab  ba  bb  aaa  aab ...

Extended BNF grammars

The syntax for BNF we will use is:

EBNF adds some useful syntax sugar:

In addition, we shall use ranges such as a-z or 0-9.

For example:

  <integer> ::= [ '+' | '-' ] <digit> { <digit> }
  <digit> ::= 0-9

Converting EBNF to BNF

EBNF is just syntax sugar to make BNF more readable.

We can rewrite:

  <foo> ::= { <stuff> }
  <bar> ::= [ <stuff> ]

as:

  <foo> ::= <empty> | <stuff> <foo>
  <bar> ::= <empty> | <stuff>
  <empty> ::= 

We can unfold 0-9 as 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Expressions

Programs have more structure than just a stream of tokens.

Tokens can be built together to form expressions:

An expression consists of an operator and a collection of subexpressions.

Abstract syntax trees

Abstract syntax trees (ASTs) show the structure of an expression independent of the original programming language.

For example x + y is written differently in different languages:

but they all have the same AST:

In general, the AST for op (E1,...Ek) is:

Abstract syntax trees

Abstract syntax trees `scale up' to entire programs.

For example:

  if (x > 0) { return x; } else { return -x; }

is drawn:

Trees larger than this are stored inside compilers, not drawn by people!

Grammars

Expression languages are specified using grammars.

For example:

  <exp> ::= <identifier> | <literal> | <unary> <exp> | <exp> <binary> <exp>
  <binary> ::= '<' | '>' | '+' | '-' | ...
  <unary> ::= '-' | ...

This generates a language including:

  x+1  x+y+z  1+2-3  ...

From the grammar and an expression we can generate a parse tree.

For example x + y + z is parsed:

Abstract vs Concrete syntax

The grammar defines a concrete grammar.

The parse trees contain a lot more detail than the abstract syntax trees.

Compare:

with:

Ambiguity

We could have parsed x + y + z as:

Such grammars are ambiguous, which is a bad thing!

We need to resolve the ambiguity by specifying whether + associates to the right:

or to the left:

This is either done informally, or by editing the grammar.

Ambiguity

How many ways can we parse x + y * z?

Most languages include precedence of operators.

For example, we can specify that * is of higher precedence than + by saying:

  <exp> ::= <plusexp>
  <plusexp> ::= <timesexp> | <plusexp> + <plusexp>
  <timesexp> ::= <atomexp> | <timesexp> * <timesexp>
  <atomexp> ::= <literal> | '(' <exp> ')'

Now how many ways are there to parse x + y * z?

Ambiguity

With the grammar:

  <statement> ::= if <exp> then <statement>
               |  if <exp> then <statement> else <statement>

how do we parse:

  if A then if B then C else D

Parser tools

Parsing programs by hand is painful.

Writing programs to parse programs is still painful (there's lots of pitfalls, and writing an efficient parser is a black art...)

Use a parser generator instead.

Parser generators take an input which is part programming language and part grammar.

For example Sun's JavaCC tool takes input like:

  void exp() : {} {
    literal() | unary() exp() | exp() binary() exp()
  }

Parser generators can also spot ambiguity, and generate efficient parsers.

Imperative Programming

Common Features of Imperative Languages

Imperative languages are based on assignment.

Size and layout of data structures are determined statically (at compile-time).

Storage is allocated and deallocated explicitly.

The type declaration determines the data representation.

Types in Imperative languages

Base types:

Compound types:

Some languages also supply common Abstract Datatypes such as enumeration types, hash tables, lists, etc.

Issues with Types

`Good design' for imperative languages:

Perl: a sample imperative language

How `well-designed' is Perl?

Perl:

i.e. it breaks every rule in language design.

Larry Wall (the author of Perl) says:

The language is intended to be practical (easy to use, efficient, complete) rather than beautiful (tiny, elegant, minimal).

Language features

Powerful string manipulation.

Associative arrays (aka hash tables) built in.

Powerful libraries for internet programming, generating on-the-fly web pages, graphics, etc.

Quick prototyping (it's an interpreted language).

`Write-only programming language'.

Great for sysadmin tasks.

Why system administrators are a lot like Santa Claus (from rec.humor.funny)

When you ask Santa for something, the odds of receiving what you wanted are infinitesimal.

Santa seldom answers your mail.

When you ask Santa where he gets all the stuff he's got, he says, `Elves make it for me'

Santa doesn't care about your deadlines.

Nobody knows who Santa has to answer to for his actions.

Santa laughs entirely too much.

Santa thinks nothing of breaking into your $HOME.

Only a lunatic says bad things about Santa in his presence.

Pragmatics

Perl is available for download, see the Useful Links section of the web page.

Download and install it.

Create a perl program in your favourite text editor:

  print "Hello world!\n";

Save it as a text file hw.pl

Run perl on it at the command line:

  perl hw.pl

That's it!

Next Class

More Perl

More issues in imperative programming