Python help

Erik Price erikprice at mac.com
Mon Feb 9 19:10:49 EST 2004


On Feb 8, 2004, at 9:31 PM, p.lussier at comcast.net wrote:

>> Java requires even more verbosity.
>
> This is my general impression of Java.  Is the verbosity a good thing
> or not?  It seems verbose to the point of redundancy.  Is this
> helpful, or does it just get in the way?

The answer to that is that it's a matter of perspective, but I don't 
think that Java would be as successful as it is if a majority of people 
found it -too- redundant.  There's no doubt about it -- the verbosity 
of Java is overkill for simpler tasks like what you've accomplished 
with your Python script, especially when there are languages like 
Python, Perl, Ruby, and bash which can make this kind of thing a lot 
easier.  And, if you know you'll only ever invoke the program on a Unix 
machine, you can do what another poster suggested and simply glue a 
bunch of existing Unix tools together like awk, sed, grep, find, et 
cetera, reducing the burden of actually programming the task to a level 
of merely asking other tools to perform certain actions on the text.  
But when things get a bit more complicated, this verbosity can be 
helpful.  (see below)


>> It took longer to write, even though I had already
>> prototyped the design in Python (the two designs are nearly 
>> identical),
>
> Was it just the verbosity of Java which made it take so long?

Yes and no.  No if you mean did the verbosity take longer to type.  
Yes, because Java requires a great deal more syntax to say the same 
thing that can be said with less syntax in Python, and subtle issues 
surfaced when trying to compile the code and then run it.  For 
instance, there were numerous times I was attempting to use a class 
without having imported it first.  Another thing is that the verbosity 
of Java means that there is more text to be conscious of using 
correctly, so that right there leads to greater potential to make a 
mistake.

I wrote that Java implementation in a text editor, which is a useable 
but relatively primitive tool for a language that can be as verbose as 
Java.  It would have been faster to use an IDE.  At work we use WSAD, 
which speeds development by automatically importing any classes I 
attempt to use, automatic method completion, popup API documentation 
(sparing me a trip to the docs), realtime compiler error-checking, and 
other luxuries.  So, using a text editor, I'd make a change and then 
jump back to the shell and type the command to compile the file.  This 
doesn't take long, because I use keyboard shortcuts and the command 
history, but then I have to examine the compiler output when there's a 
problem and jump back to the text editor and fix the mistake -- an IDE 
will highlight the erroneous code, making it much faster to figure out 
what you've done wrong.

But Python offers the interactive interpreter, which is a godsend when 
trying to debug a problem or even just sketching out the script.  If 
you write a python script and invoke it with the -i option (python -i 
scriptname), then after the script completes you are automatically 
dumped into the interactive interpreter and any variables in your 
script are now local to your python session so you can invoke functions 
using arbitrary arguments, evaluate the values of variables, and other 
conveniences.

(Yeah, I know that Real Programmers (tm) use vi/emacs/ed, but Real 
Programmers also don't consider Java  a Real Programming Language. ;)

>> and IMHO would also be more work to modify/extend.  That said, if
>> handed a several million-line application written by some other
>> development team, I would rather the application be written in Java
>> than Python.
>
> Why?  Performance, cleaner code, more robust language?  What makes
> Java better than Python for some things?  What types of things is
> Java best at?

In this particular case, the reason I say that is because Java is a 
statically typed language, and Python is dynamically typed.  There are 
a lot of arguments about which is better, and I won't say one is better 
than the other for all occasions -- but I happen to find a statically 
typed language like Java to be easier to read once the application 
exceeds a certain level of complexity.  I anticipate some dissent on 
this topic, mind you.  But when I'm trying to navigate my way through a 
twisted legacy framework of poorly-written source code to find a bug, 
it's nice to see the type declarations reminding me that foo is a 
FileManager and bar is a BufferedXmlMessage.

Of course, static type declarations are a pain in the ass in smaller 
programs, or programs that I'll be writing entirely myself, since I 
can't enjoy such flexibilities as this:

for item in iterableSequence:
     item.doSomething()

Where iterableSequence could be a reference to a file object (so "item" 
would be a line), or a database resultset object (so "item" would be a 
row), or an object representing a list of users (so "item" would be a 
user), etc.  Unlike Java, Python doesn't require that your objects 
actually declare that they implement a specific interface, so you can 
substitute any object that will accept the message you are sending to 
it ("doSomething") -- the compiler never checks to make sure that 
"item" is the right type before attempting to send it the "doSomething" 
message.  Of course, if "item" doesn't support that message, you'll get 
an exception.  Which, you can trap if you want, or you can allow to 
propagate upward.

As for performance, cleaner code, robustness of the language: I do not 
have any evidence on hand, though no doubt there's plenty to be found 
on the web.  But I would bet that in many situations, Python is 
comparable to Java in performance, in the sense that the difference 
between Python and Java is nowhere near as great as the difference 
between Python and C or Java and C.  Clean code can be written in 
Python or Java, though I would say that Python's easier to read in the 
short run, but I stand by my assertion that for truly complex 
applications, Java is easier to navigate.  And about robustness, I 
think they're probably pretty comparable.

The other really important thing about Java to keep in mind is that 
it's designed to be binary-compatible across platforms.  I realize that 
in reality this doesn't always work quite as cleanly as Sun would like 
to have you believe, but the very code I compiled on my MacOSX machine 
when I finally tested the working version of EtcGroupToLdif should work 
just fine on your Linux box or even a Windows box.  Should.  There's 
plenty of stories of incompatibilities, and supposedly it's best to 
compile your code on the architecture in which it's intended to be 
deployed.  But overall, Java code really can be deployed on just about 
every major operating system.  Now, it's true that most Perl, Python, 
Ruby, and PHP scripts can too -- and if you write them without making 
use of OS-specific features, they should work even more seamlessly than 
Java.  But Sun does its best to provide a Java-specific way to do 
everything.  The earlier suggestion about glueing a bunch of command 
line tools together to do something would be great for a shell script, 
since that's what it's designed for.  It would work fine for a Python 
script, though Python often provides its own libraries for doing some 
of those things, and you would really limit yourself to a single OS by 
using them (Unix).  But in Java, it's actually a pain in the ass to 
step outside of the Java mindset and work in collaboration with 
OS-specific services and tools.  In this sense, Java doesn't really fit 
in well with the Unix philosophy -- when writing Java code, it's more 
accurate to think of Java as its -own- platform or operating system 
that happens to run on top of other platforms.  It even provides its 
own multi-threaded programming model, which AFAIU doesn't directly 
correspond to the threading model of the OS that the program happens to 
be running on.

By contrast, if you tell a Python programmer that "Python's too slow 
for my project", then s/he will probably say "write it in Python, then 
find the bottleneck and implement that part in C".  This makes perfect 
sense, assuming you're not trying to distribute the software as a 
wholly platform-neutral shrinkwrapped product.  I would argue that most 
Python programs bigger than a homebrewed script/project are deployed by 
at least somewhat knowledgeable Python users themselves, who can at 
least try to diagnose a problem, or who will know which forum/mailing 
list to turn to for help if there's a problem.  I bet they are less 
likely to require a support contract from an ISV before trying an 
application.  And I bet that in a majority of situations, the software 
is being custom-developed or customized for a set of end users, so 
having a truly shrinkwrappable product is not an objective of the 
development effort anyway.

>> import java.io.BufferedReader;
>> import java.io.File;
>> import java.io.FileReader;
>> import java.io.IOException;
>>
>> import java.util.Arrays;
>> import java.util.Iterator;
>> import java.util.List;
>> import java.util.SortedSet;
>> import java.util.TreeSet;
>
> This almost seems rediculous :)  9 library imports, it seems that
> some should be so commonly used that they'd just be built-in
> or at least combined into something like stdio.

Each one of these entities is a class, and the classes actually came 
from only two different packages -- java.io and java.util.  In all 
fairness, I should say that I could have simply written:

import java.io.*;
import java.util.*;

It's just that using wildcard imports is a pet peeve of mine, because 
it means that when someone's reading the code and wants to read the 
documentation on each class being used, they aren't sure whether 
TreeSet is from the java.io package or the java.util package.  (In this 
case, it's pretty obvious that a Set comes from java.util, but in 
giganto classes with dozens of imported classes, it's not always so 
easy.)  Also, if you're working on a substantial Java application then 
you're probably using an IDE, which does all the importing for you (and 
even tells you when you've imported a class that's not really being 
used).  IDEs also make it much easier to figure out whether a class 
came from one package or the other even when wildcard imports -are- 
being used.

In fact, I have to say this about Java -- it's a great language for 
very large applications composed of many hundreds or thousands of 
classes IMHO, but it's also a language that pretty much demands an IDE. 
  I learned Java using a text editor and the command line, but those two 
tools don't scale up well to real application development, and really 
don't make the most of your time when there are some great free open 
source IDEs out there that can basically take a lot of the "work" out 
of using Java.  WSAD is based on the open source Eclipse project, and 
Eclipse is simply an amazing tool if you're a Java developer.

But I have to say this about Python -- if your project is one that is 
manageable using Python, then I personally feel it's the better 
language for the job.  It's a lot simpler than Java (or other languages 
IMHO), and I really believe in keep it simple.  But it's got a lot of 
the things I like about Java, such as a clean OO paradigm and strong 
typing, and a bunch of things that Java just doesn't offer -- such as 
the ability to use functions, dynamic typing/method dispatch, and just 
general ease of use.

Although I know a couple of other languages, and make my living using 
Java, I use Python for nearly *everything* I do in my own time, and 
even to write quick tools that I use at work to make my life easier.

> Thanks a bunch.  Part of trouble with learning languages is the lack
> of "real world" applications to try them out with.  One reason I know
> perl so well is because the language is designed to do exactly what I
> do all the time; text munging.  Something C and Java aren't
> especially efficient at it seems :)

Perl is the ideal language for text munging IMHO.  It has regexes built 
right into the language's syntax, a lot of conveniences for reading 
text and files and reformatting text too.  I'd say that between an 
equally skilled Perl programmer and Python programmer, the Perl 
programmer will get a text munging task done quicker, simply for the 
very reason that you mention, which is that the language was designed 
with this task in mind.  But I'm glad you're giving it a try in Python. 
  I personally do all my text munging in Python (and am even starting to 
forget some of the Perlisms that used to be second nature, such as 
whether the regex match operator is =~ or ~= , or what the various 
quoting operators are [qq, qr, qx, etc]) and have no problems, nor do I 
feel less productive than I was when I used Perl for that purpose.  
Some things are even easier [for me] in Python, like using an XML 
parser.

I guess every language has its real strengths and weaknesses.  It seems 
to me that C is great when you need really fast code, but I find it 
harder to work with than Java in terms of maintaining and debugging 
problems.  Perl is great for text munging, but I hate reading other 
people's Perl code (except that of Randal Schwartz, who writes really 
clean and understandable Perl) and I really dislike Perl's OO model.  
Python has a clean syntax and OO model, but isn't a particularly fast 
language if you need speed, and I don't think it's quite as 
maintainable as Java for very very large programs.  Java is overall a 
pretty well-designed language, but not designed for scripting, nor as 
fast as straight C for most tasks.  Some people say that Java's biggest 
advantage is when an HR dept is looking at your resume.

;)


Erik




More information about the gnhlug-discuss mailing list