Python help
Erik Price
erikprice at mac.com
Mon Feb 9 19:10:49 EST 2004
On Feb 8, 2004, at 9:31 PM, p.lussier at comcast.net wrote:
>> Java requires even more verbosity.
>
> This is my general impression of Java. Is the verbosity a good thing
> or not? It seems verbose to the point of redundancy. Is this
> helpful, or does it just get in the way?
The answer to that is that it's a matter of perspective, but I don't
think that Java would be as successful as it is if a majority of people
found it -too- redundant. There's no doubt about it -- the verbosity
of Java is overkill for simpler tasks like what you've accomplished
with your Python script, especially when there are languages like
Python, Perl, Ruby, and bash which can make this kind of thing a lot
easier. And, if you know you'll only ever invoke the program on a Unix
machine, you can do what another poster suggested and simply glue a
bunch of existing Unix tools together like awk, sed, grep, find, et
cetera, reducing the burden of actually programming the task to a level
of merely asking other tools to perform certain actions on the text.
But when things get a bit more complicated, this verbosity can be
helpful. (see below)
>> It took longer to write, even though I had already
>> prototyped the design in Python (the two designs are nearly
>> identical),
>
> Was it just the verbosity of Java which made it take so long?
Yes and no. No if you mean did the verbosity take longer to type.
Yes, because Java requires a great deal more syntax to say the same
thing that can be said with less syntax in Python, and subtle issues
surfaced when trying to compile the code and then run it. For
instance, there were numerous times I was attempting to use a class
without having imported it first. Another thing is that the verbosity
of Java means that there is more text to be conscious of using
correctly, so that right there leads to greater potential to make a
mistake.
I wrote that Java implementation in a text editor, which is a useable
but relatively primitive tool for a language that can be as verbose as
Java. It would have been faster to use an IDE. At work we use WSAD,
which speeds development by automatically importing any classes I
attempt to use, automatic method completion, popup API documentation
(sparing me a trip to the docs), realtime compiler error-checking, and
other luxuries. So, using a text editor, I'd make a change and then
jump back to the shell and type the command to compile the file. This
doesn't take long, because I use keyboard shortcuts and the command
history, but then I have to examine the compiler output when there's a
problem and jump back to the text editor and fix the mistake -- an IDE
will highlight the erroneous code, making it much faster to figure out
what you've done wrong.
But Python offers the interactive interpreter, which is a godsend when
trying to debug a problem or even just sketching out the script. If
you write a python script and invoke it with the -i option (python -i
scriptname), then after the script completes you are automatically
dumped into the interactive interpreter and any variables in your
script are now local to your python session so you can invoke functions
using arbitrary arguments, evaluate the values of variables, and other
conveniences.
(Yeah, I know that Real Programmers (tm) use vi/emacs/ed, but Real
Programmers also don't consider Java a Real Programming Language. ;)
>> and IMHO would also be more work to modify/extend. That said, if
>> handed a several million-line application written by some other
>> development team, I would rather the application be written in Java
>> than Python.
>
> Why? Performance, cleaner code, more robust language? What makes
> Java better than Python for some things? What types of things is
> Java best at?
In this particular case, the reason I say that is because Java is a
statically typed language, and Python is dynamically typed. There are
a lot of arguments about which is better, and I won't say one is better
than the other for all occasions -- but I happen to find a statically
typed language like Java to be easier to read once the application
exceeds a certain level of complexity. I anticipate some dissent on
this topic, mind you. But when I'm trying to navigate my way through a
twisted legacy framework of poorly-written source code to find a bug,
it's nice to see the type declarations reminding me that foo is a
FileManager and bar is a BufferedXmlMessage.
Of course, static type declarations are a pain in the ass in smaller
programs, or programs that I'll be writing entirely myself, since I
can't enjoy such flexibilities as this:
for item in iterableSequence:
item.doSomething()
Where iterableSequence could be a reference to a file object (so "item"
would be a line), or a database resultset object (so "item" would be a
row), or an object representing a list of users (so "item" would be a
user), etc. Unlike Java, Python doesn't require that your objects
actually declare that they implement a specific interface, so you can
substitute any object that will accept the message you are sending to
it ("doSomething") -- the compiler never checks to make sure that
"item" is the right type before attempting to send it the "doSomething"
message. Of course, if "item" doesn't support that message, you'll get
an exception. Which, you can trap if you want, or you can allow to
propagate upward.
As for performance, cleaner code, robustness of the language: I do not
have any evidence on hand, though no doubt there's plenty to be found
on the web. But I would bet that in many situations, Python is
comparable to Java in performance, in the sense that the difference
between Python and Java is nowhere near as great as the difference
between Python and C or Java and C. Clean code can be written in
Python or Java, though I would say that Python's easier to read in the
short run, but I stand by my assertion that for truly complex
applications, Java is easier to navigate. And about robustness, I
think they're probably pretty comparable.
The other really important thing about Java to keep in mind is that
it's designed to be binary-compatible across platforms. I realize that
in reality this doesn't always work quite as cleanly as Sun would like
to have you believe, but the very code I compiled on my MacOSX machine
when I finally tested the working version of EtcGroupToLdif should work
just fine on your Linux box or even a Windows box. Should. There's
plenty of stories of incompatibilities, and supposedly it's best to
compile your code on the architecture in which it's intended to be
deployed. But overall, Java code really can be deployed on just about
every major operating system. Now, it's true that most Perl, Python,
Ruby, and PHP scripts can too -- and if you write them without making
use of OS-specific features, they should work even more seamlessly than
Java. But Sun does its best to provide a Java-specific way to do
everything. The earlier suggestion about glueing a bunch of command
line tools together to do something would be great for a shell script,
since that's what it's designed for. It would work fine for a Python
script, though Python often provides its own libraries for doing some
of those things, and you would really limit yourself to a single OS by
using them (Unix). But in Java, it's actually a pain in the ass to
step outside of the Java mindset and work in collaboration with
OS-specific services and tools. In this sense, Java doesn't really fit
in well with the Unix philosophy -- when writing Java code, it's more
accurate to think of Java as its -own- platform or operating system
that happens to run on top of other platforms. It even provides its
own multi-threaded programming model, which AFAIU doesn't directly
correspond to the threading model of the OS that the program happens to
be running on.
By contrast, if you tell a Python programmer that "Python's too slow
for my project", then s/he will probably say "write it in Python, then
find the bottleneck and implement that part in C". This makes perfect
sense, assuming you're not trying to distribute the software as a
wholly platform-neutral shrinkwrapped product. I would argue that most
Python programs bigger than a homebrewed script/project are deployed by
at least somewhat knowledgeable Python users themselves, who can at
least try to diagnose a problem, or who will know which forum/mailing
list to turn to for help if there's a problem. I bet they are less
likely to require a support contract from an ISV before trying an
application. And I bet that in a majority of situations, the software
is being custom-developed or customized for a set of end users, so
having a truly shrinkwrappable product is not an objective of the
development effort anyway.
>> import java.io.BufferedReader;
>> import java.io.File;
>> import java.io.FileReader;
>> import java.io.IOException;
>>
>> import java.util.Arrays;
>> import java.util.Iterator;
>> import java.util.List;
>> import java.util.SortedSet;
>> import java.util.TreeSet;
>
> This almost seems rediculous :) 9 library imports, it seems that
> some should be so commonly used that they'd just be built-in
> or at least combined into something like stdio.
Each one of these entities is a class, and the classes actually came
from only two different packages -- java.io and java.util. In all
fairness, I should say that I could have simply written:
import java.io.*;
import java.util.*;
It's just that using wildcard imports is a pet peeve of mine, because
it means that when someone's reading the code and wants to read the
documentation on each class being used, they aren't sure whether
TreeSet is from the java.io package or the java.util package. (In this
case, it's pretty obvious that a Set comes from java.util, but in
giganto classes with dozens of imported classes, it's not always so
easy.) Also, if you're working on a substantial Java application then
you're probably using an IDE, which does all the importing for you (and
even tells you when you've imported a class that's not really being
used). IDEs also make it much easier to figure out whether a class
came from one package or the other even when wildcard imports -are-
being used.
In fact, I have to say this about Java -- it's a great language for
very large applications composed of many hundreds or thousands of
classes IMHO, but it's also a language that pretty much demands an IDE.
I learned Java using a text editor and the command line, but those two
tools don't scale up well to real application development, and really
don't make the most of your time when there are some great free open
source IDEs out there that can basically take a lot of the "work" out
of using Java. WSAD is based on the open source Eclipse project, and
Eclipse is simply an amazing tool if you're a Java developer.
But I have to say this about Python -- if your project is one that is
manageable using Python, then I personally feel it's the better
language for the job. It's a lot simpler than Java (or other languages
IMHO), and I really believe in keep it simple. But it's got a lot of
the things I like about Java, such as a clean OO paradigm and strong
typing, and a bunch of things that Java just doesn't offer -- such as
the ability to use functions, dynamic typing/method dispatch, and just
general ease of use.
Although I know a couple of other languages, and make my living using
Java, I use Python for nearly *everything* I do in my own time, and
even to write quick tools that I use at work to make my life easier.
> Thanks a bunch. Part of trouble with learning languages is the lack
> of "real world" applications to try them out with. One reason I know
> perl so well is because the language is designed to do exactly what I
> do all the time; text munging. Something C and Java aren't
> especially efficient at it seems :)
Perl is the ideal language for text munging IMHO. It has regexes built
right into the language's syntax, a lot of conveniences for reading
text and files and reformatting text too. I'd say that between an
equally skilled Perl programmer and Python programmer, the Perl
programmer will get a text munging task done quicker, simply for the
very reason that you mention, which is that the language was designed
with this task in mind. But I'm glad you're giving it a try in Python.
I personally do all my text munging in Python (and am even starting to
forget some of the Perlisms that used to be second nature, such as
whether the regex match operator is =~ or ~= , or what the various
quoting operators are [qq, qr, qx, etc]) and have no problems, nor do I
feel less productive than I was when I used Perl for that purpose.
Some things are even easier [for me] in Python, like using an XML
parser.
I guess every language has its real strengths and weaknesses. It seems
to me that C is great when you need really fast code, but I find it
harder to work with than Java in terms of maintaining and debugging
problems. Perl is great for text munging, but I hate reading other
people's Perl code (except that of Randal Schwartz, who writes really
clean and understandable Perl) and I really dislike Perl's OO model.
Python has a clean syntax and OO model, but isn't a particularly fast
language if you need speed, and I don't think it's quite as
maintainable as Java for very very large programs. Java is overall a
pretty well-designed language, but not designed for scripting, nor as
fast as straight C for most tasks. Some people say that Java's biggest
advantage is when an HR dept is looking at your resume.
;)
Erik
More information about the gnhlug-discuss
mailing list