Those who know me are aware that I’ve been following play framework,
and actively taking part of it’s community, for a couple of years.
Playframework 2.0 is right around the corner, and it’s core is programmed in Scala, so it’s a wonderful opportunity to give this object-oriented / functional hybrid beast a try…
Like many others, I will pick a very simple script to give my first steps…
Finding an excuse to give Scala a try
With a couple of friends we are on the way to translate play framework documentation to spanish (go have a look at it at http://playdoces.appspot.com/, by the way, you are more than welcome to collaborate with us)
The documentation is composed of a bunch of .textile files, and I had
a very simple and silly bash script to track our advance. Every file
that has not yet been translated has the phrase “todavÃa no ha sido
traducida†in it’s first line
1 | echo pending: `grep "todavÃa no ha sido traducida" * | wc -l` / `ls | wc -l` |
Which produced something like
Pretty simple, right?
I just wanted to develop a simple scala script to count the
translated files, and also it’s size, to know how much work we had
ahead.
Scala as a scripting language
Using scala as a scripting language is pretty simple. Just enter some scala code in a text file, and execute it with “scala file.scala“.
You can also try it with the interactive interpreter, better knonw as
REPL (well, it’s not really an interpreter, but a Read-Evaluate-Print
Loop, that’s where the REPL name comes from).
In linux, you can also excute them directly from the shell marking
the scala file as executable and adding these lines to the beginning of
the file.
Classes and type inference in scala
So I created a DocumentationFile, with a name, length and an isTranslated property.
1 | class DocumentationFile(val file: File) { |
3 | val name = file.getName |
4 | val length = file.length |
5 | val isTranslated = (firstLine.indexOf("Esta página todavÃa no ha sido traducida al castellano") == -1) |
7 | def firstLine = new BufferedReader(new FileReader(file)).readLine |
Scala takes away a lot of boilerplate code. The constructor is right
there, along with the class declaration. In our case, the
DocumentationFile constructor takes a java.io.File as argument.
Scala also makes heavy use of type inference to alleviate us from
having to declare every variable’s type. That’s why you don’t have to
specify that name is a String, length a Long and isTranslated a
Boolean. You still have to declare types on method’s arguments, but
usually you can omit them everywhere else.
Working with collections
Next I needed to get all textile files from the current directory,
instantiate a DocumentationFile for each of them, and save them in an
Array for later processing.
3 | val docs = new File(".").listFiles |
4 | .filter(_.getName.endsWith(".textile")) // process only textile files |
5 | .map(new DocumentationFile(_)) |
Technically speaking is just one line of code. The “_†is just
syntactic sugar, we could have written it in a more verbose way like
this:
1 | val docs = new File(".").listFiles |
2 | .filter( file => file.getName.endsWith(".textile") ) // process only textile files |
3 | .map( file => new DocumentationFile(file) ) |
Or if you are a curly braces fun:
1 | val docs = new File(".").listFiles |
3 | file.getName.endsWith(".textile") // process only textile files |
6 | new DocumentationFile(file) |
Higher order functions
Once we have all textile files, we’ll need the translated ones.
1 | val translated = docs.filter(_.isTranslated) |
Here we are passing the filter method a function as parameter (that’s
what is called a higher order function). That function is evaluated for
every item in the Array, and if it returns true, that item is added to
the resulting Array. The “_.isTranslated†stuff is once again just
syntactic sugar. We could have also written the function as follows:
1 | val translated = docs.filter( (doc: DocumentationFile) => doc.isTranslated ) |
Functional versus imperative: To var or not to var
Now I need to calculate the quantity and size of the translated and
not yet translated files. Counting the files is pretty easy, just have
to use “translated.length†to know how many files have been translated
so far. But for counting their size I have to sum the size of each one
of them.
This was my first attempt:
1 | var translatedLength = 0L |
2 | translated.foreach( translatedLength += _.length ) |
In scala we can declare variables with the “var†and “valâ€
keywords, the first ones are mutable, while the later one ar
immutables. Mutable variables are read-write, while immutable variables
can’t be reassigned once their value has been established (think of them
like final variables in Java).
While scala allows you to work in an imperative or functional style, it really encourages the later one. Programming in scala,
kind of the scala bible, even teaches how to refactor your code to
avoid the use of mutable variables, and get your head used to a more
functional programming style.
These are several ways I’ve found to calculate it in a more functional style (thanks to stack overflow!)
01 | val translatedLength: Long = translated.fold(0L)( (acum: Long, element: DocumentFile) => acum + element.length ) |
03 | //type inference to the rescue |
04 | val translatedLength = translated.foldLeft(0L)( (acum, element) => acum + element.length ) |
07 | val translatedLength = translated.foldLeft(0L)( _ + _.length ) |
09 | // yes, if statement is also an expression, just like the a ? b : c java operator. |
10 | val translatedLength = if (translated.length == 0) 0 else translated.map(_.length).sum |
I’ve finally settled with this simple and short form:
1 | val translatedLength = translated.map(_.length).sum |
2 | val docsLength = docs.map(_.length).sum |
Default parameters and passing functions as arguments
Now I have all the information I needed, so I just have to show it on screen. I also wanted to show the file size in kbs.
Once again this was my first attempt:
02 | "translated size: " + asKB(translatedLength) + "/" + asKB(docsLength) + " " + |
03 | translatedLength * 100 / docsLength + "% " |
07 | "translated files: " + translated.length + "/" + docs.length + " " + |
08 | translated.length * 100 / docs.length + "% " |
11 | def asKB(length: Long) = (length / 1000) + "kb" |
And this was the output:
1 | translated size: 256kb/612kb 41% |
2 | translated files: 24/64 37% |
Well, it worked, but it could definitely be improved, there was too much code duplication.
So I created a function that took care of it all:
02 | title: String = "status", |
03 | current: Long, total: Long, |
04 | format: (Long) => String = (x) => x.toString): String = { |
06 | val percent = current * 100 / total |
08 | title + ": " + format(current) + "/" + format(total) + " " + |
10 | " (pending " + format(total - current) + " " + |
The only tricky part is the format parameter. It’s just a higher
order function, that by default just converts the passed number to a
String.
We use that function like this:
2 | status("translated size", translatedLength, docsLength, (length) => asKB(length) ) |
6 | status("translated files", translated.length, docs.length) |
And that’s it.
It’s really easy to achieve this kind of stuff using scala as a
scripting language, and on the way you may learn a couple of interesting
concepts, and give your first steps into functional programming.
This is the complete script, here you have a github gist and you can also find it in the play spanish documentation project.
07 | val docs = new File(".").listFiles |
08 | .filter(_.getName.endsWith(".textile")) // process only textile files |
09 | .map(new DocumentationFile(_)) |
11 | val translated = docs.filter(_.isTranslated) // only already translated files |
13 | val translatedLength = translated.map(_.length).sum |
14 | val docsLength = docs.map(_.length).sum |
17 | status("translated size", translatedLength, docsLength, (length) => asKB(length) ) |
21 | status("translated files", translated.length, docs.length) |
25 | title: String = "status", |
26 | current: Long, total: Long, |
27 | format: (Long) => String = (x) => x.toString): String = { |
29 | val percent = current * 100 / total |
31 | title + ": " + format(current) + "/" + format(total) + " " + |
33 | " (pending " + format(total - current) + " " + |
37 | def asKB(length: Long) = (length / 1000) + "kb" |
39 | class DocumentationFile(val file: File) { |
41 | val name = file.getName |
42 | val length = file.length |
43 | val isTranslated = (firstLine.indexOf("Esta página todavÃa no ha sido traducida al castellano") == -1) |
45 | override def toString = "name: " + name + ", length: " + length + ", isTranslated: " + isTranslated |
47 | def firstLine = new BufferedReader(new FileReader(file)).readLine |
Source:http://playlatam.wordpress.com/2011/12/05/first-steps-with-scala-say-goodbye-to-bash-scripts/