First steps with Scala, say goodbye to bash scripts…

  opensas        2012-01-12 06:45:16       2,704        0    

Those who know me are aware that I’ve been following play framework, and actively taking part of it’s community, for a couple of years.

Playframework 2.0 is right around the corner, and it’s core is programmed in Scala, so it’s a wonderful opportunity to give this object-oriented / functional hybrid beast a try…

Like many others, I will pick a very simple script to give my first steps…

Finding an excuse to give Scala a try

With a couple of friends we are on the way to translate play framework documentation to spanish (go have a look at it at http://playdoces.appspot.com/, by the way, you are more than welcome to collaborate with us)

The documentation is composed of a bunch of .textile files, and I had a very simple and silly bash script to track our advance. Every file that has not yet been translated has the phrase “todavía no ha sido traducida” in it’s first line

1echo pending: `grep "todavía no ha sido traducida" * | wc -l` / `ls | wc -l`

Which produced something like

1pending: 40 / 63

Pretty simple, right?

I just wanted to develop a simple scala script to count the translated files, and also it’s size, to know how much work we had ahead.

Scala as a scripting language

Using scala as a scripting language is pretty simple. Just enter some scala code in a text file, and execute it with “scala file.scala“. You can also try it with the interactive interpreter, better knonw as REPL (well, it’s not really an interpreter, but a Read-Evaluate-Print Loop, that’s where the REPL name comes from).

In linux, you can also excute them directly from the shell marking the scala file as executable and adding these lines to the beginning of the file.

1#!/bin/sh
2exec scala "$0" "$@"
3!#

Classes and type inference in scala

So I created a DocumentationFile, with a name, length and an isTranslated property.

1class DocumentationFile(val file: File) {
2 
3  val name = file.getName
4  val length = file.length
5  val isTranslated = (firstLine.indexOf("Esta página todavía no ha sido traducida al castellano") == -1)
6 
7  def firstLine = new BufferedReader(new FileReader(file)).readLine
8 
9}

Scala takes away a lot of boilerplate code. The constructor is right there, along with the class declaration. In our case, the DocumentationFile constructor takes a java.io.File as argument.

Scala also makes heavy use of type inference to alleviate us from having to declare every variable’s type. That’s why you don’t have to specify that name is a String, length a Long and isTranslated a Boolean. You still have to declare types on method’s arguments, but usually you can omit them everywhere else.

Working with collections

Next I needed to get all textile files from the current directory, instantiate a DocumentationFile for each of them, and save them in an Array for later processing.

1import java.io._
2 
3val docs = new File(".").listFiles
4  .filter(_.getName.endsWith(".textile"))   // process only textile files
5  .map(new DocumentationFile(_))

Technically speaking is just one line of code. The “_” is just syntactic sugar, we could have written it in a more verbose way like this:

1val docs = new File(".").listFiles
2  .filter( file => file.getName.endsWith(".textile") )   // process only textile files
3  .map( file => new DocumentationFile(file) )

Or if you are a curly braces fun:

1val docs = new File(".").listFiles
2  .filter { file =>
3    file.getName.endsWith(".textile")         // process only textile files
4  }  
5  .map { file =>
6    new DocumentationFile(file)
7  }

Higher order functions

Once we have all textile files, we’ll need the translated ones.

1val translated = docs.filter(_.isTranslated)

Here we are passing the filter method a function as parameter (that’s what is called a higher order function). That function is evaluated for every item in the Array, and if it returns true, that item is added to the resulting Array. The “_.isTranslated” stuff is once again just syntactic sugar. We could have also written the function as follows:

1val translated = docs.filter( (doc: DocumentationFile) => doc.isTranslated )

Functional versus imperative: To var or not to var

Now I need to calculate the quantity and size of the translated and not yet translated files. Counting the files is pretty easy, just have to use “translated.length” to know how many files have been translated so far. But for counting their size I have to sum the size of each one of them.

This was my first attempt:

1var translatedLength = 0L
2translated.foreach( translatedLength += _.length )

In scala we can declare variables with the “var” and “val” keywords, the first ones are mutable, while the later one ar immutables. Mutable variables are read-write, while immutable variables can’t be reassigned once their value has been established (think of them like final variables in Java).

While scala allows you to work in an imperative or functional style, it really encourages the later one. Programming in scala, kind of the scala bible, even teaches how to refactor your code to avoid the use of mutable variables, and get your head used to a more functional programming style.

These are several ways I’ve found to calculate it in a more functional style (thanks to stack overflow!)

01val translatedLength: Long = translated.fold(0L)( (acum: Long, element: DocumentFile) => acum + element.length )
02 
03//type inference to the rescue
04val translatedLength = translated.foldLeft(0L)( (acum, element) => acum + element.length )
05 
06//syntactic sugar
07val translatedLength = translated.foldLeft(0L)( _ + _.length )
08 
09// yes, if statement is also an expression, just like the a ? b : c java operator.
10val translatedLength = if (translated.length == 0) 0 else translated.map(_.length).sum

I’ve finally settled with this simple and short form:

1val translatedLength = translated.map(_.length).sum
2val docsLength = docs.map(_.length).sum

Default parameters and passing functions as arguments

Now I have all the information I needed, so I just have to show it on screen. I also wanted to show the file size in kbs.

Once again this was my first attempt:

01println(
02  "translated size: " + asKB(translatedLength) + "/" + asKB(docsLength) + " " +
03  translatedLength * 100 / docsLength + "% "
04)
05 
06println(
07  "translated files: " + translated.length + "/" + docs.length + " " +
08  translated.length * 100 / docs.length + "% "
09)
10 
11def asKB(length: Long) = (length / 1000) + "kb"

And this was the output:

1translated size: 256kb/612kb 41%
2translated files: 24/64 37%

Well, it worked, but it could definitely be improved, there was too much code duplication.

So I created a function that took care of it all:

01def status(
02  title: String = "status",
03  current: Long, total: Long,
04  format: (Long) => String = (x) => x.toString): String = {
05 
06  val percent = current * 100 / total
07 
08  title + ": " + format(current) + "/" + format(total) + " " +
09  percent + "%" +
10  " (pending " + format(total - current) + " " +
11  (100-percent) + "%)"
12}

The only tricky part is the format parameter. It’s just a higher order function, that by default just converts the passed number to a String.

We use that function like this:

1println(
2  status("translated size", translatedLength, docsLength, (length) => asKB(length) )
3)
4 
5println(
6  status("translated files", translated.length, docs.length)
7)

And that’s it.

It’s really easy to achieve this kind of stuff using scala as a scripting language, and on the way you may learn a couple of interesting concepts, and give your first steps into functional programming.

This is the complete script, here you have a github gist and you can also find it in the play spanish documentation project.

01#!/bin/sh
02exec scala "$0" "$@"
03!#
04 
05import java.io._
06 
07val docs = new File(".").listFiles
08  .filter(_.getName.endsWith(".textile"))   // process only textile files
09  .map(new DocumentationFile(_))
10 
11val translated = docs.filter(_.isTranslated)    // only already translated files
12 
13val translatedLength = translated.map(_.length).sum
14val docsLength = docs.map(_.length).sum
15 
16println(
17  status("translated size", translatedLength, docsLength, (length) => asKB(length) )
18)
19 
20println(
21  status("translated files", translated.length, docs.length)
22)
23 
24def status(
25  title: String = "status",
26  current: Long, total: Long,
27  format: (Long) => String = (x) => x.toString): String = {
28 
29  val percent = current * 100 / total
30 
31  title + ": " + format(current) + "/" + format(total) + " " +
32  percent + "%" +
33  " (pending " + format(total - current) + " " +
34  (100-percent) + "%)"
35}
36 
37def asKB(length: Long) = (length / 1000) + "kb"
38 
39class DocumentationFile(val file: File) {
40 
41  val name = file.getName
42  val length = file.length
43  val isTranslated = (firstLine.indexOf("Esta página todavía no ha sido traducida al castellano") == -1)
44 
45  override def toString = "name: " + name + ", length: " + length + ", isTranslated: " + isTranslated
46 
47  def firstLine = new BufferedReader(new FileReader(file)).readLine
48 
49}

Source:http://playlatam.wordpress.com/2011/12/05/first-steps-with-scala-say-goodbye-to-bash-scripts/

SCALA  FUNCTIONAL PROGRAMMING  BASH SCRIPT  REPLACEMENT 

       

  RELATED


  0 COMMENT


No comment for this article.



  RANDOM FUN

A torrent joke